在 Power BI 中使用 DirectQueryUsing DirectQuery in Power BI

使用 Power BI Desktop 或 Power BI 服务时,可以连接各种类型的数据源,并且可以通过不同的方式连接这些数据。You can connect to all sorts of different data sources when using Power BI Desktop or the Power BI service, and you can make those data connections in different ways. 可以将数据导入 Power BI,也可以在其原始源存储库中直接连接数据,前者是获取数据最常见的方法,后者称为 DirectQuery。You can either import data to Power BI, which is the most common way to get data, or you can connect directly to data in its original source repository, which is known as DirectQuery. 本文介绍 DirectQuery 及其功能,具体包括以下主题:This article describes DirectQuery and its capabilities, including the following topics:

  • DirectQuery 的各种连接选项Different connectivity options for DirectQuery
  • 何时应考虑使用 DirectQuery 而非导入的相关指导Guidance for when you should consider using DirectQuery rather than import
  • 使用 DirectQuery 时带来的弊端Drawbacks of using DirectQuery
  • 使用 DirectQuery 的最佳做法Best practice for using DirectQuery

简而言之,使用 DirectQuery 和导入的最佳做法是:In short, the best practice for using import versus DirectQuery is the following:

  • 如可能,应使用“导入”将数据导入 Power BI。You should import data to Power BI wherever possible. 这样做可以充分利用 Power BI 的高性能查询引擎,并提供高度交互和功能完善的数据体验。This takes advantage of the high performance query engine of Power BI, and provides a highly interactive and fully featured experience over your data.
  • 如果通过导入数据无法实现目标,则可以考虑使用 DirectQuery。If your goals can't be met by importing data, then consider using DirectQuery. 例如,如果数据被频繁更改并且报表必须反映最新数据,DirectQuery 可能是最佳选择。For example, if the data is changing frequently and reports must reflect the latest data, DirectQuery may be best. 但是,通常只有当基础数据源可以为典型聚合查询提供交互式查询(少于 5 秒)并且能够处理生成的查询负载时,使用 DirectQuery 才可行。However, using DirectQuery is generally only be feasible when the underlying data source can provide interactive queries (less than 5 seconds) for the typical aggregate query, and is able to handle the query load that will be generated. 此外,考虑结合使用 DirectQuery 的限制列表时应格外慎重,以确保仍然能够实现目标。Additionally, the list of limitations that accompany use of DirectQuery should be considered carefully, to ensure your goals can still be met.

Power BI 为两种连接模式(导入和 DirectQuery)提供的功能集将随时间的推移不断改进。The set of capabilities offered by Power BI for both connectivity modes – import and DirectQuery - will evolve over time. 这将包括在使用导入数据时提供更强的灵活性,以便可以在更多情况下使用导入模式,以及消除使用 DirectQuery 时带来的的一些弊端。This will include providing more flexibility when using imported data, such that import can be used in more cases, as well as eliminating some of the drawbacks of using DirectQuery. 无论如何改进,使用 DirectQuery 时都会始终将基础数据源的性能作为主要考虑因素。Regardless of improvements, when using DirectQuery the performance of the underlying data source will always remain a major consideration. 如果该基础数据源速度缓慢,则对该数据源使用 DirectQuery 将仍然不可行。If that underlying data source is slow, then using DirectQuery for that source it will remain unfeasible.

本主题介绍 Power BI(而不是 SQL Server Analysis Services)中的 DirectQuery。This topic covers DirectQuery with Power BI, and not SQL Server Analysis Services. DirectQuery 也是 SQL Server Analysis Services 的功能之一,下面描述的许多细节同样适用于 SQL Server Analysis Services,但还是存在重大区别。DirectQuery is also a feature of SQL Server Analysis Services, and many of the details described below apply to its use, there are also important differences. 若要详细了解如何使用 SQL Server Analysis Services 的 DirectQuery,请参阅 SQL Server Analysis Services 2016 中的 DirectQueryFor information about using DirectQuery with SQL Server Analysis Services, see the whitepaper that details DirectQuery in SQL Server Analysis Services 2016.

本文重点介绍 DirectQuery 的建议工作流、在 Power BI Desktop 中创建报表的位置,以及如何在 Power BI 服务中直接连接数据。This article focuses on the recommended workflow for DirectQuery, where the report is created in Power BI Desktop, but also covers connecting directly in the Power BI service.

Power BI 连接模式Power BI connectivity modes

Power BI 可连接大量不同类型的数据源,包括:Power BI connects to a very large number of varied data sources, encompassing:

  • 联机服务(Salesforce、Dynamics 365 等)Online services (Salesforce, Dynamics 365, others)
  • 数据库(SQL Server、Access、Amazon Redshift 等)Databases (SQL Server, Access, Amazon Redshift, others)
  • 简单文件(Excel、JSON 等)Simple files (Excel, JSON, others)
  • 其他数据源(Spark、网站、Microsoft Exchange 等)Other data sources (Spark, Web sites, Microsoft Exchange, others)

这些数据源通常都可以使用导入方法将数据导入 Power BI。For these sources, it's usually possible to import the data to Power BI. 其中某些数据源还可以使用 DirectQuery 进行连接。For some it is also possible to connect using DirectQuery. 有关支持 DirectQuery 的其他一组数据源的信息,请参阅 DirectQuery 支持的数据源一文。The exact set of sources that support DirectQuery is described in the Data Sources supported by DirectQuery article. 将来会有更多数据源支持 DirectQuery,我们会主要关注可以提供良好的交互式查询性能的数据源。More sources will be DirectQuery enabled in the future, focusing primarily on sources that can be expected to deliver good interactive query performance.

SQL Server Analysis Services 是一个特例。SQL Server Analysis Services is a special case. 连接到 SQL Server Analysis Services 时,你可以选择导入数据,或使用实时连接。When connecting to SQL Server Analysis Services, you can choose to import the data, or use a live connection. 使用实时连接与使用 DirectQuery 在以下方面类似:不导入任何数据,始终对基础数据源进行查询以刷新视觉对象;但在许多其他方面实时连接又有所不同,因此使用了不同的术语(“实时”和“DirectQuery”)。Using a live connection is similar to DirectQuery, in that no data is imported, and the underlying data source is always queried to refresh a visual, but a live connection is different in many other regards, so a different term (live versus DirectQuery) is used.

以下各节详细说明了这三个数据连接选项 -“导入”、“DirectQuery”和“实时连接”。These three options for connecting to data – import, DirectQuery, and live connection – are explained in detail in the following sections.

导入连接Import connections

如果在 Power BI Desktop 中使用“获取数据”以连接数据源(如 SQL Server),并且选择“导入”,则该连接行为如下所示:When using Get Data in Power BI Desktop to connect to a data source like SQL Server, and you choose Import, the behavior of that connection is as follows:

  • 在初始“获取数据”体验期间,所选的每个表定义一个将返回一组数据的查询(可以在加载数据之前对这些查询进行编辑,例如应用筛选器、聚合数据或联接不同的表)。During the initial Get Data experience, the set of tables selected each define a query that will return a set of data (those queries can be edited prior to loading the data, for example to apply filters, or aggregate the data, or join different tables).
  • 加载时,这些查询定义的所有数据都将被导入 Power BI 高速缓存。Upon load, all of the data defined by those queries will be imported into the Power BI cache.
  • 在 Power BI Desktop 中生成视觉对象时,将查询导入的数据。Upon building a visual within Power BI Desktop, the imported data will be queried. Power BI 存储可确保查询速度非常快,因此可立即反映视觉对象的所有更改。The Power BI store ensures the query will be very fast, hence all changes to the visual will be reflected immediately.
  • 但对基础数据的任何更改都不会反映在任何视觉对象中。Any changes to the underlying data will not be reflected in any visuals. 必须先“刷新”,然后重新导入数据。It is necessary to Refresh, whereupon the data will be re-imported.
  • 报表(.pbix 文件)发布到 Power BI 服务时,就会创建一个数据集并上传到 Power BI 服务。Upon publishing the report (the .pbix file) to the Power BI service, a dataset is created and uploaded to the Power BI service. 该数据集包含导入的数据。The imported data is included with that dataset. 随后可以对该数据设置按计划刷新,例如每天重新导入数据。It is then possible to set up scheduled refresh of that data, for example, to re-import the data every day. 可能必须配置本地数据网关,具体取决于原始数据源的位置。Depending upon the location of the original data source, it might be necessary to configure an on-premises data gateway.
  • 在 Power BI 服务中打开现有报表或创作新报表时,将再次查询导入的数据,确保交互性。When opening an existing report in the Power BI service, or authoring a new report, the imported data is queried again, ensuring interactivity.
  • 可以将视觉对象或整个报表页固定为仪表板磁贴。Visuals, or entire report pages, can be pinned as dashboard tiles. 每次刷新基础数据集时,将自动刷新磁贴。The tiles will be automatically refreshed whenever the underlying dataset is refreshed.

DirectQuery 连接DirectQuery connections

如果在 Power BI Desktop 中使用“获取数据”连接数据源,并且选择“DirectQuery”,则该连接行为如下所示:When using Get Data in Power BI Desktop to connect to a data source, and you choose DirectQuery, the behavior of that connection is as follows:

  • 在初始“获取数据”体验期间,会选择数据源。During the initial Get Data experience, the source is selected. 对于关系数据源,这意味着选择一组表,每个表仍定义一个查询,该查询在逻辑上返回一组数据。For relational sources, this means a set of tables are selected and each still define a query that logically returns a set of data. 对于多维数据源(如 SAP BW),将仅选择数据源。For multidimensional sources like SAP BW, only the source is selected.
  • 但在加载时,实际上不会将数据导入 Power BI 存储。However, upon load, no data will actually be imported into the Power BI store. 相反,在 Power BI Desktop 中生成视觉对象时,会向基础数据源发送查询以检索所需数据。Instead, upon building a visual within Power BI Desktop, queries will be sent to the underlying data source to retrieve the necessary data. 刷新视觉对象所花费的时间取决于基础数据源的性能。The time then taken to refresh the visual will depend on the performance of the underlying data source.
  • 对基础数据的任何更改都不会立即反映在任何现有视觉对象中。Any changes to the underlying data will not be immediately reflected in any existing visuals. 仍然需要先“刷新”,因此将针对每个视觉对象重新发送必需的查询,并根据需要对视觉对象进行更新。It is still necessary to Refresh, whereupon the necessary queries will be resent for each visual, and the visual updated as necessary.
  • 将报表发布到 Power BI 服务时,将再次在 Power BI 服务中生成数据集(和导入模式相同)。Upon publishing the report to the Power BI service, it will again result in a Dataset in the Power BI service, just as for import. 但是,该数据集不包含任何数据。However, no data is included with that dataset.
  • 在 Power BI 服务中打开现有报表或创作新报表时,将再次查询基础数据源以检索所需数据。When opening an existing report in the Power BI service, or authoring a new one, the underlying data source is again queried to retrieve the necessary data. 可能必须配置本地数据网关,具体取决于原始数据源的位置(和导入模式中刷新数据所需步骤相同)。Depending upon the location of the original data source, it might be necessary to configure an on-premises data gateway, just as is needed for Import mode if the data is refreshed.
  • 可以将视觉对象或整个报表页固定为仪表板磁贴。Visuals, or entire report pages, can be pinned as Dashboard tiles. 为了确保迅速打开仪表板,磁贴会按计划(例如每小时)自动刷新。To ensure that opening a dashboard will be fast, the tiles are automatically refreshed on a schedule (for example, every hour). 可以控制此刷新频率,以反映数据更改频率和查看最新数据的重要性。The frequency of this refresh can be controlled, to reflect how frequently the data is changing, and how important it is to see the very latest data. 因此,打开仪表板时,磁贴将反映自上次刷新后的数据,而不一定反映对基础源所做的最新更改。Thus, when opening a dashboard, the tiles will reflect the data as of the time of the last refresh, and not necessarily the very latest changes made to the underlying source. 可以反复刷新打开的仪表板,使其保持最新。An open dashboard can always be Refreshed to ensure it is up-to-date.

实时连接Live connections

如果连接到 SQL Server Analysis Services (SSAS),可选择从所选数据模型导入数据,或实时连接到所选数据模型。When connecting to SQL Server Analysis Services (SSAS), there is an option to either import data from, or connect live to, the selected data model. 如果选择“导入”,则需要针对该外部 SSAS 数据源定义查询,以便正常导入数据。If you select import, then you define a query against that external SSAS source, and the data is imported as normal. 如果选择“实时连接”,则不需要定义查询,字段列表中会显示整个外部模型。If you select to connect live then there is no query defined, and the entire external model is shown in the field list. 如果选择 DirectQuery,生成视觉对象时,将向外部 SSAS 源发送查询。If you select DirectQuery, as visuals are built, queries are sent to the external SSAS source. 与 DirectQuery 不同的是,实时连接在创建新模型方面没有意义,即不能定义新的计算列、层次结构、关系等。However, unlike DirectQuery, there is no sense in which a new model is being created; in other words, it's not possible to define new calculated columns, hierarchies, relationships, and so on. 相反,只需直接连接到外部 SSAS 模型即可。Instead you are simply connecting directly to the external SSAS model.

上段中所述的情况也适用于连接以下数据源(没有导入数据选项的情况除外):The situation described in the previous paragraph applies to connecting to the following sources as well, except that there is no option to import the data:

  • Power BI 数据集(例如,连接到以前已创建并发布到服务的 Power BI 数据集以创作新报表将其覆盖)Power BI datasets (for example, when connecting to a Power BI dataset that has previously been created and published to the service, to author a new report over it)
  • Common Data ServicesCommon Data Services

SSAS 报表发布到 Power BI 服务时,其行为在以下方面与 DirectQuery 报表类似:The behavior of reports over SSAS, upon publishing to the Power BI service, is similar to DirectQuery reports in the following ways:

  • 在 Power BI 服务中打开现有报表或创作新报表时,查询基础 SSAS 数据源(可能需要一个本地数据网关)When opening an existing report in the Power BI service or authoring a new report, the underlying SSAS source is queried (possibly requiring an on-premises data gateway)
  • 仪表板磁贴按计划(例如每小时,或定义的任何频率)自动刷新Dashboard tiles are automatically refreshed on a schedule (such as every hour, or whatever frequency is defined)

但两者也有重大区别:对于实时连接,打开报表的用户标识将始终传递到基础 SSAS 数据源。However, there are also important differences, including that for live connections the identity of the user opening the report will always be passed to the underlying SSAS source.

介绍这些比较后,本文剩余部分将着重介绍 DirectQuery。With these comparisons out of the way, let's focus solely on DirectQuery for the rest of this article.

DirectQuery 在什么情况下有用?When is DirectQuery useful?

下表描述了使用 DirectQuery 连接特别有用的场景,包括将数据保留在原始源中被认为是有益的情况。The following table describes scenarios where connecting with DirectQuery could be especially useful, including cases where leaving the data in the original source would be considered beneficial. 该描述内容包括对特定场景是否适用于 Power BI 的讨论。The description includes a discussion about whether the specified scenario is available in Power BI.

限制Limitation 说明Description
数据频繁变化,需要几乎“实时”的报表Data is changing frequently, and near ‘real-time’ reporting is needed 导入数据模型最快可以每小时刷新一次。Models with Imported data can be refreshed at most once per hour. 因此,如果数据不断改变,并且报表必须显示最新数据,则使用按计划刷新的导入可能无法满足需求。Hence, if the data is continually changing, and it is necessary for reports to show the latest data, then using Import with scheduled refresh might simply not meet those needs. 请注意,也可以将数据直接流式传输到 Power BI 中(尽管这种情况下数据量有限制)。Note also that it is also possible to stream data directly into Power BI, though there a limits on the data volumes supported for this case.

相比之下,使用 DirectQuery 意味着打开或刷新报表或仪表板将始终显示源中最新的数据。Using DirectQuery, by contrast, means that opening or refreshing a report or dashboard will always show the latest data in the source. 此外,可以更频繁地(每 15 分钟)更新仪表板磁贴。Additionally, the dashboard tiles can be updated more frequently (as often as every 15 mins).
数据量非常大Data is very large 如果数据量非常大,直接全部导入肯定不可行。If the data is very large, then it certainly would not be feasible to import it all. 相比之下,DirectQuery 不需要大量数据传输,因为可以进行就地查询。DirectQuery, by contrast, requires no large transfer of data, as it is queried in place.

但是,大量数据可能也意味着对该基础源查询的速度很慢(如本文后面部分的“使用 DirectQuery 的影响”中所述。However, large data might also imply that the performance of the queries against that underlying source are too slow (as discussed in Implications of using DirectQuery section, later in this article). 当然,并不总是需要导入全部详细数据。And of course it is not always necessary to import the full detailed data. 相反,数据可以在导入过程中预先聚合(通过“查询编辑器”可以轻松实现此操作)。Instead the data can be pre-aggregated during import (and Query Editor makes it easy to do exactly this). 在极端情况下,可以只导入每个视觉对象所需的聚合数据。In the extreme it would be possible to import exactly the aggregate data needed for each visual. 所以 DirectQuery 是处理大量数据的最简单方法,请始终记住,如果基础数据源太慢,导入聚合数据也许是一个解决方案。So while DirectQuery is the simplest approach to large data, you should always keep in mind that importing aggregate data might offer a solution if the underlying source is too slow.
基础数据源中定义的安全规则Security rules are defined in the underlying source 导入数据时,Power BI 将使用当前用户凭据(从 Power BI Desktop),或作为配置计划刷新(从 Power BI 服务)的一部分定义的凭据,连接到数据源。When the data is imported, Power BI will connect to the data source using the current users credentials (from Power BI Desktop), or the credentials defined as part of configuring scheduled refresh (from the Power BI service). 因此,在发布和共享此类报表时,必须注意只与允许查看相同数据的用户共享,或将行级别安全性定义为数据集的一部分。Thus, in publishing and sharing such a report, care must be taken to only share with users allowed to see the same data, or to define Row Level Security as part of the dataset.

理想情况下,由于 DirectQuery 始终查询基础数据源,这将允许应用该基础数据源中的任何安全规则。Ideally, because DirectQuery always queries the underlying source, this would allow any security in that underlying source to be applied. 但是,目前 Power BI 将始终使用与采用“导入”方法时使用的相同凭据来连接基础数据源。However, today Power BI will always connect to the underlying source using the same credentials as would be used for Import.

因此,除非 Power BI 允许将报表使用者的标识传递给基础数据源,否则 DirectQuery 在数据源安全性方面没有任何优势。Thus, until Power BI allows for the identity of the report consumer to pass through to the underlying source, DirectQuery offers no advantages with regard to data source security.
数据主权限制应用Data sovereignty restrictions apply 某些组织对数据主权制定有相应策略,这意味着数据不能离开组织规定的前提。Some organizations have policies around data sovereignty, meaning that data cannot leave the organization premises. 基于导入的解决方案很显然会存在问题。A solution based on import would clearly present issues. 相比之下,如果使用 DirectQuery,数据将保留在基础数据源中。By contrast, with DirectQuery that data remains in the underlying source.

但应注意,由于磁贴按计划刷新,即使使用 DirectQuery,某些视觉对象级别的数据缓存也会保留在 Power BI 服务中。However, it should be noted that even with DirectQuery, some caches of data at the visual level are retained in the Power BI service (due to scheduled refresh of tiles).
基础数据源是包含度量值的 OLAP 数据源Underlying data source is an OLAP source, containing measures 如果基础数据源包含度量值(如 SAP HANA 或 SAP Business Warehouse),则导入数据将引发其他问题。If the underlying data source contains *measures *(such as SAP HANA or SAP Business Warehouse) then importing the data brings other issues. 这意味着导入的数据处于由查询定义的特定聚合级别。It means that the data imported is at a particular level of aggregation, as defined by the query. 例如,按 Class、Year 和 City 衡量总销售额。For example, measure TotalSales by Class, Year, and City. 如果构建的视觉对象要求较高级别聚合数据(如按 Year 的总销售额),它会进一步聚合总值。Then if a visual is built asking for data at a higher level aggregate (such as TotalSales by Year) it is further aggregating the aggregate value. 这对于附加式度量(如 Sum、Min)没有问题,但对非附加式(如 Average、DistinctCount),会产生问题。This is fine for additive measures (such as Sum, Min) but it's an issue for non-additive (such as Average, DistinctCount).

为了轻松从源中直接获取正确的聚合数据(根据特定视觉对象的需要),必须像 DirectQuery 一样按视觉对象发送查询。To make it easy to get the correct aggregate data (as needed for the particular visual) directly from the source, it would be necessary to send queries per visual, as in DirectQuery.

连接到 SAP Business Warehouse (BW) 时,可以选择 DirectQuery 进行这种度量。When connecting to SAP Business Warehouse (BW), choosing DirectQuery allows for this treatment of measures. DirectQuery 和 SAP BW 中进一步介绍了对 SAP BW 的支持。Support for SAP BW is covered further in DirectQuery and SAP BW.

但是,当前 SAP HANA 上的 DirectQuery 将其视为关系数据源对待,因此提供的行为类似于导入。However, currently DirectQuery over SAP HANA treats it the same as a relational source, and hence provides similar behavior to import. 这在 DirectQuery 和 SAP HANA 中有进一步介绍。This is covered further in DirectQuery and SAP HANA.

因此,综上所述,鉴于 Power BI 中 DirectQuery 的当前功能,其优势体现在以下方面:So in summary, given the current capabilities of DirectQuery in Power BI, the scenarios where it offers benefits are the following:

  • 数据频繁变化,需要几乎“实时”的报表Data is changing frequently, and near ‘real-time’ reporting is needed
  • 处理非常大的数据量,而无需预先聚合Handling very large data, without the need to pre-aggregate
  • 数据主权限制应用Data sovereignty restrictions apply
  • 源是包含度量值(如 SAP BW)的多维度源The source is a multi dimensional source containing measures (such as SAP BW)

请注意,上一列表中的详细信息仅涉及 Power BI 的使用。Note that the details in the previous list relate to the use of Power BI alone. 另外,始终可以选择使用外部 SQL Server Analysis Services(或 Azure Analysis Services)模型导入数据,然后使用 Power BI 连接该模型。There is always the option of instead using an external SQL Server Analysis Services (or Azure Analysis Services) model to import data, and then using Power BI to connect to that model. 虽然这种方法需要其他技能,但它可以提供更强的灵活性。While that approach would require additional skills, it does provide greater flexibility. 例如,可以导入更多大容量数据,并且刷新数据的频率不受限制。For example, much larger volumes of data can be imported, and there is no restriction on how frequently the data can be refreshed.

使用 DirectQuery 的影响Implications of using DirectQuery

如本节内容所述,使用 DirectQuery 确实存在潜在的负面影响。Use of DirectQuery does have potentially negative implications, as detailed in this section. 其中一些限制因使用的具体数据源不同而略有不同。Some of those limitations are slightly different depending upon the exact source that is being used. 这在适当情况下会被提及,差异显著的数据源将由单独的主题介绍。This will be called out where applicable, and separate topics cover those sources that are substantially different.

基础源的性能和负载Performance and load on the underlying source

使用 DirectQuery 时,整体体验很大程度取决于基础数据源的性能。When using DirectQuery, the overall experience depends very much on the performance of the underlying data source. 如果刷新每个视觉对象(如更改切片器值后)都需要几秒钟时间(小于 5 秒),则等待体验是合理的,但与经常使用的将数据导入到 Power BI 的即时响应相比,可能仍感觉有些缓慢。If refreshing each visual (for example, after changing a slicer value) takes a few seconds (<5s) then the experience would be reasonable, yet might still feel sluggish compared to the immediate response we are used to when importing the data to Power BI. 相反,如果数据源缓慢意味着单个视觉对象花费的时间更长(几十秒),甚至可能超出查询的时间,则体验会非常差。If instead the slowness of the source means that individual visuals take longer than that (tens of seconds), then the experience becomes extremely poor, possibly even to the point of queries timing out.

除基础数据源的性能外,还应仔细考虑置于其上的负载(这通常会影响性能)。Along with the performance of the underlying source, careful consideration should be paid to the load that will be placed upon it (that of course then often impacts the performance). 正如下文所进一步讨论的,打开共享报表的每个用户以及定期刷新的每个仪表板磁贴都会向基础数据源发送查询(每个视觉对象至少发送一个查询)。As discussed further below, each user that opens a shared report, and each dashboard tile that is periodically refreshed, will be sending at least one query per visual to the underlying source. 这种情况需要数据源能够处理此类查询负载,同时仍保持良好的性能。This fact requires that the source be able to handle such a query load, while still maintaining reasonable performance.

单个数据源限制Limited to a single source

导入数据时,可以将来自多个源的数据组合到单个模型中,例如,可以轻松联接公司 SQL Server 数据库中的数据和保存在 Excel 文件中的本地数据。When importing data, it is possible to combine data from multiple sources into a single model, for example, to easily join some data from a corporate SQL Server database with some local data maintained in an Excel file. 使用 DirectQuery 时,无法实现该操作。This is not possible when using DirectQuery. 如果对源选择 DirectQuery,则只能使用来自单个源(如单个 SQL Server 数据库)中的数据。When selecting DirectQuery for a source, it will then only be possible to use data from that single source (such as a single SQL Server database).

数据转换限制Limited data transformations

同样,在“查询编辑器”中应用数据转换也存在限制。Similarly, there are limitations in the data transformations that can be applied within Query Editor. 如果使用导入数据,可以轻松应用一组复杂的变换来清理和重新整理数据,然后再使用数据创建视觉对象(例如解析 JSON 文档或将数据从列形式转换为行形式)。With imported data, a sophisticated set of transformations can easily be applied to clean and re-shape the data before using it to create visuals (such as parsing JSON documents, or pivoting data from a column to a row orientated form). 这些转换在 DirectQuery 中受到更多限制。Those transformations are more limited in DirectQuery. 首先,连接到 SAP Business Warehouse 等 OLAP 数据源时,不能定义任何转换,整个外部“模型”都来自该数据源。First, when connecting to an OLAP source like SAP Business Warehouse, no transformations can be defined at all, and the entire external ‘model’ is taken from the source. 对于类似 SQL Server 的关系数据源,每个查询仍可以定义一组转换,但是出于性能原因,这些转换将受到限制。For relational sources like SQL Server, it is still possible to define a set of transformations per query, but those transformations are limited, for performance reasons. 任何此类转换需要应用于基础数据源的每个查询,而不是在数据刷新时应用一次,因此它们仅限于可以合理地转换为单个本机查询的转换。Any such transformation will need to be applied on every query to the underlying source, rather than once on data refresh, so they are limited to those transformations that can reasonably be translated into a single native query. 如果使用的转换过于复杂,将收到错误警报,必须删除它,或将模型切换到导入模式。If you use a transformation that is too complex, then you will receive an error that either it must be deleted, or the model switched to Import mode.

此外,“获取数据”对话框或“查询编辑器”产生的查询将用于生成和发送的查询的子选择中,以检索视觉对象所需的数据。Additionally, the query that results from the Get Data dialog or Query Editor will be used in a subselect within the queries generated and sent to retrieve the necessary data for a visual. 因此,在“查询编辑器”中定义的查询必须在此上下文中有效。Thus the query defined in Query Editor must be valid within this context. 具体而言,这意味着不能使用包含公用表表达式或调用存储过程的查询。In particular this means it is not possible to use a query using Common Table Expressions, nor that invokes Stored Procedures.

建模限制Modelling limitations

术语“建模”在此上下文中表示完善和丰富原始数据(作为创建使用它的报表的一部分)。The term modelling in this context means the act of refining and enriching the raw data, as part of authoring a report using it. 示例包括:Examples include:

  • 定义表之间的关系Defining relationships between tables
  • 添加新计算(计算列和度量值)Adding new calculations (calculated columns and measures)
  • 重命名和隐藏列和度量值Renaming and hiding columns and measures
  • 定义层次结构Defining hierarchies
  • 定义列的格式设置、默认汇总以及排序顺序Defining the formatting, default summarization and sort order for a column
  • 分组或聚类值Grouping or clustering values

如果使用 DirectQuery,仍然可以丰富大多数这些模型,当然还有正被丰富的原始数据的原则,以便改进后续使用。When using DirectQuery, many of these model enrichments can still be made, and certainly there is still the principle that the raw data is being enriched, so as to improve later consumption. 但是,使用 DirectQuery 时,有些建模功能将不可用或受到限制。However, there some modeling capabilities are not available, or are limited, when using DirectQuery. 通常,应用限制的目的是避免性能问题。The limitations are generally applied to avoid performance issues. 下列项目符号列表列出了所有 DirectQuery 数据源通用的限制。The set of limitations that are common to all DirectQuery sources are listed in the following bulleted list. 如本文接近末尾处的“数据源详细信息”中所述,可能有其他限制适用于特定数据源。Additional limitations might apply to individual sources, as described in Data source specific details found near the end of this article.

  • 无内置日期层次结构:导入数据时,默认情况下,每个日期/日期时间列都将具有默认的内置日期层次结构。No Built in-date hierarchy: When importing data, then by default every date/datetime column will also have a built-in date hierarchy available by default. 例如,如果导入的销售订单表包含 OrderDate 列,则在视觉对象中使用 OrderDate 时,可以选择要使用的适当级别(年、月、日)。For example, if importing a table of sales orders including a column OrderDate, then upon using OrderDate in a visual, it will be possible to choose the appropriate level (Year, Month, Day) to use. 使用 DirectQuery 模式时,此内置日期层次结构不可用。This built-in date hierarchy is not available when using DirectQuery mode. 但是请注意,如果基础数据源中存在“日期”表(如许多数据仓库中常见的那样),则可以照常使用 DAX 时间智能函数。Note however that if there is Date table available in the underlying source (as is common in many data warehouses) then the DAX Time Intelligence functions can be used as normal.
  • 在计算列中的限制:计算列仅限于行内,因为它们只能引用同一表中其他列的值,不能使用任何聚合函数。Limitations in calculated columns: Calculated columns are limited to being intra-row, as in, they can only refer to values of other columns of the same table, without the use of any aggregate functions. 此外,可用的 DAX 标量函数(如 LEFT())被限制为只能推送到基础数据源,因此从很大程度上取决于数据源的具体功能。Additionally, the DAX scalar functions (such as LEFT()) that are allowed will be limited to those which can simply be pushed to the underlying source, hence will vary depending upon the exact capabilities of the source. 创建计算列的 DAX 时,不支持的功能不会在自动完成中列出,如果使用则会导致错误。Functions that are not supported will not be listed in autocomplete when authoring the DAX for a calculated column, and would result in an error if used.
  • 不支持父-子 DAX 函数:在 DirectQuery 模型中,不能使用 DAX PATH() 系列函数,这类函数通常处理父-子结构,如帐户图表或员工层次结构图表。No support for parent-child DAX functions: When in DirectQuery model, it is not possible to use the family of DAX PATH() functions, that generally handle Parent-Child structures (such as chart of accounts, or employee hierarchies).
  • 度量值的(默认)限制:默认情况下,可在度量值中使用的 DAX 函数和表达式受限。Limitations (by default) for measures: By default, the DAX functions and expressions that can be used in measures is restricted. 同样,如果使用无效的函数或表达式,自动完成将不会列出该函数,并且会出错。Again, autocomplete will restrict the functions listed, and an error will occur if an invalid function or expression is used. 可以确定的原因是,默认情况下,度量值被限制为简单的度量值,这些度量值本身不太可能导致任何性能问题。The reason is simply to ensure that, by default, measures are restricted to simple measures that are unlikely by themselves cause any performance issues. 高级用户可以选择绕过此限制,方法是依次选择“文件”>“选项”和“设置”>“选项”>“DirectQuery”,然后选择选项“允许 DirectQuery 模式下的度量值不受限制”。Advanced users can choose to bypass this limitation by selecting File > Options and then Settings > Options > DirectQuery, then selecting the option Allow unrestricted measures in DirectQuery mode. 选中该选项后,即可使用对度量值有效的任何 DAX 表达式。When that option is selected, any DAX expression that is valid for a measure can be used. 但是,用户必须知道,可在导入数据的情况下正常工作的某些表达式,在 DirectQuery 模式下则可能会导致针对后端数据源的查询速度缓慢。Users must be aware, however, that some expressions that perform well when the data is imported may result in very slow queries to the backend source when in DirectQuery mode.

    • 例如,默认情况下:For example, by default:

      • 可以创建一个简单的销售金额相加的度量值:It would be possible to author a measure that simply summed the sales amount:

        SalesAmount = SUMX(Web_Sales, [ws_sales_price]* [ws_quantity])
        
      • 不能创建一个度量值,然后在所有项上平均销售额:It would not be possible to author a measure that then averaged that SalesAmount over all of the Items:

        AverageItemSalesAmount = AVERAGEX('Item', [SalesAmount])
        

      原因是,如果有大量项,此类度量值可能导致性能不佳。The reason is that such a measure could result in poor performance if there were a very large number of items.

  • 不支持计算表:DirectQuery 模式不支持使用 DAX 表达式定义计算表。Calculated tables are not supported: The ability to define a calculated table using a DAX expression is not supported in DirectQuery mode.
  • 关系筛选仅限于单方向:使用 DirectQuery 时,不能将交叉筛选关系方向设置为“双向”。Relationship filtering is limited to a single direction: When using DirectQuery, it is not possible to set the Cross Filter direction on a relationship to “Both”. 例如,以下三个表格不能生成显示每个客户[性别]和每个客户购买的产品[类别]数量的视觉对象。For example, with the three tables below, it would not be possible to build a visual showing each Customer[Gender], and the number of Product[Category] bought by each. 本白皮书中详细说明了这类双向筛选的使用(本文介绍了 SQL Server Analysis Services 上下文中的示例,但基本要点也同样适用于 Power BI)。Use of such bi-directional filtering is described in this detailed whitepaper (the paper presents examples in the context of SQL Server Analysis Services, but the fundamental points apply equally to Power BI).

    同样,施加此限制也是出于性能影响的原因。Again, the limitation is imposed due to the performance implications. 此方面的一个特别重要的应用是,将行级别安全性定义为报表的一部分时,由于常见模式是在用户和其允许访问的实体之间建立多对多关系,因此必须使用双向筛选才能执行此操作。One particularly important application of this is when defining Row Level Security as part of the report, as a common pattern is to have a many-many relationship between the users and the entities they are allowed access to, and use of bi-directional filtering is necessary to enforce this. 但是,对 DirectQuery 模型使用双向筛选应十分谨慎,并注意性能方面产生的任何不利影响。However, use of bi-directional filtering for DirectQuery models should be used judiciously, with careful attention paid to any detrimental impact on performance.

  • 无聚类分析:使用 DirectQuery 时,不能使用聚类分析功能自动查找组No Clustering: When using DirectQuery, it is not possible to use the Clustering capability, to automatically find groups

报表限制Reporting limitations

几乎所有的报表功能都支持 DirectQuery 模型。Almost all reporting capabilities are supported for DirectQuery models. 在这种情况下,只要基础数据源提供了合适的性能水平,就可以使用相同的可视化组件集。As such, so long as the underlying source offers a suitable level of performance, the same set of visualizations can be used. 但在发布报表后,Power BI 服务中提供的一些其他功能会受到一些重要限制,如以下项目列表所述:However, there are some important limitations in some of the other capabilities offered in the Power BI service after a report is published, as described in the following bullets:

  • 不支持快速见解:Power BI 快速见解功能可快速搜索数据集的不同子集,同时应用一组复杂的算法来发现潜在相关的见解。Quick Insights is not supported: Power BI Quick Insights searches different subsets of your dataset while applying a set of sophisticated algorithms to discover potentially-interesting insights. 此功能对查询的性能要求非常高,因此在使用 DirectQuery 的数据集上不可用。Given the need for very high performance queries, this capability is not available on datasets using DirectQuery.
  • 不支持问答:使用 Power BI 中的“问答”可利用直观、自然的语言功能浏览数据并接收图表和图形形式的答案。Q&A is not supported: Power BI Q&A enables you to explore your data using intuitive, natural language capabilities and receive answers in the form of charts and graphs. 但是,当前使用 DirectQuery 的数据集不支持此功能。However, it is currently not supported on datasets using DirectQuery.
  • 使用“在 Excel 中浏览”可能会导致性能不佳:可以在数据集上使用“在 Excel 中浏览”功能浏览数据。Using Explore in Excel will likely result in poorer performance: It is possible to explore your data by using the “Explore in Excel” capability on a dataset. 这将允许在 Excel 中创建数据透视表和数据透视图。This will allow Pivot Tables and Pivot Charts to be created in Excel. 尽管使用 DirectQuery 的数据集支持此功能,但性能通常比在 Power BI 中创建视觉对象的性能更慢,因此如果使用 Excel 对你的方案非常重要,则决定使用 DirectQuery 时请考虑这一点。While this capability is supported on datasets using DirectQuery, the performance is generally slower than creating visuals in Power BI, and therefore if the use of Excel is important for your scenarios, this should be accounted for in your decision to use DirectQuery.

安全性Security

如本文前面所述,使用 DirectQuery 的报表在发布到 Power BI 服务后,将始终使用相同的固定凭据连接到基础数据源。As discussed earlier in this article, a report using DirectQuery will always use the same fixed credentials to connect to the underlying data source, after publish to the Power BI service. 同时请注意,此处专指 DirectQuery,而不是 SQL Server Analysis Services 的实时连接,两者在这方面不同。Again, note this refers specifically to DirectQuery, not to live connections to SQL Server Analysis Services, which is different in this respect. 因此,发布 DirectQuery 报表后,必须立即配置用户要使用的凭据。Hence immediately after publish of a DirectQuery report, it is necessary to configure the credentials of the user that will be used. 在完成此操作之前,打开 Power BI 服务报表将导致错误。Until this is done, opening the report on the Power BI service would result in an error.

提供用户凭据后,所有打开报表的用户都可使用这些凭据。Once the user credentials are provided, then those credentials will be used, irrespective of the user who opens the report. 这方面与导入数据完全相同,即除非行级别安全性已定义为报表的一部分,否则每个用户都将看到相同的数据。In this regard it is exactly like imported data, in that every user will see the same data, unless Row Level Security has been defined as part of the report. 因此,如果在基础数据源中定义了任何安全规则,共享报表时也必须注意这一点。Hence the same attention must be paid to sharing the report, if there are any security rules defined in the underlying source.

Power BI 服务中的行为Behavior in the Power BI service

本部分介绍 Power BI 服务中 DirectQuery 报表的行为,主要是为了使读者了解将置于后端数据源的负载程度、需要考虑的共享报表和仪表板的用户数、报表的复杂性以及行级别安全性是否已在报表中定义。This section describes the behavior of a DirectQuery report in the Power BI service, primarily so as to be able to understand the degree of load that will be placed on the back end data source, given the number of users that the report and dashboard will be shared with, the complexity of the report, and whether Row Level Security has been defined in the report.

报表 – 打开、与之交互、编辑Reports – opening, interacting with, editing

打开报表时,将刷新当前可见页面上的所有视觉对象。When a report is opened, then all the visuals on the currently visible page will refresh. 每个视觉对象通常需要至少一个查询以对基础数据源进行查询。Each visual will generally require at least one query to the underlying data source. 某些视觉对象可能需要多个查询,例如,显示来自两个不同事实数据表的聚合值、包含更复杂的度量值或包含非附加式度量值的总和(如 Count Distinct)。Some visuals might require more than one query (for example, if it showed aggregate values from two different fact tables, or contained a more complex measure, or contained totals of a non-additive measure like Count Distinct). 移动到新页面将导致刷新这些视觉对象,从而生成一组针对基础数据源的新查询。Moving to a new page will result in those visuals being refreshed, resulting in a new set of queries to the underlying source.

报表上的所有用户交互都可能导致刷新视觉对象。Every user interaction on the report might result in visuals being refreshed. 例如,选择切片器上的不同值需要发送一组新查询以刷新所有受影响的视觉对象。For example, selecting a different value on a slicer will require sending a new set of queries to refresh all of the effected visuals. 单击视觉对象以交叉突出显示其他视觉对象或更改筛选器也是如此。The same is true for clicking on a visual to cross-highlight other visuals, or changing a filter.

当然,类似情况还包括编辑新报表,该报表需要为每个步骤发送查询以产生所需的最终视觉对象。Similarly of course, editing a new report with require queries to be sent for each step on the path to produce the final desired visual.

存在一些包含结果的缓存,如果获得完全相同的最新结果,将即时刷新视觉对象。There is some caching of results, so that the refresh of a visual will be instantaneous if the exact same results have recently been obtained. 如果将任何行级别安全性定义为报表的一部分,此类缓存不会在所有用户中共享。Such caches are not shared across users, if there is any Row Level Security defined as part of the report.

仪表板刷新Dashboard Refresh

单个视觉对象或整个页面可以作为磁贴固定到仪表板。Individual visuals, or entire pages, can be pinned to dashboard as tiles. 基于 DirectQuery 数据集的磁贴会按计划自动刷新,并将查询发送到后端数据源。Tiles based on DirectQuery datasets are then refreshed automatically according to a schedule, resulting in queries being sent to the backend data source. 默认情况下,每小时刷新一次,但作为数据集设置的一部分,可以将其配置为每周和每 15 分钟刷新一次。By default, this occurs every hour, but can be configured as part of Dataset settings to be between weekly, and every 15 minutes.

如果模型中未定义行级别安全性,这意味着每个磁贴将刷新一次,并在所有用户之间共享结果。If no Row Level Security is defined in the model, this means that each tile would be refreshed once, and the results shared across all users. 如果定义了行级别安全性,则会有很大的乘数效应 – 每个磁贴需要按用户单独向基础源发送查询。If Row Level Security is defined, then there can be a large multiplier effect – each tile requires separate queries per user to be sent to the underlying source.

因此,如果某个仪表板有 10 个磁贴、与 100 个用户共享、是在使用具有行级别安全性的 DirectQuery 数据集创建的,并且配置为每 15 分钟刷新一次,则将导致每 15 分钟向后端数据源发送至少 1000 个查询。Hence a dashboard with ten tiles, shared with 100 users, created on a dataset using DirectQuery with Row Level Security, and configured to refresh every 15 minutes, would result in at least 1000 queries being sent every 15 minutes to the back end source.

因此必须慎重考虑使用行级别安全性和计划刷新配置。Hence careful consideration must be paid to the use of Row Level Security, and the configuring of the refresh schedule.

超时Timeouts

在 Power BI 服务中,4 分钟超时应用于单个查询,超过该时间的查询将失败。A timeout of four minutes is applied to individual queries in the Power BI service, and queries taking longer than that will simply fail. 如前所述,建议对提供近似交互式查询性能的数据源使用 DirectQuery,因此此限制旨在防止执行时间过长的问题。As stressed earlier, it is recommended that DirectQuery be used for sources that provide near interactive query performance, so this limit is intended to prevent issues from overly long execution times.

其他影响Other implications

使用 DirectQuery 的其他一般性影响如下:Some other general implications of using DirectQuery are the following:

  • 如果更改数据,则必须刷新以确保显示最新数据:由于给定使用缓存,无法确保视觉对象始终显示最新数据。If data is changing, it is necessary to Refresh to ensure the latest data is shown: Given the use of caches, there is no guarantee that the visual is always showing the latest data. 例如,视觉对象可能显示最后一天的事务。For example, a visual might show the transactions in the last day. 然后由于切片器被更改,它可能会刷新显示最近两天的事务,包括一些最新的事务。Then due to a slicer being changed, it might refresh to show the transactions for the last two days, including some very recent, newly arrived transactions. 将切片器返回到其原始值将导致再次显示之前获得的缓存值,不包括之前看到的新到达的事务。Returning the slicer to its original value would result in it again showing the cached value previously obtained, that would not include the newly arrived transaction seen before.

    选择刷新将清除所有缓存,并刷新页面上的所有视觉对象以显示最新数据。Selecting Refresh will clear any caches, and refresh all the visuals on the page to show the latest data.

  • 如果更改数据,不能保证视觉对象之间的一致性:不同的视觉对象(无论是在相同页面还是在不同页面上),可能会在不同的时间刷新。If data is changing, there is no guarantee of consistency between visuals: Different visuals, whether on the same page or on different pages, might be refreshed at different times. 因此如果基础数据源中的数据已更改,则不能保证每个视觉对象都在完全相同的时间点显示数据。Thus if the data in the underlying source is changing, there is no guarantee that each visual will be showing the data at the exact same point of time. 事实上,鉴于有时单个视觉对象需要多个查询(例如,获取详细信息和总计),因此,即使在单个视觉对象中,一致性也得不得保证。Indeed, given that sometimes more than one query is required for a single visual (for example, to obtain the details and the totals) then consistency even within a single visual is not guaranteed. 若要确保一致性,需要刷新任何视觉时带来的刷新所有视觉对象的开销,同时使用昂贵的功能,如基础数据源中的“快照隔离”。To guarantee this would require the overhead of refreshing all visuals whenever any visual refreshed, in tandem with the use of costly features like Snapshot Isolation in the underlying data source.

    再次选择“刷新”(刷新页面中所有视觉对象)可以很大程度上缓解此问题。This issue can be mitigated to a large extent by again selecting Refresh, to will refresh all of the visuals on the page. 而且应注意的是,即便使用导入模式,如果从多个表中导入数据,也存在类似的保证一致性的问题。And it should be noted that even if using Import mode, there is a similar problem of guaranteeing consistency if importing data from more than one table.

  • 需要 Power BI Desktop 中的刷新以反映元数据的任何更改:报表发布后,使用“刷新”将只需刷新报表中的视觉对象。Refresh in Power BI Desktop is needed to reflect any metadata changes: After a report is published, Refresh will simply refresh the visuals in the report. 如果基础数据源的架构已更改,这些更改将不会自动应用于更改字段列表中可用的字段。If the schema of the underlying source has changed, then those changes are not automatically applied to change the available fields in the field list. 因此如果已从基础数据源中删除表或列,则可能导致刷新时查询失败。Thus if tables or columns have been removed from the underlying source, it might result in query failure upon refresh. 请在 Power BI Desktop 中打开报表并选择“刷新”,更新模型中的字段以反映更改。Opening the report in Power BI Desktop, and choosing Refresh, will update the fields in the model to reflect the changes.
  • 对任何查询可以返回的行数限制为一百万行:对于基础数据源的任何单个查询,可以返回的行数限制为固定的一百万行。Limit of one million rows returned on any query: There is a fixed limit of one million rows placed on the number of rows that can be returned in any single query to the underlying source. 这通常没有实际意义,视觉对象本身不会显示那么多行。This generally has no practical implications, and visuals themselves aren’t going to display that many points. 但如果 Power BI 未完全优化发送的查询,并且有一些中间请求结果超出此限制,则可能会受此限制。However, the limit can occur in cases where Power BI is not fully optimizing the queries sent, and there is some intermediate result being requested that exceeds the limit. 在生成达到更合理最终状态的视觉对象过程中,也可能会受此限制。It can also occur whilst building a visual, on the path to a more reasonable final state. 例如,如果有超过 1 百万的客户,如果不应用某些筛选器,则 Customer 和 TotalSalesQuantity 将达到此限制。For example, including Customer and TotalSalesQuantity would hit this limit if there were more than 1m customers, until some filter were applied.

    返回的错误将是“外部数据源的查询结果集超过了允许的最大行数 1000000 行。”The error that would be returned would be “The resultset of a query to external data source has exceeded the maximum allowed size of '1000000' rows.”

  • 无法将导入模式更改为 DirectQuery 模式:请注意,通常可以将模型从 DirectQuery 模式切换到导入模式,这意味着必须导入所有需要的数据。Cannot change from import to DirectQuery mode: Note that while it's generally possible to switch a model from DirectQuery mode to use import mode, this means all the necessary data must be imported. 它无法切换回来(主要是由于 DirectQuery 模式下不支持此功能集)。It is also not possible to switch back (primarily due to the set of features not supported in DirectQuery mode). 由于外部度量值的处理方式完全不同,多维数据源(如 SAP BW)的 DirectQuery 模型也不能从 DirectQuery 切换到导入。DirectQuery models over multidimensional sources like SAP BW also cannot be switched from DirectQuery to import, due to the completely different treatment of external measures.

Power BI 服务中的 DirectQueryDirectQuery in the Power BI service

Power BI Desktop 支持所有数据源。All sources are supported from Power BI Desktop. 某些数据源还可直接在 Power BI 服务中使用。Some sources are also available directly from within the Power BI service. 例如,企业用户可以使用 Power BI 连接其 Salesforce 中的数据并立即获得仪表板,而无需使用 Power BI Desktop。For example, it is possible for a business user to use Power BI to connect to their data in Salesforce, and immediately get a dashboard, without use of Power BI Desktop.

启用了 DirectQuery 的数据源中,只有两个数据源可以直接在服务中使用:Only two of the DirectQuery enabled-sources are available directly in the service:

  • SparkSpark
  • Azure SQL 数据仓库Azure SQL Data Warehouse

但是,强烈建议在 Power BI Desktop 中对这两个数据源使用 DirectQuery。However, it is strongly recommended that any use of DirectQuery over those two sources start within Power BI Desktop. 原因是,当最初在 Power BI 服务中进行连接时,会应用许多关键限制,这意味着虽然开始时很容易(从Power BI 服务开始),但是会进一步限制生成报表的功能(例如,不能创建任何计算或使用许多分析功能,甚至不能刷新元数据以反映对基础架构的任何更改)。The reason is that when the connection is initially made in the Power BI service, many key limitations will apply, meaning that while the start point was easy (starting in the Power BI service), there are limitations on enhancing the resulting report any further (for example, it's not possible then to create any calculations, or use many analytical features, or even refresh the metadata to reflect any changes to the underlying schema).

成功使用 DirectQuery 的指南Guidance for using DirectQuery successfully

如果要使用 DirectQuery,此节提供一些有关如何确保使用成功的高级指导。If you're going to use DirectQuery, then this section provides you with some high level guidance on how to ensure success. 此部分中的指导源自本文所述的使用 DirectQuery 的影响。The guidance in this section is derived from the implications of using DirectQuery that have been described in this article.

后端数据源性能Backend data source performance

强烈建议验证简单的视觉对象是否可以在合理的时间内刷新。It is strongly recommended to validate that simple visuals will be able to refresh in a reasonable time. 要提供合理的交互体验,刷新时间应在 5 秒内。This should be within 5 seconds to have a reasonable interactive experience. 当然,如果视觉对象刷新时间超过 30 秒,发布报表后很可能会出现进一步的问题,这会使解决方案无效。Certainly if visuals are taking longer than 30 seconds, then it's highly likely that further issues will occur following publication of the report, which will make the solution unworkable.

如果查询速度慢,则最先检查发送到基础数据源的查询,以及所观测到的查询性能不佳的原因。If queries are slow, then the first point of investigation is to examine the queries being sent to the underlying source, and the reason for the query performance being observed. 本主题不涉及各种在完整的一组潜在基础数据源中优化数据库的最佳做法,但却适用于适合大多数情况的标准数据库实践:This topic doesn't cover the wide range of database optimization best practices across the full set of potential underlying sources, but it does apply to the standard database practices that apply to most situations:

  • 基于整数列的关系通常比其他数据类型的列的联接更好Relationships based on integer columns generally perform better than joins on columns of other data types
  • 应创建相应的索引,这通常意味着在支持列存储索引的数据源(如 SQL Server)中使用它们。The appropriate indexes should be created, which generally means the use of column store indexes in those sources that support them (for example, SQL Server).
  • 应更新数据源中的任何必要的统计信息Any necessary statistics in the source should be updated

模型设计指南Model Design Guidance

定义模型时,请考虑执行以下操作:When defining the model, consider doing the following:

  • 避免在查询编辑器中定义复杂的查询。Avoid complex queries in Query Editor. 在查询编辑器中定义的查询将转换为单个 SQL 查询,然后包含在发送到该表的每个查询的子选择中。The query that's defined in the Query Editor will be translated into a single SQL query, that will then be included in the subselect of every query sent to that table. 如果查询很复杂,则可能导致所发送的查询出现性能问题。If that query is complex, it might result in performance issues on ever query sent. 可以通过选择查询编辑器中的最后一步,并从上下文菜单中选择“查看本地查询”来获取一组步骤的实际 SQL 查询。The actual SQL query for a set of steps can be obtained by selecting the last step in Query Editor, and choosing View Native Query from the context menu.
  • 简化度量值。Keep measures simple. 至少在开始时,建议将度量值限制为简单聚合。At least initially, it is recommended to limit measures to simple aggregates. 如果这些度量值的执行令人满意,可以定义更复杂的度量值,但要注意每个度量值的性能。Then if those perform in a satisfactory manner, more complex measures can be defined, but paying attention to the performance for each.
  • 避免定义计算列上的关系。Avoid relationships on calculated columns. 这与需要执行多列联接的数据库尤其相关。This is particularly relevant to databases where it is necessary to perform multi-column joins. Power BI 目前不允许基于多列作为 FK/PK 的关系。Power BI today does not allow a relationship to be based on multiple columns as the FK/PK. 常见的解决方法是使用计算列来连接列,然后在其上创建联接。The common workaround is to concatenate the columns together using a calculated column, and base the join on that. 尽管此解决方法对于导入数据是合理的,但在 DirectQuery 情况下,会导致表达式联接,通常会阻止使用任何索引,并导致性能不佳。While this workaround is reasonable for imported data, in the case of DirectQuery it results in a join on an expression, that commonly prevents use of any indexes, and leads to poor performance. 唯一的解决方法是,在基础数据库中将多列具体化为单列。The only workaround is to actually materialize the multiple columns into a single column in the underlying database.
  • 避免定义 uniqueidentifier 列上的关系。Avoid relationships on uniqueidentifier columns. Power BI 在本机上不支持 uniqueidentifier 数据类型。Power BI does not natively support a datatype of uniqueidentifier. 因此,定义 uniqueidentifier 列之间的关系将导致一个包含涉及 Cast 的联接的查询。Hence defining a relationship between columns of type uniqueidentifier column will result in a query with a join involving a Cast. 这通常也会导致性能不佳。Again, this commonly leads to poor performance. 在进行特别优化之前,唯一的解决方法是在基础数据库中具体化替代类型的列。Until this case is specifically optimized, the only workaround is to materialize columns of an alternative type in the underlying database.
  • 隐藏关系中的 to 列Hide the to column on relationships. 应隐藏关系中的 to 列(通常是 to 表中的主键),使其不出现在字段列表中,从而不在视觉对象中使用。The to column on relationships (commonly the primary key on the to table) should be hidden, so that it does not appear in the field list, and therefore cannot be used in visuals. 通常,关系所在的列实际上是系统列(例如,数据仓库中的代理键),隐藏这些列总是很好的做法。Often the columns on which relationships are based are in fact system columns (for example, surrogate keys in a data warehouse) and hiding such columns is good practice anyway. 如果该列确实有意义,则引入一个可见并且具有等于主键的简单表达式的计算列。If the column does have meaning, then introduce a calculated column that is visible, and that has a simple expression of being equal to the primary key. 例如:For example:

    ProductKey_PK   (Destination of a relationship, hidden)
    ProductKey (= [ProductKey_PK],   visible)
    ProductName
    ...
    

    这样做只是为了避免在视觉对象包括主键列时可能发生的性能问题。The reason for doing this is simply to avoid a performance issue that can occur otherwise if a visual includes the primary key column.

  • 检查所有计算列和数据类型更改的使用。Examine all uses of calculated columns and data type changes. 使用这些功能并不一定产生不利影响,它们会将查询发送到包含表达式的基础数据源,而不是对列简单引用的基础数据源,这可能会导致索引未被使用。Use of these capabilities are not necessarily harmful, they result in the queries sent to the underlying source containing expressions rather than simple references to columns, that again might result in indexes not being used.
  • 避免对关系使用双向交叉筛选(预览)。Avoid use of the (preview) bi-directional cross filtering on relationships.
  • 设置“假设引用完整性”实验Experiment with setting Assume referential integrity. 关系的“假设引用完整性”设置使查询能够使用 INNER JOIN 语句,而不是 OUTER JOIN 语句。The Assume Referential Integrity setting on relationships enables queries to use INNER JOIN statements rather than OUTER JOIN. 这通常可以提高查询性能,但具体取决于数据源的详细情况。This generally improves query performance, though it does depend on the specifics of the data source.
  • 在查询编辑器中不使用相对数据筛选。Do not use the relative data filtering in Query Editor. 在查询编辑器中可以定义相对日期筛选。It's possible to define relative date filtering in Query Editor. 例如,假设筛选日期是过去 14 天内的行。For example, to filter to the rows where the date is in the last 14 days.

    但是,编写查询,它将被转换为基于固定日期的筛选器。However, this will be translated into a filter based on the fixed date, as at the time the query was authored. 这可以通过查看本机查询看到。This can be seen from viewing the native query.

    这种转换几乎肯定是不需要的。This is almost certainly not what was wanted. 为了确保根据执行报表时的日期应用筛选器,请将筛选器作为报表筛选器应用于报表中。To ensure the filter is applied based upon the date as at the time the report is executed then instead apply the filter in the report as a Report Filter. 目前的实现方法是创建计算列计算天数(使用 DAX DATE() 函数),然后在筛选器中使用该计算列。Currently this would be done by creating a calculated column calculating the number of days ago (using the DAX DATE() function), and then using that calculated column in a filter.

报表设计指南Report Design Guidance

创建使用 DirectQuery 连接的报表时,请遵循以下指南:When creating a report using a DirectQuery connection, adhere to the following guidance:

  • 先应用筛选器:始终在生成视觉对象开始时应用任何适用的筛选器。Apply filters first: Always apply any applicable filters at the start of building a visual. 例如,一开始就应用筛选器 Year,而不是拖动 TotalSalesAmount 和 ProductName,然后筛选特定年份。For example, rather than drag in the TotalSalesAmount, and ProductName, then filter to a particular year, apply the filter on Year at the very start. 这是因为生成视觉对象的每个步骤都将发送查询,而在第一个查询完成之前可能会进行其他更改,这仍然会对基础数据源造成不必要的负担。This is because each step of building a visual will send a query, and whilst it is possible to then make another change before the first query has completed, this still leaves unnecessary load on the underlying source. 尽早应用筛选器通常会降低中间查询成本。By applying filters early, it generally makes those intermediate queries less costly. 此外,不尽早应用筛选器可能导致达到 1 百万行的限制。Also, failing to apply filters early can result in hitting the 1m row limit above.
  • 限制页面上的视觉对象的数目:打开页面(或某些页级别的切片器或筛选器更改)后,将刷新页面上所有的视觉对象。Limit the number of visuals on a page: When a page is opened (or some page level slicer or filter changed) then all of the visuals on a page are refreshed. 由于并行发送的查询数量有限制,随着视觉对象数量的增加,一些视觉对象将以串行方式刷新,从而增加刷新整个页面所需的时间。There is also a limit on the number of queries that are sent in parallel, hence as the number of visuals increases, some of the visuals will be refreshed in a serial manner, increasing the time taken to refresh the entire page. 出于此原因,建议限制单个页面中的视觉对象数量,改为包含更多更简单的页面。For this reason it's recommended to limit the number of visuals on a single page, and instead have more, simpler pages.
  • 考虑关闭视觉对象之间的交互:默认情况下,报表页上的可视化组件可用于交叉筛选和交叉突出显示页面上的其他可视化组件。Consider switching off interaction between visuals: By default, visualizations on a report page can be used to cross-filter and cross-highlight the other visualizations on the page. 例如,选择了饼图上的“1999”之后,柱形图交叉突出显示“1999”类别的销售额。For example, having selected “1999” on the pie chart, the column chart is cross highlighted to show the sales by category for “1999”.

    但是,可以按此文所述控制交互。However, this interaction can be controlled as described in this article. 在 DirectQuery 中,交叉筛选和交叉突出显示需要向基础数据源发送查询,所以如果响应用户选择所花费的时间太长,应关闭交互。In DirectQuery such cross-filtering and cross-highlighting requires queries to be sent to the underlying source, so the interaction should be switched off if the time taken to respond to users' selections would be unreasonably long.

  • 考虑仅共享报表:发布到 Power BI 服务后,可以通过不同的方式共享内容。Consider sharing the report only: There are different ways of sharing content after publishing to the Power BI service. 对于 DirectQuery,建议只考虑共享已完成的报表,而不允许其他用户创作新报表(其生成的特定视觉对象可能会遇到性能问题)。In the case of DirectQuery, it's advisable to only considering sharing the finished report, rather than allow other users to author new reports (and potentially encounter performance issues for the particular visuals that they build).

除了上述列表中的建议,请注意,以下每一种报表功能都会导致性能问题:In addition to the above list of suggestions, note that each of the following reporting capabilities can cause performance issues:

  • 度量值筛选器:包含度量值(或列聚合)的视觉对象可以包含这些度量值中的筛选器。Measure filters: Visuals containing measures (or aggregates of columns) can contain filters in those measures. 例如,下面的视觉对象按类别显示 SalesAmount,但仅包括超过 2 千万的类别。For example, the visual below shows SalesAmount by Category, but only including those Categories with more than 20M of sales.

    这会导致两个查询被发送到基础数据源:This will result in two queries being sent to the underlying source:

    • 第一个查询检索符合条件 (Sales > 20M) 的类别The first query will retrieve the Categories meeting the condition (Sales > 20M)
    • 第二个查询检索视觉对象所需的数据,包括满足 WHJERE 子句中的条件的类别。The second query will then retrieve the necessary data for the visual, including the Categories that met the condition in the WHJERE clause.

    如果有数百或数千个类别(如此示例所示),这通常可以正常执行。This generally performs just fine if there are hundreds or thousands of categories, as in this example. 如果类别的数量大得多,则可能会降低性能(事实上,如果符合条件的类别超过 1 百万,由于前面所述的 1 百万行限制,查询将失败)。Performance can degrade if the number of categories is much larger (and indeed, the query will fail if there were more than a million categories meeting the condition, due to the one million row limit discussed earlier).

  • TopN 筛选器:可以定义高级筛选器以筛选根据某些度量值排列的前(或后)N 个值,例如,仅在上述视觉对象中包含前 10 个类别。TopN filters: Advanced filters can be defined to filter on only the Top (or Bottom) N values ranked by some measure, for example, to only include the Top 10 Categories in the visual above. 这会导致两个查询被发送到基础数据源。This will again result in two queries being sent to the underlying source. 但是,第一个查询会从基础数据源返回所有类别,然后 TopN 基于返回的结果进行筛选。However, the first query will return all categories from the underlying source, and then the TopN are determined based on the returned results. 具体取决于涉及的列的基数,这可能会导致性能问题(或由于 1 百万行限制引起的查询失败)。Depending on the cardinality of the column involved, this can lead to performance issues (or query failures due to the 1m row limit).
  • 中值:通常情况下,所有聚合(Sum、Count、Distinct 等)都会被推送到基础数据源。Median: Generally, any aggregation (Sum, Count Distinct, so on) is pushed to the underlying source. 但这并不适用于中值,因为基础数据源通常不支持此聚合。However, this is not true for Median, as this aggregate is generally not supported by the underlying source. 在这种情况下,会先从基础数据源中检索详细数据,然后从返回的结果中计算中值。In such cases, the detail data is retrieved from the underlying source, and the Median calculated from the returned results. 当中值在相对较少的结果中计算时,等待时间是合理的,但是如果基数很大,则会发生性能问题(或由于 1 百万行限制引起的查询失败)。This is reasonable when the median is to be calculated over a relatively small number of results, but performance issues (or query failures due to the 1m row limit) will occur if the cardinality is large. 例如,国家/地区人口的中值可能是合理的,但销售价格的中值可能不合理。For example, Median Country Population might be reasonable, but Median Sales Price might not be.
  • 高级文本筛选器(“contains”和类似筛选器):当对文本列进行筛选时,高级筛选功能允许使用筛选器“contains”、“begins with”等筛选器。Advanced text filters (‘contains’ and similar): When filtering on a text column, the advanced filtering allows filters like ‘contains’ and ‘begins with’ and so on. 对于某些数据源,这些筛选器肯定会导致性能下降。These filters can certainly result in degraded performance for some data sources. 特别是在真正需要完全匹配(“is”或“is not”)的情况下,不应使用默认的“contains”筛选器。In particular, the default ‘contains’ filter should not be used if what is really required is an exact match (‘is’ or ‘is not’). 尽管结果可能相同(具体取决于实际数据),但由于使用索引不同,性能可能也会完全不同。Although the results might be the same, depending on the actual data, the performance might be drastically different due to the use of indexes.
  • 多选切片器:默认情况下,切片器仅允许单选。Multi select slicers: By default, slicers only allow a single selection to be made. 允许多选筛选器可能会导致一些性能问题,因为用户在切片器中选择一组项时(例如,10 个感兴趣的产品),每个新选择都将导致向后端数据源发送查询。Allowing multi selection in filters can cause some performance issues, because as the user selects a set of items in the slicer (for example, the ten products they are interested in), then each new selection will result in queries being sent to the backend source. 同时用户可以在查询完成之前选择下一项,这会导致基础源上的额外负载。Whilst the user can select the next item prior to the query completing, this does result in extra load on the underlying source.

诊断性能问题Diagnosing performance issues

本部分介绍如何诊断性能问题,或如何获取更详细的信息以优化报表。This section describes how to diagnose performance issues, or how to get more detailed information to allow the reports to be optimized.

强烈建议在 Power BI Desktop 而不是 Power BI 服务中诊断任何性能问题。It's strongly recommended that any diagnosis of performance issues starts in Power BI Desktop, rather than in the Power BI service. 通常情况下,性能问题只取决于基础数据源的性能水平,因此,在 Power BI Desktop 更加独立的环境中就更容易识别和诊断这些问题,并且可以一开始就消除某些组件(如 Power BI 网关)。It's commonly the case that performance issues are simply based on the level of performance of the underlying source, and these are more easily identified and diagnosed in the much more isolated environment of Power BI Desktop, and initially eliminates certain components (such as the Power BI gateway). 只有在 Power BI Desktop 中未发现性能问题时,调查才应关注 Power BI 服务中的报表细节。Only if the performance issues are found to not be present with Power BI Desktop should investigation focus on the specifics of the report in the Power BI service.

同样,建议先尝试隔离单个视觉对象的任何问题,而不是调查页面上多个视觉对象的问题。Similarly, it is recommended to first try to isolate any issues to an individual visual, rather than many visuals on a page.

因此,假设已完成这些步骤(如本部分中前面段落所述)- 现在 Power BI Desktop 页面中单个视觉对象仍然缓慢。So, let's say those steps (in the previous paragraphs in this section) have been taken - we now have a single visual on a page in Power BI Desktop that is still sluggish. 为了确定 Power BI Desktop 发送到基础数据源的查询,可以查看可能由该数据源发出的跟踪/诊断信息。To determine the queries that are sent to the underlying source by Power BI Desktop, it's possible to view traces/diagnostic information that might be emitted by that source. 此类跟踪还可能包含关于如何执行查询的详细信息以及如何改进的有用信息。Such traces might also contain useful information about the details of how the query was executed, and how it can be improved.

此外,即使数据源中没有这类跟踪,也可以查看 Power BI 发送的查询及其执行时间,如下所述。Further, even in the absence of such traces from the source, it's possible to view the queries sent by Power BI, along with their execution times, as described next.

确定通过 Power BI Desktop 发送的查询Determining the queries sent by Power BI Desktop

默认情况下,Power BI Desktop 会在给定会话期间将事件记录到名为 FlightRecorderCurrent.trc 的跟踪文件中。By default, Power BI Desktop logs events during a given session to a trace file called FlightRecorderCurrent.trc.

对于某些 DirectQuery 数据源,此日志文件包含发送到基础数据源的所有查询(将来会包括其余的 DirectQuery 数据源)。For some DirectQuery sources, this log includes all queries sent to the underlying data source (the remaining DirectQuery sources will be included in the future). 将查询发送到日志的源如下所示:The sources that send queries to the log are the following:

  • SQL ServerSQL Server
  • Azure SQL 数据库Azure SQL Database
  • Azure SQL 数据仓库Azure SQL Data warehouse
  • OracleOracle
  • TeradataTeradata
  • SAP HANASAP HANA

当前用户的 AppData 文件夹中可以找到的跟踪文件:The trace file can be found in the AppData folder for the current user:

\<User>\AppData\Local\Microsoft\Power BI Desktop\AnalysisServicesWorkspaces

下面是获取对此文件夹的简单办法:在“Power BI Desktop”中,选择“文件”>“选项和设置”>“选项”,然后选择“诊断”。Here's an easy way to get to this folder: In Power BI Desktop select File > Options and settings > Options, and then select Diagnostics. 将显示以下对话框窗口:The following dialog window appears:

在“诊断选项”下,选择“打开跟踪文件夹”链接,此时会打开以下文件夹:When you select the Open traces folder link, under Diagnostic Options, the following folder opens:

\<User>\AppData\Local\Microsoft\Power BI Desktop\Traces

导航到该文件夹的父文件夹将显示包含 AnalysisServicesWorkspaces 的文件夹,该文件包含每个打开的 Power BI Desktop 实例的工作区子文件夹。Navigating to that folder's parent folder displays the folder containing AnalysisServicesWorkspaces, which will contain one workspace subfolder for every open instance of Power BI Desktop. 这些子文件夹名称中带有整数后缀,例如 AnalysisServicesWorkspace2058279583。These subfolders are named with an integer suffix, such as AnalysisServicesWorkspace2058279583.

该文件夹内是一个 \Data 子文件夹,其中包含当前 Power BI 会话的跟踪文件 FlightRecorderCurrent.trc。Inside that folder is a \Data subfolder that contains the trace file FlightRecorderCurrent.trc for the current Power BI session. 相关联的 Power BI Desktop 会话结束时,将删除相应的工作区文件夹。The corresponding workspace folder is deleted when the associated Power BI Desktop session ends.

跟踪文件可以使用 SQL Server Profiler 工具读取,该工具作为 SQL Server Management Studio 的一部分可以免费下载。The trace files can be read using the SQL Server Profiler tool, which is available as a free download as part of SQL Server Management Studio. 可以从此处获取。You can get that from this location.

下载并安装 SQL Server Management Studio 后,运行 SQL Server Profiler。Once you download and install SQL Server Management Studio, run SQL Server Profiler.

若要打开跟踪文件,请执行以下步骤:To open the trace file, take the following steps:

  1. 在 SQL Server Profiler 中,选择“文件”>“打开”>“跟踪文件”In SQL Server Profiler, select File > Open > Trace file
  2. 输入当前打开的 Power BI 会话的跟踪文件路径,如:Enter the path to the trace file for the currently open Power BI session, such as:

      C:\Users\<user>\AppData\Local\Microsoft\Power BI Desktop\AnalysisServicesWorkspaces\AnalysisServicesWorkspace2058279583\Data
    
  3. 打开 FilghtRecorderCurrent.trcOpen FilghtRecorderCurrent.trc

将显示当前会话的所有事件。All events from the current session are displayed. 注释示例如下所示,其中突出显示了事件组。An annotated example is shown below, which highlights groups of events. 每个组具有以下内容:Each group has the following:

  • 一个 Query Begin 和一个 Query End 事件,分别表示 UI(例如,从视觉对象,或从在筛选器 UI 中填充值的列表)生成的 DAX 查询的开始和结束A Query Begin and Query End event, which represent the start and end of a DAX query generated by the UI (for example, from a visual, or from populating a list of values in the filter UI)
  • 一对或多对 DirectQuery Begin 和 DirectQuery End 事件,表示发送到基础数据源的查询(作为评估 DAX 查询的一部分)。One or more pairs of DirectQuery Begin and DirectQuery End events, which represent a query sent to the underlying data source, as part of evaluating the DAX query.

请注意,可以并行执行多个 DAX 查询,因此来自不同组的事件可能互相交错。Note that multiple DAX queries can be executed in parallel, so events from different groups can be interleaved. ActivityID 值可以用于确定属于同一个组的具体事件。The value of the ActivityID can be used to determine which events belong to the same group.

其他较重要的列如下所示:Other columns of interest are the following:

  • TextData:事件的文本详细信息。TextData: The textual detail of the event. 对于“Query Begin/End”事件,该内容会是 DAX 查询。For “Query Begin/End” events this will be the DAX query. 对于“DirectQuery Begin/End”事件,该内容会是发送到基础数据源的 SQL 查询。For “DirectQuery Begin/End” events this will be the SQL query sent to the underlying source. 当前选中事件的 TextData 也显示在底部区域中。The TextData for the currently selected event is also displayed in the region at the bottom.
  • EndTime:事件完成的时间。EndTime: When the event completed.
  • Duration:执行 DAX 或 SQL 查询的持续时间,以毫秒为单位。Duration: The duration, in milliseconds, taken to execute the DAX or SQL query.
  • Error:指示是否发生了错误(发生错误时,该事件显示为红色)。Error: Indicates if an error occurred (in which case the event is also displayed in red).

请注意,在上图中,缩小了重要性较低的列,以便更容易看到重要性较高的列。Note that in the image above, some of the less interesting columns have narrowed, to allow the interesting columns to be seen more easily.

建议使用以下方法捕获跟踪以帮助诊断潜在性能问题:The recommended approach to capturing a trace to help diagnose a potential performance issue is the following:

  • 打开单个 Power BI Desktop 会话(避免多个工作区的文件夹产生混淆)Open a single Power BI Desktop session (to avoid the confusion of multiple workspace folders)
  • 在 Power BI Desktop 执行一组意向操作。Perform the set of actions of interest in Power BI Desktop. 再执行一些额外的操作,确保将意向操作事件刷新到跟踪文件中。Include a few additional actions beyond that, to ensure that the events of interest are flushed into the trace file.
  • 打开 SQL Server Profiler 并检查跟踪,如前面所述。Open SQL Server Profiler and examine the trace, as described earlier. 请记住,关闭 Power BI Desktop 时将删除跟踪文件。Remember that the trace file will be deleted upon closing Power BI Desktop. 此外,在 Power BI Desktop 中的进一步操作将不会立刻显示 – 应关闭跟踪文件并重新打开以查看新事件。Also, further actions in Power BI Desktop will not immediately appear – the trace file should be closed and re-opened to see the new events.
  • 保持较短的单个会话时间(10 秒的操作时间,而不是数百秒的操作时间),使跟踪文件更容易解释(并且因为跟踪文件的大小有限制,因此在时间很长的会话中可能会丢弃早期事件)。Keep individual sessions reasonably small (ten seconds of actions, not hundreds) to make it easier to interpret the trace file (and because there is a limit on the size of the trace file, thus for very long sessions there is a chance of early events being dropped).

了解 Power BI Desktop 发送的查询的形式Understanding the form of query sent by Power BI Desktop

Power BI Desktop 创建和发送的查询的一般格式会对每个引用的表使用子选项,其中子查询由“查询编辑器”中定义的查询定义。The general format of queries created and sent by Power BI Desktop use subselects for each of the tables referenced, where the subselect is as defined by the query defined in Query Editor. 例如,假设 SQL Server 中有以下 TPC-DS 表:For example, assume the following TPC-DS tables in SQL Server:

请考虑运行以下查询:Consider the following query:

该查询将生成下面的视觉对象:That query results in the following visual:

刷新该视觉对象将生成下一段所示的 SQL 查询。Refreshing that visual will result in the SQL query shown below the next paragraph. 可以看出,Web Sales、Item 和 Date_dim 各有三个子选项,每个子选项都返回相应表上的所有列,即使实际只有四列被视觉对象引用。As you can tell, there are three subselects for Web Sales, Item, and Date_dim, that each return all the columns on the respective table, even though only four columns are actually referenced by the visual. 子选项中的这些查询(颜色较深部分)完全是“查询编辑器”中定义的查询的结果。These queries in the subselects (they're shaded) are exactly the result of the queries defined in Query editor. 对于目前 DirectQuery 支持的数据源,尚未发现以这种方式使用子选项会影响性能。Use of subselects in this manner has not been found to impact performance, for the data sources so far supported for DirectQuery. SQL Server 等数据源简化了对其他列的引用。Data sources like SQL Server simply optimize away the references to the other columns.

Power BI 使用此模式的原因之一是,所使用的 SQL 查询可直接由分析师提供,因此可根据“提供”的模式使用,而无需尝试重写。One reason Power BI employs this pattern is because the SQL query used can be provided directly by the analyst, so it's used "as provided", without an attempt to rewrite it.

后续步骤Next steps

本文介绍了所有数据源中常见的 DirectQuery 的各个方面。This article describes aspects of DirectQuery that are common across all data sources. 某些细节特定于某些数据源。There are certain details that are specific to individual sources. 请参阅以下涵盖特定数据源的主题:See the following topics covering specific sources:

有关 DirectQuery 的详细信息,请查看以下资源:For more information about DirectQuery, check out the following resources: