DirectQuery 和 SAP HANADirectQuery and SAP HANA

可以使用 DirectQuery 直接连接到 SAP HANA 数据源。You can connect to SAP HANA data sources directly using DirectQuery.

使用 SAP HANA 时,请务必了解如何处理其连接的某些方面,以确保:When using SAP HANA it is important to understand some aspects of how connections to it are treated, to ensure that:

  • 当 SAP HANA 视图包含非累加性度量值(例如,非重复计数或平均值,而不是简单求和)时,结果与预期一致The results are as expected, when the SAP HANA view contains non-additive measures (for example, distinct counts, or averages, rather than simple sums)
  • 生成的查询是高效的The resulting queries are efficient

当“获取数据”或“查询编辑器”中定义的查询执行聚合时,最好先花点时间来弄清关系源(如 SQL Server)的行为。It's useful to start by taking a moment to clarify the behavior of a relational source such as SQL Server, when the query defined in Get Data or Query Editor performs an aggregation. 在以下示例中,查询编辑器中定义的查询按 ProductID 返回平均价格。In the example that follows, a query defined in Query Editor returns the average price by ProductID.

如果要将数据导入到 Power BI(而不是使用 DirectQuery),将产生以下结果:If the data is being imported into Power BI (versus using DirectQuery), the following would result:

  • 数据在查询编辑器中创建的查询所定义的聚合级别导入。The data is imported at the level of aggregation defined by the query created in Query Editor. 例如,按产品划分的平均价格。For example, average price by product. 这会导致一个表格具有两列:ProductID 和 AveragePrice,可以在视觉对象中使用它们。This results in a table with the two columns ProductID and AveragePrice that can be used in visuals.
  • 在视觉对象中,任何后续聚合(如 Sum、Average、Min 等)都在该导入的数据上执行。In a visual, any subsequent aggregation (such as Sum, Average, Min, others) is performed over that imported data. 例如,在视觉对象中包含 AveragePrice 将默认使用求和聚合,并将针对每个 ProductID 在 AveragePrice 上返回总和 - 在此情况下为 13.67。For example, including AveragePrice on a visual will use the Sum aggregate by default, and would return the sum over the AveragePrice for each ProductID – which in this case would be 13.67. 这同样适用于视觉对象中使用的任何替代聚合函数(如 Min、Average 等)。The same applies to any alternative aggregate function (such as Min, Average, so on) used on the visual. 例如,AveragePrice 的平均值返回 6.66、4 和 3 的平均值,即等于 4.56,而不是基础表中的 6 个记录中的价格的平均值 5.17。For example, Average of AveragePrice returns the average of 6.66, 4 and 3, which equates to 4.56, and not the average of Price on the 6 records in the underlying table, which is 5.17.

如果使用“DirectQuery”而不是导入,则相同的语义适用并且结果将完全相同:If DirectQuery is being used instead of Import, the same semantics apply and the results would be exactly the same:

  • 对于相同的查询,逻辑上完全相同的数据会提供给报表层 - 即使没有实际导入数据。Given the same query, logically exactly the same data is presented to the reporting layer – even though the data is not actually imported.
  • 在视觉对象中,任何后续聚合(Sum、Average、Min 等)再次通过查询中的逻辑表执行。In a visual, any subsequent aggregation (Sum, Average, Min, others) is again performed over that logical table from the query. 同样,一个包含 AveragePrice 的平均值的视觉对象返回相同的 4.56。And again, a visual containing Average of AveragePrice returns the same 4.56.

现在,我们来看一看 SAP HANASo now let's consider SAP HANA. 在 SAP HANA 中,Power BI 可以使用分析视图和计算视图,这两种视图都可以包含度量值。Power BI can work with both Analytic Views and Calculation Views in SAP HANA, both of which can contain measures. 但是现在,用于 SAP HANA 的方法遵循如上所述的相同原则:“获取数据”或“查询编辑器”中定义的查询将确定可用数据,然后视觉对象中的任何后续聚合都在该数据上执行,这同样适用于导入和 DirectQuery。Yet today the approach for SAP HANA follows the same principles as described previously: the query defined in Get Data or Query Editor will determine the data available, and then any subsequent aggregation in a visual is over that data, and the same applies for both Import and DirectQuery.

但是,鉴于 HANA 的性质,初始“获取数据”对话框或“查询编辑器”中定义的查询始终是聚合查询,并且通常会包括度量值,要使用的实际聚合由 HANA 视图定义。However, given the nature of HANA, the query defined in the initial Get Data dialog or Query Editor is always an aggregate query, and generally will include measures where the actual aggregation that will be used is defined by the HANA view.

上面 SQL Server 示例的等效为存在包含 ID、ProductID、DepotID 和度量值(包括 AveragePrice,在视图中定义为“平均价格”)的 HANA 视图。The equivalent of the SQL Server example above is that there is a HANA view containing ID, ProductID, DepotID, and measures including AveragePrice, defined in the view as Average of Price.

如果在“获取数据”体验中,所做的选择是针对 ProductID 和 AveragePrice 度量值,那么即是在定义对该视图的查询,请求该聚合数据(在上面的示例中,为简单起见,使用与 HANA SQL 的确切语法不匹配的伪 SQL)。If, in the Get Data experience, the selections made were for ProductID and the AveragePrice measure, then that is defining a query over the view, requesting that aggregate data (in the example above, for simplicity pseudo-SQL is used that doesn’t match the exact syntax of HANA SQL). 视觉对象中定义的任何更深入聚合都将进一步聚合此类查询的结果。Then any further aggregations defined in a visual are further aggregating the results of such a query. 如上文针对 SQL Server 所述,这同时适用于导入和 DirectQuery 用例。Again, as described above for SQL Server, this applies both for the Import and DirectQuery case. 请注意,在 DirectQuery 用例中,来自“获取数据”或“查询编辑器”的查询将在发送到 HANA 的单个查询中的嵌套 select 语句中使用,因此实际上并不是在更深入聚合之前读取所有数据。Note that in the DirectQuery case, the query from Get Data or Query Editor will be used in a subselect within a single query sent to HANA, and thus it is not actually the case that all the data would be read in, prior to aggregating further.

这导致在对 HANA 使用 DirectQuery 时要考虑以下重要注意事项:This gives rise to the following important considerations when using DirectQuery over HANA:

  • 每当 HANA 中的度量值为非累加性时(例如,不是简单的 Sum、Min 或 Max),必须注意在视觉对象中执行的任何更深入聚合。Attention must be paid to any further aggregation performed in visuals, whenever the measure in HANA is non-additive (for example, not a simple Sum, Min, or Max).
  • 在“获取数据”或“查询编辑器”中,仅应包含所需的列以检索所需的数据,这反映了结果将是查询的事实,该查询必须是可以发送到 HANA 的合理查询。In Get Data or Query Editor, only the required columns should be included to retrieve the necessary data, reflecting the fact that the result will be a query, that must be a reasonable query that can be sent to HANA. 例如,如果选择了多个列,认为在后续视觉对象中可能需要这些列,那么即使是对于 DirectQuery,简单的视觉对象也将意味着嵌套 select 语句中使用的聚合查询将包含这些列,这样通常执行效果将非常差。For example, if dozens of columns were selected, with the thought that they might be needed on subsequent visuals, then even for DirectQuery a simple visual will mean the aggregate query used in the subselect will contain those dozens of columns, which will generally perform very poorly.

我们来看一个示例。Let's look at an example. 在以下示例中,在“获取数据”对话框中选择五个列(CalendarQuarter、Color、LastName、ProductLine、SalesOrderNumber)以及度量值 OrderQuantity 意味着以后创建包含 Min OrderQuantity 的简单视觉对象会导致将以下 SQL 查询发送到 HANA。In the following example, selecting five columns (CalendarQuarter, Color, LastName, ProductLine, SalesOrderNumber) in the Get Data dialog, along with the measure OrderQuantity, will mean that later creating a simple visual containing the Min OrderQuantity will result in the following SQL query to HANA. 阴影部分是嵌套 select 语句,其中包含“获取数据” / “查询编辑器”中的查询。The shaded portion is the subselect, containing the query from Get Data / Query Editor. 如果此嵌套 select 语句提供非常高的基数结果,那么很可能由此产生的 HANA 性能会很差。If this subselect gives a very high cardinality result, then it is likely the resulting HANA performance will be poor.

因此,建议在“获取数据”或“查询编辑器”中选择的项应限于所需的项,同时仍然生成 HANA 的合理查询。Because of this, it is recommended that the items selected in Get Data or Query Editor should be limited to those items that are needed, while still resulting in a reasonable query for HANA.

后续步骤Next steps

有关 DirectQuery 的详细信息,请查看以下资源:For more information about DirectQuery, check out the following resources: