套用 Power BI Desktop 中的見解以找出分佈的不同之處 (預覽)Apply insights in Power BI Desktop to discover where distributions vary (preview)

在視覺效果中,您經常會看到資料點,並想知道不同類別的分佈是否會相同。Often in visuals, you see a data point, and wonder about whether distribution would be the same for different categories. 使用 Power BI Desktop 中的 見解,只要按幾下就能查明。With insights in Power BI Desktop you can find out with just a few clicks.

假設下列視覺效果依「國家/地區」 顯示「總銷售額」 。Consider the following visual, which shows Total Sales by Country. 如圖表所示,銷售額大部分來自美國,佔總銷售額的 57%,而來自其他國家/地區的比重則相對較少。As the chart shows, most sales come from the United States, accounting for 57% of all sales with lessor contributions coming from the other countries. 在此情況下,探索不同子母體擴展中是否會看到相同的分佈通常會很有趣。It's often interesting in such cases to explore whether that same distribution would be seen for different sub-populations. 例如,這在所有年份、所有銷售管道及所有產品類別是否都相同?For example, is this the same for all years, all sales channels, and all categories of products? 雖然您可以套用不同的篩選並以視覺方式比較結果,但這樣做可能相當耗時且容易出錯。While you could apply different filters and compare the results visually, doing so can be time consuming and error prone.

顯示大型分佈的圖表

您可以要求 Power BI Desktop 找出分佈的不同之處,並取得快速、自動化、具洞察力的資料分析。You can tell Power BI Desktop to find where a distribution is different, and get fast, automated, insightful analysis about your data. 只要以滑鼠右鍵按一下資料點,然後選取 [分析] > [找出此分佈的不同之處] ,即會以易於使用的視窗形式將見解傳遞給您。Simply right-click on a data point, and select Analyze > Find where the distribution is different, and insight is delivered to you in an easy-to-use window.

分佈不同之處的見解

在此範例中,自動化分析會快速顯示「旅行車」 在美國和加拿大的銷售額比例較低,而來自其他國家/地區的比例則較高。In this example, the automated analysis quickly shows that for Touring Bikes, the proportion of sales in the United States and Canada are lower, while the proportion coming from the other countries is higher.

注意

這項功能目前處於預覽狀態,並可能有所變更。This feature is in preview, and is subject to change. 自 2017 年 9 月起的 Power BI Desktop 版本開始,深入解析功能即預設為啟用 (您不需要勾選 [預覽] 方塊即可啟用)。The insight feature is enabled and on by default (you don't need to check a Preview box to enable it) beginning with the September 2017 version of Power BI Desktop.

使用深入解析Using insights

若要使用見解找出圖表中所顯示分佈的不同之處,只要以滑鼠右鍵按一下任何資料點 (或整個視覺效果),然後選取 [分析] > [找出此分佈的不同之處] 。To use insights to find where distributions seen on charts are different, just right-click on any data point (or on the visual as a whole), and select Analyze > Find where the distribution is different.

按一下滑鼠右鍵以取得見解

Power BI Desktop 即會對這項資料執行機器學習演算法,並於視窗中填入視覺效果與描述,以說明哪些類別 (資料行) 及這些資料行的哪些值會導致最顯著不同的分佈。Power BI Desktop then runs its machine learning algorithms over the data, and populates a window with a visual and a description that describes which categories (columns), and which values of those columns, result in the most significantly different distribution. 這些見解會以直條圖形式提供,如下圖所示。Insights are provided as a column chart, as shown in the following image.

直條圖

套用了所選篩選的值會使用一般預設色彩顯示。The values with the selected filter applied are shown using the normal default color. 整體值 (如原始起始視覺效果所示) 會以灰色顯示以便比較。The overall values, as seen on the original starting visual, are shown in grey for easy comparison. 最多可包含三個不同的篩選 (在此範例中為「旅行車」 、「登山車」 、「公路車」 ),按一下即可選擇不同的篩選 (或按住 ctrl 鍵再按一下滑鼠來選取多個)。Up to three different filters might be included (Touring Bikes, Mountain Bikes, Road Bikes in this example) and different filters can be chosen by clicking on them (or using ctrl-click to select multiple).

針對簡單加總量值 (例如此範例中的「總銷售額」 ),則會依據相對值 (而不是絕對值) 來進行比較。For simple additive measures, like Total Sales in this example, the comparison is based on the relative, rather than absolute, values. 因此,雖然「旅行車」的銷售額低於所有類別的總銷售額,但視覺效果預設會使用雙軸,允許比較不同國家/地區之間「旅行車」與所有自行車類別的銷售額比例。Hence while the sales for Touring Bikes are lower than overall sales for all categories, by default the visual uses a dual axis to allow the comparison between the proportion of sales across different countries, for Touring Bikes versus all categories of bikes. 開關視覺效果下方的切換按鈕可讓兩個值顯示在同一個軸上,以輕鬆比較絕對值 (如下圖所示)。Switching the toggle below the visual allows the two values to be displayed in the same axis, allowing the absolute values to easily be compared (as shown in the following image).

使用見解時顯示的視覺效果

描述性文字也會提供一些指示,根據符合篩選的記錄數目,指出可能附加至篩選值的重要性層級。The descriptive text also gives some indication of the level of importance that might be attached to a filter value, by given the number of records that match the filter. 因此在此範例中,您可以看到雖然「旅行車」 的分佈可能明顯不同,但僅佔記錄的 16.6%。So in this example, you can see that while the distribution for Touring Bikes might be significantly different, they account for only 16.6% of records.

頁面頂端有「喜歡」 和「不喜歡」 圖示,可讓您提供視覺效果和功能的意見反應。The thumbs up and thumbs down icons at the top of the page are provided so you can provide feedback about the visual and the feature. 這樣做會提供意見反應,但目前不會定型演算法來影響下次使用此功能時所傳回的結果。Doing so provides feedback, but it does not currently train the algorithm to influence the results returned next time you use the feature.

值得注意的是,視覺效果頂端的 + 按鈕可讓您將選取的視覺效果新增至報表,就像您手動建立視覺效果一樣。And importantly, the + button at the top of the visual lets you add the selected visual to your report, just as if you created the visual manually. 然後,您可以設定新增視覺效果的格式或進行調整,方法即如您在報表上對任何其他視覺效果的操作。You can then format or otherwise adjust the added visual just as you would to any other visual on your report. Power BI Desktop 中編輯報表時,您只能新增選取的深入解析視覺效果。You can only add a selected insight visual when you're editing a report in Power BI Desktop.

當報表在閱讀模式或編輯模式中時,您都可以使用深入解析,讓分析資料與建立可輕鬆新增至報表之視覺效果的作業更加多元。You can use insights when your report is in reading or editing mode, making it versatile for both analyzing data, and for creating visuals you can easily add to your reports.

所傳回結果的詳細資料Details of the returned results

您可以將此演算法想成擷取模型中的所有其他資料行,並針對這些資料行的所有值,將其當做篩選套用至原始視覺效果,然後找出這些篩選值當中哪些會產生與原始視覺效果最「不同」 的結果。You can think of the algorithm as taking all the other columns in the model, and for all of the values of those columns, applying them as filters to the original visual, and finding which of those filter values produces the most different result from the original.

您可能想知道「不同」 所代表的意思。You likely wonder what different means. 例如,假設美國與加拿大之間的整體銷售額分割如下:For example, say that the overall split of sales between the USA and Canada was the following:

國家/地區Country 銷售額 (百萬元美金)Sales ($M)
美國USA 1515
CanadaCanada 55

針對「公路車」 產品類別,銷售額分割可能如下:Then for a particular category of product “Road Bike) the split of sales might be:

國家/地區Country 銷售額 (百萬元美金)Sales ($M)
美國USA 33
CanadaCanada 11

雖然上述每個表格的數字不同,但美國與加拿大之間的相對值則完全相同 (佔整體 75% 和 25%,後者表示「公路車」)。While the numbers are different in each of those tables, the relative values between USA and Canada are identical (75% and 25% overall, and for Road Bikes). 因此,兩者並無不同。Because of that, these are not considered different. 針對這類簡單加總量值,演算法會尋找「相對」 值中的差異。For simple additive measures like this, the algorithm is therefore looking for differences in the relative value.

相較之下,請考慮利潤等量值,這會以收益/成本來計算,並假設美國與加拿大的整體利潤如下By contrast consider a measure like margin, that is calculated as Profit/Cost, and say that the overall margins for the USA and Canada were the following

國家/地區Country 利潤 (%)Margin (%)
美國USA 1515
CanadaCanada 55

針對「公路車」 產品類別,銷售額分割可能如下:Then for a particular category of product “Road Bike) the split of sales might be:

國家/地區Country 利潤 (%)Margin (%)
美國USA 33
CanadaCanada 11

根據這類量值的本質,這「會」 視為相當不同。Given the nature of such measures, this is considered interestingly different. 因此,針對非加總量值 (例如此利潤範例),演算法會尋找絕對值中的差異。So for non-additive measures such as this margin example, the algorithm is looking for differences in the absolute value.

因此,顯示的視覺效果是為了清楚顯示整體分佈 (如原始視覺效果所示) 與套用特定篩選的值之間發現的差異。The visuals displayed are thus intended to clearly show the differences found between the overall distribution (as seen in the original visual) and the value with the particular filter applied.

因此,針對加總量值 (例如前一個範例中的「銷售額」 ),會使用直條圖和折線圖,其中使用雙軸和適當的縮放比例,以便輕鬆比較相對值。So for additive measures, like Sales in the previous example, a column and line chart is used, where the use of a dual axis with appropriate scaling such that the relative values can easily be compared. 直條圖顯示套用篩選的值,而折線圖顯示整體值 (直條圖座標軸在左側,折線圖座標軸在右側,如往常一樣)。The columns show the value with the filter applied, and the line shows the overall value (with the column axis being on the left, and the line axis on the right, as normal). 此線條是使用「階梯狀」 樣式顯示,其中包含虛線,並填入灰色。The line is shown using a stepped style, with a dashed line, filled with grey. 在前一個範例中,如果直條圖座標軸最大值為 4,而折線圖座標軸最大值為 20,它可讓您輕鬆比較美國與加拿大之間已篩選和整體值的相對值。For the previous example, if the column axis maximum value is 4, and the line axis maximum value is 20, then it would allow easy comparison of the relative values between USA and Canada for the filtered and overall values.

同樣地,針對非加總量值 (例如前一個範例中的「利潤」 ),會使用直條圖和折線圖,其中使用單軸表示可輕鬆比較絕對值。Similarly, for non-additive measures like Margin in the previous example, a column and line chart is used, where the use of a single axis means the absolute values can easily be compared. 再次重申,折線圖 (填滿灰色) 會顯示整體值。Again the line (filled with grey) shows the overall value. 是否比較實際或相對數字,決定兩個分佈的不同程度不只是計算值的差異。Whether comparing actual or relative numbers, the determination of the degree to which two distributions are different is not simply a matter of calculating the difference in the values. 範例︰For example:

  • 套用至整體母體擴展的較小比例時,會將母體擴展大小列入考量,因為差異較不具統計顯著性且較不相關。The size of the population is factored in, as a difference is less statistically significant and less interesting when it applies to a smaller proportion of the overall population. 例如,某個特定產品在不同國家/地區之間的銷售額分佈可能非常不同,如果有數千個產品則會視為不相關,因此該特定產品只會佔整體銷售額的很小比例。As an example, the distribution of sales across countries might be very different for some particular product, this would not be considered interesting if there were thousands of products, and hence that particular product accounted for only a small percentage of the overall sales.

  • 這些類別的差異若原始值很高或非常接近零,則其加權會比其他類別高。Differences for those categories where the original values were very high or very close to zero are weighted higher than others. 例如,如果某個國家/地區僅佔銷售額的 1%,但針對某個特定類型的產品則佔 6%,則相較於其比重從 50% 變更為 55% 的國家/地區,該國家/地區會更具統計顯著性,因此會視為更相關。For example, if a country overall contributes only 1% of sales, but for some particular type of product contributes 6%, that is more statistically significant, and therefore considered more interesting, than a country whose contribution changed from 50% to 55%.

  • 採用各種不同的啟發學習法來選取最有意義的結果,例如考慮資料之間的其他關聯性。Various heuristics are employed to select the most meaningful results, for example by considering other relationships between the data.

查看不同的資料行及每個資料行的值之後,會選擇差異最大的一組值。After examining different columns, and the values for each of those columns, the set of values that give the biggest differences are chosen. 為了方便了解,這些值會接著依資料行分組輸出,值差異最大的資料行會先列出。For ease of understanding, these are then output grouped by column, with the column whose values give the biggest difference listed first. 每個資料行最多會顯示三個值,但如果影響很大的值不到三個,或如果某些值比其他值更具影響力,則可能顯示較少的值。Up to three values are shown per column, but less might be shown either if there were fewer than three values that have a large effect, or if some values are much more impactful than others.

在可用時間內,不一定會查看模型中的所有資料行,因此不保證會顯示最具影響力的資料行和值。It is not necessarily the case that all of the columns in the model will be examined in the time available, so it is not guaranteed that the most impactful columns and values are displayed. 不過,會採用各種不同的啟發學習法,以確保先檢查最可能的資料行。However, various heuristics are employed to ensure that the most likely columns are examined first. 例如,假設在查看所有資料行之後,判斷下列資料行/值對分佈的影響最大,則從最具影響力到最不具影響力:For example, say that after examining all the columns, it is determined that the following columns/values have the biggest impact on the distribution, from most impact to least:

Subcategory = Touring Bikes
Channel = Direct
Subcategory = Mountain Bikes
Subcategory = Road Bikes
Subcategory = Kids Bikes
Channel = Store

這些會依資料行順序輸出,如下所示:These would get output in column order, as follows:

  • 子類別:旅行自行車、登山自行車、公路自行車 (只列出三個,其中的文字包含「等等」,指出具有重大影響的超過三個)Subcategory: Touring Bikes, Mountain Bikes, Road Bikes (only three listed, with the text including “...amongst others” to indicate that more than three have a significant impact)

  • [通道] = [直接] (若其影像程度遠大於 [儲存],則只會列出 [直接])Channel = Direct (only Direct listed, if it’s level of impact was much greater than Store)

考量與限制Considerations and limitations

下列清單是目前「深入解析」不支援的案例集合:The following list is the collection of currently unsupported scenarios for insights:

  • TopN 篩選TopN filters
  • 量值篩選Measure filters
  • 非數字量值Non-numeric measures
  • 使用「顯示值為」Use of "Show value as"
  • 篩選的量值 - 所篩選量值為已套用特定篩選的視覺效果層級計算 (例如「法國總銷售額」),並用於見解功能所建立的其中一些視覺效果Filtered measures - filtered measures are visual level calculations with a specific filter applied (for example, Total Sales for France), and are used on some of the visuals created by the insights feature

此外,深入解析目前不支援下列模型類型和資料來源:In addition, the following model types and data sources are currently not supported for insights:

  • DirectQueryDirectQuery
  • Live connectLive connect
  • 內部部署的 Reporting ServicesOn-premises Reporting Services
  • 內嵌Embedding

後續步驟Next steps

如需 Power BI Desktop 的詳細資訊,以及如何開始使用,請參閱下列文章。For more information about Power BI Desktop, and how to get started, check out the following articles.