Transformation functions in Power Query for data wrangling

APPLIES TO: Azure Data Factory Azure Synapse Analytics

Data Wrangling in Azure Data Factory allows you to do code-free agile data preparation and wrangling at cloud scale by translating Power Query M scripts into Data Flow script. ADF integrates with Power Query Online and makes Power Query M functions available for data wrangling via Spark execution using the data flow Spark infrastructure.

Note

Power Query in ADF is currently available in public preview

Currently not all Power Query M functions are supported for data wrangling despite being available during authoring. While building your mash-ups, you'll be prompted with the following error message if a function isn't supported:

UserQuery : Expression.Error: The transformation logic is not supported as it requires dynamic access to rows of data, which cannot be scaled out.

Below is a list of supported Power Query M functions.

Column Management

Row Filtering

Use M function Table.SelectRows to filter on the following conditions:

Adding and Transforming Columns

The following M functions add or transform columns: Table.AddColumn, Table.TransformColumns, Table.ReplaceValue, Table.DuplicateColumn. Below are the supported transformation functions.

Merging/Joining tables

  • Power Query will generate a nested join (Table.NestedJoin; users can also manually write Table.AddJoinColumn). Users must then expand the nested join column into a non-nested join (Table.ExpandTableColumn, not supported in any other context).
  • The M function Table.Join can be written directly to avoid the need for an additional expansion step, but the user must ensure that there are no duplicate column names among the joined tables
  • Supported Join Kinds: Inner, LeftOuter, RightOuter, FullOuter
  • Both Value.Equals and Value.NullableEquals are supported as key equality comparers

Group by

Use Table.Group to aggregate values.

Sorting

Use Table.Sort to sort values.

Reducing Rows

Keep and Remove Top, Keep Range (corresponding M functions, only supporting counts, not conditions: Table.FirstN, Table.Skip, Table.RemoveFirstN, Table.Range, Table.MinN, Table.MaxN)

Known unsupported functions

Function Status
Table.PromoteHeaders Not supported. The same result can be achieved by setting "First row as header" in the dataset.
Table.CombineColumns This is a common scenario that isn't directly supported but can be achieved by adding a new column that concatenates two given columns. For example, Table.AddColumn(RemoveEmailColumn, "Name", each [FirstName] & " " & [LastName])
Table.TransformColumnTypes This is supported in most cases. The following scenarios are unsupported: transforming string to currency type, transforming string to time type, transforming string to Percentage type.
Table.NestedJoin Just doing a join will result in a validation error. The columns must be expanded for it to work.
Table.Distinct Remove duplicate rows isn't supported.
Table.RemoveLastN Remove bottom rows isn't supported.
Table.RowCount Not supported, but can be achieved by adding a custom column containing the value 1, then aggregating that column with List.Sum. Table.Group is supported.
Row level error handling Row level error handling is currently not supported. For example, to filter out non-numeric values from a column, one approach would be to transform the text column to a number. Every cell which fails to transform will be in an error state and need to be filtered. This scenario isn't possible in scaled-out M.
Table.Transpose Not supported
Table.Pivot Not supported
Table.SplitColumn Partially supported

M script workarounds

For SplitColumn there is an alternate for split by length and by position

  • Table.AddColumn(Source, "First characters", each Text.Start([Email], 7), type text)
  • Table.AddColumn(#"Inserted first characters", "Text range", each Text.Middle([Email], 4, 9), type text)

This option is accessible from the Extract option in the ribbon

Power Query Add Column

For Table.CombineColumns

  • Table.AddColumn(RemoveEmailColumn, "Name", each [FirstName] & " " & [LastName])

Next steps

Learn how to create a data wrangling Power Query in ADF.