Import data sources using Power Query
Before you begin ingesting data into Audience Insights, you need to decide the best method to use for collecting the data. This will vary depending on factors such as the volume of data being ingested and the amount of data transformation required. In scenarios that require a large amount of data transformation or when you're going to be connecting to non-Common Data Model or Dataverse sources, importing data is often the best option.
The import option provides the most flexibility as it uses Power Query. Power Query is Microsoft's Data Connectivity and Data Preparation technology used to access and reshape data from hundreds of data sources. For more information on using Power Query: What is Power Query? It includes around 40 different connectors to connect to data sources such as Excel, Oracle, OData, Azure, and more.
It's important to note that when data is ingested using the import option, it's copied into your Audience Insights data lake. Depending on the volume of data, this could require extra storage to support. You'll need to make sure you can access the data you want to import as different connectors may require different configuration and authentication. Although Audience Insights uses Power Query, not all connectors Power Query supports allow you to bring data into Audience Insights. You should be familiar with different options available based on the connector you select. There's a complete list of Power Query connectors and capabilities available.
One of the main advantages of importing data using Power Query is the ability to perform data transformation. Data transformation is used to change data to the appropriate form so it can be used for a statistical test or method. From an Audience Insights standpoint, data typically needs to be transformed to ensure it can be used appropriately by application features such as activities and measures. For example, activities rely heavily on dates, so if you're intending to use a data for activities, you'll need to ensure that the data set has at least one data field in it. Numeric field would be needed for fields that might be used in calculations or as measures such as calculating the total cost of something.
During the import process, you can modify configuration settings such as the delimiter used (based on the data source selected). The preview area allows you to modify some of these settings before you begin the transformation process. While each situation will be unique, there are a few standard transformation steps that you should consider.
These transformations are highly recommended:
Use headers as first row - If you're ingesting data from a CSV file, you can use the first row of the CSV file as the header information. Select Transform Table, and you'll see the option to switch between using the first row as headers or using it as data.
Map your data to a standard Data Format - Audience Insights allows you to map your data to the Common Data Model. The Common Data Model (CDM) is a standard and extensible collection of schemas (entities, attributes, relationships) that represents business concepts and activities with well-defined semantics, to facilitate data interoperability. Examples of entities include: Account, Contact, Lead, Opportunity, Product, etc. To do this, go to Map to Standard, and then map fields from your source data to Common Data Model fields.
Modify Field Data Types - Each field with a data set will have a data type associated to it.
More on data types
As mentioned previously, the data type associated with a field directly impacts how the item can be used and tracked in Audience Insights. Failure to define data types correctly could lead to you not being able to create activities or measures. While it's possible to adjust this information later, depending on where you are at in the process, it could lead to numerous more configurations. It's important that you spend time to ensure each item has the appropriate data type associated with it.
Consider the following example. The table represents an example of a transaction table that contains data, which represents purchases made on a company website. It contains the ID of the customer that made the purchase, and the purchase date and total price of the purchase.
|Customer ID||Purchase date||Total price|
To ensure that we can use the data from this table appropriately, we need to make sure that we've assigned the correct data type to the appropriate fields.
In the above example, we might set the data types for the fields as noted below:
Customer ID - Since this is a text-based value, we could leave this as a text field.
Purchase Date - This will likely be used to represent time windows so we would want this field to be set to some type of date-related data type.
Date - Stores just the date that the transaction took place.
Date/Time - Stores both the date and time of the purchase.
Date/Time/Zone - Stores date, time, and time zone details.
In our example, it can be set to date since we only need to store it as a date and no time-related data is necessary.
Total Price - This would likely be set as Currency to represent a dollar amount.
Be sure to review each field, especially if you're connected to a text-based data source to ensure everything is set appropriately.
As with anything, it's possible to make mistakes during the data transformation process such as defining the wrong data type. As you make changes to a data set, those changes are tracked in the Applied Steps section under the query settings section on the right-hand side of edit queries screen. The applied steps can be used to edit or remove any steps that performed as needed.
The Query settings also contain a name field that you can use to name the query that you're working with. Once you've completed your query changes, you can select next to save your data set changes.
For more information on Importing using Power Query, see Import Using Power Query.