Use wizard for one-time ingestion of historical data with LightIngest (preview)
LightIngest is a command-line utility for ad-hoc data ingestion into Azure Data Explorer. To learn more about LightIngest, see Use LightIngest to ingest data into Azure Data Explorer.
LightIngest can be particularly useful to load historical data from an existing storage system to Azure Data Explorer. While you can build your own command using the list of Command-line arguments, this article shows you how to auto-generate this command through an ingestion wizard. In addition to creating the command, you can use this process to create a new table, and create schema mapping. This tool infers schema mapping from your data set.
This article shows you how to create a new table, create schema mapping, and generate a LightIngest command for one-time ingestion using the LightIngest tool.
Prerequisites
- An Azure subscription. Create a free Azure account.
- A cluster and database.
- A storage account.
- LightIngest - download it as part of the Microsoft.Azure.Kusto.Tools NuGet package. For installation instructions, see Install LightIngest.
Access the wizard
The wizard can be accessed either from the Data tab, or from the Query tab of the Azure Data Explorer WebUI.
In the Data tab, from the Quick actions section, select Ingest new data. Alternatively, from the All actions section, select Ingest new data and then Ingest.
In the Query tab, right-click a database and select Ingest new data.
In the Ingest new data window, the Destination tab is selected. The Cluster and Database fields are automatically populated.
Destination tab
In Table, check either Existing table or Create new table. When creating a new table, enter a name for the new table. You can use alphanumeric, hyphens, and underscores. Special characters aren't supported.
Note
Table names must be between 1 and 1024 characters.
Select Next: Source
Source tab
- Under Source type, select From blob container (blob container, ADLS Gen2 container).
- Select Ingestion type>Historical data.
- You can either Add URL manually by copying the Account Key/SAS URL to source, or Select container from your storage account.
Note
The SAS URL can be created manually or automatically.
- When selecting from your storage account, select your Storage subscription, Storage account, and Container from the dropdown menus.
Advanced settings
To define additional settings for the ingestion process using LightIngest, select Advanced settings.
In the Advanced configuration panel, define the following settings:
Property Description Creation time pattern Specify to override the ingestion time property of the created extent with a pattern, for example, to apply a date based on the folder structure of the container. See also Creation time pattern. Blob name pattern Specify the pattern used to identify the files to be ingested. Ingest all the files that match the blob name pattern in the given container. Supports wildcards. Recommended to enclose in double quotes. Tag A tag assigned to the ingested data. The tag can be any string. Limit amount of files Specify the number of files that can be ingested. Ingests the first nfiles that match the blob name pattern, up to the number specified.Don't wait for ingestion to complete If set, queues the blobs for ingestion without monitoring the ingestion process. If not set, LightIngest continues to poll the ingestion status until ingestion is complete. Display only selected items List the files in the container, but does not ingest them. Enter values for relevant fields and select Done to return to the Source tab.
Filter data
If you want to, filter the data to ingest only files in a specific folder path or with a particular file extension.
The system will select one of the files at random and the schema will be generated based on that Schema defining file. You can select a different file.
Select Next: Schema to view and edit your table column configuration.
Edit the schema
In the Schema tab:
By looking at the name of the source, the service automatically identifies if it is compressed or not. Confirm that the Compression type is correct.
Confirm the format selected in Data format:
In this case, the data format is CSV
On tabular data, you can select the check box Ignore the first record to ignore the heading row of the file.
In the Mapping name field, enter a mapping name. You can use alphanumeric characters and underscores. Spaces, special characters, and hyphens aren't supported.
When using an existing table, you can Keep current table schema if the table schema matches the selected format.
Edit the table
When ingesting to a new table, alter various aspects of the table when creating the table.
The changes you can make in a table depend on the following parameters:
- Table type is new or existing
- Mapping type is new or existing
| Table type | Mapping type | Available adjustments |
|---|---|---|
| New table | New mapping | Change data type, Rename column, New column, Delete column, Update column, Sort ascending, Sort descending |
| Existing table | New mapping | New column (on which you can then change data type, rename, and update), Update column, Sort ascending, Sort descending |
| Existing mapping | Sort ascending, Sort descending |
Note
When adding a new column or updating a column, you can change mapping transformations. For more information, see Mapping transformations
Note
For tabular formats, you can't map a column twice. To map to an existing column, first delete the new column.
Select Next: Summary to generate the LightIngest command.
Generate the LightIngest command
In the Data ingestion completed window, all three steps will be marked with green check marks.
Copy the generated LightIngest command by clicking on the copy icon to the top right of the command box.
In the tiles below the ingestion progress, you can download the LightIngest tool.
To complete the ingestion process, you must run LightIngest using this copied command.
Next steps
الملاحظات
إرسال الملاحظات وعرضها المتعلقة بـ