As one of the first technical steps in a data science or AI project, you must identify the datasets to be used and bring them into your analytics environment. The Data Science Virtual Machine (DSVM) provides tools and libraries to bring data from different sources into analytical data storage locally on the DSVM, or into a data platform either on the cloud or on-premises.
Here are some data movement tools that are available in the DSVM.
Azure CLI
Category
Value
What is it?
A management tool for Azure. It also contains command verbs to move data from Azure data platforms like Azure Blob storage and Azure Data Lake Store.
Supported DSVM versions
Windows, Linux
Typical uses
Importing and exporting data to and from Azure Storage and Azure Data Lake Store.
Tool to import data from various sources into Azure Cosmos DB, a NoSQL database in the cloud. These sources include JSON files, CSV files, SQL, MongoDB, Azure Table storage, Amazon DynamoDB, and Azure Cosmos DB SQL API collections.
Supported DSVM versions
Windows
Typical uses
Importing files from a VM to CosmosDB, importing data from Azure table storage to CosmosDB, and importing data from a Microsoft SQL Server database to CosmosDB.
How to use / run it?
To use the command-line version, open a command prompt and type dt. To use the GUI tool, open a command prompt and type dtui.