Extract, transform, and load data on Linux with SSIS

THIS TOPIC APPLIES TO: yesSQL Server (Linux only)noAzure SQL DatabasenoAzure SQL Data WarehousenoParallel Data Warehouse

This topic describes how to run SQL Server Integration Services (SSIS) packages on Linux. SSIS solves complex data integration problems by extracting data from multiple sources and formats, transforming and cleansing the data, and loading the data into multiple destinations.

SSIS packages running on Linux can connect to Microsoft SQL Server running on Windows on-premises or in the cloud, on Linux, or in Docker. They can also connect to Azure SQL Database, Azure SQL Data Warehouse, ODBC data sources, flat files, and other data sources including ADO.NET sources, XML files, and OData services.

For more info about the capabilities of SSIS, see SQL Server Integration Services.

Prerequisites

To run SSIS packages on a Linux computer, first you have to install SQL Server Integration Services. SSIS is not included in the installation of SQL Server for Linux computers. For installation instructions, see Install SQL Server Integration Services.

You also have to have a Windows computer to create and maintain packages. The SSIS design and management tools are Windows applications that are not currently available for Linux computers.

Run an SSIS package

To run an SSIS package on a Linux computer, do the following things:

  1. Copy the SSIS package to the Linux computer.
  2. Run the following command: $ dtexec /F \<package name \> /DE <protection password>

Other common SSIS tasks

  • Design packages.

    • Connect to ODBC data sources. With SSIS on Linux CTP 2.1 Refresh and later, SSIS packages can use ODBC connections on Linux. This functionality has been tested with the SQL Server and the MySQL ODBC drivers, but is also expected to work with any Unicode ODBC driver that observes the ODBC specification. At design time, you can provide either a DSN or a connection string to connect to the ODBC data; you can also use Windows authentication. For more info, see the blog post announcing ODBC support on Linux.

    • Paths. Provide Windows-style paths in your SSIS packages. SSIS on Linux does not support Linux-style paths, but maps Windows-style paths to Linux-style paths at run time. Then, for example, SSIS on Linux maps the Windows-style path C:\test to the Linux-style path /test.

  • Deploy packages. You can only store packages in the file system on Linux in this release. The SSIS Catalog database and the legacy SSIS service are not available on Linux for package deployment and storage.

  • Schedule packages. You can use Linux system scheduling tools such as cron to schedule packages. You can't use SQL Agent on Linux to schedule package execution in this release. For more info, see Schedule SSIS packages on Linux with cron.

Limitations and known issues

For detailed info about the limitations and known issues of SSIS on Linux, see Limitations and known issues for SSIS on Linux.

More info about SSIS on Linux

For more info about SSIS on Linux, see the following blog posts:

More info about SSIS

Microsoft SQL Server Integration Services (SSIS) is a platform for building high-performance data integration solutions, including extraction, transformation, and loading (ETL) packages for data warehousing. For more info about SSIS, see SQL Server Integration Services.

SSIS includes the following features:

  • Graphical tools and wizards for building and debugging packages on Windows
  • A variety of tasks for performing workflow functions such as FTP operations, executing SQL statements, and sending e-mail messages
  • A variety of data sources and destinations for extracting and loading data
  • A variety of transformations for cleaning, aggregating, merging, and copying data
  • Application programming interfaces (APIs) for extending SSIS with your own custom scripts and components

To get started with SSIS, download the latest version of SQL Server Data Tools (SSDT).

See also