What's new in Azure Data Factory
The Azure Data Factory service is improved on an ongoing basis. To stay up to date with the most recent developments, this article provides you with information about:
- The latest releases
- Known issues
- Bug fixes
- Deprecated functionality
- Plans for changes
This page is updated monthly, so revisit it regularly.
November 2021
| Service Category | Service improvements | Details |
| CI/CD | GitHub integration improvements | Improvements in ADF and GitHub integration removes limits on 1000 data factory resources per resource type (datasets, pipelines, etc.). For large data factories, this helps mitigate the impact of GitHub API rate limit. Learn more |
| Data Flow | Set a custom error code and error message with the Fail activity | Fail Activity enables ETL developers to set the error message and custom error code for an Azure Data Factory pipeline. Learn more |
| External call transformation | Mapping Data Flows External Call transformation enables ETL developers to leverage transformations, and data enrichments provided by REST endpoints or 3rd party API services. Learn more | |
| Synapse quick re-use | When executing Data flow in Synapse Analytics, use the TTL feature. The TTL feature uses the quick re-use feature so that sequential data flows will execute within a few seconds. You can set the TTL when configuring an Azure Integration runtime. Learn more | |
| Data Movement | Copy activity supports reading data from FTP/SFTP without chunking | Automatically determining the file length or the relevant offset to be read when copying data from an FTP or SFTP server. With this capability, Azure Data Factory will automatically connect to the FTP/SFTP server to determine the file length. Once this is determined, Azure Data Factory will dive the file into multiple chunks and read them in parallel. Learn more |
| UTF-8 without BOM support in Copy activity | Copy activity supports writing data with encoding type UTF-8 without BOM for JSON and delimited text datasets. | |
| Multi-character column delimiter support | Copy activity supports using multi-character column delimiters (for delimited text datasets). | |
| Integration Runtime | Run any process anywhere in 3 easy steps with SSIS in Azure Data Factory | In this article, you will learn how to use the best of Azure Data Factory and SSIS capabilities in a pipeline. A sample SSIS package (with parameterized properties) is provided to help you jumpstart. Using Azure Data Factory Studio, the SSIS package can be easily dragged & dropped into a pipeline and used as part of an Execute SSIS Package activity. This enables you to run the Azure Data Factory pipeline (with SSIS package) on self-hosted/SSIS integration runtimes (SHIR/SSIS IR). By providing run-time parameter values, you can leverage the powerful capabilities of Azure Data Factory and SSIS capabilities together. This article illustrates 3 easy steps to run any process (which can be any executable, such as application/program/utility/batch file) anywhere. Learn more |
October 2021
| Service Category | Service improvements | Details |
| Data Flow | Azure Data Explorer and Amazon Web Services S3 connectors | The Microsoft Data Integration team has just released two new connectors for mapping data flows. If you are using Azure Synapse, you can now connect directly to your AWS S3 buckets for data transformations. In both Azure Data Factory and Azure Synapse, you can now natively connect to your Azure Data Explorer clusters in mapping data flows. Learn more |
| Power Query activity leaves preview for General Availability (GA) | Microsoft has released the Azure Data Factory Power Query pipeline activity as Generally Available. This new feature provides scaled-out data prep and data wrangling for citizen integrators inside the ADF browser UI for an integrated experience for data engineers. The Power Query data wrangling feature in ADF provides a powerful easy-to-use pipeline capability to solve your most complex data integration and ETL patterns in a single service. Learn more | |
| New Stringify data transformation in mapping data flows | Mapping data flows adds a new data transformation called Stringify to make it easy to convert complex data types like structs and arrays into string form that can be sent to structured output destinations. Learn more | |
| Integration Runtime | Azure Data Factory Managed vNet goes GA | You can now provision the Azure Integration Runtime as part of a managed Virtual Network and leverage Private Endpoints to securely connect to supported data stores. Data traffic goes through Azure Private Links which provide secured connectivity to the data source. In addition, it prevents data exfiltration to the public internet. Learn more |
| Express VNet injection for SSIS integration runtime (Public Preview) | The SSIS integration runtime now supports express VNet injection. Learn more: Overview of VNet injection for SSIS integration runtime Standard vs. express VNet injection for SSIS integration runtime Express VNet injection for SSIS integration runtime |
|
| Security | Azure Key Vault integration improvement | We have improved Azure Key Vault integration by adding user selectable drop-downs to select the secret values in the linked service, increasing productivity and not requiring users to type in the secrets, which could result in human error. |
| Support for user-assigned managed identity in Azure Data Factory | Credential safety is crucial for any enterprise. With that in mind, the Azure Data Factory (ADF) team is committed to making the data engineering process secure yet simple for data engineers. We are excited to announce the support for user-assigned managed identity (Preview) in all connectors/ linked services that support Azure Active Directory (Azure AD) based authentication. Learn more |
September 2021
| Service Category | Service improvements | Details |
| Continuous integration and delivery (CI/CD) | Expanded CI/CD capabilities | You can now create a new Git branch based on any other branch in Azure Data Factory. Learn more |
| Data Movement | Amazon Relational Database Service (RDS) for Oracle sources | The Amazon RDS for Oracle sources connector is now available in both Azure Data Factory and Azure Synapse. Learn more |
| Amazon RDS for SQL Server sources | The Amazon RDS for SQL Server sources connector is now available in both Azure Data Factory and Azure Synapse. Learn more | |
| Support parallel copy from Azure Database for PostgreSQL | The Azure Database for PostgreSQL connector now supports parallel copy operations. Learn more | |
| Data Flow | Use Azure Data Lake Storage (ADLS) Gen2 to execute pre- and post-processing commands | Hadoop Distributed File System (HDFS) pre- and post-processing commands can now be executed using ADLS Gen2 sinks in data flows Learn more |
| Edit data flow properties for existing instances of the Azure Integration Runtime (IR) | The Azure Integration Runtime (IR) has been updated to allow editing of data flow properties for existing IRs. You can now modify data flow compute properties without needing to create a new Azure IR. Learn more | |
| TTL setting for Azure Synapse to improve pipeline activities execution startup time | Azure Synapse Analytics has added TTL to the Azure Integration Runtime to enable your data flow pipeline activities to begin execution in seconds, greatly minimizing the runtime of your data flow pipelines. Learn more | |
| Integration Runtime | Azure Data Factory Managed vNet goes GA | You can now provision the Azure Integration Runtime as part of a managed Virtual Network and leverage Private Endpoints to securely connect to supported data stores. Data traffic goes through Azure Private Links which provide secured connectivity to the data source. In addition, it prevents data exfiltration to the public internet. Learn more |
| Orchestration | Operationalize and Provide SLA for Data Pipelines | The new Elapsed Time Pipeline Run metric, combined with Data Factory Alerts, empowers data pipeline developers to better deliver SLAs to their customers, and you tell us how long a pipeline should run, and we will notify you proactively when the pipeline runs longer than expected. Learn more |
| Fail Activity (Public Preview) | The new Fail activity allows you to throw an error in a pipeline intentionally for any reason. For example, you might use the Fail activity if a Lookup activity returns no matching data or a Custom activity finishes with an internal error. Learn more |
August 2021
| Service Category | Service improvements | Details |
| Continuous integration and delivery (CI/CD) | CICD Improvements with GitHub support in Azure Government and Azure China | We have added support for GitHub in Azure for U.S. Government and Azure China. Learn more |
| Data Movement | Azure Cosmos DB's API for MongoDB connector supports version 3.6 & 4.0 in Azure Data Factory | Azure Data Factory Cosmos DB’s API for MongoDB connector now supports server version 3.6 & 4.0. Learn more |
| Enhance using COPY statement to load data into Azure Synapse Analytics | The Azure Data Factory Azure Synapse Analytics connector now supports staged copy and copy source with *.* as wildcardFilename for COPY statement. Learn more | |
| Data Flow | REST endpoints are available as source and sink in Data Flow | Data flows in Azure Data Factory and Azure Synapse Analytics now support REST endpoints as both a source and sink with full support for both JSON and XML payloads. Learn more |
| Integration Runtime | Diagnostic tool is available for self-hosted integration runtime | A diagnostic tool for self-hosted integration runtime is designed for providing a better user experience and help users to find potential issues. The tool runs a series of test scenarios on the self-hosted integration runtime machine and every scenario has typical health check cases for common issues. Learn more |
| Orchestration | Custom Event Trigger with Advanced Filtering Option is GA | You can now create a trigger that responds to a Custom Topic posted to Event Grid. Additionally, you can leverage Advanced Filtering to get fine-grain control over what events to respond to. Learn more |
July 2021
| Service Category | Service improvements | Details |
| Data Movement | Get metadata driven data ingestion pipelines on ADF Copy Data Tool within 10 minutes (Public Preview) | With this, you can build large-scale data copy pipelines with metadata-driven approach on copy data tool(Public Preview) within 10 minutes. Learn more |
| Data Flow | New map functions added in data flow transformation functions | A new set of data flow transformation functions has been added to enable data engineers to easily generate, read, and update map data types and complex map structures. Learn more |
| Integration Runtime | 5 new regions available in Azure Data Factory Managed VNET (Public Preview) | These 5 new regions(China East2, China North2, US Gov Arizona, US Gov Texas, US Gov Virginia) are available in Azure Data Factory managed virtual network (Public Preview). Learn more |
| Developer Productivity | ADF homepage improvements | The Data Factory home page has been redesigned with better contrast and reflow capabilities. Additionally, a few sections have been introduced on the homepage to help you improve productivity in your data integration journey. Learn more |
| New landing page for Azure Data Factory Studio | The landing page for Data Factory blade in the Azure portal. Learn more |
June 2021
| Service Category | Service improvements | Details |
| Data Movement | New user experience with Azure Data Factory Copy Data Tool | Redesigned Copy Data Tool is now available with improved data ingestion experience. Learn more |
| MongoDB and MongoDB Atlas are Supported as both Source and Sink | This improvement supports copying data between any supported data store and MongoDB or MongoDB Atlas database. Learn more | |
| Always Encrypted is supported for Azure SQL Database, Azure SQL Managed Instance, and SQL Server connectors as both source and sink | Always Encrypted is available in Azure Data Factory for Azure SQL Database, Azure SQL Managed Instance, and SQL Server connectors for copy activity. Learn more | |
| Setting custom metadata is supported in copy activity when sinking to ADLS Gen2 or Azure Blob | When writing to ADLS Gen2 or Azure Blob, copy activity supports setting custom metadata or storage of the source file's last modified info as metadata. Learn more | |
| Data Flow | SQL Server is now supported as a source and sink in data flows | SQL Server is now supported as a source and sink in data flows. Follow the link for instructions on how to configure your networking using the Azure Integration Runtime managed VNET feature to talk to your SQL Server on-premise and cloud VM-based instances. Learn more |
| Dataflow Cluster quick reuse is now enabled by default for all new Azure Integration Runtimes | ADF is happy to announce the general availability of the popular data flow quick start-up reuse feature. All new Azure Integration Runtimes will now have quick reuse enabled by default. Learn more | |
| Power Query activity (Public Preview) | You can now build complex field mappings to your Power Query sink using Azure Data Factory data wrangling. The sink is now configured in the pipeline in the Power Query (Public Preview) activity to accommodate this update. Learn more | |
| Updated data flows monitoring UI in Azure Data Factory | Azure Data Factory has a new update for the monitoring UI to make it easier to view your data flow ETL job executions and quickly identify areas for performance tuning. Learn more | |
| SQL Server Integration Services (SSIS) | Run any SQL anywhere in 3 simple steps with SSIS in Azure Data Factory | This post provides 3 simple steps to run any SQL statements/scripts anywhere with SSIS in Azure Data Factory.
|