What's New in Azure Synapse Analytics Archive

This article describes previous month updates to Azure Synapse Analytics. For the most current month's release, check out Azure Synapse Analytics latest updates. Each update links to the Azure Synapse Analytics blog and an article that provides more information.

Generally available features

The following table lists a past history of the features of Azure Synapse Analytics that have transitioned from preview to general availability (GA).

Month Feature Learn more
July 2022 Apache Spark™ 3.2 for Synapse Analytics Apache Spark™ 3.2 for Synapse Analytics is now generally available. Review the official release notes and migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2. Highlights of what got better in Spark 3.2 in the Azure Synapse Analytics July Update 2022.
July 2022 Apache Spark in Azure Synapse Intelligent Cache feature Intelligent Cache for Spark automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more, see how to Enable/Disable the cache for your Apache Spark pool.
June 2022 Map Data tool The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code. To learn more about the Map Data tool, read Map Data in Azure Synapse Analytics.
June 2022 User Defined Functions User defined functions (UDFs) are now generally available. To learn more, read User defined functions in mapping data flows.
May 2022 Azure Synapse Data Explorer connector for Power Automate, Logic Apps, and Power Apps The Azure Data Explorer connector for Power Automate enables you to orchestrate and schedule flows, send notifications, and alerts, as part of a scheduled or triggered task. To learn more, read Azure Data Explorer connector for Microsoft Power Automate and Usage examples for Azure Data Explorer connector to Power Automate.
April 2022 Cross-subscription restore for Azure Synapse SQL With the PowerShell Az.Sql module 3.8 update, the Restore-AzSqlDatabase cmdlet can be used for cross-subscription restore of dedicated SQL pools. To learn more, see Blog: Restore a dedicated SQL pool (formerly SQL DW) to a different subscription. This feature is now generally available for dedicated SQL pools (formerly SQL DW) and dedicated SQL pools in a Synapse workspace. What's the difference?
April 2022 Database Designer The database designer allows users to visually create databases within Synapse Studio without writing a single line of code. For more information, see Announcing General Availability of Database Designer. Read more about lake databases and learn How to modify an existing lake database using the database designer.
April 2022 Database Templates New industry-specific database templates were introduced in the Synapse Database Templates General Availability blog. Learn more about Database templates and the improved exploration experience.
April 2022 Synapse Monitoring Operator RBAC role The Synapse Monitoring Operator RBAC (role-based access control) role allows a user persona to monitor the execution of Synapse Pipelines and Spark applications without having the ability to run or cancel the execution of these applications. For more information, review the Synapse RBAC Roles.
March 2022 Flowlets Flowlets help you design portions of new data flow logic, or to extract portions of an existing data flow, and save them as separate artifact inside your Synapse workspace. Then, you can reuse these Flowlets can inside other data flows. To learn more, review the Flowlets GA announcement blog post and read Flowlets in mapping data flow.
March 2022 Change Feed connectors Changed data capture (CDC) feed data flow source transformations for Azure Cosmos DB, Azure Blob Storage, ADLS Gen1, ADLS Gen2, and Common Data Model (CDM) are now generally available. By simply checking a box, you can tell ADF to manage a checkpoint automatically for you and only read the latest rows that were updated or inserted since the last pipeline run. To learn more, review the Change Feed connectors GA preview blog post and read Copy and transform data in Azure Data Lake Storage Gen2 using Azure Data Factory or Azure Synapse Analytics.
March 2022 Column level encryption for dedicated SQL pools Column level encryption is now generally available for use on new and existing Azure SQL logical servers with Azure Synapse dedicated SQL pools and dedicated SQL pools in Azure Synapse workspaces. SQL Server Data Tools (SSDT) support for column level encryption for the dedicated SQL pools is available starting with the 17.2 Preview 2 build of Visual Studio 2022.
March 2022 Synapse Spark Common Data Model (CDM) connector The CDM format reader/writer enables a Spark program to read and write CDM entities in a CDM folder via Spark dataframes. To learn more, see how the CDM connector supports reading, writing data, examples, & known issues.
November 2021 PREDICT The T-SQL PREDICT syntax is now generally available for dedicated SQL pools. Get started with the Machine learning model scoring wizard for dedicated SQL pools.
October 2021 Synapse RBAC Roles Synapse role-based access control (RBAC) roles are now generally available. Learn more about Synapse RBAC roles and Azure Synapse role-based access control (RBAC) using PowerShell.

Community

This section is an archive of Azure Synapse Analytics community opportunities and the Azure Synapse Influencer program from Microsoft.

Month Feature Learn more
May 2022 Azure Synapse Influencer program Sign up for our free Azure Synapse Influencer program and get connected with a community of Synapse-users who are dedicated to helping others achieve more with cloud analytics. Register now for our next Synapse Influencer Ask the Experts session. It's free to attend and everyone is welcome to participate and join the discussion on Synapse-related topics. You can watch past recorded Ask the Experts events on the Azure Synapse YouTube channel.
March 2022 Azure Synapse Analytics and Microsoft MVP YouTube video series A joint activity with the Azure Synapse product team and the Microsoft MVP community, a new YouTube MVP Video Series about the Azure Synapse features has launched. See more at the Azure Synapse Analytics YouTube channel.

Apache Spark for Azure Synapse Analytics

This section is an archive of features and capabilities of Apache Spark for Azure Synapse Analytics.

Month Feature Learn more
May 2022 Azure Synapse dedicated SQL pool connector for Apache Spark now available in Python Previously, the Azure Synapse Dedicated SQL Pool Connector for Apache Spark was only available using Scala. Now, the dedicated SQL pool connector for Apache Spark can be used with Python on Spark 3.
May 2022 Manage Azure Synapse Apache Spark configuration With the new Apache Spark configurations feature, you can create a standalone Spark configuration artifact with auto-suggestions and built-in validation rules. The Spark configuration artifact allows you to share your Spark configuration within and across Azure Synapse workspaces. You can also easily associate your Spark configuration with a Spark pool, a Notebook, and a Spark job definition for reuse and minimize the need to copy the Spark configuration in multiple places.
April 2022 Apache Spark 3.2 for Synapse Analytics Apache Spark 3.2 for Synapse Analytics with preview availability. Review the official Spark 3.2 release notes and migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2.
April 2022 Parameterization for Spark job definition You can now assign parameters dynamically based on variables, metadata, or specifying Pipeline specific parameters for the Spark job definition activity. For more details, read Transform data using Apache Spark job definition.
April 2022 Apache Spark notebook snapshot You can access a snapshot of the Notebook when there's a Pipeline Notebook run failure or when there's a long-running Notebook job. To learn more, read Transform data by running a Synapse notebook and Introduction to Microsoft Spark utilities.
March 2022 Synapse Spark Common Data Model (CDM) connector The CDM format reader/writer enables a Spark program to read and write CDM entities in a CDM folder via Spark dataframes. To learn more, see how the CDM connector supports reading, writing data, examples, & known issues.
March 2022 Performance optimization for Synapse Spark dedicated SQL pool connector New improvements to the Azure Synapse Dedicated SQL Pool Connector for Apache Spark reduce data movement and leverage COPY INTO. Performance tests indicated at least ~5x improvement over the previous version. No action is required from the user to leverage these enhancements. For more information, see Blog: Synapse Spark Dedicated SQL Pool (DW) Connector: Performance Improvements.
March 2022 Support for all Spark Dataframe SaveMode choices The Azure Synapse Dedicated SQL Pool Connector for Apache Spark now supports all four Spark Dataframe SaveMode choices: Append, Overwrite, ErrorIfExists, Ignore. For more information on Spark SaveMode, read the official Apache Spark documentation.
March 2022 Apache Spark in Azure Synapse Analytics Intelligent Cache feature Intelligent Cache for Spark automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more on this preview feature, see how to Enable/Disable the cache for your Apache Spark pool or see the blog post.

Data integration

This section is an archive of features and capabilities of Azure Synapse Analytics data integration. Learn how to Load data into Azure Synapse Analytics using Azure Data Factory (ADF) or a Synapse pipeline.

Month Feature Learn more
June 2022 SAP CDC connector preview A new data connector for SAP Change Data Capture (CDC) is now available in preview. For more information, see Announcing Public Preview of the SAP CDC solution in Azure Data Factory and Azure Synapse Analytics and SAP CDC solution in Azure Data Factory.
June 2022 Fuzzy join option in Join Transformation Use fuzzy matching with a similarity threshold score slider has been added to the Join transformation in Mapping Data Flows.
June 2022 Map Data tool GA We're excited to announce that the Map Data tool is now Generally Available. The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code.
June 2022 Rerun pipeline with new parameters You can now change pipeline parameters when rerunning a pipeline from the Monitoring page without having to return to the pipeline editor. To learn more, read Rerun pipelines and activities.
June 2022 User Defined Functions GA User defined functions (UDFs) in mapping data flows are now generally available (GA).
May 2022 Export pipeline monitoring as a CSV The ability to export pipeline monitoring to CSV and other monitoring improvements have been introduced to ADF.
May 2022 Automatic incremental source data loading from PostgreSQL and MySQL Automatic incremental source data loading from PostgreSQL and MySQL to Synapse SQL and Azure Database is now natively available in ADF.
May 2022 Assert transformation error handling Error handling has now been added to sinks following an assert transformation in mapping data flow. You can now choose whether to output the failed rows to the selected sink or to a separate file.
May 2022 Mapping data flows projection editing In mapping data flows, you can now update source projection column names and column types.
April 2022 Dataverse connector for Synapse Data Flows Dataverse is now a source and sink connector to Synapse Data Flows. You can Copy and transform data from Dynamics 365 (Microsoft Dataverse) or Dynamics CRM using Azure Data Factory or Azure Synapse Analytics.
April 2022 Configurable Synapse Pipelines Web activity response timeout With the response timeout property httpRequestTimeout, you can define a timeout for the HTTP request up to 10 minutes. Web activities work exceptionally well with APIs that follow the asynchronous request-reply pattern, a suggested approach for building scalable web APIs/services.
March 2022 sFTP connector for Synapse data flows A native sftp connector in Synapse data flows is supported to read and write data from sFTP using the visual low-code data flows interface in Synapse. To learn more, see Copy and transform data in SFTP server using Azure Data Factory or Azure Synapse Analytics.
March 2022 Data flow improvements to Data Preview Review features added to the Data Preview and debug improvements in Mapping Data Flows.
March 2022 Pipeline script activity You can now Transform data by using the Script activity to invoke SQL commands to perform both DDL and DML.
December 2021 Custom partitions for Synapse link for Azure Cosmos DB Improve query execution times for your Spark queries, by creating custom partitions based on fields frequently used in your queries. To learn more, see Custom partitioning in Azure Synapse Link for Azure Cosmos DB (Preview).

Database Templates & Database Designer

This section is an archive of features and capabilities of database templates and the database designer.

Month Feature Learn more
April 2022 Database Designer The database designer allows users to visually create databases within Synapse Studio without writing a single line of code. For more information, see Announcing General Availability of Database Designer. Read more about lake databases and learn How to modify an existing lake database using the database designer.
April 2022 Database Templates New industry-specific database templates were introduced in the Synapse Database Templates General Availability blog. Learn more about Database templates and the improved exploration experience.
April 2022 Clone lake database In Synapse Studio, you can now clone a database using the action menu available on the lake database. To learn more, read How-to: Clone a lake database.
April 2022 Use wildcards to specify custom folder hierarchies Lake databases sit on top of data that is in the lake and this data can live in nested folders that don't fit into clean partition patterns. You can now use wildcards to specify custom folder hierarchies. To learn more, read How-to: Modify a datalake.
January 2022 New database templates Learn more about new industry-specific Automotive, Genomics, Manufacturing, and Pharmaceuticals templates and get started with database templates in the Synapse Studio gallery.

Developer experience

This section is an archive of quality of life and feature improvements for developers in Azure Synapse Analytics.

Month Feature Learn more
May 2022 Updated Azure Synapse Analyzer Report Learn about the new features in version 2.0 of the Synapse Analyzer report.
April 2022 Azure Synapse Analyzer Report The Azure Synapse Analyzer Report helps you identify common issues that may be present in your database that can lead to performance issues.
April 2022 Reference unpublished notebooks Now, when using %run notebooks, you can enable 'unpublished notebook reference', which will allow you to reference unpublished notebooks. When enabled, notebook run will fetch the current contents in the notebook web cache, meaning the changes in your notebook editor can be referenced immediately by other notebooks without having to be published (Live mode) or committed (Git mode).
March 2022 Code cells with exception to show standard output Now in Synapse notebooks, both standard output and exception messages are shown when a code statement fails for Python and Scala languages. For examples, see Synapse notebooks: Code cells with exception to show standard output.
March 2022 Partial output is available for running notebook code cells Now in Synapse notebooks, you can see anything you write (with println commands, for example) as the cell executes, instead of waiting until it ends. For examples, see Synapse notebooks: Partial output is available for running notebook code cells .
March 2022 Dynamically control your Spark session configuration with pipeline parameters Now in Synapse notebooks, you can use pipeline parameters to configure the session with the notebook %%configure magic. For examples, see Synapse notebooks: Dynamically control your Spark session configuration with pipeline parameters.
March 2022 Reuse and manage notebook sessions Now in Synapse notebooks, it's easy to reuse an active session conveniently without having to start a new one and to see and manage your active sessions in the Active sessions list. To view your sessions, select the 3 dots in the notebook and select Manage sessions. For examples, see Synapse notebooks: Reuse and manage notebook sessions.
March 2022 Support for Python logging Now in Synapse notebooks, anything written through the Python logging module is captured, in addition to the driver logs. For examples, see Synapse notebooks: Support for Python logging.

Machine Learning

This section is an archive of features and improvements to machine learning models in Azure Synapse Analytics.

Month Feature Learn more
June 2022 Distributed Deep Neural Network Training (preview) The Azure Synapse runtime also includes supporting libraries like Petastorm and Horovod, which are commonly used for distributed training. This feature is currently available in preview. The Azure Synapse Analytics runtime for Apache Spark 3.1 and 3.2 also now includes support for the most common deep learning libraries like TensorFlow and PyTorch. To learn more about how to leverage these libraries within your Azure Synapse Analytics GPU-accelerated pools, read the Deep learning tutorials.
November 2021 PREDICT The T-SQL PREDICT syntax is now generally available for dedicated SQL pools. Get started with the Machine learning model scoring wizard for dedicated SQL pools.

Samples and guidance

This section is an archive of guidance and sample project resources for Azure Synapse Analytics.

Month Feature Learn more
June 2022 Azure Orbital analytics with Synapse Analytics We now offer an Azure Orbital analytics sample solution showing an end-to-end implementation of extracting, loading, transforming, and analyzing spaceborne data by using geospatial libraries and AI models with Azure Synapse Analytics. The sample solution also demonstrates how to integrate geospatial-specific Azure AI services models, AI models from partners, and bring-your-own-data models.
June 2022 Migration guides for Oracle A new Microsoft-authored migration guide for Oracle to Azure Synapse Analytics is now available. Design and performance for Oracle migrations.
June 2022 Azure Synapse success by design The Azure Synapse proof of concept playbook provides a guide to scope, design, execute, and evaluate a proof of concept for SQL or Spark workloads.
June 2022 Migration guides for Teradata A new Microsoft-authored migration guide for Teradata to Azure Synapse Analytics is now available. Design and performance for Teradata migrations.
June 2022 Migration guides for IBM Netezza A new Microsoft-authored migration guide for IBM Netezza to Azure Synapse Analytics is now available. Design and performance for IBM Netezza migrations.

Security

This section is an archive of security features and settings in Azure Synapse Analytics.

Month Feature Learn more
April 2022 Synapse Monitoring Operator RBAC role The Synapse Monitoring Operator role-based access control (RBAC) role allows a user persona to monitor the execution of Synapse Pipelines and Spark applications without having the ability to run or cancel the execution of these applications. For more information, review the Synapse RBAC Roles.
March 2022 Enforce minimal TLS version You can now raise or lower the minimum TLS version for dedicated SQL pools in Synapse workspaces. To learn more, see Azure SQL connectivity settings. The workspace managed SQL API can be used to modify the minimum TLS settings.
March 2022 Azure Synapse Analytics now supports Azure Active Directory (Azure AD) only authentication You can now use Azure Active Directory authentication to centrally manage access to all Azure Synapse resources, including SQL pools. You can disable local authentication upon creation or after a workspace is created through the Azure portal.
December 2021 User-Assigned managed identities Now you can use user-assigned managed identities in linked services for authentication in Synapse Pipelines and Dataflows. To learn more, see Credentials in Azure Data Factory and Azure Synapse.
December 2021 Browse ADLS Gen2 folders in the Azure Synapse Analytics workspace You can now browse and secure an Azure Data Lake Storage Gen2 (ADLS Gen2) container or folder in your Azure Synapse Analytics workspace by connecting to a specific container or folder in Synapse Studio.
December 2021 TLS 2.1 enforced for new Synapse Workspaces Starting in December 2021, a requirement for TLS 1.2 has been implemented for new Synapse Workspaces only.

Azure Synapse Data Explorer

Azure Data Explorer (ADX) is a fast and highly scalable data exploration service for log and telemetry data. It offers ingestion from Event Hubs, IoT Hubs, blobs written to blob containers, and Azure Stream Analytics jobs. This section is an archive of features and capabilities of the Azure Synapse Data Explorer and the Kusto Query Language (KQL). Read more about What is the difference between Azure Synapse Data Explorer and Azure Data Explorer? (Preview)

Month Feature Learn more
June 2022 Web Explorer new homepage The new Azure Synapse Web Explorer homepage makes it even easier to get started with Synapse Web Explorer.
June 2022 Web Explorer sample gallery The Web Explorer sample gallery provides end-to-end samples of how customers leverage Synapse Data Explorer popular use cases such as Logs Data, Metrics Data, IoT data and Basic big data examples.
June 2022 Web Explorer dashboards drill through capabilities You can now use drillthroughs as parameters in your Synapse Web Explorer dashboards.
June 2022 Time Zone settings for Web Explorer The Time Zone settings of the Web Explorer now apply to both the Query results and to the Dashboard. By changing the time zone, the dashboards will be automatically refreshed to present the data with the selected time zone.
May 2022 Synapse Data Explorer live query in Excel Using the new Data Explorer web experience Open in Excel feature, you can now provide access to live results of your query by sharing the connected Excel Workbook with colleagues and team members. You can open the live query in an Excel Workbook and refresh it directly from Excel to get the most up to date query results. To create an Excel Workbook connected to Synapse Data Explorer, start by running a query in the Web experience.
May 2022 Use Managed Identities for external SQL Server tables With Managed Identity support, Synapse Data Explorer table definition is now simpler and more secure. You can now use managed identities instead of entering in your credentials. To learn more about external tables, read Create and alter SQL Server external tables.
May 2022 Azure Synapse Data Explorer connector for Microsoft Power Automate, Logic Apps, and Power Apps New Azure Data Explorer connectors for Power Automate are generally available (GA). To learn more, read Azure Data Explorer connector for Microsoft Power Automate, the Microsoft Logic App and Azure Data Explorer, and the ability to Create Power Apps application to query data in Azure Data Explorer.
May 2022 Dynamic events routing from event hub to multiple databases We now support routing events data from Azure Event Hub/Azure IoT Hub/Azure Event Grid to multiple databases hosted in a single ADX cluster. To learn more about dynamic routing, read Ingest from event hub.
May 2022 Configure a database using a KQL inline script as part of JSON ARM deployment template Running a Kusto Query Language (KQL) script to configure your database can now be done using an inline script provided inline as a parameter to a JSON ARM template.

Azure Synapse Link is an automated system for replicating data from SQL Server or Azure SQL Database, Azure Cosmos DB, or Dataverse into Azure Synapse Analytics. This section is an archive of news about the Azure Synapse Link feature.

Month Feature Learn more
May 2022 Azure Synapse Link for SQL preview Azure Synapse Link for SQL is in preview for both SQL Server 2022 and Azure SQL Database. The Azure Synapse Link feature provides low- and no-code, near real-time data replication from your SQL-based operational stores into Azure Synapse Analytics. Provide BI reporting on operational data in near real-time, with minimal impact on your operational store. The Azure Synapse Link for SQL preview has been announced. For more information, see Blog: Azure Synapse Link for SQL Deep Dive.

Synapse SQL

This section is an archive of improvements and features in SQL pools in Azure Synapse Analytics.

Month Feature Learn more
June 2022 Result set size limit increase The maximum size of query result sets in serverless SQL pools has been increased from 200 GB to 400 GB.
May 2022 Automatic character column length calculation for serverless SQL pools It's no longer necessary to define character column lengths for serverless SQL pools in the data lake. You can get optimal query performance without having to define the schema, because the serverless SQL pool will use automatically calculated average column lengths and cardinality estimation.
April 2022 Cross-subscription restore for Azure Synapse SQL GA With the PowerShell Az.Sql module 3.8 update, the Restore-AzSqlDatabase cmdlet can be used for cross-subscription restore of dedicated SQL pools. To learn more, see Restore a dedicated SQL pool to a different subscription. This feature is now generally available for dedicated SQL pools (formerly SQL DW) and dedicated SQL pools in a Synapse workspace. What's the difference?
April 2022 Recover SQL pool from dropped server or workspace With the PowerShell Restore cmdlets in Az.Sql and Az.Synapse modules, you can now restore from a deleted server or workspace without filing a support ticket. For more information, see Restore a dedicated SQL pool from a deleted Azure Synapse workspace or Restore a standalone dedicated SQL pools (formerly SQL DW) from a deleted server, depending on your scenario.
March 2022 Column level encryption for dedicated SQL pools Column level encryption is now generally available for use on new and existing Azure SQL logical servers with Azure Synapse dedicated SQL pools and dedicated SQL pools in Azure Synapse workspaces. SQL Server Data Tools (SSDT) support for column level encryption for the dedicated SQL pools is available starting with the 17.2 Preview 2 build of Visual Studio 2022.
March 2022 Parallel execution for CETAS Better performance for CREATE TABLE AS SELECT (CETAS) and subsequent SELECT statements now made possible by use of parallel execution plans. For examples, see Better performance for CETAS and subsequent SELECTs.

Previous monthly updates in Azure Synapse Analytics

What follows are the previous format of monthly news updates for Synapse Analytics.

June 2022 update

General

  • Azure Orbital analytics with Synapse Analytics - We now offer an Azure Orbital analytics sample solution showing an end-to-end implementation of extracting, loading, transforming, and analyzing spaceborne data by using geospatial libraries and AI models with Azure Synapse Analytics. The sample solution also demonstrates how to integrate geospatial-specific Azure AI services models, AI models from partners, and bring-your-own-data models.

  • Azure Synapse success by design - Project success is no accident and requires careful planning and execution. The Synapse Analytics' Success by Design playbooks are now available. The Azure Synapse proof of concept playbook provides a guide to scope, design, execute, and evaluate a proof of concept for SQL or Spark workloads. These guides contain best practices from the most challenging and complex solution implementations incorporating Azure Synapse. To learn more about the Azure Synapse proof of concept playbook, read Success by Design.

SQL

Result set size limit increase - We know that you turn to Azure Synapse Analytics to work with large amounts of data. With that in mind, the maximum size of query result sets in Serverless SQL pools has been increased from 200 GB to 400 GB. This limit is shared between concurrent queries. To learn more about this size limit increase and other constraints, read Self-help for serverless SQL pool.

Synapse data explorer

  • Web Explorer new homepage - The new Synapse Web Explorer homepage makes it even easier to get started with Synapse Web Explorer. The Web Explorer homepage now includes the following sections:

    • Get started – Sample gallery offering example queries and dashboards for popular Synapse Data Explorer use cases.
    • Recommended – Popular learning modules designed to help you master Synapse Web Explorer and KQL.
    • Documentation – Synapse Web Explorer basic and advanced documentation.
  • Web Explorer sample gallery - A great way to learn about a product is to see how it is being used by others. The Web Explorer sample gallery provides end-to-end samples of how customers leverage Synapse Data Explorer popular use cases such as Logs Data, Metrics Data, IoT data and Basic big data examples. Each sample includes the dataset, well-documented queries, and a sample dashboard. To learn more about the sample gallery, read Azure Data Explorer in 60 minutes with the new samples gallery.

  • Web Explorer dashboards drill through capabilities - You can now add drill through capabilities to your Synapse Web Explorer dashboards. The new drill through capabilities allow you to easily jump back and forth between dashboard pages. This is made possible by using a contextual filter to connect your dashboards. Defining these contextual drill throughs is done by editing the visual interactions of the selected tile in your dashboard. To learn more about drill through capabilities, read Use drillthroughs as dashboard parameters.

  • Time Zone settings for Web Explorer - Being able to display data in different time zones is very powerful. You can now decide to view the data in UTC time, your local time zone, or the time zone of the monitored device/machine. The Time Zone settings of the Web Explorer now apply to both the Query results and to the Dashboard. By changing the time zone, the dashboards will be automatically refreshed to present the data with the selected time zone. For more information on time zone settings, read Change datetime to specific time zone.

Data integration

  • Fuzzy Join option in Join Transformation - Fuzzy matching with a sliding similarity score option has been added to the Join transformation in Mapping Data Flows. You can create inner and outer joins on data values that are similar rather than exact matches! Previously, you would have had to use an exact match. The sliding scale value goes from 60% to 100%, making it easy to adjust the similarity threshold of the match. For learn more about fuzzy joins, read Join transformation in mapping data flow.

  • Map Data [Generally Available] - We're excited to announce that the Map Data tool is now Generally Available. The Map Data tool is a guided process to help you create ETL mappings and mapping data flows from your source data to Synapse without writing code. To learn more about Map Data, read Map Data in Azure Synapse Analytics.

  • Rerun pipeline with new parameters - You can now change pipeline parameters when rerunning a pipeline from the Monitoring page without having to return to the pipeline editor. After running a pipeline with new parameters, you can easily monitor the new run against the old ones without having to toggle between pages. To learn more about rerunning pipelines with new parameters, read Rerun pipelines and activities.

  • User Defined Functions [Generally Available] - We're excited to announce that user defined functions (UDFs) are now Generally Available. With user-defined functions, you can create customized expressions that can be reused across multiple mapping data flows. You no longer have to use the same string manipulation, math calculations, or other complex logic several times. User-defined functions will be grouped in libraries to help developers group common sets of functions. To learn more about user defined functions, read User defined functions in mapping data flows.

Machine learning

Distributed Deep Neural Network Training with Horovod and Petastorm [Public Preview] - To simplify the process for creating and managing GPU-accelerated pools, Azure Synapse takes care of pre-installing low-level libraries and setting up all the complex networking requirements between compute nodes. This integration allows users to get started with GPU- accelerated pools within just a few minutes.

Now, Azure Synapse Analytics provides built-in support for deep learning infrastructure. The Azure Synapse Analytics runtime for Apache Spark 3.1 and 3.2 now includes support for the most common deep learning libraries like TensorFlow and PyTorch. The Azure Synapse runtime also includes supporting libraries like Petastorm and Horovod, which are commonly used for distributed training. This feature is currently available in Public Preview.

To learn more about how to leverage these libraries within your Azure Synapse Analytics GPU-accelerated pools, read the Deep learning tutorials.

May 2022 update

The following updates are new to Azure Synapse Analytics this month.

General

Get connected with the new Azure Synapse Influencer program! Join a community of Azure Synapse Influencers who are helping each other achieve more with cloud analytics! The Azure Synapse Influencer program recognizes Azure Synapse Analytics users and advocates who actively support the community by sharing Synapse-related content, announcements, and product news via social media.

SQL

  • Data Warehouse Migration guide for Dedicated SQL Pools in Azure Synapse Analytics - With the benefits that cloud migration offers, we hear that you often look for steps, processes, or guidelines to follow for quick and easy migrations from existing data warehouse environments. We just released a set of Data Warehouse migration guides to make your transition to dedicated SQL Pools in Azure Synapse Analytics easier.

  • Automatic character column length calculation - It's no longer necessary to define character column lengths! Serverless SQL pools let you query files in the data lake without knowing the schema upfront. The best practice was to specify the lengths of character columns to get optimal performance. Not anymore! With this new feature, you can get optimal query performance without having to define the schema. The serverless SQL pool will calculate the average column length for each inferred character column or character column defined as larger than 100 bytes. The schema will stay the same, while the serverless SQL pool will use the calculated average column lengths internally. It will also automatically calculate the cardinality estimation in case there was no previously created statistic.

Apache Spark for Synapse

  • Azure Synapse Dedicated SQL Pool Connector for Apache Spark Now Available in Python - Previously, the Azure Synapse Dedicated SQL Pool connector was only available using Scala. Now, it can be used with Python on Spark 3. The only difference between the Scala and Python implementations is the optional Scala callback handle, which allows you to receive post-write metrics.

    The following are now supported in Python on Spark 3:

    • Read using Azure Active Directory (AD) Authentication or Basic Authentication
    • Write to Internal Table using Azure AD Authentication or Basic Authentication
    • Write to External Table using Azure AD Authentication or Basic Authentication

    To learn more about the connector in Python, read Azure Synapse Dedicated SQL Pool Connector for Apache Spark.

  • Manage Azure Synapse Apache Spark configuration - Apache Spark configuration management is always a challenging task because Spark has hundreds of properties. It is also challenging for you to know the optimal value for Spark configurations. With the new Spark configuration management feature, you can create a standalone Spark configuration artifact with auto-suggestions and built-in validation rules. The Spark configuration artifact allows you to share your Spark configuration within and across Azure Synapse workspaces. You can also easily associate your Spark configuration with a Spark pool, a Notebook, and a Spark job definition for reuse and minimize the need to copy the Spark configuration in multiple places. To learn more about the new Spark configuration management feature, read Manage Apache Spark configuration.

Synapse Data Explorer

  • Synapse Data Explorer live query in Excel - Using the new Data Explorer web experience Open in Excel feature, you can now provide access to live results of your query by sharing the connected Excel Workbook with colleagues and team members.  You can open the live query in an Excel Workbook and refresh it directly from Excel to get the most up to date query results. To learn more about Excel live query, read Open live query in Excel.

  • Use Managed Identities for External SQL Server Tables - One of the key benefits of Azure Synapse is the ability to bring together data integration, enterprise data warehousing, and big data analytics. With Managed Identity support, Synapse Data Explorer table definition is now simpler and more secure. You can now use managed identities instead of entering in your credentials.

    An external SQL table is a schema entity that references data stored outside the Synapse Data Explorer database. Using the Create and alter SQL Server external tables command, External SQL tables can easily be added to the Synapse Data Explorer database schema.

    To learn more about managed identities, read Managed identities overview.

    To learn more about external tables, read Create and alter SQL Server external tables.

  • New KQL Learn module (2 out of 3) is live! - The power of Kusto Query Language (KQL) is its simplicity to query structured, semi-structured, and unstructured data together. To make it easier for you to learn KQL, we are releasing Learn modules. Previously, we released Write your first query with Kusto Query Language. New this month is Gain insights from your data by using Kusto Query Language.

    KQL is the query language used to query Synapse Data Explorer big data. KQL has a fast-growing user community, with hundreds of thousands of developers, data engineers, data analysts, and students.

    Check out the newest KQL Learn module and see for yourself how easy it is to become a KQL master.

    To learn more about KQL, read Kusto Query Language (KQL) overview.

  • Azure Synapse Data Explorer connector for Microsoft Power Automate, Logic Apps, and Power Apps [Generally Available] - The Azure Data Explorer connector for Power Automate enables you to orchestrate and schedule flows, send notifications, and alerts, as part of a scheduled or triggered task. To learn more, read Azure Data Explorer connector for Microsoft Power Automate and Usage examples for Azure Data Explorer connector to Power Automate.

  • Dynamic events routing from event hub to multiple databases - Routing events from Event Hub/IOT Hub/Event Grid is an activity commonly performed by Azure Data Explorer (ADX) users. Previously, you could route events only to a single database per defined connection. If you wanted to route the events to multiple databases, you needed to create multiple ADX cluster connections.

    To simplify the experience, we now support routing events data to multiple databases hosted in a single ADX cluster. To learn more about dynamic routing, read Ingest from event hub.

  • Configure a database using a KQL inline script as part of JSON ARM deployment template - Previously, Azure Data Explorer supported running a Kusto Query Language (KQL) script to configure your database during Azure Resource Manager (ARM) template deployment. Now, this can be done using an inline script provided inline as a parameter to a JSON ARM template. To learn more about using a KQL inline script, read Configure a database using a Kusto Query Language script.

Data Integration

  • Export pipeline monitoring as a CSV - The ability to export pipeline monitoring to CSV has been added after receiving many community requests for the feature. Simply filter the Pipeline runs screen to the data you want and select Export to CSV*. To learn more about exporting pipeline monitoring and other monitoring improvements, read Azure Data Factory monitoring improvements.

  • Incremental data loading made easy for Synapse and Azure Database for PostgreSQL and MySQL - In a data integration solution, incrementally loading data after an initial full data load is a widely used scenario. Automatic incremental source data loading is now natively available for Synapse SQL and Azure Database for PostgreSQL and MySQL. Users can "enable incremental extract" and only inserted or updated rows will be read by the pipeline. To learn more about incremental data loading, read Incrementally copy data from a source data store to a destination data store.

  • User-Defined Functions for Mapping Data Flows [Public Preview] - We hear you that you can find yourself doing the same string manipulation, math calculations, or other complex logic several times. Now, with the new user-defined function feature, you can create customized expressions that can be reused across multiple mapping data flows. User-defined functions will be grouped in libraries to help developers group common sets of functions. Once you've created a data flow library, you can add in your user-defined functions. You can even add in multiple arguments to make your function more reusable. To learn more about user-defined functions, read User defined functions in mapping data flows.

  • Assert Error Handling - Error handling has now been added to sinks following an assert transformation. Assert transformations enable you to build custom rules for data quality and data validation. You can now choose whether to output the failed rows to the selected sink or to a separate file. To learn more about error handling, read Assert data transformation in mapping data flow.

  • Mapping data flows projection editing - New UI updates have been made to source projection editing in mapping data flows. You can now update source projection column names and column types. To learn more about source projection editing, read Source transformation in mapping data flow.

Azure Synapse Link for SQL Server - At Microsoft Build 2022, we announced the Public Preview availability of Azure Synapse Link for SQL, for both SQL Server 2022 and Azure SQL Database. Data-driven, quality insights are critical for companies to stay competitive. The speed to achieve those insights can make all the difference. The costly and time-consuming nature of traditional ETL and ELT pipelines is no longer enough. With this release, you can now take advantage of low- and no-code, near real-time data replication from your SQL-based operational stores into Azure Synapse Analytics. This makes it easier to run BI reporting on operational data in near real-time, with minimal impact on your operational store. To learn more, read Announcing the Public Preview of Azure Synapse Link for SQL and watch our YouTube video.

Apr 2022 update

The following updates are new to Azure Synapse Analytics this month.

SQL

  • Cross-subscription restore for Azure Synapse SQL is now generally available. Previously, it took many undocumented steps to restore a dedicated SQL pool to another subscription. Now, with the PowerShell Az.Sql module 3.8 update, the Restore-AzSqlDatabase cmdlet can be used for cross-subscription restore. To learn more, see Restore a dedicated SQL pool (formerly SQL DW) to a different subscription.

  • It is now possible to recover a SQL pool from a dropped server or workspace. With the PowerShell Restore cmdlets in Az.Sql and Az.Synapse modules, you can now restore from a deleted server or workspace without filing a support ticket. For more information, read Synapse workspace SQL pools or standalone SQL pools (formerly SQL DW), depending on your scenario.

Synapse database templates and database designer

  • Based on popular customer feedback, we've made significant improvements to our exploration experience when creating a lake database using an industry template. To learn more, read Quickstart: Create a new Lake database leveraging database templates.

  • We've added the option to clone a lake database. This unlocks additional opportunities to manage new versions of databases or support schemas that evolve in discrete steps. You can quickly clone a database using the action menu available on the lake database. To learn more, read How-to: Clone a lake database.

  • You can now use wildcards to specify custom folder hierarchies. Lake databases sit on top of data that is in the lake and this data can live in nested folders that don't fit into clean partition patterns. Previously, querying lake databases required that your data exists in a simple directory structure that you could browse using the folder icon without the ability to manually specify directory structure or use wildcard characters. To learn more, read How-to: Modify a datalake.

Apache Spark for Synapse

  • We are excited to announce the preview availability of Apache Spark™ 3.2 on Synapse Analytics. This new version incorporates user-requested enhancements and resolves 1,700+ Jira tickets. Please review the official release notes for the complete list of fixes and features and review the migration guidelines between Spark 3.1 and 3.2 to assess potential changes to your applications. For more details, read Apache Spark version support and Azure Synapse Runtime for Apache Spark 3.2.

  • Assigning parameters dynamically based on variables, metadata, or specifying Pipeline specific parameters has been one of your top feature requests. Now, with the release of parameterization for the Spark job definition activity, you can do just that. For more details, read Transform data using Apache Spark job definition.

  • We often receive customer requests to access the snapshot of the Notebook when there is a Pipeline Notebook run failure or there is a long-running Notebook job. With the release of the Synapse Notebook snapshot feature, you can now view the snapshot of the Notebook activity run with the original Notebook code, the cell output, and the input parameters. You can also access the snapshot of the referenced Notebook from the referencing Notebook cell output if you refer to other Notebooks through Spark utils. To learn more, read Transform data by running a Synapse notebook and Introduction to Microsoft Spark utilities.

Security

  • The Synapse Monitoring Operator RBAC role is now generally available. Since the GA of Synapse, customers have asked for a fine-grained RBAC (role-based access control) role that allows a user persona to monitor the execution of Synapse Pipelines and Spark applications without having the ability to run or cancel the execution of these applications. Now, customers can assign the Synapse Monitoring Operator role to such monitoring personas. This allows organizations to stay compliant while having flexibility in the delegation of tasks to individuals or teams. Learn more by reading Synapse RBAC Roles.

Data integration

  • Microsoft has added Dataverse as a source and sink connector to Synapse Data Flows so that you can now build low-code data transformation ETL jobs in Synapse directly accessing your Dataverse environment. For more details on how to use this new connector, read Mapping data flow properties.

  • We heard from you that a 1-minute timeout for Web activity was not long enough, especially in cases of synchronous APIs. Now, with the response timeout property 'httpRequestTimeout', you can define timeout for the HTTP request up to 10 minutes. Learn more by reading Web activity response timeout improvements.

Developer experience

  • Previously, if you wanted to reference a notebook in another notebook, you could only reference published or committed content. Now, when using %run notebooks, you can enable 'unpublished notebook reference' which will allow you to reference unpublished notebooks. When enabled, notebook run will fetch the current contents in the notebook web cache, meaning the changes in your notebook editor can be referenced immediately by other notebooks without having to be published (Live mode) or committed (Git mode). To learn more, read Reference unpublished notebook.

Mar 2022 update

The following updates are new to Azure Synapse Analytics this month.

Developer Experience

  • Code cells in Synapse notebooks that result in exception will now show standard output along with the exception message. This feature is supported for Python and Scala languages. To learn more, see the example output when a code statement fails.

  • Synapse notebooks now support partial output when running code cells. To learn more, see the examples at this blog post

  • You can now dynamically control Spark session configuration for the notebook activity with pipeline parameters. To learn more, see the variable explorer feature of Synapse notebooks.

  • You can now reuse and manage notebook sessions without having to start a new one. You can easily connect a selected notebook to an active session in the list started from another notebook. You can detach a session from a notebook, stop the session, and monitor it. To learn more, see how to manage your active notebook sessions.

  • Synapse notebooks now capture anything written through the Python logging module, in addition to the driver logs. To learn more, see support for Python logging.

SQL

  • Column Level Encryption for Azure Synapse dedicated SQL Pools is now Generally Available. With column level encryption, you can use different protection keys for each column with each key having its own access permissions. The data in CLE-enforced columns are encrypted on disk and remain encrypted in memory until the DECRYPTBYKEY function is used to decrypt it. To learn more, see how to encrypt a data column.

  • Serverless SQL pools now support better performance for CETAS (Create External Table as Select) and subsequent SELECT queries. The performance improvements include, a parallel execution plan resulting in faster CETAS execution and outputting multiple files. To learn more, see CETAS with Synapse SQL article and the blog post

Apache Spark for Synapse

  • Synapse Spark Common Data Model (CDM) Connector is now Generally Available. The CDM format reader/writer enables a Spark program to read and write CDM entities in a CDM folder via Spark dataframes. To learn more, see how the CDM connector supports reading, writing data, examples, & known issues.

  • Synapse Spark Dedicated SQL Pool (DW) Connector now supports improved performance. The new architecture eliminates redundant data movement and uses COPY-INTO instead of PolyBase. You can authenticate through SQL basic authentication or opt into the Azure Active Directory/Azure AD based authentication method. It now has ~5x improvements over the previous version. To learn more, see Azure Synapse Dedicated SQL Pool Connector for Apache Spark

  • Synapse Spark Dedicated SQL Pool (DW) Connector now supports all Spark Dataframe SaveMode choices. It supports Append, Overwrite, ErrorIfExists, and Ignore modes. The Append and Overwrite are critical for managing data ingestion at scale. To learn more, see DataFrame write SaveMode support

  • Accelerate Spark execution speed using the new Intelligent Cache feature. This feature is currently in public preview. Intelligent Cache automatically stores each read within the allocated cache storage space, detecting underlying file changes and refreshing the files to provide the most recent data. To learn more, see how to Enable/Disable the cache for your Apache Spark pool or see the blog post

Security

Data Integration

Feb 2022 update

The following updates are new to Azure Synapse Analytics this month.

SQL

Data integration

Jan 2022 update

The following updates are new to Azure Synapse Analytics this month.

Apache Spark for Synapse

You can now use four new database templates in Azure Synapse. Learn more about Automotive, Genomics, Manufacturing, and Pharmaceuticals templates from the blog post or the database templates article. These templates are currently in public preview and are available within the Synapse Studio gallery.

Machine Learning

Improvements to the Synapse Machine Learning library v0.9.5 (previously called MMLSpark). This release simplifies the creation of massively scalable machine learning pipelines with Apache Spark. To learn more, read the blog post about the new capabilities in this release or see the full release notes

Security

  • The Azure Synapse Analytics security overview - A whitepaper that covers the five layers of security. The security layers include authentication, access control, data protection, network security, and threat protection. Understand each security feature in detailed to implement an industry-standard security baseline and protect your data on the cloud.

  • TLS 1.2 is now required for newly created Synapse Workspaces. To learn more, see how TLS 1.2 provides enhanced security using this article or the blog post. Sign-in attempts to a newly created Synapse workspace from connections using TLS versions lower than 1.2 will fail.

Data Integration

Synapse SQL

December 2021 update

The following updates are new to Azure Synapse Analytics this month.

Apache Spark for Synapse

  • Accelerate Spark workloads with NVIDIA GPU acceleration blog article
  • Mount remote storage to a Synapse Spark pool blog article
  • Natively read & write data in ADLS with Pandas blog article
  • Dynamic allocation of executors for Spark blog article

Machine Learning

  • The Synapse Machine Learning library blog article
  • Getting started with state-of-the-art pre-built intelligent models blog article
  • Building responsible AI systems with the Synapse ML library blog article
  • PREDICT is now GA for Synapse Dedicated SQL pools blog article
  • Simple & scalable scoring with PREDICT and MLFlow for Apache Spark for Synapse blog article
  • Retail AI solutions blog article

Security

  • User-Assigned managed identities now supported in Synapse Pipelines in preview blog article
  • Browse ADLS Gen2 folders in an Azure Synapse Analytics workspace in preview blog article

Data Integration

  • Pipeline Fail activity blog article
  • Mapping Data Flow gets new native connectors blog article
  • More notebook export formats: HTML, Python, and LaTeX blog
  • Three new chart types in notebook view: box plot, histogram, and pivot table blog
  • Reconnect to lost notebook session blog

Integrate

  • Azure Synapse Link for Dataverse blog article
  • Custom partitions for Azure Synapse Link for Azure Cosmos DB in preview blog article
  • Map data tool (Public Preview), a no-code guided ETL experience blog article
  • Quick reuse of spark cluster blog article
  • External Call transformation blog article
  • Flowlets (Public Preview) blog article

November 2021 update

The following updates are new to Azure Synapse Analytics this month.

Synapse Data Explorer

  • Synapse Data Explorer now available in preview blog article

Work with Databases and Data Lakes

  • Introducing Lake databases (formerly known as Spark databases) blog article
  • Lake database designer now available in preview blog article
  • Database Templates and Database Designer blog article

SQL

  • Delta Lake support for serverless SQL is generally available blog article
  • Query multiple file paths using OPENROWSET in serverless SQL blog article
  • Serverless SQL queries can now return up to 200 GB of results blog article
  • Handling invalid rows with OPENROWSET in serverless SQL blog article

Apache Spark for Synapse

  • Accelerate Spark workloads with NVIDIA GPU acceleration blog article
  • Mount remote storage to a Synapse Spark pool blog article
  • Natively read & write data in ADLS with Pandas blog article
  • Dynamic allocation of executors for Spark blog article

Machine Learning

  • The Synapse Machine Learning library blog article
  • Getting started with state-of-the-art pre-built intelligent models blog article
  • Building responsible AI systems with the Synapse ML library blog article
  • PREDICT is now GA for Synapse Dedicated SQL pools blog article
  • Simple & scalable scoring with PREDICT and MLFlow for Apache Spark for Synapse blog article
  • Retail AI solutions blog article

Security

  • User-Assigned managed identities now supported in Synapse Pipelines in preview blog article
  • Browse ADLS Gen2 folders in an Azure Synapse Analytics workspace in preview blog article

Data Integration

  • Azure Synapse Link for Dataverse blog article
  • Custom partitions for Azure Synapse Link for Azure Cosmos DB in preview blog article

October 2021 update

The following updates are new to Azure Synapse Analytics this month.

General

  • Manage your cost with Azure Synapse pre-purchase plans blog article
  • Move your Azure Synapse workspace across Azure regions blog article

Apache Spark for Synapse

  • Spark performance optimizations blog

Security

  • All Synapse RBAC roles are now generally available for use in production blog article
  • Apply User-Assigned Managed Identities for Double Encryption blog article
  • Synapse Administrators now have elevated access to dedicated SQL pools blog article

Governance

  • Synapse workspaces can now automatically push lineage data to Microsoft Purview blog article

Integrate

  • Use Stringify in data flows to easily transform complex data types to strings blog article
  • Control Spark session time-to-live (TTL) in data flows blog article

CI/CD & Git

  • Deploy Synapse workspaces using GitHub Actions blog article
  • More control creating Git branches in Synapse Studio blog article

Developer Experience

  • Enhanced Markdown editing in Synapse notebooks preview blog article
  • Pandas dataframes automatically render as nicely formatted HTML tables blog article
  • Use IPython widgets in Synapse Notebooks blog article
  • Mssparkutils runtime context now available for Python and Scala blog article

Next steps

Get started with Azure Synapse Analytics