What is Azure Synapse Analytics?
Azure Synapse is an enterprise analytics service that accelerates time to insight across data warehouses and big data systems. Azure Synapse brings together the best of SQL technologies used in enterprise data warehousing, Spark technologies used for big data, Pipelines for data integration and ETL/ELT, and deep integration with other Azure services such as Power BI, CosmosDB, and AzureML.
Synapse SQL is a distributed query system for T-SQL that enables data warehousing and data virtualization scenarios and extends T-SQL to address streaming and machine learning scenarios.
- Synapse SQL offers both serverless and dedicated resource models. For predictable performance and cost, create dedicated SQL pools to reserve processing power for data stored in SQL tables. For unplanned or bursty workloads, use the always-available, serverless SQL endpoint.
- Use built-in streaming capabilities to land data from cloud data sources into SQL tables.
- Load data into managed tables to get the best query performance or directly query data in Azure Data Lake Storage or Azure Cosmos DB without need to perform periodic import tasks and wait for them to finish.
- Integrate AI with SQL by using machine learning models to score data using the T-SQL PREDICT function.
Industry-standard Apache Spark
Apache Spark for Azure Synapse deeply and seamlessly integrates Apache Spark--the most popular open source big data engine used for data preparation, data engineering, ETL, and machine learning.
- ML models with SparkML algorithms and AzureML integration for Apache Spark 2.4 with built-in support for Linux Foundation Delta Lake.
- Simplified resource model that frees you from having to worry about managing clusters.
- Fast Spark start-up and aggressive autoscaling.
- Built-in support for .NET for Spark allowing you to reuse your C# expertise and existing .NET code within a Spark application.
Working with your Data Lake
Azure Synapse removes the traditional technology barriers between using SQL and Spark together. You can seamlessly mix and match based on your needs and expertise.
- Tables defined on files in the data lake are seamlessly consumed by either Spark or Hive.
- SQL and Spark can directly explore and analyze Parquet, CSV, TSV, Delta Lake, and JSON files stored in the data lake.
- Fast, scalable data loading between SQL and Spark databases
Built-in data integration
Azure Synapse contains the same Data Integration engine and experiences as Azure Data Factory, allowing you to create rich at-scale ETL pipelines without leaving Azure Synapse Analytics.
- Ingest data from 90+ data sources
- Code-Free ETL with Data flow activities
- Orchestrate notebooks, Spark jobs, stored procedures, SQL scripts, and more
Synapse Studio provides a single way for enterprises to build solutions, maintain, and secure all in a single user experience
- Perform key tasks: ingest, explore, prepare, orchestrate, visualize
- Monitor resources, usage, and users across SQL and Spark
- Use Role-based access control to simplify access to analytics resources
- Write SQL or Spark code and integrate with enterprise CI/CD processes