Azure Synapse Analytics

Azure Synapse Analytics is composed of the five elements:

  • Azure Synapse SQL pool: Synapse SQL offers both serverless and dedicated resource models to work with a node-based architecture. For predictable performance and cost, you can create dedicated SQL pools. For unplanned or unplanned workloads, you can use the always-available, serverless SQL endpoint:
    • Azure Synapse Dedicated SQL Pool
    • Azure Synapse Serverless SQL Pool
  • Azure Synapse Spark pool: This pool is a cluster of servers that run Apache Spark to process data. You write your data processing logic by using one of the four supported languages: Python, Scala, SQL, and C# (via .NET for Apache Spark). Apache Spark for Azure Synapse integrates Apache Spark (the open source big data engine used for data preparation, data engineering, ETL, and machine learning).
  • Azure Synapse Pipelines: Azure Synapse Pipelines applies the capabilities of Azure Data Factory. Pipelines are the cloud-based ETL and data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. You can include activities that transform the data as it’s transferred, or you can combine data from multiple sources together.
  • Azure Synapse Link: This component allows you to connect to Azure Cosmos DB. You can use it to perform near real-time analytics over the operational data stored in an Azure Cosmos DB database.
  • Azure Synapse Studio: This element is a web-based IDE that can be used centrally to work with all capabilities of Azure Synapse Analytics. You can use Azure Synapse Studio to create SQL and Spark pools, define and run pipelines, and configure links to external data sources.

Reference Materials

%d bloggers like this: