Data Factory is a cloud-based data integration service that orchestrates and automates movement and transformation of data. Data Factory works across on-premises and cloud data sources and SaaS to ingest, prepare, transform, analyze, and publish data.
Data Factory allows to compose services into managed data flow pipelines to transform data using following services:
- Azure HDInsight (Hadoop) and Azure Batch for big data computing needs,
- Azure Machine Learning (AML) to operationalize analytics solutions,
- Azure Stream Analytics for complex events processing etc.
Example of Data Sources:
- Azure SQL Data Warehouse (DW) for storing and querying relational data,
- Azure Blob Storage,
- Azure Data Lake etc.
Using ADF it is possible to see lineage and dependencies between data pipelines, as well as monitor all of data flow pipelines.
Sources: