What is Azure Data Factory (ADF)?
- Pre-requisite Knowledge –
Before we start with the understanding of what is Azure Data Factory, we should know –
- Basic knowledge of cloud computing and its services
- Basic knowledge of Microsoft Azure
- Basic knowledge of Data Base Management System (DBMS) and data warehouse.
- Good to know - SQL Server Integration Services (SSIS)
- Background –
Almost all the organization stores data into the database systems since data is very important for every organization. This data can be of raw data, organized and unorganized data. It is very difficult and sometimes not possible to get more insights into raw and unorganized data for data scientists for business decisions.
Different applications can have the same or different database management systems with the same or different data models. In large enterprise applications, it is significant to integrate the disparate data systems, transform the data or transfer the data and load the subset of data or complete data into another system. This refined data can be used as business intelligence (BI). This helps to business to decide their strategies, attention and add value to business goals.
Azure Data Factory is a managed cloud service that's built for these complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.
Image Source: Microsoft Docs
- Introduction and how it works –
- Azure Data Factory (ADF) is a service from Microsoft Azure that comes under the ‘Integration’ category.
- This service provides service(s) to integrate the different database systems.
- ADF is like an SSIS used to extract, transform and load (ETL) the data.
- ADF can transform structured, semi-structured and unstructured data.
Image Source: Microsoft Docs
- ADF can connect to the cloud data sources as well as to on-premises data source with the help of data management gateways.
- Once we connect and load the data then we can process/transform the data by using Hive pig, C# activities.
- ADF doesn’t have drag and drops features like SSIS.
- The set of activities of the processing can combine into the pipeline (also called workflows) and we can schedule the pipeline as per our need.
- We can immediately view the pipeline activities with data immediately in the Azure portal with dashboards.
- This dashboard consists of visual layouts of pipeline and data input/outputs.
- With the help of dashboards, we can view relationships of the data, dependencies, how data is processing at the backend.
- We can monitor the execution using Azure monitor logs and it's API’s, PowerShell, health panels in the portal.
- We can use the various tools to create the ADF –
- Using the Azure portal
- PowerShell
- Visual Studio – Azure .NET SDK
- REST API
- Azure Data Factory Tangible Benefits –
- Integrate structured, semi-structured and unstructured data with the cloud platform.
- Easily perform the ETL, ELT code-free or using custom business rules.
- Cost-efficient and fully managed serverless cloud data integration tool that scales on-demand.
- It can connect and integrate into cloud, on-premise, and software as system platforms.
- SSIS integration runtime to easily move SSIS ETL workloads into the cloud with minimal effort.
- Reduce overhead cost – Advantage of existing investments of SSIS and move SSIS workloads to the cloud with negligible efforts.
- Best solution for complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects.
- ADF has prebuilt connectors to transform the data.
- Use the visual interface or write your own code in Python, .NET or ARM to build pipelines.
- We can integrate the Azure DevOps with ADF for visual monitoring and alerts.
- Reference Links –
- https://azure.microsoft.com/en-in/resources/videos/azure-data-factory-overview/
- https://azure.microsoft.com/en-in/services/data-factory/
- https://docs.microsoft.com/en-us/azure/data-factory/introduction
- https://www.jamesserra.com/archive/2014/11/what-is-azure-data-factory/
- https://blog.5nine.com/what-is-azure-data-factory-and-how-can-it-help
- https://azure.microsoft.com/en-in/services/devops/
Conclusion - In this article, we have learned what is Azure Data Factory, how it works and its services.