Apache Airflow using Google Cloud Composer: Introduction

With Google Cloud composer learn Apache Airflow without making any local install. Ensures focus is on Airflow topics.

Apache Airflow is an open-source  platform to programmatically author, schedule and monitor workflows.

What you’ll learn

  • Understand automation of Task workflows through Airflow.
  • Airflow Architecture – On Premise (local install), Cloud, single node, multiple node.
  • How to use connection functionality to connect to different systems to automate data pipelines.
  • What is Google cloud Big query and briefly how it can be used in Dataware housing as well as in Airflow DAG.
  • Master core functionalities such as DAGs, Operators, Tasks through hands on demonstrations.
  • Understand advanced functionalities like XCOM, Branching, Subdags through hands on demonstrations.
  • Get an overview understanding on SLAs, Kubernetes executor functionality in Apache Airflow.
  • The source files of Python DAG programs (9 .py files) used in demonstration are available for download towards practice for students.

Course Content

  • Course Overview –> 1 lecture • 10min.
  • Introduction –> 3 lectures • 23min.
  • What is Airflow – Directed Acyclic Graph (DAG) & operators? –> 1 lecture • 6min.
  • Apache Airflow architecture –> 2 lectures • 12min.
  • Google Cloud Platform: Cloud composer used as Apache Airflow –> 3 lectures • 22min.
  • Understanding Apache Airflow program structure –> 1 lecture • 4min.
  • Activity 1 : Create and submit Apache airflow DAG program –> 1 lecture • 13min.
  • Activity 2: Using Template functionality in Apache Airflow program –> 2 lectures • 11min.
  • Using Variables in Apache Airflow –> 2 lectures • 13min.
  • Activity 4: Calling Bash script in different folder / different machine. –> 2 lectures • 11min.
  • Creating connections in Apache Airflow –> 4 lectures • 27min.
  • Using Google’s cloud Bigquery with Apache Airflow Datapipelines –> 4 lectures • 25min.
  • Cross communication between tasks – XCOM –> 2 lectures • 12min.
  • Branching based on conditions –> 2 lectures • 12min.
  • SUBDAGS –> 2 lectures • 9min.
  • Other functionalities –> 3 lectures • 13min.
  • Apache Airflow Vs Apache Beam and Spark – Quick comparison –> 1 lecture • 5min.
  • Bonus –> 1 lecture • 4min.

Apache Airflow using Google Cloud Composer: Introduction

Requirements

Apache Airflow is an open-source  platform to programmatically author, schedule and monitor workflows.

Cloud Composer  is a fully managed workflow orchestration service that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers. Built on the popular Apache Airflow open source project and operated using the Python programming language, Cloud Composer is free from lock-in and easy to use.

With Apache Airflow hosted on cloud (‘Google’ Cloud composer) and hence,this will assist learner to focus on Apache Airflow product functionality and thereby learn quickly, without any hassles of having Apache Airflow installed locally on a machine.

Cloud Composer pipelines are configured as directed acyclic graphs (DAGs) using Python, making it easy for users of any experience level to author and schedule a workflow. One-click deployment yields instant access to a rich library of connectors and multiple graphical representations of your workflow in action, increasing pipeline reliability by making troubleshooting easy.

This course is designed with beginner in mind, that is first time users of cloud composer / Apache airflow. The course is structured in such a way that it has presentation to discuss the concepts initially and then  provides with hands on demonstration to make the understanding better.

The python DAG programs used in demonstration source file (9 Python files) are available for download toward further practice by students.

Happy learning!!!

Get Tutorial