Logo

Apache Airflow: Complete Hands-On Beginner to Advanced Class

Image

Learn Apache Airflow step-by-step. Real-Life Data Pipelines & Quizzes Included. Learn by Doing!

Created by
Alexandra Abbas, Data Engineer
Rating
star interface icon star interface icon star interface icon star interface icon star interface icon
4.8/5
graphical divider

What you'll learn

  • Checkmark icon
    Core and advanced concepts through real-world examples
  • Checkmark icon
    Architecture components of Apache Airflow
  • Checkmark icon
    How to set up connections to external resources
  • Checkmark icon
    How to load and analyse data in a Data Warehouse using Airflow
  • Checkmark icon
    How to schedule PySpark jobs using Apache Airflow
  • Checkmark icon
    How to extend Airflow with custom operators and sensors
  • Checkmark icon
    How to test Airflow DAGs and operators
  • Checkmark icon
    How to deploy Airflow instances with different executors
  • Checkmark icon
    How to set up error tracking and monitoring

Description

Hi there, my name is Alexandra Abbas. I’m an Apache Airflow Contributor and a Google Cloud Certified Data Engineer & Architect with over 3 years experience as a Data Engineer.

Are you struggling to learn Apache Airflow on your own? In this course I will teach you Airflow in a practical manner, with every lecture comes a full coding screencast. By the end of the course you will be able to use Airflow professionally and add Airflow to your CV.

This course includes 50 lectures and more than 4 hours of video, quizzes, coding exercises as well as 2 major real-life projects that you can add to your Github portfolio!

You will learn:

  • How to install and set up Airflow on your machine
  • Basic and advanced Airflow concepts
  • How to develop complex real-life data pipelines
  • How to interact with Google Cloud from your Airflow instance
  • How to extend Airflow with custom operators and sensors
  • How to test Airflow pipelines and operators
  • How to monitor your Airflow instance using Prometheus and Grafana
  • How to track errors with Sentry
  • How to set up and run Airflow in production

This course is for beginners. You do not need any previous knowledge of Apache Airflow, Data Engineering or Google Cloud. We will start right at the beginning and work our way through step by step.

You will get access to over 50 lectures plus corresponding cheat sheets, datasets and code base for the lectures!

Introduction video

Course content

54 lectures, 5h total length

Your Airflow Journey (2:36) Preview
What is Apache Airflow? (1:40) Preview
Comparing Airflow to Other Tools (0:45) Preview
Course Prerequisites Preview
Extra: Install Conda (Virtual Environment Manager) (2:43)
Quiz 1: Airflow Basics

Components of Airflow (2:12)
Install Airflow on MacOS (5:45)
Install Airflow on Linux
Install Airflow on Windows (19:30)
Install and Run Airflow with Docker
Run Airflow Locally (3:22)
Introduction to the Airflow UI (4:05)
Introduction to the Airflow CLI (2:37)
Quiz 2: Airflow Setup

What are DAGs? (4:49)
What are Default Arguments? (2:00)
What are Tasks and Operators? (5:21)
How to Define Dependencies? (5:16)
Quiz 3: Core Concepts

Use Case (2:06)
Set Up (4:55)
Connections (2:50)
Load Data from Storage to BigQuery (8:01)
Run SQL Query in BigQuery (9:47)
Use Hook to List Storage Objects (5:42)
Cross-Task Communication (XComs) (6:59)
Jinja Templating and Macros (5:33)
Variables (4:44)
Quiz 4: Advanced Concepts

Use Case (5:06)
Set Up (2:36)
Branching (4:33) Preview
Create Dataproc Hadoop Cluster (4:49)
Submit a PySpark Job (4:13)
Subdags (9:19)
Trigger Rules (3:55)
DAG Documentation (6:39)
Quiz 5: Advanced Concepts

Create a Custom Operator (17:38)
Create a Custom Sensor (8:33)
Run Custom Plugins (7:20)
Quiz 6: Custom Plugins

Load Test DAGs (6:00)
Unit Test DAGs and Operators (12:08)
Unit Test Custom Operators (10:25)
Quiz 7: Testing

Executors (7:45)
Configure Local Executor (21:50)
Configure Celery Executor
Service Level Agreements (SLAs) (3:48)
Security: Authentication, Roles, Encryption (7:10)
Write Logs to a Remote Location (4:32)
Monitor Airflow with StatsD, Prometheus and Grafana (19:26)
Error Tracking with Sentry (3:29)
Managed Airflow Services
Quiz 8: Airflow in Production

Who this course is for

  • Data Engineers
  • Data Scientists
  • Python Developers Interested in Data Engineering
  • Data Analysts with Python Programming Knowledge

Requirements

  • Intermediate Python programming knowledge
  • Beginner SQL knowledge
  • Beginner Docker knowledge
  • Having Git, Docker and Conda (or other Virtual Environment Manager) installed on your machine

Instructor

Alexandra Abbas
Google Cloud Certified Data Engineer & Architect

Image
Meet Alexandra, your instructor for this course

Alexandra is a Google Cloud Certified Data Engineer & Architect and Apache Airflow Contributor.

She has experience with large-scale data science and engineering projects. She spends her time building data pipelines using Apache Airflow and Apache Beam and creating production ready Machine Learning pipelines with Tensorflow.

Alexandra was a speaker at Serverless Days London 2019 and presented at the Tensorflow London meetup.

Reviews

divider graphic

Related Courses

icon

Quick introduction to batch processing in Apache Beam

Read More
icon

Everything you need to get started with Kubeflow in production

Read More
icon

Explore the landscape with our Modern Data Engineer Roadmap 2020

Read More
divider graphic
arrow-up icon