What you'll learn
Description
Hi there, my name is Alexandra Abbas. I’m an Apache Airflow Contributor and a Google Cloud Certified Data Engineer & Architect with over 3 years experience as a Data Engineer.
Are you struggling to learn Apache Airflow on your own? In this course I will teach you Airflow in a practical manner, with every lecture comes a full coding screencast. By the end of the course you will be able to use Airflow professionally and add Airflow to your CV.
This course includes 50 lectures and more than 4 hours of video, quizzes, coding exercises as well as 2 major real-life projects that you can add to your Github portfolio!
You will learn:
- How to install and set up Airflow on your machine
- Basic and advanced Airflow concepts
- How to develop complex real-life data pipelines
- How to interact with Google Cloud from your Airflow instance
- How to extend Airflow with custom operators and sensors
- How to test Airflow pipelines and operators
- How to monitor your Airflow instance using Prometheus and Grafana
- How to track errors with Sentry
- How to set up and run Airflow in production
This course is for beginners. You do not need any previous knowledge of Apache Airflow, Data Engineering or Google Cloud. We will start right at the beginning and work our way through step by step.
You will get access to over 50 lectures plus corresponding cheat sheets, datasets and code base for the lectures!
Introduction video
Course content
54 lectures, 5h total length
Components of Airflow (2:12)
Install Airflow on MacOS (5:45)
Install Airflow on Linux
Install Airflow on Windows (19:30)
Install and Run Airflow with Docker
Run Airflow Locally (3:22)
Introduction to the Airflow UI (4:05)
Introduction to the Airflow CLI (2:37)
Quiz 2: Airflow Setup
What are DAGs? (4:49)
What are Default Arguments? (2:00)
What are Tasks and Operators? (5:21)
How to Define Dependencies? (5:16)
Quiz 3: Core Concepts
Use Case (2:06)
Set Up (4:55)
Connections (2:50)
Load Data from Storage to BigQuery (8:01)
Run SQL Query in BigQuery (9:47)
Use Hook to List Storage Objects (5:42)
Cross-Task Communication (XComs) (6:59)
Jinja Templating and Macros (5:33)
Variables (4:44)
Quiz 4: Advanced Concepts
Use Case (5:06)
Set Up (2:36)
Branching (4:33) Preview
Create Dataproc Hadoop Cluster (4:49)
Submit a PySpark Job (4:13)
Subdags (9:19)
Trigger Rules (3:55)
DAG Documentation (6:39)
Quiz 5: Advanced Concepts
Create a Custom Operator (17:38)
Create a Custom Sensor (8:33)
Run Custom Plugins (7:20)
Quiz 6: Custom Plugins
Load Test DAGs (6:00)
Unit Test DAGs and Operators (12:08)
Unit Test Custom Operators (10:25)
Quiz 7: Testing
Executors (7:45)
Configure Local Executor (21:50)
Configure Celery Executor
Service Level Agreements (SLAs) (3:48)
Security: Authentication, Roles, Encryption (7:10)
Write Logs to a Remote Location (4:32)
Monitor Airflow with StatsD, Prometheus and Grafana (19:26)
Error Tracking with Sentry (3:29)
Managed Airflow Services
Quiz 8: Airflow in Production
Who this course is for
- Data Engineers
- Data Scientists
- Python Developers Interested in Data Engineering
- Data Analysts with Python Programming Knowledge
Requirements
- Intermediate Python programming knowledge
- Beginner SQL knowledge
- Beginner Docker knowledge
- Having Git, Docker and Conda (or other Virtual Environment Manager) installed on your machine
Instructor
Alexandra Abbas
Google Cloud Certified Data Engineer & Architect

Alexandra is a Google Cloud Certified Data Engineer & Architect and Apache Airflow Contributor.
She has experience with large-scale data science and engineering projects. She spends her time building data pipelines using Apache Airflow and Apache Beam and creating production ready Machine Learning pipelines with Tensorflow.
Alexandra was a speaker at Serverless Days London 2019 and presented at the Tensorflow London meetup.