Course Outline

Introduction to Apache Airflow

  • What is workflow orchestration
  • Key features and benefits of Apache Airflow
  • Airflow 2.x improvements and ecosystem overview

Architecture and Core Concepts

  • Scheduler, web server, and worker processes
  • DAGs, tasks, and operators
  • Executors and backends (Local, Celery, Kubernetes)

Installation and Setup

  • Installing Airflow in local and cloud environments
  • Configuring Airflow with different executors
  • Setting up metadata databases and connections

Navigating the Airflow UI and CLI

  • Exploring the Airflow web interface
  • Monitoring DAG runs, tasks, and logs
  • Using the Airflow CLI for administration

Authoring and Managing DAGs

  • Creating DAGs with the TaskFlow API
  • Using operators, sensors, and hooks
  • Managing dependencies and scheduling intervals

Integrating Airflow with Data and Cloud Services

  • Connecting to databases, APIs, and message queues
  • Running ETL pipelines with Airflow
  • Cloud integrations: AWS, GCP, Azure operators

Monitoring and Observability

  • Task logs and real-time monitoring
  • Metrics with Prometheus and Grafana
  • Alerting and notifications with email or Slack

Securing Apache Airflow

  • Role-based access control (RBAC)
  • Authentication with LDAP, OAuth, and SSO
  • Secrets management with Vault and cloud secret stores

Scaling Apache Airflow

  • Parallelism, concurrency, and task queues
  • Using CeleryExecutor and KubernetesExecutor
  • Deploying Airflow on Kubernetes with Helm

Best Practices for Production

  • Version control and CI/CD for DAGs
  • Testing and debugging DAGs
  • Maintaining reliability and performance at scale

Troubleshooting and Optimization

  • Debugging failed DAGs and tasks
  • Optimizing DAG performance
  • Common pitfalls and how to avoid them

Summary and Next Steps

Requirements

  • Experience with Python programming
  • Familiarity with data engineering or DevOps concepts
  • Understanding of ETL or workflow orchestration

Audience

  • Data scientists
  • Data engineers
  • DevOps and infrastructure engineers
  • Software developers
 21 Hours

Number of participants


Price per participant

Testimonials (7)

Upcoming Courses

Related Categories