Guide to Installing and Configuring Apache Airflow 3.2.0 with PostgreSQL and Running Your First DAG

Published: (May 2, 2026 at 02:46 PM EDT)
5 min read
Source: Dev.to

Source: Dev.to

Introduction

As a data engineer, you may have recently learned about Apache Airflow, what it is, and how it orchestrates and automates data workflows.
The next step is gaining hands‑on experience by setting it up in your own environment.

This article provides a step‑by‑step guide to:

  • Installing and configuring Apache Airflow
  • Connecting it to PostgreSQL
  • Running your first DAG

By the end, you will have a fully functional Airflow environment ready for building and managing data pipelines.

We will follow the official installation guide.

Prerequisites

  • A Linux environment (e.g., a Linux VPS)
  • Python 3 installed
sudo apt install python-is-python3   # makes `python` point to Python 3

1. Set the Airflow home directory

Airflow uses ~/airflow by default, but you can choose another location.
Set the environment variable before installing Airflow:

export AIRFLOW_HOME=~/airflow

2. Create a project folder and virtual environment

cd ~                     # go to your home directory
mkdir airflow && cd airflow
python -m venv airflow_venv
source airflow_venv/bin/activate

Upgrade pip:

pip install --upgrade pip

3. Install Apache Airflow

Specify the Airflow version you want (example: 3.2.0) and the matching Python‑version constraints:

pip install apache-airflow[celery]==3.2.0 \
    --constraint https://raw.githubusercontent.com/apache/airflow/constraints-3.2.0/constraints-3.12.txt

Wait a few seconds for the installation to finish, then verify:

airflow version

4. Starting Airflow

4.1 Using Airflow Standalone (quick start)

airflow standalone

This command starts all components, but logs are printed to the terminal, blocking further use.

Run it in the background and redirect logs:

nohup airflow standalone > airflow.log 2>&1 &

Check the processes:

ps aux | grep airflow

Open the web UI at http://<your-ip>:8080 (e.g., http://102.209.32.65:8080).

4.2 Running components manually

If you prefer to start each service yourself:

airflow db migrate

airflow users create \
    --username admin \
    --firstname Peter \
    --lastname Parker \
    --role Admin \
    --email spiderman@superhero.org

airflow api-server --port 8080
airflow scheduler
airflow dag-processor
airflow triggerer

Note: In Airflow 3+ the above commands require the Flask‑AppBuilder (FAB) auth manager.

Enable FAB auth manager

Edit airflow.cfg (default location: $AIRFLOW_HOME/airflow.cfg):

nano $AIRFLOW_HOME/airflow.cfg
# add/ensure:
auth_manager = airflow.providers.fab.auth_manager.fab_auth_manager.FABAuthManager

If you encounter

ModuleNotFoundError: No module named 'airflow.providers.fab'

install the missing provider:

pip install apache-airflow-providers-fab

Run the migration again:

airflow db migrate

Create the admin user (if not already done) and start the services in the background:

nohup airflow api-server --port 8080 > api-server.log 2>&1 &
nohup airflow scheduler               > scheduler.log 2>&1 &
nohup airflow dag-processor           > dag-processor.log 2>&1 &
nohup airflow triggerer               > triggerer.log 2>&1 &

Airflow is now reachable via the browser UI.

5. Adjusting Airflow configuration

Before editing, stop any running Airflow processes:

pkill -9 airflow

Open the configuration file:

nano $AIRFLOW_HOME/airflow.cfg

Common changes (optional)

SettingDesired valueComment
dags_folder/root/workflowsLocation where you will store your DAG files
default_timezoneyour/local/timezone (e.g., Europe/Paris)Align timestamps with your region
executorLocalExecutorUse when running locally and you need parallel tasks
sql_alchemy_connpostgresql+psycopg2://user:password@localhost:5432/airflowdbPoint to an external PostgreSQL instance
load_examplesFalseDisable the example DAGs that ship with Airflow

Save (Ctrl+S) and exit (Ctrl+X).

6. Install database drivers

Inside the activated virtual environment:

pip install psycopg2-binary   # PostgreSQL driver
pip install asyncpg           # Async PostgreSQL driver (optional but recommended)

Run the migration to create Airflow tables in the new database:

airflow db migrate

7. Add your first DAG

Create the directory you pointed dags_folder to (e.g., /root/workflows) and add a simple DAG:

mkdir -p /root/workflows
cd /root/workflows
nano simple.py

Paste the following Python code and save the file:

from airflow import DAG
from datetime import datetime, timedelta
from airflow.providers.standard.operators.python import PythonOperator

def say_hello():
    print("Hello from Airflow!")

default_args = {
    "owner": "airflow",
    "depends_on_past": False,
    "retries": 1,
    "retry_delay": timedelta(minutes=5),
}

with DAG(
    dag_id="hello_world",
    start_date=datetime(2024, 1, 1),
    schedule_interval="@daily",
    default_args=default_args,
    catchup=False,
) as dag:

    hello_task = PythonOperator(
        task_id="say_hello",
        python_callable=say_hello,
    )

After saving, refresh the Airflow UI – the hello_world DAG should appear and be ready to run.

🎉 You now have a fully functional Apache Airflow installation, connected to PostgreSQL, and running your first DAG! 🎉

Feel free to explore more complex DAGs, integrate additional providers, and scale your executor as needed. Happy data engineering!

Simple Airflow DAG Example

from datetime import datetime, timedelta

from airflow import DAG
from airflow.operators.python import PythonOperator

def say_hello():
    print("Hello from Airflow!")

def say_goodbye():
    print("Goodbye from Airflow!")

with DAG(
    dag_id="simple_dag",
    start_date=datetime(2026, 1, 1),
    schedule_interval=timedelta(minutes=5),
    catchup=False,
) as dag:

    hello_task = PythonOperator(
        task_id="hi",
        python_callable=say_hello,
    )

    goodbye_task = PythonOperator(
        task_id="bye",
        python_callable=say_goodbye,
    )

    hello_task >> goodbye_task

Note: Airflow automatically picks up the DAG and loads it. DAGs are listed in the DAGs section of the UI.

Viewing the DAG

  • Click on the DAG name to see its details, including run history and success/failure status.

Recap

In this article you have:

  • Successfully installed and configured Apache Airflow 3.2.0.
  • Connected Airflow to a PostgreSQL backend.
  • Explored two ways of launching Airflow:
    1. The simplified standalone approach.
    2. A production‑style setup with manually created users and individual Airflow services.
  • Made essential configuration changes in the airflow.cfg file.
  • Deployed your first DAG into the Airflow environment.

With this foundation you now have a functional orchestration platform capable of scheduling, monitoring, and managing data workflows. As you continue learning Airflow, you can dive into more advanced topics such as:

  • Complex task dependencies
  • Advanced scheduling strategies
  • Integrations with cloud platforms
  • Building production‑grade ETL and data‑engineering pipelines

Happy orchestrating!

0 views
Back to Blog

Related posts

Read more »

DAG Workflow Engine

DAG Workflow Engine A production-ready DAG Directed Acyclic Graph workflow engine driven by a YAML DSL. Validates, executes, and visualizes workflows with supp...

A Beginner's Guide to Apache Airflow 3

Data Orchestration & Apache Airflow – A Beginner’s Guide If the terms orchestration or Apache Airflow sound like intimidating industry jargon, this article wil...

Getting Started with Python

Today I started learning Python, and I explored some fundamental concepts that helped me understand how Python actually works behind the scenes. What is Python?...