How to install Airflow on Ubuntu

Apache Airflow is a platform for analytics and workflow applications. It manages multi-stage pipelines that consist of tasks that run on various schedules, using both cron-like time-based jobs and event-driven execution models.

Using Apache Airflow, an organization can design, implement, schedule, monitor, and maintain highly available pipelines that include batch processing, continuous processing, staging, and other custom scheduling strategies.

Before getting any further, we assume that you have a basic understanding of how the Linux shell works and how we send commands to it. You also need to know how to run terminal commands under root privileges in the safe way using sudo.

In this article, we will show you to install Apache Airflow on Ubuntu 20.04 (codename Focal Fossa). The installation instructions is applicable to any Linux distro based on Ubuntu or Debian, such as Linux Mint, ElementaryOS or Pop! OS.

Apache Airflow

Installing Pip

The standard package manager for Python is pip . It allows you to install and manage packages that aren’t part of the Python standard packages. Since Airflow is an Apache product, you’re going to need pip to install the software. In order to install pip, sequentially run the following commands in a terminal emulator.

sudo apt-get install software-properties-common sudo apt-add-repository universe sudo apt-get update sudo apt-get install python-setuptools sudo apt install python3-pip
Code language: JavaScript (javascript)

Once the installation is complete, verify the installation by checking the pip version:

pip3 --version

The version number may vary, but it will look something like this:

pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)
Code language: JavaScript (javascript)

Please note that recent Debian/Ubuntu versions have modified pip to use the “User Scheme” by default, which may come as a surprise for a few users who is already familiar with the old, global way.

Install Airflow and its dependencies on Ubuntu

Before installing Apache Airflow, make sure you have the necessary dependencies installed. Airflow uses sqlite as its default database, but you can use something more scalable like PostgreSQL or MySQL if you like.

However, if you’re just starting out and just wants to learn the basics, you can stick with sqlite to keep things simple. In that case, you can skip this part

In order to install addition support packages, run the following commands:

sudo apt-get install libmysqlclient-dev sudo apt-get install libssl-dev sudo apt-get install libkrb5-dev
Code language: JavaScript (javascript)

Once you’ve had all Airflow depencies on your system, now you can begin installing the software itself.

First of all, Airflow needs a home directory where it stores all its settings, configurations. It is usually set to ~/airflow using the following command:

export AIRFLOW_HOME=~/airflow
Code language: JavaScript (javascript)

After that, install Apache Airflow using the command below:

pip3 install apache-airflow pip3 install typing_extensions

By the time the two commands above completes, Airflow is now installed on your system. But you cannot access its interface just yet because the Airflow service is not running. In order to start Airflow, you need to initialize its database, starts its server and scheduler by executing the following commands.

# initialize the database airflow initdb # start the web server, default port is 8080 airflow webserver -p 8080
Code language: PHP (php)

Now open up another terminal window and run the following command to start the Airflow scheduler, as it must run in a separate process.

# start the scheduler, should be ran in a separate window airflow scheduler
Code language: PHP (php)

And that’s it — Apache Airflow is now running! To verify, open a web browser and go to localhost:8080 . You should be able to see the following screen once logged in.

Apache Airflow interface

We hope that the information above is useful to you. If you’re interested in advanced source editing in Visual Studio Code, check out our post on how to enable/disable word wrap in VSCodeHow to use LaTeX in VSCode or how to automatically indent your code in Visual Studio Code.

1 thought on “How to install Airflow on Ubuntu”

  1. Airflow needs a home directory where it stores all its settings, configurations. It is usually set to ~/airflow
    Who will create this and How is this is created. Once the installtio of Airflow is over using pip3 this will be automatically created ???


Leave a Comment