Public | Automated Build

Last pushed: 3 days ago
Short Description
Airflow is a platform to programmatically author, schedule and monitor workflows.
Full Description

docker-airflow




This repository contains Dockerfile of apache-airflow for Docker's automated build published to the public Docker Hub Registry.

Informations

/!\ If you want to use Airflow using Python 2, use TAG 1.8.1

Installation

Pull the image from the Docker repository.

    docker pull puckel/docker-airflow

Build

For example, if you need to install Extra Packages, edit the Dockerfile and then build it.

    docker build --rm -t puckel/docker-airflow .

Usage

By default, docker-airflow runs Airflow with SequentialExecutor :

    docker run -d -p 8080:8080 puckel/docker-airflow

If you want to run another executor, use the other docker-compose.yml files provided in this repository.

For LocalExecutor :

    docker-compose -f docker-compose-LocalExecutor.yml up -d

For CeleryExecutor :

    docker-compose -f docker-compose-CeleryExecutor.yml up -d

NB : If you don't want to have DAGs example loaded (default=True), you've to set the following environment variable :

LOAD_EX=n

    docker run -d -p 8080:8080 -e LOAD_EX=n puckel/docker-airflow

If you want to use Ad hoc query, make sure you've configured connections:
Go to Admin -> Connections and Edit "postgres_default" set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :

  • Host : postgres
  • Schema : airflow
  • Login : airflow
  • Password : airflow

For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :

    python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print FERNET_KEY"

Check Airflow Documentation

Install custom python package

  • Create a file "requirements.txt" with the desired python modules
  • Mount this file as a volume -v $(pwd)/requirements.txt:/requirements.txt
  • The entrypoint.sh script execute the pip install command (with --user option)

UI Links

Scale the number of workers

Easy scaling using docker-compose:

    docker-compose scale worker=5

This can be used to scale to a multi node setup using docker swarm.

Wanna help?

Fork, improve and PR. ;-)

Docker Pull Command
Owner
puckel
Source Repository

Comments (4)
cmourouvin
4 months ago

@catch22

In the *.yml you have to remove the comment for lines :

volumes:

  # - ~/docker-airflow/dags:/usr/local/airflow/dags 

Note : yourLocalPathToDags:/dockerInstanceDagPath

BTW i tested the new image with 1.8.0 and with CeleryExecutor or the other, it's loading example DAGs but when i activate in the interface nothing happens, no task scheduled. Nothing.

If i activate my proper DAG it's fine, it works. Any knwon issue with exmple DAGs ?

Regards and thanks :)

catch22
5 months ago

Newbie question here, but, where do you place your dag files? I logged onto my MobyLinuxVM (running on win10) but i'm unable to figure out where do the files go.

Thanks!
tom

cmourouvin
a year ago

Hi,

First of all, thanks for you work. Then i would like to share a problem i've found with you :


Log file isn't local. Fetching here: http://0532d0891dcd:8793/log/dmp_profilehub_billing/bash_get_logs_from_gcs_v2/2016-07-21T00:00:00

* Failed to fetch log file from worker.

I'll try to fix it and make a PR.

Regards,

harryzhu
a year ago

Running setup.py install for mysqlclient
building '_mysql' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -Dversion_info=(1,3,7,'final',1) -Dversion=1.3.7 -I/usr/include/mysql -I/usr/include/python2.7 -c _mysql.c -o build/temp.linux-x86_64-2.7/_mysql.o -DBIG_JOINS=1 -fno-strict-aliasing -g -DNDEBUG
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-z,relro -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/_mysql.o -L/usr/lib/x86_64-linux-gnu -lmysqlclient_r -lpthread -lz -lm -ldl -o build/lib.linux-x86_64-2.7/_mysql.so

Could not find .egg-info directory in install record for mysqlclient>=1.3.6 (from airflow[mysql]==1.7.0)
Successfully installed mysqlclient
Cleaning up...
(Reading database ... 15105 files and directories currently installed.)
Removing build-essential (11.7) ...
Removing libffi-dev:amd64 (3.1-2+b2) ...
Removing libkrb5-dev (1.12.1+dfsg-19+deb8u2) ...
Removing libmysqlclient-dev (5.5.49-0+deb8u1) ...
Removing libsasl2-dev (2.1.26.dfsg1-13+deb8u1) ...
Removing libssl-dev:amd64 (1.0.1k-3+deb8u5) ...
Removing python-dev (2.7.9-1) ...
Removing python-pip (1.5.6-5) ...
---> 57c96c7bcdca
Removing intermediate container f437df30f0e1
Step 13 : ADD script/entrypoint.sh ${AIRFLOW_HOME}/entrypoint.sh
script/entrypoint.sh: no such file or directory