部署方式:docker+airflow+mysql+LocalExecutor

使用airflow的docker镜像

https://hub.docker.com/r/puckel/docker-airflow

使用默认的sqlite+SequentialExecutor启动:

将容器中的airflow.cfg拷贝出来修改

尝试使用自定义airflow.cfg

其中修改sql_alchemy_conn为mysql,修改executor = LocalExecutor

发现使用的还是SequentialExecutor

查看Dockerfile:docker-airflow/Dockerfile

发现最后启动的脚本是entrypoint.sh

查看entrypoint.sh:docker-airflow/script/entrypoint.sh

1)取环境变量EXECUTOR(取值为Sequential、Local等)来构造环境变量AIRFLOW__CORE__EXECUTOR;
2)如果AIRFLOW__CORE__EXECUTOR不是SequentialExecutor,就等待postgres(这里强制依赖postgres);
3)如果启动参数为webserver,同时AIRFLOW__CORE__EXECUTOR=LocalExecutor,自动启动scheduler;

Due to Airflow’s automatic environment variable expansion, you can also set the env var AIRFLOW__CORE__* to temporarily overwrite airflow.cfg.

由于环境变量优先级高于airflow.cfg,所以即使修改了airflow.cfg中executor=LocalExecutor,实际使用的还是SequentialExecutor;将容器中的entrypoint.sh拷贝出来修改

注释掉以下行

启动命令

虽然是单点,但是配合mesos+hdfs nfs可以做成高可用用于生产环境;

参考:
https://github.com/puckel/docker-airflow

05-02 08:48