问题描述
airflow.cfg中的设置catchup_by_default = False似乎不起作用。另外,向DAG中添加catchup = False也不起作用。
The setting catchup_by_default=False in airflow.cfg does not seem to work. Also adding catchup=False to the DAG doesn't work neither.
这里是重现问题的方法。我总是从运行 airflow resetdb
开始。取消暂停后,任务便开始回填。
Here's how to reproduce the issue. I always start from a clean slate by running airflow resetdb
. As soon as I unpause the dag, the tasks start to backfill.
以下是该设置。我只是使用。
Here's the setup for the dag. I'm just using the tutorial example.
default_args = {
"owner": "airflow",
"depends_on_past": False,
"start_date": datetime(2018, 9, 16),
"email": ["[email protected]"],
"email_on_failure": False,
"email_on_retry": False,
"retries": 1,
"retry_delay": timedelta(minutes=5),
}
dag = DAG("tutorial", default_args=default_args, schedule_interval=timedelta(1), catchup=False)
推荐答案
就像@dlamblin一样,并且在也是。Airflow会为最近的有效间隔创建一个DagRun。 catchup = False
将指示调度程序仅为DAG间隔系列的最新实例创建DAG运行。
Like @dlamblin mentioned and as mentioned in the docs too Airflow would create a single DagRun for the most recent valid interval. catchup=False
will instruct the scheduler to only create a DAG Run for the most current instance of the DAG interval series.
虽然在使用时有一个 timedelta
表示 schedule_interval
,而不是CRON表达式或CRON预设。这已在Airflow Master中通过。我们将通过此修复程序发布Airflow 1.10.11。
Although there was a BUG when using a timedelta
for schedule_interval
instead of a CRON expression or CRON preset. This has been fixed in Airflow Master with https://github.com/apache/airflow/pull/8776. We will release Airflow 1.10.11 with this fix.
这篇关于如何阻止DAG回填? catchup_by_default = False和catchup = False似乎不起作用,Airflow Scheduler无法回填的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!