Skip to content

Commit

Permalink
Increase the default min_file_process_interval to decrease CPU Us…
Browse files Browse the repository at this point in the history
…age (apache#13664)

With the previous default of `0`, the CPU Usage mostly stays around 100.
As in Airflow 2.0.0, the scheduling decisions have been moved out from
DagFileProcessor to Scheduler, we can keep this number high.

closes apache#13637
  • Loading branch information
kaxil authored Jan 14, 2021
1 parent 9536ad9 commit e4b8ee6
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 4 deletions.
9 changes: 9 additions & 0 deletions UPDATING.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,15 @@ However, it was unintentionally changed to `8` in 2.0.0.

From Airflow 2.0.1, we revert to the old default of `16`.

### Default `[scheduler] min_file_process_interval` is changed to `30`

The default value for `[scheduler] min_file_process_interval` was `0`,
due to which the CPU Usage mostly stayed around 100% as the DAG files are parsed
constantly.

From Airflow 2.0.0, the scheduling decisions have been moved from
DagFileProcessor to Scheduler, so we can keep the default a bit higher: `30`.

## Airflow 2.0.0

### The experimental REST API is disabled by default
Expand Down
6 changes: 4 additions & 2 deletions airflow/config_templates/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1648,11 +1648,13 @@
default: "1"
- name: min_file_process_interval
description: |
after how much time (seconds) a new DAGs should be picked up from the filesystem
Number of seconds after which a DAG file is parsed. The DAG file is parsed every
``min_file_process_interval`` number of seconds. Updates to DAGs are reflected after
this interval. Keeping this number low will increase CPU usage.
version_added: ~
type: string
example: ~
default: "0"
default: "30"
- name: dag_dir_list_interval
description: |
How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
Expand Down
6 changes: 4 additions & 2 deletions airflow/config_templates/default_airflow.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -814,8 +814,10 @@ num_runs = -1
# The number of seconds to wait between consecutive DAG file processing
processor_poll_interval = 1

# after how much time (seconds) a new DAGs should be picked up from the filesystem
min_file_process_interval = 0
# Number of seconds after which a DAG file is parsed. The DAG file is parsed every
# ``min_file_process_interval`` number of seconds. Updates to DAGs are reflected after
# this interval. Keeping this number low will increase CPU usage.
min_file_process_interval = 30

# How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
dag_dir_list_interval = 300
Expand Down

0 comments on commit e4b8ee6

Please sign in to comment.