Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pool option does not work in backfill command #23249

Closed
1 of 2 tasks
Swalloow opened this issue Apr 26, 2022 · 1 comment · Fixed by #23258
Closed
1 of 2 tasks

Pool option does not work in backfill command #23249

Swalloow opened this issue Apr 26, 2022 · 1 comment · Fixed by #23258
Labels
area:core kind:bug This is a clearly a bug

Comments

@Swalloow
Copy link
Contributor

Swalloow commented Apr 26, 2022

Apache Airflow version

2.2.4

What happened

Discussion Ref: #22201

Added the pool option to the backfill command, but only uses default_pool.
The log appears as below, but if you check the Task Instance Details / List Pool UI, default_pool is used.

[2022-03-12, 20:03:44 KST] {taskinstance.py:1244} INFO - Starting attempt 1 of 1
[2022-03-12, 20:03:44 KST] {taskinstance.py:1245} INFO - 
--------------------------------------------------------------------------------
[2022-03-12, 20:03:44 KST] {taskinstance.py:1264} INFO - Executing <Task(BashOperator): runme_0> on 2022-03-05 00:00:00+00:00
[2022-03-12, 20:03:44 KST] {standard_task_runner.py:52} INFO - Started process 555 to run task
[2022-03-12, 20:03:45 KST] {standard_task_runner.py:76} INFO - Running: ['***', 'tasks', 'run', 'example_bash_operator', 'runme_0', 'backfill__2022-03-05T00:00:00+00:00', '--job-id', '127', '--pool', 'backfill_pool', '--raw', '--subdir', '/home/***/.local/lib/python3.8/site-packages/***/example_dags/example_bash_operator.py', '--cfg-path', '/tmp/tmprhjr0bc_', '--error-file', '/tmp/tmpkew9ufim']
[2022-03-12, 20:03:45 KST] {standard_task_runner.py:77} INFO - Job 127: Subtask runme_0
[2022-03-12, 20:03:45 KST] {logging_mixin.py:109} INFO - Running <TaskInstance: example_bash_operator.runme_0 backfill__2022-03-05T00:00:00+00:00 [running]> on host 56d55382c860
[2022-03-12, 20:03:45 KST] {taskinstance.py:1429} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=example_bash_operator
AIRFLOW_CTX_TASK_ID=runme_0
AIRFLOW_CTX_EXECUTION_DATE=2022-03-05T00:00:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=backfill__2022-03-05T00:00:00+00:00
[2022-03-12, 20:03:45 KST] {subprocess.py:62} INFO - Tmp dir root location: 
 /tmp
[2022-03-12, 20:03:45 KST] {subprocess.py:74} INFO - Running command: ['bash', '-c', 'echo "example_bash_operator__runme_0__20220305" && sleep 1']
[2022-03-12, 20:03:45 KST] {subprocess.py:85} INFO - Output:
[2022-03-12, 20:03:46 KST] {subprocess.py:89} INFO - example_bash_operator__runme_0__20220305
[2022-03-12, 20:03:47 KST] {subprocess.py:93} INFO - Command exited with return code 0
[2022-03-12, 20:03:47 KST] {taskinstance.py:1272} INFO - Marking task as SUCCESS. dag_id=example_bash_operator, task_id=runme_0, execution_date=20220305T000000, start_date=20220312T110344, end_date=20220312T110347
[2022-03-12, 20:03:47 KST] {local_task_job.py:154} INFO - Task exited with return code 0
[2022-03-12, 20:03:47 KST] {local_task_job.py:264} INFO - 0 downstream tasks scheduled from follow-on schedule check

What you think should happen instead

The backfill task instance should use a slot in the backfill_pool.

How to reproduce

  1. Create a backfill_pool in UI.
  2. Run the backfill command on the example dag.
$ docker exec -it airflow_airflow-scheduler_1 /bin/bash
$ airflow dags backfill example_bash_operator -s 2022-03-05 -e 2022-03-06 \
--pool backfill_pool --reset-dagruns -y

[2022-03-12 11:03:52,720] {backfill_job.py:386} INFO - [backfill progress] | finished run 0 of 2 | tasks waiting: 2 | succeeded: 8 | running: 2 | failed: 0 | skipped: 2 | deadlocked: 0 | not ready: 2
[2022-03-12 11:03:57,574] {dagrun.py:545} INFO - Marking run <DagRun example_bash_operator @ 2022-03-05T00:00:00+00:00: backfill__2022-03-05T00:00:00+00:00, externally triggered: False> successful
[2022-03-12 11:03:57,575] {dagrun.py:590} INFO - DagRun Finished: dag_id=example_bash_operator, execution_date=2022-03-05T00:00:00+00:00, run_id=backfill__2022-03-05T00:00:00+00:00, run_start_date=2022-03-12 11:03:37.530158+00:00, run_end_date=2022-03-12 11:03:57.575869+00:00, run_duration=20.045711, state=success, external_trigger=False, run_type=backfill, data_interval_start=2022-03-05T00:00:00+00:00, data_interval_end=2022-03-06 00:00:00+00:00, dag_hash=None
[2022-03-12 11:03:57,582] {dagrun.py:545} INFO - Marking run <DagRun example_bash_operator @ 2022-03-06T00:00:00+00:00: backfill__2022-03-06T00:00:00+00:00, externally triggered: False> successful
[2022-03-12 11:03:57,583] {dagrun.py:590} INFO - DagRun Finished: dag_id=example_bash_operator, execution_date=2022-03-06T00:00:00+00:00, run_id=backfill__2022-03-06T00:00:00+00:00, run_start_date=2022-03-12 11:03:37.598927+00:00, run_end_date=2022-03-12 11:03:57.583295+00:00, run_duration=19.984368, state=success, external_trigger=False, run_type=backfill, data_interval_start=2022-03-06 00:00:00+00:00, data_interval_end=2022-03-07 00:00:00+00:00, dag_hash=None
[2022-03-12 11:03:57,584] {backfill_job.py:386} INFO - [backfill progress] | finished run 2 of 2 | tasks waiting: 0 | succeeded: 10 | running: 0 | failed: 0 | skipped: 4 | deadlocked: 0 | not ready: 0
[2022-03-12 11:03:57,589] {backfill_job.py:851} INFO - Backfill done. Exiting.

Operating System

MacOS BigSur, docker-compose

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

Follow the guide - [Running Airflow in Docker]. Use CeleryExecutor.
https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@Swalloow Swalloow added area:core kind:bug This is a clearly a bug labels Apr 26, 2022
@tirkarthi
Copy link
Contributor

In _get_ti while getting the task instance pool_override is not passed which uses the default pool. Using pool_override=args.pool will fix the issue. I am working on a PR with tests for this.

ti.refresh_from_task(task)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants