Skip to content

Mapped task failure causes scheduler crash #24985

@arroadie

Description

@arroadie

Apache Airflow version

2.3.3 (latest released)

What happened

When a dynamic task from a dag with retry delay set fails, the scheduler dies from an unhandled AttributeError exception from non inherited attributes.

The following stack trace was produced

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/__main__.py", line 38, in main
    args.func(args)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/cli_parser.py", line 51, in command
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/cli.py", line 99, in wrapper
    return f(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/scheduler_command.py", line 75, in scheduler
    _run_scheduler_job(args=args)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/scheduler_command.py", line 46, in _run_scheduler_job
    job.run()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/base_job.py", line 244, in run
    self._execute()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 751, in _execute
    self._run_scheduler_loop()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 839, in _run_scheduler_loop
    num_queued_tis = self._do_scheduling(session)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 921, in _do_scheduling
    callback_to_run = self._schedule_dag_run(dag_run, session)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 1163, in _schedule_dag_run
    schedulable_tis, callback_to_run = dag_run.update_state(session=session, execute_callbacks=False)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", line 68, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/dagrun.py", line 524, in update_state
    info = self.task_instance_scheduling_decisions(session)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", line 68, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/dagrun.py", line 654, in task_instance_scheduling_decisions
    schedulable_tis, changed_tis, expansion_happened = self._get_ready_tis(
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/dagrun.py", line 701, in _get_ready_tis
    if not schedulable.are_dependencies_met(session=session, dep_context=dep_context):
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", line 68, in wrapper
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1166, in are_dependencies_met
    for dep_status in self.get_failed_dep_statuses(dep_context=dep_context, session=session):
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1187, in get_failed_dep_statuses
    for dep_status in dep.get_dep_statuses(self, session, dep_context):
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/ti_deps/deps/base_ti_dep.py", line 95, in get_dep_statuses
    yield from self._get_dep_statuses(ti, session, dep_context)
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/ti_deps/deps/not_in_retry_period_dep.py", line 47, in _get_dep_statuses
    next_task_retry_date = ti.next_retry_datetime()
  File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1241, in next_retry_datetime
    if self.task.max_retry_delay:
AttributeError: 'MappedOperator' object has no attribute 'max_retry_delay'

What you think should happen instead

The scheduler should handle task definition exceptions gracefully, report the error and keep its cycle

How to reproduce

from a fresh install:

  • create a dag with retry definition
  • On a dynamic task, raise an error
  • Monitor the scheduler for retry

Operating System

Docker on MacOS bigsur

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    duplicateIssue that is duplicated

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions