Skip to content

Airflow-2.6.3 Mysql: Deadlock found when trying to get lock; try restarting transaction #35144

@liangriyu

Description

@liangriyu

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

2023-10-24 00:10:20,620 ERROR - Exception when executing SchedulerJob._run_scheduler_loop
Traceback (most recent call last):
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/airflow/jobs/scheduler_job_runner.py", line 835, in _execute
    self._run_scheduler_loop()
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/airflow/jobs/scheduler_job_runner.py", line 969, in _run_scheduler_loop
    num_queued_tis = self._do_scheduling(session)
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/airflow/jobs/scheduler_job_runner.py", line 1051, in _do_scheduling
    callback_tuples = self._schedule_all_dag_runs(guard, dag_runs, session)
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/airflow/utils/retries.py", line 78, in wrapped_function
    for attempt in run_with_db_retries(max_retries=retries, logger=logger, **retry_kwargs):
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/tenacity/__init__.py", line 347, in __iter__
    do = self.iter(retry_state=retry_state)
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/tenacity/__init__.py", line 314, in iter
    return fut.result()
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/airflow/utils/retries.py", line 87, in wrapped_function
    return func(*args, **kwargs)
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/airflow/jobs/scheduler_job_runner.py", line 1355, in _schedule_all_dag_runs
    for dag_run in dag_runs:
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2901, in __iter__
    result = self._iter()
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2916, in _iter
    result = self.session.execute(
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1716, in execute
    conn = self._connection_for_bind(bind)
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1555, in _connection_for_bind
    return self._transaction._connection_for_bind(
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 724, in _connection_for_bind
    self._assert_active()
  File "/usr/local/miniconda3/envs/airflow/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 604, in _assert_active
    raise sa_exc.PendingRollbackError(
sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (mysql.connector.errors.InternalError) 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
[SQL: UPDATE dag_run SET last_scheduling_decision=%(last_scheduling_decision)s, updated_at=%(updated_at)s WHERE dag_run.id = %(dag_run_id)s]
[parameters: ({'last_scheduling_decision': datetime.datetime(2023, 10, 23, 16, 10, 20, 153406), 'updated_at': datetime.datetime(2023, 10, 23, 16, 10, 20, 389092), 'dag_run_id': 154}, {'last_scheduling_decision': datetime.datetime(2023, 10, 23, 16, 10, 19, 835645), 'updated_at': datetime.datetime(2023, 10, 23, 16, 10, 20, 389109), 'dag_run_id': 157}, {'last_scheduling_decision': datetime.datetime(2023, 10, 23, 16, 10, 20, 81685), 'updated_at': datetime.datetime(2023, 10, 23, 16, 10, 20, 389115), 'dag_run_id': 158}, {'last_scheduling_decision': datetime.datetime(2023, 10, 23, 16, 10, 20, 116207), 'updated_at': datetime.datetime(2023, 10, 23, 16, 10, 20, 389119), 'dag_run_id': 159})]
(Background on this error at: https://sqlalche.me/e/14/2j85) (Background on this error at: https://sqlalche.me/e/14/7s2a)
2023-10-24 00:10:21,697 INFO - Sending Signals.SIGTERM to group 31151. PIDs of all processes in the group: [32813, 32842, 31151]
2023-10-24 00:10:21,698 INFO - Sending the signal Signals.SIGTERM to group 31151
2023-10-24 00:10:21,991 INFO - Process psutil.Process(pid=32813, status='terminated', started='00:10:20') (32813) terminated with exit code None
2023-10-24 00:10:22,084 INFO - Process psutil.Process(pid=32842, status='terminated', started='00:10:20') (32842) terminated with exit code None
2023-10-24 00:10:22,136 INFO - Process psutil.Process(pid=31151, status='terminated', exitcode=0, started='00:09:41') (31151) terminated with exit code 0
2023-10-24 00:10:22,138 INFO - Exited execute loop

What you think should happen instead

No response

How to reproduce

When multiple dag tasks fail simultaneously

Operating System

centos7

Versions of Apache Airflow Providers

apache-airflow 2.6.3
apache-airflow-providers-celery 3.2.1
apache-airflow-providers-common-sql 1.5.2
apache-airflow-providers-datadog 3.3.1
apache-airflow-providers-ftp 3.4.2
apache-airflow-providers-http 4.4.2
apache-airflow-providers-imap 3.2.2
apache-airflow-providers-mysql 5.1.1
apache-airflow-providers-redis 3.2.1
apache-airflow-providers-sqlite 3.4.2

Deployment

Virtualenv installation

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Can't ReproduceThe problem cannot be reproducedaffected_version:2.6Issues Reported for 2.6area:MetaDBMeta Database related issues.kind:bugThis is a clearly a bugpending-responsestaleStale PRs per the .github/workflows/stale.yml policy file

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions