Skip to content

DataprocCreateBatchOperator in deferrable mode doesn't reattach with deferment.  #32215

@kristopherkane

Description

@kristopherkane

Apache Airflow version

main (development)

What happened

The DataprocCreateBatchOperator (Google provider) handles the case when a batch_id already exists in the Dataproc API by 'reattaching' to a potentially running job.

Current reattachment logic uses the non-deferrable method even when the operator is in deferrable mode.

What you think should happen instead

The operator should reattach in deferrable mode.

How to reproduce

Create a DAG with a task of DataprocCreateBatchOperator that is long running. Make DataprocCreateBatchOperator deferrable in the constructor.

Restart local Airflow to simulate having to 'reattach' to a running job in Google Cloud Dataproc.

The operator resumes using the running job but in the code path for the non-derferrable logic.

Operating System

macOS 13.4.1 (22F82)

Versions of Apache Airflow Providers

Current main.

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions