Skip to content

Conversation

@TobKed
Copy link
Contributor

@TobKed TobKed commented Oct 9, 2020


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added the provider:google Google (including GCP) related issues label Oct 9, 2020
@TobKed TobKed force-pushed the dataflow-add-drain-option branch from 49efb51 to 98673d1 Compare October 9, 2020 12:06
@TobKed TobKed marked this pull request as draft October 9, 2020 12:07
@TobKed TobKed changed the title Add drain option when canceling Dataflow pipelines [WIP] Add drain option when canceling Dataflow pipelines Oct 9, 2020
@TobKed TobKed mentioned this pull request Oct 9, 2020
6 tasks
@github-actions
Copy link

github-actions bot commented Oct 9, 2020

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataflowJobStatus.JOB_STATE_DRAINED instead of DataflowJobType.JOB_TYPE_STREAMING ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed.

Copy link
Member

@aaltay aaltay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you.

I know little about airflow, 2 clarifying questions:

  • Would users generally decide on drain vs cancel when they are starting the operation? (Would they change their minds before onkill?)
  • Would they want to wait until drain is completed? Drain may or may not fully complete in 5 minutes timeout window.

@TobKed TobKed force-pushed the dataflow-add-drain-option branch from 98673d1 to 8e5a516 Compare October 12, 2020 17:09
@TobKed
Copy link
Contributor Author

TobKed commented Oct 13, 2020

* Would users generally decide on drain vs cancel when they are starting the operation? (Would they change their minds before onkill?)

Users define this parameter in the code which defines the DAG. AFAIK when DAG is running it is not possible to change this parameters afterwards. User can edit DAG and run it again with the new parameters.
@turbaszek could you confirm am I right, please?

* Would they want to wait until drain is completed? Drain may or may not fully complete in 5 minutes timeout window.

I created separate PR to handle timeout, I think it is good idea to allow configure it so users could decide what timeout is suitable for them.
#11501

@TobKed TobKed force-pushed the dataflow-add-drain-option branch from 8e5a516 to 8525bf9 Compare October 14, 2020 09:44
@TobKed TobKed changed the title [WIP] Add drain option when canceling Dataflow pipelines Add drain option when canceling Dataflow pipelines Oct 14, 2020
@TobKed
Copy link
Contributor Author

TobKed commented Oct 14, 2020

Thanks @aaltay
I rebased on the latest master and added tests.
cc @mik-laj @turbaszek @potiuk

@TobKed TobKed marked this pull request as ready for review October 14, 2020 10:19
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@TobKed TobKed force-pushed the dataflow-add-drain-option branch 3 times, most recently from 8e51958 to 07e3e33 Compare October 22, 2020 07:06
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@TobKed TobKed force-pushed the dataflow-add-drain-option branch 2 times, most recently from a34b65e to ee87cda Compare October 23, 2020 11:03
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@TobKed TobKed force-pushed the dataflow-add-drain-option branch from c701cd2 to 766a251 Compare October 23, 2020 16:08
@TobKed TobKed force-pushed the dataflow-add-drain-option branch from 766a251 to 5a17511 Compare October 23, 2020 16:13
@mik-laj mik-laj merged commit e5713e0 into apache:master Oct 29, 2020
@mik-laj mik-laj deleted the dataflow-add-drain-option branch October 29, 2020 13:22
michalmisiewicz pushed a commit to michalmisiewicz/airflow that referenced this pull request Oct 30, 2020
* Add drain option when cancel Dataflow pipelines

* fixup! Add drain option when cancel Dataflow pipelines

* fixup! fixup! Add drain option when cancel Dataflow pipelines

* fixup! fixup! fixup! Add drain option when cancel Dataflow pipelines
szn pushed a commit to szn/airflow that referenced this pull request Nov 1, 2020
* Add drain option when cancel Dataflow pipelines

* fixup! Add drain option when cancel Dataflow pipelines

* fixup! fixup! Add drain option when cancel Dataflow pipelines

* fixup! fixup! fixup! Add drain option when cancel Dataflow pipelines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants