Skip to content

Conversation

@AchimGaedkeLynker
Copy link
Contributor

This PR implements the termination of instances in creation when a task instance is stopped (externally marked for retry) before the instance IDs were returned. It uses the on_kill method to shut down EC2 instances.

Use case/problem:

The EC2CreateInstanceOperator is often used in a setup task of https://airflow.apache.org/docs/apache-airflow/stable/howto/setup-and-teardown.html . If the task inside the DAG is cancelled, a (potentially) ongoing EC2CreateInstanceOperator task is killed. This is likely if the AWS initialization and/or the post-processing step take considerable time (large instance storage, long cloud-init processes) and the wait_for_completion is True.

Without the on_kill cleanup code the partially initialized instances (i.e. the instance ids were not sent to XCom) will not be terminated by the tear-down task.

This happened several times to me and I finally found the cause and fixed it.

Alternative: Split the setup into many small tasks: EC2CreateInstance as a setup (wo wait_for_completion), wait on instance state, wait on cloud-init setup to finish, ... EC2TerminateInstance as teardown.

Questions:

  • Is it required to delete the _on_kill_instance_ids - what's coming after execute in the TI lifecycle?
  • How to unit-test this?

Copy link
Contributor

@dirrao dirrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. Can you add the test cases for this?

@AchimGaedkeLynker
Copy link
Contributor Author

AchimGaedkeLynker commented Jan 17, 2024

Can you add the test cases for this?

Absolutely, I just have to find out how to test on_kill... It's on my todo list. Will dig into the AWS mock setup...

@vincbeck
Copy link
Contributor

Can you add the test cases for this?

Absolutely, I just have to find out how to test on_kill... It's on my todo list. Will dig into the AWS mock setup...

You can call the on_kill function as any other function ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants