Skip to content

Conversation

@stelsemeyer-m60
Copy link
Contributor


Parametrize the interval in which the Kubernetes pod status is polled when launching a new pod.

When using serverless Kubernetes services like Google GKE Autopilot the pod startup time is sometimes expected to be longer due to a cold start. Therefore the logs might be spammed due to the default checks every second (see below), and a lower check frequency might be desired

[2023-05-02, 05:33:22 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:23 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:24 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:25 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:26 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:27 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:28 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:29 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:30 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:31 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:32 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:33 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:34 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:35 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:36 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
[2023-05-02, 05:33:37 UTC] {pod_manager.py:187} WARNING - Pod not yet started: some-pod-he2j8139
...

@boring-cyborg boring-cyborg bot added area:providers kind:documentation provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:google Google (including GCP) related issues labels Sep 9, 2023
@stelsemeyer-m60
Copy link
Contributor Author

Refers to #31008.
Finishing off work in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add an else after

if delta.total_seconds() >= self.startup_timeout:
message = (
f"Pod took longer than {self.startup_timeout} seconds to start. "
"Check the pod events in kubernetes to determine why."
)
yield TriggerEvent(
{
"name": self.pod_name,
"namespace": self.pod_namespace,
"status": "timeout",
"message": message,
}
)
return

and await asyncio.sleep(self.startup_check_interval)

@stelsemeyer-m60 stelsemeyer-m60 marked this pull request as ready for review September 12, 2023 23:02
@eladkal eladkal requested a review from hussein-awala October 1, 2023 09:15
@eladkal eladkal merged commit 2b0bfea into apache:main Nov 1, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Nov 1, 2023

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Nov 10, 2023
…pache#34231)

* add startup_check_interval_seconds

* change default value in method

* fix static checks, add missing param, fix typo

* default is 1s

* fix outdated docs

* add test to check time.sleep is called with specific value

* add more documentation

* rephrase

* add startup_check_interval_seconds

* change default value in method

* fix static checks, add missing param, fix typo

* default is 1s

* fix outdated docs

* add test to check time.sleep is called with specific value

* add more documentation

* rephrase

* add sleep in else clause

* Update airflow/providers/cncf/kubernetes/triggers/pod.py

---------

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Co-authored-by: Hussein Awala <hussein@awala.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers kind:documentation provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants