KPO async logging callback #60778

johnhoran · 2026-01-19T13:26:05Z

When running kubernetes pod operator in deferred mode we have implemented support for most callbacks, however the progress_callback one was still missing. This PR implements that callback from the triggerer, along with a fallback to the operator if any logs were missing from what the triggerer reports.

When the callbacks were first added, the original commiter noted in #35714 that

support is blocked by the way of serializing the trigger kwargs, where all the provided types and methods are serialized as a string, and it's a bit difficult to recreate the original object without a custom serializer or impacting the event loop (for example provide the class path and import it in the triggerer)

We could probably do something a lot more complicated to serialize/deserialize the callbacks, but as the callbacks need to be static anyway, I'm just putting their module path as a string and importing when de-serializing the trigger There might be a need to do something more complicated, but I think this is better than nothing.

SameerMesiah97

I left a few comments. Please review.

SameerMesiah97 · 2026-01-20T19:47:43Z

providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/kubernetes_helper_functions.py

+        if mode == ExecutionMode.ASYNC:
+            return f(*args, mode=mode, **kwargs)
+        return asyncio.run(f(*args, mode=mode, **kwargs))
+


Are you confident that def serialiable_callback will not be called from within a running event loop? Since this is a helper we cannot guarantee that it will not be invoked in a context without a pre-existing event loop and this could result ina RuntimeError. I believe it would be better to not special-case ASYNC mode and let the wrapper return the function as is regardless of ExecutionMode without invoking asyncio.run(). Unless you have a strong reason to do it this way.

Well the idea was to define a helper function that could be used in a callback like

class AsyncCallback(KubernetesPodOperatorCallback): @staticmethod @serializable_callback async def progress_callback( *, line: str, **kwargs ) -> None: ...

and that the same callback could be used in the triggerer or in the operator, which is invoked when the triggerer hands back to the operator if there are remaining logs in the pod that haven't been processed. Ideally then the callback should be written in async format, as it would be blocking in the callback if it wasn't, though that obviously depends on what the callback is doing.
Invocations of these callbacks are in the operator/triggerer, I don't know of anywhere else they are used.

So the invocation of @serializable_callback is not being guarded and you are relying on the user correctly inferring the intent of the function? I agree with your motivation for introducing this helper but we cannot guarantee that it will not be called within an event loop in the operator. Is there any reason why the operator cannot do something like this:

asyncio.run(callback(...))

If that is not feasible, I believe at the very minimum, it should be:

The possibility of encountering RuntimeError should be documented very clearly (docstring and comment)
The RuntimeError should be caught in a try/except block with a more informative error message.
Below is a suggested implementation if you are still intent on keeping 2 separate modes:

def serializable_callback(f): """ Convert async callback so it can run in sync or async mode. In ASYNC mode (e.g. triggerer), the callback is expected to be awaited by the caller. In SYNC mode (e.g. operator fallback), the callback is executed via asyncio.run(); callers should ensure this is only used when no event loop is already running. """ @wraps(f) def wrapper(*args, mode: str, **kwargs): if mode == ExecutionMode.ASYNC: return f(*args, mode=mode, **kwargs) # SYNC mode owns the event loop; calling this while a loop is already # running is a hard error and indicates a misclassified execution context. try: return asyncio.run(f(*args, mode=mode, **kwargs)) except RuntimeError as e: raise RuntimeError( "Cannot call serializable_callback in SYNC mode while an event " "loop is running. Use ExecutionMode.ASYNC and await the callback " "instead." ) from e return wrapper

This will immediately inform the user of the reason for the RuntimeError and mitigate against further unsafe usage.

SameerMesiah97 · 2026-01-20T21:00:51Z

providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/utils/pod_manager.py

+                                            pod=pod,
+                                        )
+                                        if asyncio.iscoroutine(cb):
+                                            await cb


Should we really be executing arbitrary user code inside the triggerer? Even though this is async, a long-running or blocking progress callback (for example calling an external API without a timeout) can still starve the triggerer’s event loop. That at least blocks the trigger executing it, and potentially other triggers handled by the same triggerer process. This feels like a fairly big design trade-off just to support progress logging, and I’m not sure it’s worth it.

The specific use case I have in mind is in support of the watcher pattern that is being implement in astronomer cosmos for running DBT. astronomer/astronomer-cosmos#2207

With this you have a pod that's actually running DBT and an airflow task that is parsing kubernetes logs to extract DBT events and create xcom variables that are consumed by sensors. I think the parsing of the logs and setting xcom variables should be lightweight enough that it can be run from the triggerer. Implementing this from the triggerer was part of the original design of the defferred mode for KPO, but the implementation ran into issues so it was stripped back out.

I did wonder if I should check the callbacks and only pass the ones that have actually implemented progress_callbacks to make everything a little lighter.

The key phrase here is “should be lightweight”. Can we really guarantee that? Reading the code, it doesn’t seem like there are any restrictions on what a callback could execute, so we are effectively trusting users not to do any heavy lifting in them.

One possible compromise would be to enforce a small global timeout for callbacks (e.g. a few seconds at most). However, even with that, this still sets a precedent for executing arbitrary user code in the triggerer.

I agree the motivation here is solid and I can see the value of the feature, but this crosses into a fundamental triggerer design decision, which I’m not comfortable approving or disapproving unilaterally.

@jscheffl I know you requested my review — I’d be interested in your thoughts on whether this is a precedent we’re happy to set for triggerers.

SameerMesiah97

Left more comments. I think this needs some discussion.

johnhoran added 16 commits January 15, 2026 14:53

add defferred logging callback

b6a4ef4

log callbacks

22744c0

log callbacks

c3110ad

use sync logging

593a2f6

use sync logging

413c2ec

log last log time info

326ba80

log status

e955453

check type

82cd175

set post_termination_timeout

488d75e

cleanup

26028d5

remove left over code

69d98fc

fix tests

4f7efc8

tests green

66a2f63

don't follow logs

36cbfc8

check async logs

33107bd

test logging

2a1656b

johnhoran requested review from hussein-awala and jedcunningham as code owners January 19, 2026 13:26

boring-cyborg bot added area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues labels Jan 19, 2026

cleanup

32afc4f

johnhoran marked this pull request as draft January 19, 2026 15:59

johnhoran added 3 commits January 19, 2026 16:18

resolve failure

8d4a6c5

format

1771384

fix doc

f78339c

johnhoran marked this pull request as ready for review January 20, 2026 11:13

Merge branch 'main' into kpo_deferred_logging_callback

690b396

SameerMesiah97 reviewed Jan 20, 2026

View reviewed changes

Merge branch 'main' into kpo_deferred_logging_callback

f3cf176

johnhoran requested a review from jscheffl as a code owner January 26, 2026 14:31

johnhoran added 2 commits January 26, 2026 16:26

Merge branch 'main' into kpo_deferred_logging_callback

a7b4a06

Merge branch 'main' into kpo_deferred_logging_callback

3faeb21

jscheffl requested a review from SameerMesiah97 January 29, 2026 20:23

Merge branch 'main' into kpo_deferred_logging_callback

c9e2d29

SameerMesiah97 reviewed Jan 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KPO async logging callback #60778

KPO async logging callback #60778

johnhoran commented Jan 19, 2026

Uh oh!

SameerMesiah97 left a comment

Uh oh!

SameerMesiah97 Jan 20, 2026

Uh oh!

johnhoran Jan 22, 2026 •

edited

Loading

Uh oh!

SameerMesiah97 Jan 30, 2026

Uh oh!

SameerMesiah97 Jan 20, 2026

Uh oh!

johnhoran Jan 22, 2026

Uh oh!

SameerMesiah97 Jan 30, 2026

Uh oh!

SameerMesiah97 left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KPO async logging callback #60778

Are you sure you want to change the base?

KPO async logging callback #60778

Conversation

johnhoran commented Jan 19, 2026

Uh oh!

SameerMesiah97 left a comment

Choose a reason for hiding this comment

Uh oh!

SameerMesiah97 Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

johnhoran Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SameerMesiah97 Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

SameerMesiah97 Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

johnhoran Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

SameerMesiah97 Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

SameerMesiah97 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

johnhoran Jan 22, 2026 •

edited

Loading

SameerMesiah97 left a comment •

edited

Loading