Skip to content

Conversation

@lwyszomi
Copy link
Contributor

@lwyszomi lwyszomi commented Jan 11, 2023

Fixes: #30324


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added provider:cncf-kubernetes Kubernetes (k8s) provider related issues area:providers labels Jan 11, 2023
@lwyszomi lwyszomi force-pushed the kubernetes_pod branch 3 times, most recently from c0bef5e to 76ff473 Compare January 11, 2023 13:40
@jedcunningham
Copy link
Member

@dstandish, do you remember why we didn't do this initially?

@potiuk
Copy link
Member

potiuk commented Jan 11, 2023

@dstandish, do you remember why we didn't do this initially?

My question exactly.

@dstandish
Copy link
Contributor

@dstandish, do you remember why we didn't do this initially?

My question exactly.

Backward compatibility. Previously, KPO did not use airflow conn. If we now use default, this will break things for users. We can do but must be major release.

@potiuk
Copy link
Member

potiuk commented Jan 11, 2023

Backward compatibility. Previously, KPO did not use airflow conn. If we now use default, this will break things for users. We can do but must be major release.

So it was using the bult-in "local" kubernetes (basically whatever kubectl sees ?) if None is used?

Maybe we can check for existence of the connection and fallback to default behaviour when it is not available? What's the likelihood someone will have default k8s connection set and would not want to use it ? (still likely requring major version bump and technically breaking but with far less radius blast)

@dstandish
Copy link
Contributor

Backward compatibility. Previously, KPO did not use airflow conn. If we now use default, this will break things for users. We can do but must be major release.

So it was using the bult-in "local" kubernetes (basically whatever kubectl sees ?) if None is used?

Maybe we can check for existence of the connection and fallback to default behaviour when it is not available? What's the likelihood someone will have default k8s connection set and would not want to use it ? (still likely requring major version bump and technically breaking but with far less radius blast)

i think quite likely:

airflow/airflow/utils/db.py

Lines 364 to 370 in 6d09fc7

merge_conn(
Connection(
conn_id="kubernetes_default",
conn_type="kubernetes",
),
session,
)

@dstandish
Copy link
Contributor

So it was using the bult-in "local" kubernetes (basically whatever kubectl sees ?) if None is used?

it was using the settings from core, i.e. get_kube_client

@potiuk
Copy link
Member

potiuk commented Jan 11, 2023

i think quite likely:

airflow/airflow/utils/db.py

Lines 364 to 370 in 6d09fc7

merge_conn(
Connection(
conn_id="kubernetes_default",
conn_type="kubernetes",
),
session,
)

Let me modify it slightly. Why don't we fall-back to get_kube_client when either the connection is not defined or when it is not configured (i.e no extras defined). I believe the only used connection fields are "extras" and it would require an effort to edit the connection to add any extra there - the default one will have no extras at all and we can easily detect it. A bit long shot, but maybe it is the most sensible thing to do? (just wild thought)

@dstandish
Copy link
Contributor

Hard no with falling back to get_kube_client -- no reason to bring that back. The kube hook with works with no conn Id and it's fully back compat with get kube client

Now should we ever disregard the default connection? Not sure about that

Copy link
Contributor

@dstandish dstandish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to block merge (for now) to allow some time for debate to play out re whether this is good idea. ultimately i do not veto this just want to take a moment

@dimberman
Copy link
Contributor

Hey @dstandish @potiuk can we restart this conversation? I wouldn't want this PR to fall into the abyss just because we're unsure.

FWIW I think it's fine for us to make this breaking change and then do a major update. to me this is where the magic of providers exists. If we want to release a version before that contains a deprecation warning/write a warning somewhere just in case I'm not against that, but I'd love to bring the KPO more in line with the rest of the operators.

@eladkal
Copy link
Contributor

eladkal commented Feb 18, 2023

If we want to release a version before that contains a deprecation warning/write a warning somewhere just in case I'm not against that, but I'd love to bring the KPO more in line with the rest of the operators.

That is preferred

@dstandish
Copy link
Contributor

I'm fine with using kubernetes_default by default, just should be in a major. And i don't have an opinion re whether we need to introduce warning first so I defer to y'all on that.

@eladkal
Copy link
Contributor

eladkal commented Feb 18, 2023

So if decided to go with major release directly (which is OK)
please update provider.yaml to version 6.0.0 and add entry in the provider change log

@potiuk
Copy link
Member

potiuk commented Mar 4, 2023

Needs conflict resolution now, I am afraid

@eladkal
Copy link
Contributor

eladkal commented Mar 29, 2023

@lwyszomi can you rebase and resolve conflicts?

@iJanki-gr
Copy link
Contributor

Has this change actually got into 2.6.0 release? I had to change KPO conf to include kubernetes_conn_id=None, or I would get The conn_id kubernetes_default isn't defined

@eladkal
Copy link
Contributor

eladkal commented May 5, 2023

Has this change actually got into 2.6.0 release?

This is not a core PR It's not released with apache-airlfow it's released with Kubernetes provider: apache-airflow-providers-cncf-kubernetes=6.0.0

@iJanki-gr
Copy link
Contributor

Ok but apparently it was bundled with 2.6.0 https://raw.githubusercontent.com/apache/airflow/constraints-2.6.0/constraints-3.10.txt
And in the docker image as I'm using that.

@potiuk
Copy link
Member

potiuk commented May 5, 2023

Ok but apparently it was bundled with 2.6.0 https://raw.githubusercontent.com/apache/airflow/constraints-2.6.0/constraints-3.10.txt And in the docker image as I'm using that.

From you message it's not clear if you have problem with or not. and what you are asking for. But no matter what - if you think you have a problem and needs older version of the provider, you can alwas build your own custom image where the provider is downgraded. The image we publish is the reference image and you are free to extend it as you see fit.

What's your ask really @iJanki-gr ? Since you've already checked the constraints? Is your question answered or do you expect something else here?

@eladkal
Copy link
Contributor

eladkal commented May 5, 2023

This is a breaking change that replaces the default value of the parameter. hence why it was released in a major version (6.0.0) so yet if you don't have kubernetes_defaultdefined and you don't specifically override it then it will break your dags. You need to go trough the change log when you are updating major versions of providers.

@iJanki-gr
Copy link
Contributor

Yes that's ok, but in the release notes of airflow 2.6.0 there was no mention of cncf kubernetes provider upgrade.

@iJanki-gr
Copy link
Contributor

I fixed my issue, next time i'll check the providers release notes too. Thanks

@potiuk
Copy link
Member

potiuk commented May 5, 2023

Yes that's ok, but in the release notes of airflow 2.6.0 there was no mention of cncf kubernetes provider upgrade.

Because the image bundles airflow 2.6.0 AND providers. The image is a reference one that contains airflow and latest providers, but you are in control on which providers you want to keep or use. As @eladkal mentioned, if you chose to use the image, you need to check the release notes of the providers.

@iJanki-gr
Copy link
Contributor

I will thank you and sorry for the bother.

@seeholza
Copy link

seeholza commented May 9, 2023

@iJanki-gr don't be sorry, i also just hit the same point exactly. My deployment broke updating the bundled image to airflow==2.6.0 and i didn't figure out directly checking in the cncf provider release notes.

In defense of @iJanki-gr, the breaking change announcement on the release notes does not announce the change needed to make older dags compatible, or at least i couldn't directly find it.

Since i got here only via a few corners, for the sake of google results:

My dags were breaking for KubernetesPodOperator with

The conn_id `kubernetes_default` isn't defined

Adding kubernetes_conn_id=None "fixed" the "issue".

@bmoon4
Copy link

bmoon4 commented May 9, 2023

2.6.0 KubernetesPodOperator is not working when running my dag..

@iJanki-gr
Copy link
Contributor

It works for me. Can you provide more details?

@bmoon4
Copy link

bmoon4 commented May 9, 2023

Here's the error log

[2023-05-09, 08:54:45 EDT] {pod.py:905} ERROR - The conn_id `kubernetes_default` isn't defined
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 533, in execute_sync
    self.pod_request_obj = self.build_pod_request_obj(context)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 859, in build_pod_request_obj
    "airflow_kpo_in_cluster": str(self.hook.is_in_cluster),
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 249, in is_in_cluster
    self.api_client  # so we can determine if we are in_cluster or not
  File "/usr/local/lib/python3.8/functools.py", line 967, in __get__
    val = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 257, in api_client
    return self.get_conn()
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 171, in get_conn
    in_cluster = self._coalesce_param(self.in_cluster, self._get_field("in_cluster"))
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 164, in _get_field
    if field_name in self.conn_extras:
  File "/usr/local/lib/python3.8/functools.py", line 967, in __get__
    val = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 147, in conn_extras
    connection = self.get_connection(self.conn_id)
  File "/usr/local/lib/python3.8/site-packages/airflow/hooks/base.py", line 72, in get_connection
    conn = Connection.get_connection_from_secrets(conn_id)
  File "/usr/local/lib/python3.8/site-packages/airflow/models/connection.py", line 434, in get_connection_from_secrets
    raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
airflow.exceptions.AirflowNotFoundException: The conn_id `kubernetes_default` isn't defined
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 745, in patch_already_checked
    self.client.patch_namespaced_pod(
  File "/usr/local/lib/python3.8/functools.py", line 967, in __get__
    val = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py", line 473, in client
    return self.hook.core_v1_client
  File "/usr/local/lib/python3.8/functools.py", line 967, in __get__
    val = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 261, in core_v1_client
    return client.CoreV1Api(api_client=self.api_client)
  File "/usr/local/lib/python3.8/functools.py", line 967, in __get__
    val = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 257, in api_client
    return self.get_conn()
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 171, in get_conn
    in_cluster = self._coalesce_param(self.in_cluster, self._get_field("in_cluster"))
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 164, in _get_field
    if field_name in self.conn_extras:
  File "/usr/local/lib/python3.8/functools.py", line 967, in __get__
    val = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/hooks/kubernetes.py", line 147, in conn_extras
    connection = self.get_connection(self.conn_id)
  File "/usr/local/lib/python3.8/site-packages/airflow/hooks/base.py", line 72, in get_connection
    conn = Connection.get_connection_from_secrets(conn_id)
  File "/usr/local/lib/python3.8/site-packages/airflow/models/connection.py", line 434, in get_connection_from_secrets
    raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
airflow.exceptions.AirflowNotFoundException: The conn_id `kubernetes_default` isn't defined

@iJanki-gr

Im going to test Adding kubernetes_conn_id=None in /usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py

@iJanki-gr
Copy link
Contributor

Yes that should fix it

kubernetes_conn_id=None

@potiuk
Copy link
Member

potiuk commented May 9, 2023

The kubernetes provider as of 6.0.0 releases uses kubernetes connection to be defined https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/index.html#breaking-changes and you should switch to it. That was a breaking change in the provider. Have you checked the release notes ?

@potiuk
Copy link
Member

potiuk commented May 9, 2023

And your error tells exactly this:

airflow.exceptions.AirflowNotFoundException: The conn_id `kubernetes_default` isn't defined

@potiuk
Copy link
Member

potiuk commented May 9, 2023

The whole discussion above explaines all about this in case you have not read it @bmoon4

@karunpoudel-chr
Copy link
Contributor

I am getting this error too.
Should it include logic to fallback to None if conn_id kubernetes_default doesn't exists. Otherwise we will have to include kubernetes_conn_id=None in 100s of our tasks

@bmoon4
Copy link

bmoon4 commented May 9, 2023

i added kubernetes_conn_id=None explicitly in KubernetesPodOperator parameter and seems working :)
Sorry I only checked release note in this git org repo..roughly ..@potiuk
https://github.com/apache/airflow/releases/tag/2.6.0

@hussein-awala
Copy link
Member

@karunpoudel-chr breaking change is something which stops the code from compiling / running, so this is normal.

If you want to avoid adding kubernetes_conn_id=None to all the tasks, you can create the connection kubernetes_default and configure it to access the cluster, it's the recommended way to configure the hooks.

dstandish added a commit to astronomer/airflow that referenced this pull request May 10, 2023
It's common when using KPO / K8s hook to want to simply use the RBAC of the cluster.  This was the default behavior prior to apache#28848.  After that change, users are forced to add a k8s connection or their dags will break.

To fix this, in the special case where conn_id=="kubernetes_default", if the conn is missing, we ignore the failure.
@dstandish
Copy link
Contributor

I offer a solution in #31187

If kubernetes_default doesn't exist, ignore it. If other conn id doesn't exist, fail.

jedcunningham pushed a commit that referenced this pull request May 10, 2023
It's common when using KPO / K8s hook to want to simply use the RBAC of the cluster.  This was the default behavior prior to #28848.  After that change, users are forced to add a k8s connection or their dags will break.

To fix this, in the special case where conn_id=="kubernetes_default", if the conn is missing, we ignore the failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KPO deferrable needs kubernetes_conn_id while non deferrable does not