-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Closed
Labels
area:providerskind:bugThis is a clearly a bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yetlabel for new issues that we didn't triage yetprovider:cncf-kubernetesKubernetes (k8s) provider related issuesKubernetes (k8s) provider related issues
Description
Apache Airflow version
2.8.2
If "Other Airflow 2 version" selected, which one?
2.8.3rc1
What happened?
I'm running a spark-pi example using the SparkKubernetesOperator:
task_id='spark_pi_submit',
namespace='lot1-spark-jobs',
application_file="/example_spark_kubernetes_operator_pi.yaml",
kubernetes_conn_id="kubernetes_default",
do_xcom_push=True,
in_cluster=True,
delete_on_termination=True,
dag=dag
)
It was running fine on 2.8.1. After upgrading to airflow 2.8.2 I got the following error:
│ kube_client=self.client, │
│ ^^^^^^^^^^^ │
│ File "/usr/local/lib/python3.11/functools.py", line 1001, in __get__ │
│ val = self.func(instance) │
│ ^^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 250, in client │
│ return self.hook.core_v1_client │
│ ^^^^^^^^^ │
│ File "/usr/local/lib/python3.11/functools.py", line 1001, in __get__ │
│ val = self.func(instance) │
│ ^^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 242, in hook │
│ or self.template_body.get("kubernetes", {}).get("kube_config_file", None), │
│ ^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 198, in template_body │
│ return self.manage_template_specs() │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/home/airflow/.local/lib/python3.11/site-packages/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py", line 127, in manage_template_specs │
│ template_body = _load_body_to_dict(open(self.application_file)) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ FileNotFoundError: [Errno 2] No such file or directory: 'apiVersion: "sparkoperator.k8s.io/v1beta2"\nkind: SparkApplication\nmetadata:\n name: spark-pi\n namespace: lot1-spark-jobs\ns │
│ [2024-03-10T10:29:15.613+0000] {taskinstance.py:1149} INFO - Marking task as UP_FOR_RETRY. dag_id=spark_pi, task_id=spark_pi_submit, execution_date=20240310T102910, start_date=20240310T │
It looks like self.application_file eventually contains the content of the file it point to.
I suspect it was caused by changes introduced by PR-22253. I'm quite new to Airflow and Python but my guess is that "application_file" property hasn't to be managed as a template_property since template representations where moved to template_body.
What you think should happen instead?
No response
How to reproduce
Given my understanding of the issue, a very simple example of SparkKubernetesOperator using application_file property should reproduce this issue.
Operating System
kind kubernetes
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
area:providerskind:bugThis is a clearly a bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yetlabel for new issues that we didn't triage yetprovider:cncf-kubernetesKubernetes (k8s) provider related issuesKubernetes (k8s) provider related issues