-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Revert "enhance spark_k8s_operator (#29977)" #31716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 9a4f674. Based on the discussion found [here](apache#31183), previous changes to the Spark K8s Operator broke existing functionality and did not update the documentation for the newly enabled functionality. The Spark Sensor no longer works, XCOM no longer works on the Operator itself, and the Operator does not fail when the Spark job fails. Rather than attempt to fix or resolve the current implementation, I am reverting to the existing, documented implementation. I would propose creating a _new_ Operator with alternative functionality (one which does not need a Sensor, copies logs, etc.) if that is desired.
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
|
I don't think revert is required? The PR in question was released in April. There may be users out there who relay on it and reverting it would cause their code to break. |
I have an immediate need to have a working DAG, meaning I need to revert to an older version of the Operator (as this version does not fail when a job fails). On top of that, if the change moves forwards as-is, I will not only need to update my DAGs in the future, but will also lose features I may be relying on (the ability to have a fork in a DAG after launching a Spark job, but before it's finished). I'm definitely for an Operator that behaves in this way, but either by adding flags to the existing operator, or creating an Operator with a different name, and a deprecation schedule for the existing Operator. |
Can you do it this way rather than reverting the change? While I understand your case is impacted, it can be easily fixed by downgrading to earlier version of provider, so "properly" fixing it seems to be a better way than reverting. Reverting a change that has already been released only adds confusion, because if somone relies on what has been already released then you break the other's workflows while you are fixing yours. Can you add a flag to control the behaviour instead? |
|
I agree with @potiuk . The fact that you agree and support the code modifications made by previous PR makes even stronger case not to revert. |
|
I'm not able to spend that much time on changes like that right now. You can close this PR if it's not something you plan on accepting. I find it frustrating that non-backwards compatible changes are allowed into to "stable" API versions, as will many others. For the "Some people may be relying on it" statement... while that is possible, the current version is fundamentally broken, so those people will have jobs that fail with no notification. I believe most of those people will have ended up finding #31183 as using the code provided in the examples errors out very quickly. I understand this is already a sticky situation and there are no good solutions, this is simply the solution that I had time for, that appeared to be in line with having stable API version numbers. |
|
then maybe @blcksrx will have time to fix it since that was the #31183 change that broke it. Do you agree @blcksrx that this is "fundamentally broken" ? Can you please fix it @blcksrx ? I think if not then indeed we will have to rever it, unless someone provides a fix.
Surely @jamescnowell . It is frustrating for everyone. But maybe you do not understand the perspective and OSS spirit - it might happen. Maybe you could better do it with 80 providers, 5000 or so operators and 30 commits merged a day and multiple thousands of tests to prevent such errors from happening. Airflow has 2500 contributors and you get it for absolutely free. There are many ways you and others can give back - spending time (as the 2500 contributors) is one of the ways - and I understand you do not have more of the time (but then maybe github Sponsorship or finding someone in your company - that uses the software for free) would be a good way to give back for the countless hours people spend on contributing to Airflow so that your company can use it for free. |
|
@potiuk |
This was not intentional. The issue here is that segnificant time has passed since the release and reverting / yanking the version might cause more harm than good as there may be users that already used the modified version. Your issue is solved by simply not updating the provider till a fix is merged. |
Can you make a fix then, please ? |
|
Closing in favor of #31798 |
This reverts commit 9a4f674.
Based on the discussion found
here, previous changes
to the Spark K8s Operator broke existing functionality and did not update the documentation for the newly enabled functionality.
The Spark Sensor no longer works, XCOM no longer works on the Operator itself, and the Operator does not fail when the Spark job fails.
Rather than attempt to fix or resolve the current implementation, I am reverting to the existing, documented implementation.
I would propose creating a new Operator with alternative functionality (one which does not need a Sensor, copies logs, etc.) if that is desired.
closes: #31183