-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[AIRFLOW-6557] Add test for newly added fields in BaseOperator #7162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-6557] Add test for newly added fields in BaseOperator #7162
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having said that I'm now not sure this last part is true.
So since DAG serialization schema version 1.0 is now released with 1.10.7 we might have to start thinking about versioning of this schema and migrate/update. But the important bit is that a row that exists in the DB right now must continue to work and that is what the "ground truth" (not the best name) is meant to represent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think then we need to decide what to do in this case and tell @lokeshlal what to do in this case: #7162 (and update this description).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the goal of the ground truth test is to ensure that DAGs that are currently in people's databases get correctly handled as the model changes. The important bit I think is that this data can't change sa it's fixed in a DB.
But some changes such as adding an optional field to Operators/Tasks is allowed, as the JSON Schema allows unknown fields at the task level.
But by default BaseOperator.get_serialized_fields will include extra fields. So I guess the only check I would like here is that new fields from base operator get specified with a type in the JSON schema, and that it sohuld be optional (if it's not optional then we would have to bump the version field in our schema. But we haven't worked out how to handle versioning of schemas and blobs yet!) -- which is sadly harder to test for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar rules as for protobuf. I will modify the message. We cannot test a lot of that fully automatically yet but we can at least provide the right message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should good enough for the time being
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, and switch away from JSON Schema. Geez its 2020 :-P
JSON you only use to interface externally.
Codecov Report
@@ Coverage Diff @@
## master #7162 +/- ##
==========================================
- Coverage 85.4% 85.09% -0.32%
==========================================
Files 723 723
Lines 39537 39546 +9
==========================================
- Hits 33768 33651 -117
- Misses 5769 5895 +126
Continue to review full report at Codecov.
|
8ad2704 to
2fc738f
Compare
kaxil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to merge it once we have GREEN CI
|
:) The bot caught you @potiuk |
Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added.
2fc738f to
2acd697
Compare
Good bot! Nice bot! |
| `airflow/serialization/schema.json` - they should have correct type defined there. | ||
|
|
||
| Note that we do not support versioning yet so you should only add optional fields. We do not support | ||
| versioning yet so you should make sure all fields added to the BaseOperator should be optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've duplicated(ish) the message here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just dropped a thought. I am not happy with the choice for JSON Schema.
Not that I am for JSON schema (I was barely involved in DAG serialisation). For me it's simply one of the options one could consider (alongside Protocol Buffers for example). But I am really curious what would be your arguments against it @bolkedebruin - It seems you have pretty strong opinion. |
|
Would be curious to know why you feel so? And what would you recommend as an alternative? |
Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added. (cherry picked from commit 5abce47)
Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added. (cherry picked from commit 5abce47)
Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added. (cherry picked from commit 5abce47)
Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added. (cherry picked from commit 5abce47)
Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added. (cherry picked from commit 5abce47)
…e#7162) Adding new field in BaseOperator requires some manual updates in serialization code. This test detects new fields and informs what should be done in case new field is added.

Adding new field in BaseOperator requires some manual updates
in serialization code. This test detects new fields and informs
what should be done in case new field is added.
Issue link: AIRFLOW-6557
Make sure to mark the boxes below before creating PR: [x]
[AIRFLOW-NNNN]. AIRFLOW-NNNN = JIRA ID** For document-only changes commit message can start with
[AIRFLOW-XXXX].In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.