Skip to content

Conversation

@sunank200
Copy link
Collaborator

@sunank200 sunank200 commented Mar 5, 2025

  • Handle default values from Airflow 2.x gracefully, which breaks in Airflow 3.0. This change adds a method in the AirflowConfigParser class called handle_incompatible_airflow2_defaults for this. It runs by default while loading config.
  • Add two more option in airflow config lint
    1. Automatically upgrade problematic default values:
      airflow config lint --upgrade-problematic-defaults

    2. Skip checking default values:
      airflow config lint --skip-problematic-default-checks

Screenshot of testing done locally:

  • airflow config lint and importing conf in shell
Screenshot 2025-03-13 at 3 20 04 PM

Below is a summary table of the changes:

Parameter Key Legacy (Airflow 2.x) Default New (Airflow 3.0) Default
logging.log_filename_template Either {{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{ try_number }}.log or a variant like dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/{%% if ti.map_index >= 0 %%}map_index={{ ti.map_index }}/{%% endif %%}attempt={{ try_number }}.log Uses logical_date (with a Jinja filter on try_number), for example a template that looks like dag_id={{ ti.dag_id }}/run_id={{ ti.run_id }}/task_id={{ ti.task_id }}/ {%% if ti.map_index >= 0 %%}map_index={{ ti.map_index }}/{%% endif %%}attempt={{ try_number
core.dag_ignore_file_syntax "regexp" "glob"

closes: #46972


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@sunank200 sunank200 force-pushed the handle-default-values branch from 10a4da2 to 7d539a4 Compare March 5, 2025 19:14
@sunank200 sunank200 requested a review from jedcunningham March 5, 2025 19:14
@sunank200 sunank200 force-pushed the handle-default-values branch 5 times, most recently from f8dd35a to 9ff5266 Compare March 7, 2025 06:57
@sunank200 sunank200 force-pushed the handle-default-values branch 2 times, most recently from d00db25 to 69b7eb1 Compare March 12, 2025 07:58
@sunank200 sunank200 assigned sunank200 and unassigned vatsrahul1001 Mar 12, 2025
@sunank200 sunank200 force-pushed the handle-default-values branch from 69b7eb1 to 5aba5aa Compare March 12, 2025 08:19
@sunank200 sunank200 force-pushed the handle-default-values branch 3 times, most recently from 4cb04be to 4c438c4 Compare March 12, 2025 11:26
@sunank200 sunank200 marked this pull request as ready for review March 12, 2025 12:38
@sunank200 sunank200 force-pushed the handle-default-values branch from 02372a6 to c03bac3 Compare March 13, 2025 06:27
@Lee-W Lee-W self-requested a review March 13, 2025 07:04
help="The section name",
)
ARG_LINT_CONFIG_UPGRADE_DEFAULTS = Arg(
("--upgrade-defaults",),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
("--upgrade-defaults",),
("--upgrade-problematic-defaults",),

It's probably worth renaming these flags to include problematic - without it, it implies it'll upgrade all of the old defaults (which isn't a bad idea!).

Probably also means the Arg variable name should change too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only “upgrades” them in the current process, right? As far as I can tell the new values are not persisted. I wonder if there is a better verb to describe this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uranusjr any suggestion on prefix i could add to the name?

Copy link
Member

@uranusjr uranusjr Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Maybe --patch-problematic-defaults instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, now that you bring it up, not sure this flag is even worth having. Upgrading it or not in the config lint command process doesn't seem that helpful?

But wait! In configuration.py you are already upgrading to the new default, meaning by the time config lint runs, well, it's already run and updated stuff. So having this run in config lint at all isn't doing anything really.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can maybe have upgrade=False in configuration.py, which wouldn't upgrade the confs by default. Is this what we intend to do? Or do we always want to update in any case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want upgrading by default, these are known bad values. I was just pointing out that doing thigns in config lint would never find/warn/upgrade anything - it's already too late.

@sunank200 sunank200 force-pushed the handle-default-values branch 3 times, most recently from df0c37f to ae65d26 Compare March 13, 2025 10:28
@sunank200 sunank200 force-pushed the handle-default-values branch from 3cd1b18 to c39acba Compare March 13, 2025 17:30
Comment on lines +417 to +421
("core", "dag_ignore_file_syntax"): (
re.compile(r"^regexp$"),
"3.0",
"The default value changed from 'regexp' in Airflow 2.x to 'glob' in Airflow 3.0.",
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
("core", "dag_ignore_file_syntax"): (
re.compile(r"^regexp$"),
"3.0",
"The default value changed from 'regexp' in Airflow 2.x to 'glob' in Airflow 3.0.",
),

Looking at this, automatically changing folks away from the default may break things, not using an old default. e.g. if their airflowignore was regex, but we automatically swap to glob, things won't be happy.

We should leave this one out I believe.

help="The section name",
)
ARG_LINT_CONFIG_UPGRADE_DEFAULTS = Arg(
("--upgrade-defaults",),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, now that you bring it up, not sure this flag is even worth having. Upgrading it or not in the config lint command process doesn't seem that helpful?

But wait! In configuration.py you are already upgrading to the new default, meaning by the time config lint runs, well, it's already run and updated stuff. So having this run in config lint at all isn't doing anything really.

Comment on lines +563 to +565
if not args.skip_problematic_default_checks:
conf.handle_incompatible_airflow2_defaults(upgrade=args.upgrade_problematic_defaults)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which means, since this has already run and upgraded, you can get rid of this completely - and the args, etc. All of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unfortunate that this is what you get:

$ airflow config lint
/Users/jedc/github/airflow/airflow/configuration.py:440 FutureWarning: [core] dag_ignore_file_syntax: The default value changed from 'regexp' in Airflow 2.x to 'glob' in Airflow 3.0. Auto-upgraded from 'regexp' to 'glob'.
No issues found in your airflow.cfg. It is ready for Airflow 3!

The warning is from configuration.py, where it got auto-upgraded.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a behaviour that I intentionally added where it automatically upgraded by default. If we want to change this to upgrade only via airflow config lint, I can pass upgrade=False in configuration.py method.

assert expected_message in normalized_output
assert config_change.suggestion in normalized_output

def test_lint_detects_default_value_change_upgrade(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also means this test should be refactored and moved into test_configuration.py.


legacy_incompatible_defaults: dict[tuple[str, str], tuple[Pattern, str, str]] = {
("logging", "log_filename_template"): (
re.compile(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I took the default out of a fresh 2.10.5 config file, and this regex did not match against it. The regex may have been cut off at the end of the task_id line (that's naively what printed, at least).

Give that a test.

@jedcunningham
Copy link
Member

Looking closer, this duplicates the functionality of deprecated_values. We can just update that.

I've opened #47761 to do that.

@sunank200
Copy link
Collaborator Author

Closing this as discussed with @jedcunningham as per #47371 (comment)

@sunank200 sunank200 closed this Mar 14, 2025
@sunank200 sunank200 deleted the handle-default-values branch May 8, 2025 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle default values from Airflow 2 gracefully

4 participants