Skip to content

Conversation

@Adaverse
Copy link
Contributor

@Adaverse Adaverse commented Jul 22, 2023

Currently, the container name for remote logging in Azure Blob Storage is hard coded in airflow_local_settings.py. Hence, in this PR -

  • Removed the hard-coded name and added as a configurable option - remote_wasb_log_container
  • Updated doc to reflect the above change.
  • Also, keeping the change backward compatible

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@Adaverse Adaverse marked this pull request as draft July 23, 2023 06:35
@Adaverse Adaverse marked this pull request as ready for review July 23, 2023 09:58
@Adaverse Adaverse changed the title Fixed hardcoded container name in remote logging option for Azure Blob Storage Fix hardcoded container name in remote logging option for Azure Blob Storage Jul 23, 2023
@Adaverse
Copy link
Contributor Author

@eladkal since it touches both airflow core and provider, do we need to separate into two PRs?

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like the doc improvement too :)

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rethought it a bit.

I know we still have provider-specific code in "logging" section, but with the recent move of configuration to providers, we should rather aim to eventually move all of those to provider-specific configuration.

Can you please do it this way @Adaverse (even if it is not consistent with the other remote logging, this will be the "target" - so we can start with this wasb change). This will also setup convention on how to name such "remote logging" section for other providers.

Few things to change:

  1. add config section in provider.yaml of the azure provider ("logging-azure-wasb" ?)
  2. read it from there
  3. add configuration page in azure provider
  4. update examples

@potiuk
Copy link
Member

potiuk commented Jul 31, 2023

And "version_added" in this case should be the next minor version of the azure provider, not 2.7.0 of airflow.

@potiuk
Copy link
Member

potiuk commented Jul 31, 2023

BTW. This should make the change "azure-provider-only". As it should be.

amoghrajesh

This comment was marked as outdated.

@Adaverse
Copy link
Contributor Author

Adaverse commented Jul 31, 2023

I know we still have provider-specific code in "logging" section, but with the recent move of configuration to providers, we should rather aim to eventually move all of those to provider-specific configuration.

So in order to remove provider specific config from loading from core Airflow, below are the steps I would follow -

  1. Introduce a Airflow config variable (ex - AIRFLOW__LOGGING__REMOTE_LOGGER ?) that can take values like s3, cloudwatch, gcs, wasb etc.. that will help us eliminate the if else we have based on provider specific scheme and replace it with name (👇) field in provider specific yaml
  2. Next, we can dynamically create the task handler dictionary based on the input from provider.yaml from the specific provider
    that matches with the name above for example for name = wasb, for Azure provider.yaml-->
logging:
  - airflow.providers.microsoft.azure.log.wasb_task_handler.WasbTaskHandler
  - name: wasb 
  - formatter: airflow
  - base_log_folder: AIRFLOW__LOGGING__BASE_LOG_FOLDER
  - wasb_container: AIRFLOW_LOGGING__REMOTE_WASB_LOG_CONTAINER

So we would compare the value in name with the value provided in AIRFLOW__LOGGING__REMOTE_LOGGER and then load the corresponding dict and update the DEFAULT_LOGGING_CONFIG

If we can dynamically update the config doc to only contain the possible values of the supported remote loggers, then i guess we are good to go.

Let me know if above sounds fine or any changes you suggest?

@potiuk
Copy link
Member

potiuk commented Jul 31, 2023

Introduce a Airflow config variable (ex - AIRFLOW__LOGGING__REMOTE_LOGGER ?) ....
Let me know if above sounds fine or any changes you suggest?

That approach would be far in the future I think. We should approach it gradually - maybe in Airlfow 2.8 or 2.9 we might want to do a bigger change. For now I just propose to create the new configuration option via the mixrosoft/azure/provider.yaml so that it will "belong" to azure provider (see for exaple config sections in celery/provider.yaml or cncf/kubernetes/provider.yaml.

@Adaverse
Copy link
Contributor Author

That approach would be far in the future I think. We should approach it gradually - maybe in Airlfow 2.8 or 2.9 we might want to do a bigger change. For now I just propose to create the new configuration option via the mixrosoft/azure/provider.yaml so that it will "belong" to azure provider (see for exaple config sections in celery/provider.yaml or cncf/kubernetes/provider.yaml.

Got it @potiuk! Let me make the change.

@Adaverse
Copy link
Contributor Author

Adaverse commented Aug 2, 2023

That approach would be far in the future I think. We should approach it gradually - maybe in Airlfow 2.8 or 2.9 we might want to do a bigger change. For now I just propose to create the new configuration option via the mixrosoft/azure/provider.yaml so that it will "belong" to azure provider (see for exaple config sections in celery/provider.yaml or cncf/kubernetes/provider.yaml

Moved the config to the provider but having the below issue where the config won't load in the doc. Am I missing anything here?

image

@Adaverse
Copy link
Contributor Author

Adaverse commented Aug 5, 2023

And "version_added" in this case should be the next minor version of the azure provider, not 2.7.0 of airflow.

The config is hidden when I put the version (for this case) anything greater than 2.7.0
Do we have control somewhere, where we hide (or filter) those configs which have the version greater than the Airflow version @potiuk?

For the above case, had put the next release of Azure provider i.e 6.3.0

@potiuk
Copy link
Member

potiuk commented Aug 6, 2023

Do we have control somewhere, where we hide (or filter) those configs which have the version greater than the Airflow version @potiuk?

Here:

version_added = option["version_added"]
. Yes it looks like it always compares with airflow version, so we should likely fix.

@Adaverse
Copy link
Contributor Author

Adaverse commented Aug 6, 2023

. Yes it looks like it always compares with airflow version, so we should likely fix.

Let's fix the above before merging this PR. Let me give it a try.

@potiuk
Copy link
Member

potiuk commented Sep 7, 2023

. Yes it looks like it always compares with airflow version, so we should likely fix.

Let's fix the above before merging this PR. Let me give it a try.

I already fixed it in #34011 -> can yo plese reabase and fix tests @Adaverse ?

@Adaverse
Copy link
Contributor Author

Adaverse commented Sep 7, 2023

I already fixed it in #34011 -> can yo plese reabase and fix tests @Adaverse ?
Sure @potiuk.

@eladkal
Copy link
Contributor

eladkal commented Oct 10, 2023

@Adaverse what is the status of this PR? Is it ready for review after @potiuk comments?

@Adaverse
Copy link
Contributor Author

@Adaverse what is the status of this PR? Is it ready for review after @potiuk comments?

Yes, the changes are ready for the review.
++@potiuk

@eladkal eladkal merged commit 8e383e8 into apache:main Oct 13, 2023
@Adaverse Adaverse deleted the azure-logs branch October 13, 2023 18:03
@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Oct 27, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.8.0 milestone Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants