Skip to content

Conversation

@Datkros
Copy link
Contributor

@Datkros Datkros commented Jul 6, 2020

Added support to inject the remote logging configuration into the airflow.cfg configmap. I kept it backwards compatible with the current version of the chart by not dropping the elasticsearch.enabled value.


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the area:helm-chart Airflow Helm Chart label Jul 6, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 6, 2020

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://apache-airflow-slack.herokuapp.com/

@Datkros Datkros changed the title Adding support for Remote logging to helm chart to be injected into a… Add support for remote logging to be injected into the airflow.cfg configmap in helm chart Jul 6, 2020
@mik-laj
Copy link
Member

mik-laj commented Jul 13, 2020

How do you benefit from changing your environment configuration? I am afraid that in a moment we will have in Helm copies of all options, but with a different interface and worse documentation.

# Environment variables for all airflow containers
env: []
# - name: ""
# value: ""

An additional option in Helm Chart is useful when changing options in Airflow also requires changing elsewhere in Helm. An example here:
https://github.com/apache/airflow/pull/9778/files

Configuring logging would be useful if the bucket were automatically configured at the same time. You can do it using the product below for GCP.
https://github.com/GoogleCloudPlatform/k8s-config-connector

@Datkros
Copy link
Contributor Author

Datkros commented Jul 13, 2020

How do you benefit from changing your environment configuration? I am afraid that in a moment we will have in Helm copies of all options, but with a different interface and worse documentation.

# Environment variables for all airflow containers
env: []
# - name: ""
# value: ""

An additional option in Helm Chart is useful when changing options in Airflow also requires changing elsewhere in Helm. An example here:
https://github.com/apache/airflow/pull/9778/files

Configuring logging would be useful if the bucket were automatically configured at the same time. You can do it using the product below for GCP.
https://github.com/GoogleCloudPlatform/k8s-config-connector

It benefits me because we'd be managing multiple airflow instances with different logging folders and credentials, this is specially important if it's across multiple teams / use cases, as some of them would have different isolated environment configuration (prod v staging i.e).

I could always overwrite it as an env var as you mentioned, but we'd lose visibility on what's the current configuration running by just looking at the source. In this particular use case, everything is in AWS, so it'd be preferable to use terraform / cloudformation (or maybe someone else would prefer Puppet, Chef, or ansible) to better manage those bucket and credential resources instead of locking you into a solution through having the helm release create the buckets as well. This is what happens right now by having the helm chart depend on the k8s secret creation, as you cannot choose somewhere else to store DB DNS secrets, locking you into that solution.

Essentially, one big issue of having both and the configmap is that your entire configuration becomes disjointed, being partially loaded through a .cfg, and from environment variables. It becomes hard to track what gets changed where, what is being already provided to you. Not having a single entry source makes it difficult for end users to see the full picture of their airflow configuration without having to dig through the code. This is why I chose to modify the configmap, as I believe that maybe it should be exposed through here since airflow already has templates for how this configuration would look like.

@Datkros
Copy link
Contributor Author

Datkros commented Jul 13, 2020

To clarify on the above, would this env be the entry source for any configuration that needs to be passed to Airflow itself? (is this the consensus?) As far as I understood, it seems to be more about k8s pods, than isolated to airflow only.

@mik-laj
Copy link
Member

mik-laj commented Jul 13, 2020

we'd lose visibility on what's the current configuration running by just looking at the source.

In Airflow 1.101.11, I added new command - airflow config. Will this help solve this problem?

I am not involved in the development of the Helm Chart, so it is difficult for me to address the other topics, but maybe @potiuk or @potiuk will be able to help you. I don't know the full vision of how we want to develop it. I will only add that when it comes to logger configurations it is sometimes helpful to be configured by environment variables, because secrets are easier to provide - https://airflow.readthedocs.io/en/latest/configurations-ref.html#stackdriver-key-path

@Datkros
Copy link
Contributor Author

Datkros commented Jul 14, 2020

we'd lose visibility on what's the current configuration running by just looking at the source.

In Airflow 1.101.11, I added new command - airflow config. Will this help solve this problem?

Not really, these commands are meant for adhoc configurations, not direct injections or automations, right? I

@Datkros
Copy link
Contributor Author

Datkros commented Jul 17, 2020

Closed in favor of #9816

@Datkros Datkros closed this Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:helm-chart Airflow Helm Chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants