diff --git a/docs/conf.py b/docs/conf.py index 10dd8cee42a59..f2065439750dd 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -350,7 +350,7 @@ def _get_rst_filepath_from_path(filepath: pathlib.Path): manual_substitutions_in_generated_html = ["example-dags.html", "operators.html", "index.html"] if PACKAGE_NAME == "docker-stack": # Substitute in links - manual_substitutions_in_generated_html = ["build.html"] + manual_substitutions_in_generated_html = ["build.html", "index.html"] html_css_files = ["custom.css"] diff --git a/docs/docker-stack/index.rst b/docs/docker-stack/index.rst index 07d845bf5dd2d..38f2b11cb3d5b 100644 --- a/docs/docker-stack/index.rst +++ b/docs/docker-stack/index.rst @@ -71,7 +71,7 @@ The Apache Airflow image provided as convenience package is optimized for size, it provides just a bare minimal set of the extras and dependencies installed and in most cases you want to either extend or customize the image. You can see all possible extras in :doc:`apache-airflow:extra-packages-ref`. The set of extras used in Airflow Production image are available in the -`Dockerfile `_. +`Dockerfile `__. However, Airflow has more than 60 community-managed providers (installable via extras) and some of the default extras/providers installed are not used by everyone, sometimes others extras/providers @@ -94,29 +94,114 @@ not even when those versions contain critical security fixes. The process of Air around upgrading dependencies automatically where applicable but only when we release a new version of Airflow, not for already released versions. -If you want to make sure that Airflow dependencies are upgraded to the latest released versions containing -latest security fixes in the image you use, you should implement your own process to upgrade -those yourself when you build custom image based on the Airflow reference one. Airflow usually does not -upper-bound versions of its dependencies via requirements, so you should be able to upgrade them to the -latest versions - usually without any problems. And you can follow the process described in -:ref:`Building the image ` to do it (even in automated way). - -Obviously - since we have no control over what gets released in new versions of the dependencies, we -cannot give any guarantees that tests and functionality of those dependencies will be compatible with -Airflow after you upgrade them - testing if Airflow still works with those is in your hands, -and in case of any problems, you should raise issue with the authors of the dependencies that are problematic. -You can also - in such cases - look at the `Airflow issues `_ -`Airflow Pull Requests `_ and -`Airflow Discussions `_, searching for similar -problems to see if there are any fixes or workarounds found in the ``main`` version of Airflow and apply them -to your custom image. - -The easiest way to keep-up with the latest released dependencies is however, to upgrade to the latest released -Airflow version via switching to newly released images as base for your images, when a new version of -Airflow is released. Whenever we release a new version of Airflow, we upgrade all dependencies to the latest -applicable versions and test them together, so if you want to keep up with those tests - staying up-to-date -with latest version of Airflow is the easiest way to update those dependencies. - +What should I do if my security scan shows critical and high vulnerabilities in the image? +========================================================================================== + +We often hear questions that our users use various security scanners on the image and find out that +there are some critical and high vulnerabilities in the image - not coming from Airflow but for some other +components. In general, this is normal and expected that such vulnerabilities are found in the image after +it's been released and fixed - precisely because we are NOT updating the images after they are released as +explained above. Also sometimes even the latest releases contain vulnerabilities that are not yet fixed +in the base image we use or in the dependencies we use and cannot upgrade, because some of our providers +have limits and did not manage to upgrade yet and we have no control over that. So it is possible +that even the most recent release of our image there are some High and Critical vulnerabilities that +are not yet fixed. + +**What can you do in such case?** + +First of all - you should know what you should NOT do. + +Do NOT send private email to the Airflow Security Team with scan results and asking what to do. +The Security team at Airflow takes care exclusively about undisclosed vulnerabilities in Airflow itself, not +in the dependencies or in the base image. The security email should only be used to report privately any +security issues that can be exploited via Airflow. This is nicely explained in our +`Security Policy `__ where you can find more details +including the need to provide reproducible scenarios and submitting ONE issue per email. NEVER submit multiple +vulnerabilities in one email - those are rejected immediately, as they make the process of handling the issue +way harder for everyone, including the reporters. + +Also DO NOT open aa GitHub Issue with the scan results and asking what to do. The GitHub Issues are for +reporting bugs and feature requests to Airflow itself, not for asking for help with the security scans on +3rd party components. + +So what are your options? + +You have four options: + +1. Build your own custom image following the examples we share there - using the latest base image and + possibly manually bumping dependencies you want to bump. There are quite a few examples + in :ref:`Building the image ` which you can follow. You can use "slim" image as a base + for your images and rather than basing your image on the "reference" image that has a number of extras + and providers installed, you can only install what you actually need and upgrade some dependencies that + otherwise would not be possible to upgrade - because some of the provider libraries have limits and + did not manage to upgrade yet and we have no control over that. This is the most flexible way to + build your image and you can build your process to combine it with quickly upgrading to latest Airflow + versions (see point 2. below). + +2. Wait for a new version of Airflow and upgrade to it. Airflow images are updated to latest "non-conflicting" + dependencies and use latest "base" image at release time, so what you have in the reference images + at the moment we publish the image / release the version is what is "latest and greatest" + available at the moment with the base platform we use (Debian Bookworm is the reference image we use). + This is one of good strategies you can take - build a process to upgrade your Airflow version regularly + - quickly after it has been released by the community, this will help you to keep up with the latest + security fixes in the dependencies. + +3. If the base platform we use (currently Debian Bookworm) does not contain the latest versions you want + and you want to use other base images, you can take a look at what system dependencies are installed + and scripts in the latest ``Dockerfile`` of airflow and take inspiration from it and build your own image + or copy it and modify it to your needs. See the + `Dockerfile `__ for the latest version. + +4. Research if the vulnerability affects you or not. Even if there is a dependency with high or critical + vulnerability, it does not mean that it can be exploited in Airflow (or specifically in the way you are + using Airflow). If you do have a reproducible scenario how a vulnerability can be exploited in Airflow, you should - + of course - privately report it to the security team. But if you do not have reproducible + scenario, please make a research and try to understand the impact of the vulnerability on Airflow. That + research might result in a public GitHub Discussion where you can discuss the impact of the + vulnerability if you research will indicate Airflow might not be impacted or private security email if + you find a reproducible scenario on how to exploit it. + + +**How do I discuss publicly about public CVEs in the image?** + +The security scans report public vulnerabilities in 3rd-party components of Airflow. Since those are +already public vulnerabilities, this is something you can talk about but others also are talking about. +So you can do research on your own first. Try to find discussions about the issues, how others were handling +it and possibly even try to explore, whether the vulnerability can be exploited in Airflow or not. +This is a very valuable contribution to the community you can do in order to help others to +understand the impact of the vulnerability on Airflow. We highly appreciate our commercial users do it, +because Airflow is maintained by volunteers, so if you or your company can spend some time and skills of +security researchers to help the community to understand the impact of the vulnerability on Airflow, it +could be a fantastic contribution to the community and way to give back to the project that your company uses +for free. + +You are free to discuss it publicly, open a `Github Discussion `_ +mentioning your findings and research you've done so far. Ideally (as a way to contribute to Airflow) you +should explain the findings of your own security team in your company to help to research and understand +the impact of the vulnerability on Airflow (and your way of using it). +Again - strong suggestion is to open ONE discussion per vulnerability. You should NOT post scan results in +bulk - this is not helpful for a discussion, and you will not get meaningful answers if you will attempt to +discuss all the issues in one discussion thread. + +Yes - we know it's the easy way to copy & paste your result and ask others what to do, but doing it is +going to likely result in silence because such actions in the community as seen as pretty selfish way of +getting your problems solved by tapping into time of other volunteers, without spending your time on making it +easier for them to help. If you really want to get help from the community, focus your discussion on +particular CVE, provide your findings - including analyzing your report in detail and finding which +binaries and base images exactly are causing the scanner to report the vulnerability. Remember that only +you have access to your scanner and you should bring as much helpful information so that others can +comment on it. Show that you have done your homework and that you bring valuable information to the community. + +Opening a GitHub Discussion for this kind of issues is also a great way to communicate with the +maintainers and security team in an open and transparent way - without reverting to the private security +mailing list (which serves different purpose as explained above). If after such a discussion there will be +a way to remove such a vulnerability from the scanned image - great, you can even contribute a PR to the +Dockerfile to remove the vulnerability from the image. Maybe such a discussion will lead to a PR to allow +Airflow to upgrade to newer dependency that fixes the vulnerability or remove it altogether, or maybe +there is already a way to mitigate it or maybe there is already a PR that someone works on to fix it. +All this can (and should) be discussed publicly and transparently in a GitHub Discussion, not via private +security email, nor GitHub Issues which are exclusively about Airflow Issues not 3rd-party components +public security issues. Support =======