From a2b36f23cf601a094ab7b9832eeebcb82fe1c81a Mon Sep 17 00:00:00 2001 From: Jarek Potiuk Date: Sat, 22 Apr 2023 14:17:37 +0200 Subject: [PATCH] Add instructions on how to avoid accidental airflow upgrade/downgrade Some of our users raised issues that when extending the image, airflow suddenly started reporting problem with database versions and migration not aplied or out-of-sync. This almost always turns out to be a dependency conflict, that leads to automated downgrate or upgrade of installed airflow version. This is - obviously - undesired (you should be upgrading airflow consciously rather than accidentally). However there is no way to do it implicitly - `pip` might decide to upgrade or downgrade airflow as it sees fit. From the point of view - airflow is just one of the packages and has no special meaning. The only way to "keep" airflow version is to specify it together with other requirements, pinned to the specific version. This PR updates our examples to do this and explains why airflow is added there. There is - of course - another risk that the user will forget to update the version of airflow when they upgrade, however, sinc this is explicit action performed during image extension, it is much easier to diagnose and notice. We also warn the users that they should upgrade when airflow is upgraded. --- .../test_examples_of_prod_image_building.py | 4 +- docs/docker-stack/build.rst | 52 +++++++++++++++++++ .../customizing/own-requirements.sh | 1 + .../extending/add-providers/Dockerfile | 2 +- .../extending/add-pypi-packages/Dockerfile | 2 +- .../add-requirement-packages/Dockerfile | 2 +- .../extending/custom-providers/Dockerfile | 2 +- 7 files changed, 60 insertions(+), 5 deletions(-) diff --git a/docker_tests/test_examples_of_prod_image_building.py b/docker_tests/test_examples_of_prod_image_building.py index 858e4c1e8fc9c..178d62680a59e 100644 --- a/docker_tests/test_examples_of_prod_image_building.py +++ b/docker_tests/test_examples_of_prod_image_building.py @@ -55,9 +55,11 @@ def test_dockerfile_example(dockerfile): rel_dockerfile_path = Path(dockerfile).relative_to(DOCKER_EXAMPLES_DIR) image_name = str(rel_dockerfile_path).lower().replace("/", "-") content = Path(dockerfile).read_text() + latest_released_version: str = get_latest_airflow_version_released() new_content = re.sub( - r"FROM apache/airflow:.*", rf"FROM apache/airflow:{get_latest_airflow_version_released()}", content + r"FROM apache/airflow:.*", rf"FROM apache/airflow:{latest_released_version}", content ) + new_content = re.sub(r"apache-airflow==\S*", rf"apache-airflow=={latest_released_version}", new_content) try: run_command( ["docker", "build", ".", "--tag", image_name, "-f", "-"], diff --git a/docs/docker-stack/build.rst b/docs/docker-stack/build.rst index 0020697cdb6e3..7beed0afed6b0 100644 --- a/docs/docker-stack/build.rst +++ b/docs/docker-stack/build.rst @@ -55,6 +55,17 @@ The following example adds ``lxml`` python package from PyPI to the image. When ``pip`` you need to use the ``airflow`` user rather than ``root``. Attempts to install ``pip`` packages as ``root`` will fail with an appropriate error message. +.. note:: + In the example below, we also add apache-airflow package to be installed - in the very same version + that the image version you used it from. This is not strictly necessary, but it is a good practice + to always install the same version of apache-airflow as the one you are using. This way you can + be sure that the version you are using is the same as the one you are extending. In some cases where + your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade + apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting + requirements, you will get an error message with conflict information, rather than a surprise + downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version + to match the new version of airflow. + .. exampleinclude:: docker-examples/extending/add-pypi-packages/Dockerfile :language: Dockerfile :start-after: [START Dockerfile] @@ -67,11 +78,26 @@ The following example adds few python packages from ``requirements.txt`` from Py Note that similarly when adding individual packages, you need to use the ``airflow`` user rather than ``root``. Attempts to install ``pip`` packages as ``root`` will fail with an appropriate error message. +.. note:: + In the example below, we also add apache-airflow package to be installed - in the very same version + that the image version you used it from. This is not strictly necessary, but it is a good practice + to always install the same version of apache-airflow as the one you are using. This way you can + be sure that the version you are using is the same as the one you are extending. In some cases where + your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade + apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting + requirements, you will get an error message with conflict information, rather than a surprise + downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version + to match the new version of airflow. + + .. exampleinclude:: docker-examples/extending/add-requirement-packages/Dockerfile :language: Dockerfile :start-after: [START Dockerfile] :end-before: [END Dockerfile] +.. exampleinclude:: docker-examples/extending/add-requirement-packages/requirements.txt + :language: text + Embedding DAGs .............. @@ -385,11 +411,25 @@ The following example adds few python packages from ``requirements.txt`` from Py Note that similarly when adding individual packages, you need to use the ``airflow`` user rather than ``root``. Attempts to install ``pip`` packages as ``root`` will fail with an appropriate error message. +.. note:: + In the example below, we also add apache-airflow package to be installed - in the very same version + that the image version you used it from. This is not strictly necessary, but it is a good practice + to always install the same version of apache-airflow as the one you are using. This way you can + be sure that the version you are using is the same as the one you are extending. In some cases where + your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade + apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting + requirements, you will get an error message with conflict information, rather than a surprise + downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version + to match the new version of airflow. + .. exampleinclude:: docker-examples/extending/add-requirement-packages/Dockerfile :language: Dockerfile :start-after: [START Dockerfile] :end-before: [END Dockerfile] +.. exampleinclude:: docker-examples/extending/add-requirement-packages/requirements.txt + :language: text + Example when writable directory is needed ......................................... @@ -558,6 +598,18 @@ You can use ``docker-context-files`` for the following purposes: * you can place ``requirements.txt`` and add any ``pip`` packages you want to install in the ``docker-context-file`` folder. Those requirements will be automatically installed during the build. +.. note:: + In the example below, we also add apache-airflow package to be installed - in the very same version + that the image version you used it from. This is not strictly necessary, but it is a good practice + to always install the same version of apache-airflow as the one you are using. This way you can + be sure that the version you are using is the same as the one you are extending. In some cases where + your new packages have conflicting dependencies, ``pip`` might decide to downgrade or upgrade + apache-airflow for you, so adding it explicitly is a good practice - this way if you have conflicting + requirements, you will get an error message with conflict information, rather than a surprise + downgrade or upgrade of airflow. If you upgrade airflow base image, you should also update the version + to match the new version of airflow. + + .. exampleinclude:: docker-examples/customizing/own-requirements.sh :language: bash :start-after: [START build] diff --git a/docs/docker-stack/docker-examples/customizing/own-requirements.sh b/docs/docker-stack/docker-examples/customizing/own-requirements.sh index a6e5f297b2e27..fc614f86e91b9 100755 --- a/docs/docker-stack/docker-examples/customizing/own-requirements.sh +++ b/docs/docker-stack/docker-examples/customizing/own-requirements.sh @@ -28,6 +28,7 @@ mkdir -p docker-context-files cat <./docker-context-files/requirements.txt beautifulsoup4==4.10.0 +apache-airflow==2.6.0.dev0 EOF export DOCKER_BUILDKIT=1 diff --git a/docs/docker-stack/docker-examples/extending/add-providers/Dockerfile b/docs/docker-stack/docker-examples/extending/add-providers/Dockerfile index aa3738f571d6a..faad8ab8fd95f 100644 --- a/docs/docker-stack/docker-examples/extending/add-providers/Dockerfile +++ b/docs/docker-stack/docker-examples/extending/add-providers/Dockerfile @@ -25,5 +25,5 @@ RUN apt-get update \ && rm -rf /var/lib/apt/lists/* USER airflow ENV JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 -RUN pip install --no-cache-dir apache-airflow-providers-apache-spark==2.1.3 +RUN pip install --no-cache-dir apache-airflow-providers-apache-spark==2.1.3 apache-airflow==2.6.0dev0 # [END Dockerfile] diff --git a/docs/docker-stack/docker-examples/extending/add-pypi-packages/Dockerfile b/docs/docker-stack/docker-examples/extending/add-pypi-packages/Dockerfile index 2f56a757d8256..a146091602617 100644 --- a/docs/docker-stack/docker-examples/extending/add-pypi-packages/Dockerfile +++ b/docs/docker-stack/docker-examples/extending/add-pypi-packages/Dockerfile @@ -16,5 +16,5 @@ # This is an example Dockerfile. It is not intended for PRODUCTION use # [START Dockerfile] FROM apache/airflow:2.6.0.dev0 -RUN pip install --no-cache-dir lxml +RUN pip install --no-cache-dir lxml apache-airflow==2.6.0.dev0 # [END Dockerfile] diff --git a/docs/docker-stack/docker-examples/extending/add-requirement-packages/Dockerfile b/docs/docker-stack/docker-examples/extending/add-requirement-packages/Dockerfile index 9e41b40741aa3..82b12d4744263 100644 --- a/docs/docker-stack/docker-examples/extending/add-requirement-packages/Dockerfile +++ b/docs/docker-stack/docker-examples/extending/add-requirement-packages/Dockerfile @@ -17,5 +17,5 @@ # [START Dockerfile] FROM apache/airflow:2.6.0.dev0 COPY requirements.txt / -RUN pip install --no-cache-dir -r /requirements.txt +RUN pip install --no-cache-dir -r /requirements.txt apache-airflow==2.6.0.dev0 # [END Dockerfile] diff --git a/docs/docker-stack/docker-examples/extending/custom-providers/Dockerfile b/docs/docker-stack/docker-examples/extending/custom-providers/Dockerfile index e5d4ad6eacf7f..882e0eb41dd8c 100644 --- a/docs/docker-stack/docker-examples/extending/custom-providers/Dockerfile +++ b/docs/docker-stack/docker-examples/extending/custom-providers/Dockerfile @@ -16,5 +16,5 @@ # This is an example Dockerfile. It is not intended for PRODUCTION use # [START Dockerfile] FROM apache/airflow:2.6.0.dev0 -RUN pip install --no-cache-dir apache-airflow-providers-docker==2.5.1 +RUN pip install --no-cache-dir apache-airflow-providers-docker==2.5.1 apache-airflow==2.6.0.dev0 # [END Dockerfile]