-
Notifications
You must be signed in to change notification settings - Fork 16.8k
Description
Apache Airflow version: 1.10.10
Kubernetes version (if you are using kubernetes) (use kubectl version): N/A
Environment:
- Cloud provider or hardware configuration: ECS on AWS
- OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
- Kernel (e.g.
uname -a): Linux 025b01778cfa 4.19.76-linuxkit Improving the search functionality in the graph view #1 SMP Fri Apr 3 15:53:26 UTC 2020 x86_64 GNU/Linux - Install tools:
- Others: Python 3.7.8
What happened:
When referring to non python files from within a packaged DAG zip, airflow attempts to access the file in the zip as if the zip file were a regular OS directory and consequently produces a Broken DAG: [/usr/local/airflow/dags/dags.zip] [Errno 20] Not a directory: '/usr/local/airflow/dags/dags.zip/test_dag/scripts/query.sql' error.
What you expected to happen:
I expect any file access that works in a unzipped DAGs directory to work in a packaged file.
This seems related to AIRFLOW-6853 and this stack overflow question. The cause seems to be that _file_ or path operations return values similar to dags/dags.zip/package1/file.sql which does not exist on the file path.
Ideally, airflow would be able to understand how to deal with this by making it transparent to the user or providing a utility to load these files. At the very least this limitation should be documented, as there is no indication that this is not possible.
How to reproduce it:
Attached is a zip file that exhibits the error when used as a packaged DAG file, but loads appropriately when unzipped. The unzipped dag will return a psycop error as the accounts table is not set up. A packaged DAG file will return jinja2.exceptions.TemplateNotFound: scripts/test_sql.sql
dags.zip