-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Production-level support for MSSQL #18382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Could MSSQL be used as a celery backend? airflow/scripts/in_container/prod/entrypoint_prod.sh Lines 98 to 106 in f76eaec
|
Very good point! Indeed I believe it can |
17f1c54 to
6579ba9
Compare
Added support for the entrypoint . Good catch @mik-laj ! |
|
I am missing one more element. On the database configuration page, we should add an example URL address. For SQLite, we have the following fragment:
For MySQL, we have the following fragment:
For. PostgresSQL, we have the following fragment:
Only MSSQL does not include an example URI address for me to be able to. configure the database. SQLAlchemy supports multiple drivers. Which one is the best for production deployment? |
|
MSSQL has been somewhat experimental in the `main` branch, but as we near releasing for 2.2.0 version, the image should support the mssql at the level as it supports other databases. This PR adds proper support for both PROD and CI images.
6579ba9 to
23596ce
Compare
I added mssql+pyodbc as this is the one a) we support in the the image b) we test during CI. Since this is rather fresh - I will leave it at that until we have an evidence that other drivers should be preferred (but then we should also change our tests and image to use them). |
|
good comments - All solved @mik-laj |
|
@potiuk SQLAlchemy published some recommendations about preferred drivers:
https://docs.sqlalchemy.org/en/14/dialects/mssql.html#dialect-mssql-mxodbc-connect
https://docs.sqlalchemy.org/en/14/dialects/mssql.html#module-sqlalchemy.dialects.mssql.pymssql |
|
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
|
|
||
| .. code-block:: text | ||
|
|
||
| mssql+pyodbc://<user>:<password>@<host> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On CI, we use a little more complex URI. This leads me to suppose that this example may be incomplete.
airflow/scripts/ci/docker-compose/backend-mssql.yml
Lines 23 to 24 in c686241
| - AIRFLOW__CORE__SQL_ALCHEMY_CONN=mssql+pyodbc://sa:Airflow123@mssql:1433/airflow?driver=ODBC+Driver+17+for+SQL+Server | |
| - AIRFLOW__CELERY__RESULT_BACKEND=db+mssql+pyodbc://sa:Airflow123@mssql:1433/airflow?driver=ODBC+Driver+17+for+SQL+Server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the default one from the sqlalchemy docs (I took it from there):
https://docs.sqlalchemy.org/en/14/core/engines.html#microsoft-sql-server
We use mor complex one, but those are optional parameters, and I wouldn't dive into details of it (especially not before we have first users using 'for real' and making comments about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a database name to the example? I think it is a commonly used parameter. It needed to set up a database created in the paragraph above.
mssql+pyodbc://<user>:<password>@<host>
mssql+pyodbc://<user>:<password>@<host>/<database_name>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a PR: #18404
|
How much does this add to the docker images size btw? |
The ODBC extra has been missing from apache#18382. This PR adds the missing extra and verifies if pyodbc is importable in the PROD image.
The ODBC extra has been missing from #18382. This PR adds the missing extra and verifies if pyodbc is importable in the PROD image.
It increases the size of the image by 4MB - from 972 MB to 976 MB (~ 0.5%) See the discussion here: https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1632131354042400 |
MSSQL has been somewhat experimental in the
mainbranch, but aswe near releasing for 2.2.0 version, the image should support
the mssql at the level as it supports other databases.
This PR adds proper support for both PROD and CI images.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.