Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Sep 20, 2021

MSSQL has been somewhat experimental in the main branch, but as
we near releasing for 2.2.0 version, the image should support
the mssql at the level as it supports other databases.

This PR adds proper support for both PROD and CI images.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk potiuk requested review from ashb and kaxil as code owners September 20, 2021 17:14
@boring-cyborg boring-cyborg bot added area:dev-tools area:production-image Production image improvements and fixes kind:documentation labels Sep 20, 2021
@potiuk potiuk requested a review from mik-laj September 20, 2021 17:14
@mik-laj
Copy link
Member

mik-laj commented Sep 20, 2021

Could MSSQL be used as a celery backend?

if [[ ${BACKEND} == "postgres"* ]]; then
detected_port=5432
elif [[ ${BACKEND} == "mysql"* ]]; then
detected_port=3306
elif [[ ${BACKEND} == "redis"* ]]; then
detected_port=6379
elif [[ ${BACKEND} == "amqp"* ]]; then
detected_port=5672
fi

@potiuk
Copy link
Member Author

potiuk commented Sep 20, 2021

Could MSSQL be used as a celery backend?

Very good point! Indeed I believe it can

@potiuk potiuk force-pushed the add-mssql-support-to-dockerfile branch from 17f1c54 to 6579ba9 Compare September 20, 2021 19:00
@potiuk
Copy link
Member Author

potiuk commented Sep 20, 2021

Could MSSQL be used as a celery backend?

Added support for the entrypoint . Good catch @mik-laj !

@mik-laj
Copy link
Member

mik-laj commented Sep 21, 2021

I am missing one more element. On the database configuration page, we should add an example URL address.

For SQLite, we have the following fragment:

An example URI for the sqlite database:

sqlite:////home/airflow/airflow.db

For MySQL, we have the following fragment:

We recommend using the mysqlclient driver and specifying it in your SqlAlchemy connection string.

mysql+mysqldb://:@[:]/
But we also support the mysql-connector-python driver, which lets you connect through SSL without any cert options provided.

mysql+mysqlconnector://:@[:]/

For. PostgresSQL, we have the following fragment:

We recommend using the psycopg2 driver and specifying it in your SqlAlchemy connection string.

postgresql+psycopg2://:@/

Only MSSQL does not include an example URI address for me to be able to. configure the database. SQLAlchemy supports multiple drivers. Which one is the best for production deployment?

# pyodbc
mssql+pyodbc://scott:tiger@mydsn

# pymssql
mssql+pymssql://scott:tiger@hostname:port/dbname

# mxODBC
mssql+mxodbc://<username>:<password>@<dsnname>

@potiuk
Copy link
Member Author

potiuk commented Sep 21, 2021

Only MSSQL does not include an example URI address for me to be able to. configure the database. SQLAlchemy supports multiple drivers. Which one is the best for production deployment?

@aneesh-joseph ?

MSSQL has been somewhat experimental in the `main` branch, but as
we near releasing for 2.2.0 version, the image should support
the mssql at the level as it supports other databases.

This PR adds proper support for both PROD and CI images.
@potiuk potiuk force-pushed the add-mssql-support-to-dockerfile branch from 6579ba9 to 23596ce Compare September 21, 2021 01:03
@potiuk
Copy link
Member Author

potiuk commented Sep 21, 2021

Only MSSQL does not include an example URI address for me to be able to. configure the database. SQLAlchemy supports multiple drivers. Which one is the best for production deployment?

I added mssql+pyodbc as this is the one a) we support in the the image b) we test during CI. Since this is rather fresh - I will leave it at that until we have an evidence that other drivers should be preferred (but then we should also change our tests and image to use them).

@potiuk
Copy link
Member Author

potiuk commented Sep 21, 2021

good comments - All solved @mik-laj

@mik-laj
Copy link
Member

mik-laj commented Sep 21, 2021

@potiuk SQLAlchemy published some recommendations about preferred drivers:
PyODBC has the best Unicode support.
https://docs.sqlalchemy.org/en/14/dialects/mssql.html#driver-unicode-support

Deprecated since version 1.4: The mxODBC DBAPI is deprecated and will be removed in a future version. Please use one of the supported DBAPIs to connect to mssql.

https://docs.sqlalchemy.org/en/14/dialects/mssql.html#dialect-mssql-mxodbc-connect

pymssql is currently not included in SQLAlchemy’s continuous integration (CI) testing.

https://docs.sqlalchemy.org/en/14/dialects/mssql.html#module-sqlalchemy.dialects.mssql.pymssql

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Sep 21, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


.. code-block:: text

mssql+pyodbc://<user>:<password>@<host>
Copy link
Member

@mik-laj mik-laj Sep 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On CI, we use a little more complex URI. This leads me to suppose that this example may be incomplete.

- AIRFLOW__CORE__SQL_ALCHEMY_CONN=mssql+pyodbc://sa:Airflow123@mssql:1433/airflow?driver=ODBC+Driver+17+for+SQL+Server
- AIRFLOW__CELERY__RESULT_BACKEND=db+mssql+pyodbc://sa:Airflow123@mssql:1433/airflow?driver=ODBC+Driver+17+for+SQL+Server

Copy link
Member Author

@potiuk potiuk Sep 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default one from the sqlalchemy docs (I took it from there):

https://docs.sqlalchemy.org/en/14/core/engines.html#microsoft-sql-server

We use mor complex one, but those are optional parameters, and I wouldn't dive into details of it (especially not before we have first users using 'for real' and making comments about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a database name to the example? I think it is a commonly used parameter. It needed to set up a database created in the paragraph above.

mssql+pyodbc://<user>:<password>@<host>
mssql+pyodbc://<user>:<password>@<host>/<database_name>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a PR: #18404

@potiuk potiuk merged commit ab7acfd into apache:main Sep 21, 2021
@potiuk potiuk deleted the add-mssql-support-to-dockerfile branch September 21, 2021 09:27
@ashb
Copy link
Member

ashb commented Sep 21, 2021

How much does this add to the docker images size btw?

potiuk added a commit to potiuk/airflow that referenced this pull request Sep 21, 2021
The ODBC extra has been missing from apache#18382. This PR adds the
missing extra and verifies if pyodbc is importable in the PROD
image.
potiuk added a commit that referenced this pull request Sep 21, 2021
The ODBC extra has been missing from #18382. This PR adds the
missing extra and verifies if pyodbc is importable in the PROD
image.
@potiuk
Copy link
Member Author

potiuk commented Sep 21, 2021

How much does this add to the docker images size btw?

It increases the size of the image by 4MB - from 972 MB to 976 MB (~ 0.5%) See the discussion here: https://apache-airflow.slack.com/archives/CQAMHKWSJ/p1632131354042400

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:production-image Production image improvements and fixes full tests needed We need to run full set of tests for this PR to merge kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants