Skip to content

Trino SQL split trailing semi-colon #34828

@MikeWallis42

Description

@MikeWallis42

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

We're currently using Airflow 2.6.1 but I believe the issue persists in the latest versions.

The SQLExecuteQueryOperator has the functionality to split_statements using a semi-colon. Trino is one of the few execution engines that does not accept a trailing semi-colon. sqlparse leaves the semi-colon after splitting.

The TrinoHook doesn't implement it's own run method so this is then handled by the DbApiHook.
split_statements on the DbApiHook uses sqlparse to do the splitting.

What you think should happen instead

When passing in multiple SQL statements that are semi-colon seperated, they should be split without keeping the semi-colon so that Trino considers it to be valid.

Because of Trino's current implementation if we want to make use of this functionality I believe that custom splitting logic should be implemented in Trino's hook.
If the Trino implementation is revised later on so that it accepts semi-colons then this could be removed again.

How to reproduce

Run the SQLExecuteQueryOperator using Trino with multiple SQL statements that are semi-colon seperated.
E.g.

SQLExecuteQueryOperator(
        task_id="semi_colon_test",
        conn_id="trino_default",
        sql="SELECT 1; SELECT 2",
        split_statements=True,
        handler=list,
    )

Operating System

Debian GNU/Linux 11 (bullseye)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.0.0
apache-airflow-providers-celery==3.1.0
apache-airflow-providers-common-sql==1.4.0
apache-airflow-providers-databricks==4.1.0
apache-airflow-providers-ftp==3.3.1
apache-airflow-providers-http==4.3.0
apache-airflow-providers-imap==3.1.1
apache-airflow-providers-mysql==5.0.0
apache-airflow-providers-postgres==5.4.0
apache-airflow-providers-presto==5.0.0
apache-airflow-providers-redis==3.1.0
apache-airflow-providers-sqlite==3.3.2
apache-airflow-providers-trino==5.0.0

Deployment

Other

Deployment details

Self hosted in AWS using ECS Fargate, RDS (Postgres) and Elasticache (Redis).

Anything else

Error message

trino.exceptions.TrinoUserError: TrinoUserError(type=USER_ERROR, name=SYNTAX_ERROR, message="line 1:84: mismatched input ';'. Expecting: '.', <EOF>", query_id=20231009_081239_17877_i2yca)

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions