Skip to content

How to avoid using transactions / Long running processes are using transactions that have expired #34

@neilferreira

Description

@neilferreira

I've been encountering an issue where I have a long-running process that primarily only reads data and is constantly returning (BadRequestException) when calling the ExecuteStatement operation: Transaction is not found.

According to AWS: A transaction times out if there are no calls that use its transaction ID in three minutes (ref)

An example to reproduce this issue:

with aurora_data_api.connect(aurora_cluster_arn=cluster_arn, secret_arn=secret_arn, database=database) as conn:
    with conn.cursor() as cursor:
        while True:
            cursor.execute("select * from pg_catalog.pg_tables")
            print(cursor.fetchall())
            time.sleep(3 * 60 + 1)

This problem stems from the following code:

    def cursor(self):
        if self._transaction_id is None:
            res = self._client.begin_transaction(database=self._dbname,
                                                 resourceArn=self._aurora_cluster_arn,
                                                 # schema="string", TODO
                                                 secretArn=self._secret_arn)
            self._transaction_id = res["transactionId"]
        cursor = AuroraDataAPICursor(client=self._client,
                                     dbname=self._dbname,
                                     aurora_cluster_arn=self._aurora_cluster_arn,
                                     secret_arn=self._secret_arn,
                                     transaction_id=self._transaction_id)

Things to note:

  • I have not started any transaction and have had no requirement or need to call .commit or .rollback as I have made no changes to the database through any UPDATE or DELETE queries
  • Transactions are deemed as optional in the rds-data API:

As per https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html

--transaction-id (optional) – The identifier of a transaction that was started using the begin-transaction CLI command. Specify the transaction ID of the transaction that you want to include the SQL statement in.

According to this information, there is no need to actually create and use transactions to perform database queries.

Suggestion:

  • Update def cursor() to NOT automatically create a transaction if one doesn't currently exist
  • Introduce a begin method that starts a new transaction
  • Users can opt to perform their database work in transactions or not.

Do my suggestions seem to be valid? I'm happy to (try to) contribute this change back.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions