Skip to content

Fix compareAndSwap() in SQLMetadataConnector#7661

Merged
fjy merged 2 commits intoapache:masterfrom
jon-wei:fix_auth_update_fail
May 15, 2019
Merged

Fix compareAndSwap() in SQLMetadataConnector#7661
fjy merged 2 commits intoapache:masterfrom
jon-wei:fix_auth_update_fail

Conversation

@jon-wei
Copy link
Copy Markdown
Contributor

@jon-wei jon-wei commented May 14, 2019

This PR fixes SQLMetadataConnector.compareAndSwap() by adding " FOR UPDATE" to the initial read and using REPEATABLE_READ isolation level for the transaction.

This method is used by druid-basic-security extension APIs, and the broken CAS method was causing issues where some concurrent update requests (such as assigning roles to a user) failed to take effect.

A test is added that concurrently makes 100 role assignment requests and later concurrently removes the 100 roles.

clintropolis
clintropolis previously approved these changes May 15, 2019
Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!!

Do we need similar changes in IndexerSQLMetadataStorageCoordinator's announceHistoricalSegments method? There's a similar kind of CAS going on there.

@jon-wei
Copy link
Copy Markdown
Contributor Author

jon-wei commented May 15, 2019

Marking this WIP, while testing just found that Postgres has somewhat different behavior from Derby/MySQL:

https://www.postgresql.org/docs/10/transaction-iso.html#XACT-REPEATABLE-READ

...if the first updater commits (and actually updated or deleted the row, not just locked it) then the repeatable read transaction will be rolled back with the message ERROR: could not serialize access due to concurrent update because a repeatable read transaction cannot modify or lock rows changed by other transactions after the repeatable read transaction began. When an application receives this error message, it should abort the current transaction and retry the whole transaction from the beginning.

So I'll need to add retries for that case for postgres

Do we need similar changes in IndexerSQLMetadataStorageCoordinator's announceHistoricalSegments method? There's a similar kind of CAS going on there.

I'll take a look at that, thanks!

@jon-wei jon-wei removed the WIP label May 15, 2019
@jon-wei
Copy link
Copy Markdown
Contributor Author

jon-wei commented May 15, 2019

Added handling for Postgres serialization_failure error, removed WIP tag

Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

* the calling code will attempt retries.
*/
private boolean checkRootCauseForPSQLSerializationFailure(
Throwable root
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: method argument doesn't seem like it needs to be on a new line... alternatively any reason not to just inline this whole method in the compareAndSwap method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the somewhat non-trivial boolean conditions I felt like it looked nicer as a separate method

Copy link
Copy Markdown

@surekhasaharan surekhasaharan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@surekhasaharan
Copy link
Copy Markdown

Do we need similar changes in IndexerSQLMetadataStorageCoordinator's announceHistoricalSegments method? There's a similar kind of CAS going on there.

I'll take a look at that, thanks!

Does that code need any fixes ?

@jon-wei jon-wei added this to the 0.15.0 milestone May 15, 2019
@jon-wei
Copy link
Copy Markdown
Contributor Author

jon-wei commented May 15, 2019

Do we need similar changes in IndexerSQLMetadataStorageCoordinator's announceHistoricalSegments method? There's a similar kind of CAS going on there.

I'll take a look at that, thanks!

Does that code need any fixes ?

I'm still looking into that, but it can be a separate PR if there are similar issues there.

@fjy fjy merged commit 6901123 into apache:master May 15, 2019
jon-wei added a commit to jon-wei/druid that referenced this pull request May 17, 2019
* Fix compareAndSwap() in SQLMetadataConnector

* Catch serialization_failure and retry for Postgres
jon-wei added a commit that referenced this pull request May 17, 2019
* Fix compareAndSwap() in SQLMetadataConnector

* Catch serialization_failure and retry for Postgres
jon-wei added a commit to implydata/druid-public that referenced this pull request May 20, 2019
* Fix compareAndSwap() in SQLMetadataConnector

* Catch serialization_failure and retry for Postgres
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 4, 2019
* Fix compareAndSwap() in SQLMetadataConnector

* Catch serialization_failure and retry for Postgres
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 26, 2019
* Fix compareAndSwap() in SQLMetadataConnector

* Catch serialization_failure and retry for Postgres
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants