[release/9.0] Acquire the migrations database lock in a transaction.#34489
[release/9.0] Acquire the migrations database lock in a transaction.#34489AndriySvyryd merged 4 commits intorelease/9.0from
Conversation
|
@roji Let me know if this is an acceptable design |
roji
left a comment
There was a problem hiding this comment.
@AndriySvyryd thanks, overall looks to me... There are some slight things I don't understand about flowing the state across Migrator/MigrationCommandExecutor, but that may just be the draft state of this PR. Otherwise see comments below.
I'm personally not seeing the value of the separation of Migrator vs. MigrationCommandExecutor - the separation may actually add more complexity/cognitive load rather than simplifying (especially now that transaction management and locking are also managed in Migrator...). Something to consider for the future...
This is mainly to reduce code duplication as it's also called from |
OK. This is academic at the moment in any case, I'm not actually proposing we do anything for 9.0... |
e41cba4 to
444fb34
Compare
444fb34 to
49f3b71
Compare
roji
left a comment
There was a problem hiding this comment.
I think it's getting closer @AndriySvyryd but I can still see some trouble (or stuff I don't understand), take a look.
| /// any release. You should only use it directly in your code with extreme caution and knowing that | ||
| /// doing so can result in application failures when updating to a new Entity Framework Core release. | ||
| /// </summary> | ||
| public abstract LockReleaseBehavior LockReleaseBehavior { get; } |
There was a problem hiding this comment.
Naming: I'd love to just call this LockingScope, but it's true that for Explicit we still take it ourselves, so it seems to indeed apply only to the release part...
There was a problem hiding this comment.
@dotnet/efteam API review question
There was a problem hiding this comment.
Am also OK with the current naming if we don't get around to discussing this (same with the other naming proposals below).
There was a problem hiding this comment.
Ok. We won't have any meetings for this, so if anyone else in @dotnet/efteam has an opinion - speak up
| /// <see langword="null"/> if there's no current transaction. | ||
| /// </param> | ||
| /// <returns>An object that can be disposed to release the lock.</returns> | ||
| IMigrationsDatabaseLock Refresh(bool connectionReopened, bool? transactionRestarted) |
There was a problem hiding this comment.
This feels like it might belong in the callsites, rather than as part of the lock implementation - but I'm not sure... The behavior here seems universal for the various LockReleaseBehavior values, i.e. I don't see an implementation of IMigrationsDatabaseLock ever needing to change it - do you have a specific case in mind?
If we're not sure, it may be better to not prematurely expose this hook before we have a request, or at least flag it as pubternal...
Otherwise, we may just want to call this Reacquire() instead of Refresh(), for consistency (or ReacquireIfNeeded())
There was a problem hiding this comment.
The main reason it's here is because I don't want to expose HistoryRepository to the callsites as that would allow them to acquire new locks.
The secondary benefit is having an escape hatch in case there's a behavior that can't be currently expressed in LockReleaseBehavior. Any request for this would be too late to incorporate into 9.0.0
There was a problem hiding this comment.
@dotnet/efteam API Review
There was a problem hiding this comment.
The main reason it's here is because I don't want to expose HistoryRepository to the callsites as that would allow them to acquire new locks.
OK, that may be a bit over-defensive for very low-level code that we maintain (or possibly a provider).
The secondary benefit is having an escape hatch in case there's a behavior that can't be currently expressed in LockReleaseBehavior. Any request for this would be too late to incorporate into 9.0.0
I'm a bit doubtful about the extent to which actual custom logic could be implemented in the current design... Simply because it's not really possible to override Migrator and MigrationCommandExecutor behavior which determines exactly where the lock gets taken/released...
Anyway, I'm OK with the design as-is. I'd maybe slightly prefer putting pubternal/experimental on this as if feels like we may want to refactor/improve the design later, but I'll defer to you.
| /// <see langword="null"/> if there's no current transaction. | ||
| /// </param> | ||
| /// <returns>An object that can be disposed to release the lock.</returns> | ||
| IMigrationsDatabaseLock Refresh(bool connectionReopened, bool? transactionRestarted) |
There was a problem hiding this comment.
nit: transactionStarted rather than restarted?
There was a problem hiding this comment.
I don't think this name accurately conveys the tri-value aspect.
There was a problem hiding this comment.
@dotnet/efteam API Review
There was a problem hiding this comment.
I don't think this name accurately conveys the tri-value aspect.
Maybe, but transactionRestarted is also inaccurate for the case where we're just starting out and there nothing to re-start.
But never mind, this isn't important and it's a bit late for minor naming discussions...
378c56a to
e27f07a
Compare
Reacquire the lock when retrying. Execute all migrations in a single transaction. Fixes #34439
5c7c4c5 to
3b884d9
Compare
|
@roji — is there a way to disable this? I’m using YugabyteDB (Postgres‑compatible), but it doesn’t support the advisory lock mechanism. How can I run automatic migrations without relying on advisory locks? |
Fixes #34439
Description
The migrations database lock introduced in #34115 doesn't work with providers that use a transaction-based lock implementation.
With this the lock can be acquired inside the transaction and all migrations are executed in a single transaction (when possible) to allow them to be covered by a single lock.
Additionally, the non-transaction based locks will be reacquired when retrying on a transient failure.
Customer impact
Currently, the migrations database lock does not work on some providers. And they aren't reacquired when retrying on a transient failure for the other providers.
How found
Reported by a provider maintainer.
Regression
No, new feature.
Testing
Tests added.
Risk
Medium. It's a breaking change for providers from rc1, but it's unlikely that the existing implementations were working correctly.