docs: repare the transaction conflicts resolution flow#5133
docs: repare the transaction conflicts resolution flow#5133majin1102 wants to merge 4 commits intolance-format:mainfrom
Conversation
|
Hi @wjones127 Please take a look when you have time |
| If there are any, begin the rebase process: read their transaction structures and check for conflicts. | ||
| If there are any conflicts and the conflicts are not retriable, abort the commit. | ||
| If the conflicts are retriable, writer should go back to step 2 and rebuild the transaction based on the newest version to resolve the conflicts. | ||
| 4. Create a transaction file in the _transactions directory which describes the operations that were performed for two purposes: |
There was a problem hiding this comment.
I think we could actually just make this a mermaid diagram. Then it's more understandable by AIs and can just be edited as text.
Here's how I would write it:
flowchart LR
A[Write data files] --> C[Check for concurrent commits]
A -.-> DF{{data/31a7060e-4898-4ecd-a428-afbff3539fa6.lance}}
C --> D{Are there conflicts?}
D -->|None| E[Write transaction file]
E -.-> Txn{{_transactions/42-76019405-8d5a-43c3-a7a2-324ed49a9d75.txn}}
D-->|Resolvable| G[Resolve conflicts] --> E
G -.->|merged| Deletions{{_deletions/31a7060e-4898-4ecd-a428-afbff3539fa6.lance}}
D -->|Retryable| H[Retry operation 🔄]
D -->|Non-retryable| F[Abort ✗]
E --> I[Atomically write manifest]
I -.-> Manifest{{_versions/43.manifest}}
I --> J{Success?}
J -->|Yes| K[Complete ✓]
J -->|No| C
style A fill:#e1f5fe
style K fill:#c8e6c9
style F fill:#ffcdd2
style H fill:#fff3e0
style DF fill:#ddd
style Txn fill:#ddd
style Manifest fill:#ddd
style Deletions fill: #ddd
This basically describes four outcomes of checking for conflicts:
- No conflicts
- Some conflicts, but we can resolve them just using the transactions
- Conflict, but it allows retries. So we can just retry the operation.
- Conflict, non-retryable. This rare, but happens for example if we are trying to append but another transaction completely changed the schema to remove columns we are trying to insert into.
There was a problem hiding this comment.
Sorry a bit late to this conversation, did not realize this PR. I just published #5209 which refreshes a lot of content for the table format and overall format intro. I also updated conflict resolution with mermaid diagram, but a bit different from this one. I will check tomorrow to see if I can reuse the content here.
There was a problem hiding this comment.
OK, I guess this PR would be useless.
If the conflicts are retryable, attempt to resolve conflicts between the candidate transaction and the conflicting transaction. For example, if they delete disjoint sets of rows, then the deletion files can be merged. If conflicts can't be resolved, the retryable conflict error is bubbled up. This can be handled as a full retry of the write operation.
Thanks for correcting me @wjones127 . I was misunderstanding about retryable.
Is this good to close? @jackye1995
Also refreshes all the contents regarding table format, separate it into multiple documents for clarity. Closes lance-format#4136 lance-format#5133
Close #5122