Skip to content

consensus WAL rewrite#2582

Merged
pompon0 merged 120 commits intomainfrom
gprusak-wal3
Dec 8, 2025
Merged

consensus WAL rewrite#2582
pompon0 merged 120 commits intomainfrom
gprusak-wal3

Conversation

@pompon0
Copy link
Copy Markdown
Contributor

@pompon0 pompon0 commented Dec 4, 2025

New implementation uses the same representation as the old one, so it is backward compatible.

  • Made repair an implementation detail of the WAL.
  • Made WAL guarantee crash-tolerant atomicity (under assumption that the file system guarantees crash-tolerant atomic appends; if not, torn append writes are detected best effort by using crc32)
  • Added file locking to prevent concurrent access to WAL
  • Removed the background fsync task - background syncing is executed by the file system anyway (afaik), and we fsync multiple times a second (before signing) anyway.
  • Removed autofile and its weird semantics of autoclosing files which are not accessed after a delay - afaiu the idea was that one could hold ephemeral descriptors to all WAL files, not just the head, at all times. I've replaced it with explicit fd management.
  • simplified consensus WAL reads to only support reading last height messages - it is the only case used anyway.

@pompon0 pompon0 requested a review from yzang2019 December 4, 2025 14:18
@github-actions
Copy link
Copy Markdown

github-actions bot commented Dec 4, 2025

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedDec 8, 2025, 9:28 AM

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 61.15702% with 141 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.16%. Comparing base (1ab64fc) to head (2adde9a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
sei-tendermint/internal/libs/wal/log.go 59.61% 32 Missing and 10 partials ⚠️
sei-tendermint/internal/libs/wal/writer.go 56.89% 13 Missing and 12 partials ⚠️
sei-tendermint/internal/consensus/state.go 48.64% 7 Missing and 12 partials ⚠️
sei-tendermint/internal/consensus/wal.go 64.15% 13 Missing and 6 partials ⚠️
sei-tendermint/internal/libs/wal/view.go 70.00% 9 Missing and 9 partials ⚠️
sei-tendermint/internal/libs/wal/reader.go 69.44% 6 Missing and 5 partials ⚠️
sei-tendermint/internal/consensus/replay.go 41.66% 4 Missing and 3 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2582      +/-   ##
==========================================
- Coverage   46.08%   44.16%   -1.93%     
==========================================
  Files        1174     1796     +622     
  Lines      101716   149301   +47585     
==========================================
+ Hits        46879    65941   +19062     
- Misses      50738    77487   +26749     
- Partials     4099     5873    +1774     
Flag Coverage Δ
sei-chain 42.18% <ø> (+0.01%) ⬆️
sei-cosmos 40.15% <ø> (?)
sei-db 45.06% <ø> (+0.06%) ⬆️
sei-ibc-go 56.39% <ø> (ø)
sei-tendermint 47.43% <61.15%> (-0.09%) ⬇️
sei-wasmd 42.61% <ø> (ø)
sei-wasmvm 39.88% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-tendermint/internal/state/execution.go 61.47% <100.00%> (ø)
sei-tendermint/internal/state/state.go 82.82% <100.00%> (ø)
sei-tendermint/internal/state/validation.go 100.00% <100.00%> (ø)
sei-tendermint/internal/consensus/replay.go 73.57% <41.66%> (+4.42%) ⬆️
sei-tendermint/internal/libs/wal/reader.go 69.44% <69.44%> (ø)
sei-tendermint/internal/libs/wal/view.go 70.00% <70.00%> (ø)
sei-tendermint/internal/consensus/state.go 72.31% <48.64%> (+0.96%) ⬆️
sei-tendermint/internal/consensus/wal.go 80.00% <64.15%> (+9.47%) ⬆️
sei-tendermint/internal/libs/wal/writer.go 56.89% <56.89%> (ø)
sei-tendermint/internal/libs/wal/log.go 59.61% <59.61%> (ø)

... and 636 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// Remember to call OpenForAppend before Append.
// You need to call Sync afterwards to ensure entry is persisted on disk.
func (w *WAL) Append(msg WALMessage) error {
entry, err := proto.Marshal(&tmcons.TimedWALMessage{Time: time.Now(), Msg: msg.toProto()})

Check warning

Code scanning / CodeQL

Calling the system time Warning

Calling the system time may be a possible source of non-determinism
}
defer func() {
if resErr != nil {
f.Close()

Check warning

Code scanning / CodeQL

Writable file handle closed without error handling Warning

File handle may be writable as a result of data flow from a
call to OpenFile
and closing it may result in data loss upon failure, which is not handled explicitly.

Copilot Autofix

AI 4 months ago

To fix the issue, ensure that errors returned by f.Close() in the deferred cleanup handler in openLogWriter are checked and returned appropriately. The fix should update the deferred function (starting at line 67 in openLogWriter) so that if f.Close() returns an error, it is propagated as the error for openLogWriter—but only if no prior error was already being returned (resErr).
This requires changing the function's signature to use named return values, e.g. (res *logWriter, resErr error), and then updating the deferred function to set resErr if it is nil and f.Close() returns an error.
No changes to imports or additional helpers are needed. The intention and functionality of the code remain unchanged: just more robust error reporting.


Suggested changeset 1
sei-tendermint/internal/libs/wal/writer.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/sei-tendermint/internal/libs/wal/writer.go b/sei-tendermint/internal/libs/wal/writer.go
--- a/sei-tendermint/internal/libs/wal/writer.go
+++ b/sei-tendermint/internal/libs/wal/writer.go
@@ -66,7 +66,9 @@
 	}
 	defer func() {
 		if resErr != nil {
-			f.Close()
+			if err := f.Close(); err != nil && resErr == nil {
+				resErr = err
+			}
 		}
 	}()
 	// Truncate the file to non-corrupted prefix and sync.
EOF
@@ -66,7 +66,9 @@
}
defer func() {
if resErr != nil {
f.Close()
if err := f.Close(); err != nil && resErr == nil {
resErr = err
}
}
}()
// Truncate the file to non-corrupted prefix and sync.
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f.Close() closes the descriptor, even if an error is returned. This is the only property this code needs.

@pompon0 pompon0 requested a review from sei-will December 5, 2025 22:36
@pompon0 pompon0 marked this pull request as ready for review December 5, 2025 22:40
}

group, err := auto.OpenGroup(ctx, logger, walFile, groupOptions...)
inner, err := wal.OpenLog(walFile, wal.DefaultConfig())
Copy link
Copy Markdown
Contributor

@yzang2019 yzang2019 Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this seems to be much simpler and cleaner than the previous autofile approach

@pompon0 pompon0 enabled auto-merge (squash) December 8, 2025 09:36
@pompon0 pompon0 merged commit 0d3dd84 into main Dec 8, 2025
44 of 45 checks passed
@pompon0 pompon0 deleted the gprusak-wal3 branch December 8, 2025 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants