HDFS-17863: CannotObtainBlockLengthException after DataNode restart #8203
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of PR
This PR fixes HDFS-17863.
The bug accurs where under-construction files become unreadable after DataNode restart, even though the data was successfully flushed with hflush(). This breaks HDFS's visibility guarantee for flushed data.
When a DataNode restarts, under-construction block replicas in the "rbw" (replica being written) directory are loaded as ReplicaWaitingToBeRecovered (RWR state). The getVisibleLength() method in this class unconditionally returned -1:
When a client tries to read the file:
This violates HDFS's hflush() contract which guarantees that flushed data remains visible to readers.
Changes
Changed ReplicaWaitingToBeRecovered.getVisibleLength() to return getNumBytes() instead of -1:
Why This Fix Is Safe
The fix is safe because the block length returned by getNumBytes() has already been validated against checksums when the replica is loaded from disk.
In BlockPoolSlice.addReplicaToReplicasMap() (lines 693-700), RWR replicas are created with a validated length:
The validateIntegrityAndSetLength() method (lines 871-920):
Therefore, getNumBytes() returns a checksum-verified length that is safe to expose to readers. This is the same validation used for RBW replicas loaded with valid restart metadata.
A New test added:
I also ran other tests related to this change:
Regression tests - all pass (58 tests):
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?AI Tooling
If an AI tool was used:
where is the name of the AI tool used.
https://www.apache.org/legal/generative-tooling.html