fix: Ensure buffer space before reading in Decompress method#127105
fix: Ensure buffer space before reading in Decompress method#127105rzikm merged 9 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @karelz, @dotnet/area-system-io-compression |
MihaZupan
left a comment
There was a problem hiding this comment.
Thank you. Can you please also add a test that covers the described case?
…Zstandard/ZstandardStream.Decompress.cs Co-authored-by: Miha Zupan <mihazupan.zupan1@gmail.com>
…Zstandard/ZstandardStream.Decompress.cs Co-authored-by: Miha Zupan <mihazupan.zupan1@gmail.com>
|
@dotnet-policy-service agree |
done added a test case for ZstandardStream. |
|
Did you check that the test fails before the fix in this PR? |
Yes, I checked using a test to simulate the situation with and without my fix. Without the fix, the decompressor caught an exception. With my fix, the buffer moved correctly without errors. |
|
Testing locally, the test doesn't seem to be triggering the changed case at all. Can you share the inputs & code you've used to hit this issue in the first place? |
In my test (.net), I use for the .NET test, you can try precisely adjusted sizes or a hardcore using System;
public class Program
{
public static void Main()
{
Console.WriteLine("running without fix...\n");
try {
RunSimulation(applyFix: false);
} catch (Exception ex) {
Console.WriteLine($"\nerror: {ex.Message}");
}
Console.WriteLine("\n\nrunning with fix applied...\n");
try {
RunSimulation(applyFix: true);
Console.WriteLine("\ndone all data successfully read without truncation");
} catch (Exception ex) {
Console.WriteLine($"\nerror {ex.Message}");
}
}
static void RunSimulation(bool applyFix)
{
var buffer = new MockBuffer(10);
int totalToRead = 15;
int streamRemaining = 15;
int iteration = 1;
while (streamRemaining > 0 || buffer.ActiveLength > 0)
{
Console.WriteLine($"\niteration {iteration++}:");
int consumed = Math.Min(3, buffer.ActiveLength);
buffer.Discard(consumed);
Console.WriteLine($"decompressor huk {consumed} took it the buffer has moved");
if (streamRemaining <= 0 && buffer.ActiveLength == 0) break;
if (applyFix)
{
buffer.EnsureAvailableSpace(1);
}
int spaceAvailable = buffer.AvailableLength;
int bytesToRead = Math.Min(4, Math.Min(spaceAvailable, streamRemaining));
Console.WriteLine($"available at the end of the array: {spaceAvailable} bytes. reading is stream...");
if (bytesToRead <= 0 && streamRemaining > 0)
{
throw new Exception("ThrowTruncatedInvalidData() stream has more data but no space to read into");
}
buffer.Commit(bytesToRead);
streamRemaining -= bytesToRead;
Console.WriteLine($"reading {bytesToRead} bytes remaining in stream: {streamRemaining}.");
}
}
}
class MockBuffer
{
private byte[] _array;
private int _start;
private int _end;
public MockBuffer(int size) { _array = new byte[size]; }
public int ActiveLength => _end - _start;
public int AvailableLength => _array.Length - _end;
public void Discard(int count) { _start += count; }
public void Commit(int count) { _end += count; }
public void EnsureAvailableSpace(int minimumBytes)
{
if (AvailableLength < minimumBytes)
{
Console.WriteLine($"ensureAvailableSpace: little space move {_start} unread bytes to the beginning of the array");
int active = ActiveLength;
Array.Copy(_array, _start, _array, 0, active);
_start = 0;
_end = active;
}
}
} |
|
The real logic has two important differences compared to your simulation:
Hence my question of whether you've hit this in practice or if it's just based on code observations. |
rzikm
left a comment
There was a problem hiding this comment.
zstd is likely buffering all the data we pass to it internally, so the buffer is always drained completely, still, the call to EnsureAvailableSpace does not hurt in case this changes (or the implementation gets copied for other compression algs in the future)
I would be okay with taking the changes without the unit tests, since they don't reproduce the claimed issue (I ran the code coverage and only the fast-path of EnsureAvailableSpace ever gets executed)
I agree. Even if the current native implementation typically flushes the buffer completely, having this safety check ensures the logic is robust to future changes or specific edge cases related to partial consumption, as this issue is extremely rare and depends on very specific state alignment I don't mind removing unit tests if they don't allow you to reliably reproduce the bug in the current environment. In any case, it's better to have protection in the code. |
|
Yeah, we shouldn't commit the current test. |
…nStreamUnitTests.Zstandard.cs
|
The new test, we should keep the existing file :) |
Sorry, it happened by accident. |
|
I restored the file itself and deleted my test from it. I hope this time everything will work out without any accidental errors on my part. :) |
The essence of the bug:
During decompression, a portion of compressed data ends exactly at the boundary of a 64 kb buffer, and the remaining bytes, not yet decompressed by the decoder, are waiting for the next portion, resulting in a exception. The buffer is physically full, and AvailableSpan returns an empty fragment,
Span<byte>empty reading from the file into an empty Span returns 0 bytes read.Solution:
Checks have been added to the main Read and ReadAsync loops. If AvailableLength is zero, we force a call to
_buffer.EnsureAvailableSpace(1);.the bytes are shifted to the beginning of the array, the read is released, and the stream then reads new data.