Skip to content

seekable decompression fixes#2594

Merged
senhuang42 merged 4 commits intofacebook:devfrom
azat-archive:seekable_decompression-fixes
May 5, 2021
Merged

seekable decompression fixes#2594
senhuang42 merged 4 commits intofacebook:devfrom
azat-archive:seekable_decompression-fixes

Conversation

@azat
Copy link
Contributor

@azat azat commented Apr 30, 2021

Changelog:

  • seekable_format: cap the offset+len up to the last dOffset
    This will allow to read the whole file w/o gotting corruption error if
    the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737
    

    Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0
    

    After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
    
  • seekable_decompression: break when ZSTD_seekable_decompress() returns zero

  • seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero

  • seekable_format: fix from-file reading (not in-memory)


while (startOffset < endOffset) {
size_t const result = ZSTD_seekable_decompress(seekable, buffOut, MIN(endOffset - startOffset, buffOutSize), startOffset);
if (!result) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization.
Is there more to it ? (does it dodge an error case ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there more to it ?

Added the same for seekable_decompression_mem.c

(does it dodge an error case ?)

In theory before fixing frame overrun and not using checksums it is possible to go to endless loop

azat added 4 commits April 30, 2021 21:46
It tries to check the buffer boundary, but there is no buffer for
from-file reading.
This will allow to read the whole file w/o gotting corruption error if
the offset is more then the data left in file, i.e.:

    $ ./seekable_compression seekable_compression.c 8192 | head
    $ zstd -cdq seekable_compression.c.zst | wc -c
    4737

Before this patch:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    ZSTD_seekable_decompress() error : Corrupted block detected
    0

After:

    $ ./seekable_decompression seekable_compression.c.zst 0 10000000 | wc -c
    4737
@azat azat force-pushed the seekable_decompression-fixes branch from 475c49a to 32d0813 Compare April 30, 2021 18:46
@senhuang42 senhuang42 merged commit 53a60e9 into facebook:dev May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants