Skip to content

GzipDecoder chokes on extra headers #175

@jhwgh1968

Description

@jhwgh1968

It seems that some gzip files will not decode with this library, even though the underlying flate2 library handles them correctly.

The trouble seems to be an extra field, which has tripped up other languages in the past as well:

https://forum.crystal-lang.org/t/error-when-read-extra-field-gzip-compressed-data/1840/9
https://bugs.python.org/issue17681

Cribbing from the first link, here is an easy way to make a simple gzip file that behaves improperly:

echo -n 'H4sIBAAAAAAA/wYAQkMCADIAS0ksTuNKSyxO4UqtSC4pSuQqLilNS+MCAI56o3cXAAAAH4sIBAAAAAAA/wYAQkMCABsAAwAAAAAAAAAAAA==' | base64 -d > test.gz

If you run gunzip on this or use the original flate2 library, it decodes correctly, including checksum verification.

But with this library, the reproducer program prints unexpected end of file. Other files I can't share also report deflate decompression error.

Reproducer program:

Details
use std::io::Read;

use anyhow::Context;
// features: gzip tokio
use async_compression::tokio::bufread::GzipDecoder;
use tokio::fs::File;
use tokio::io::{AsyncBufRead, AsyncReadExt, BufReader};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let input_file = File::open("test.gz").await.context("failed to open file")?;
    let reader = BufReader::new(input_file);
    let mut decoder = GzipDecoder::new(reader);

    let mut buf = vec![0; 8192];
    let n = decoder.read(&mut buf).await.context("decompress failed")?;
    buf.truncate(n);

    println!("{}", String::from_utf8_lossy(&buf));

    Ok(())
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions