Skip to content

zstd fails with archives using skippable frames #271

@cgwalters

Description

@cgwalters

xref ostreedev/ostree-rs-ext#616

The containers zstd:chunked format uses zstd's skippable frames.

It seems that async_compression's zstd support somehow fails with these.

zstd skippable frames don't seem very widely used, but the way I'm testing here is using e.g. skopeo copy --dest-compress-format=zstd:chunked docker://docker.io/library/busybox oci:busybox and you'll get a zstd:chunked tar archive in busybox/blobs/sha256.

I wrote up this quick test program:

use anyhow::Result;

use async_compression::tokio::bufread::ZstdDecoder;
use tokio::io::{stdin, BufReader};

async fn async_decompress() -> Result<()> {
    // Read zstd encoded data from stdin and decode
    let mut reader = ZstdDecoder::new(BufReader::new(stdin()));
    let mut stdout = tokio::io::stdout();
    tokio::io::copy(&mut reader, &mut stdout).await?;
    Ok(())
}

fn sync_decompress() -> Result<()> {
    zstd::stream::copy_decode(std::io::stdin(), std::io::stdout())?;
    Ok(())
}

#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<()> {
    let args = std::env::args();
    let arg = args
        .skip(1)
        .next()
        .ok_or_else(|| anyhow::anyhow!("Missing arg"))?;
    match arg.as_str() {
        "async" => async_decompress().await?,
        "sync" => sync_decompress()?,
        o => anyhow::bail!("invalid {o}"),
    }

    Ok(())
}

And the result is (using nushell syntax on a test archive I have handy):

$ open -r blobs/sha256/bf6f77dfa3bbed41a513875363df862130cdc01bb64bfa2fcabd5b4a65faa1c2 | testcompress sync | tar tf -
<tar extracted ok>

vs

$ open -r blobs/sha256/bf6f77dfa3bbed41a513875363df862130cdc01bb64bfa2fcabd5b4a65faa1c2 | testcompress async | tar tf -
usr/
usr/src/
usr/src/wordpress/
usr/src/wordpress/.htaccess
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
$

Just trying to dig into this, I don't see many references to skippable frame support in the Rust zstd bindings - not clear to me even how one accesses them with the C library. Whereas I see a clear Skippable flag in the Go implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions