Skip to content

fix(libutil/tarfile): normalize legacy HTTP Content-Encoding names#14417

Closed
lovesegfault wants to merge 1 commit into
NixOS:masterfrom
lovesegfault:fix-content-encoding
Closed

fix(libutil/tarfile): normalize legacy HTTP Content-Encoding names#14417
lovesegfault wants to merge 1 commit into
NixOS:masterfrom
lovesegfault:fix-content-encoding

Conversation

@lovesegfault
Copy link
Copy Markdown
Member

Motivation

Nix failed to download files served with Content-Encoding: x-gzip
because libarchive doesn't recognize the legacy x-* compression
format names. Per RFC 9110 §8.4.1.3, HTTP recipients should treat
these as equivalent to their standard counterparts.

Adds normalizeCompressionMethod() to map legacy encoding names
before passing to libarchive:

  • x-gzipgzip
  • x-compresscompress
  • x-bzip2bzip2

Context

Fixes: #14324


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Nix failed to download files served with `Content-Encoding: x-gzip`
because libarchive doesn't recognize the legacy `x-*` compression
format names. Per RFC 9110 §8.4.1.3, HTTP recipients should treat
these as equivalent to their standard counterparts.

Adds `normalizeCompressionMethod()` to map legacy encoding names
before passing to libarchive:
- `x-gzip` → `gzip`
- `x-compress` → `compress`
- `x-bzip2` → `bzip2`
Copy link
Copy Markdown
Contributor

@xokdvium xokdvium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what we should do is stop using strings to represent enumeration types. We confuse compression algorithm name used by libarxhive and our non-standard Content-Enxoding headers. Those need to become clearly separated

@Mic92
Copy link
Copy Markdown
Member

Mic92 commented Oct 29, 2025

I think that what we should do is stop using strings to represent enumeration types. We confuse compression algorithm name used by libarxhive and our non-standard Content-Enxoding headers. Those need to become clearly separated

you mean libarchive has enum types for compression?

@xokdvium
Copy link
Copy Markdown
Contributor

libarchive has enum types for compression?

It doesn't unfortunately, but we really should have our own to wrap around libarchive.

@Ericson2314
Copy link
Copy Markdown
Member

I agree that making our own enum sounds like the right call.

Copy link
Copy Markdown
Contributor

@tomberek tomberek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can refactor into enum in another PR.

Copy link
Copy Markdown
Contributor

@xokdvium xokdvium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that must only affect Content-Encoding parsing. e.g. it must not be possible to specify the deprecated name as the store parameter. Since there's currently no distinction in the code it's a no-go IMO.

@lovesegfault
Copy link
Copy Markdown
Member Author

fwiw I agree the enum approach is better here, just haven't found the time to do it

@xokdvium
Copy link
Copy Markdown
Contributor

just haven't found the time to do it

I have some WIP commits for that. In the meantime I don't see a need to rush. This (not accepting deprecated non-standard aliases) is not a regression.

@xokdvium
Copy link
Copy Markdown
Contributor

xokdvium commented Mar 5, 2026

I think this has been fully addressed with #15336.

@xokdvium xokdvium closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Content-Encoding: x-gzip should be supported

5 participants