feat(parser): implement pstring multi-byte length prefix variants (/B, /H, /h, /L, /l, /J)#183
Conversation
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…ation Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…version notes Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…l quirks Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
|
Important Review skippedAuto reviews are limited based on label configuration. 🏷️ Required labels (at least one) (3)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdd multi-width Pascal-string (pstring) support: new Changes
Sequence DiagramsequenceDiagram
participant Client as CLI/Parser
participant Grammar as Grammar<br/>(parse_pstring_suffix)
participant AST as AST<br/>(TypeKind::PString)
participant Evaluator as Evaluator
participant Reader as StringReader<br/>(read_pstring)
participant Buffer as Buffer
Client->>Grammar: parse "pstring/H/J" token
Grammar->>Grammar: detect "pstring" and call parse_pstring_suffix
Grammar->>AST: construct TypeKind::PString { length_width, length_includes_itself }
AST-->>Client: return TypeKind
Client->>Evaluator: evaluate rule with TypeKind::PString
Evaluator->>Reader: read_pstring(buffer, offset, max_length, length_width, length_includes_itself)
Reader->>Buffer: read N prefix bytes (N = length_width.byte_count())
Buffer-->>Reader: prefix bytes
Reader->>Reader: parse length (BE/LE), apply /J adjust
Reader->>Buffer: slice string bytes according to computed length
Buffer-->>Reader: string bytes
Reader-->>Evaluator: return Value::String or TypeReadError
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 10✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 CI must passWonderful, this rule succeeded.All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.
🟢 Do not merge outdated PRsWonderful, this rule succeeded.Make sure PRs are within 10 commits of the base branch before merging
|
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
|
Related Documentation 6 document(s) may need updating based on files changed in this PR: libMagic-rs architecture
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…r and evaluator Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
docs/src/ast-structures.md (1)
172-190:⚠️ Potential issue | 🟠 MajorAST docs still describe the old
TypeKind::PStringshape.
TypeKind::PStringis documented with onlymax_length, but the PR adds width and/J-related semantics. Please update the enum snippet and PString section to include new fields and add a short migration note for downstream code that constructs or pattern-matches this variant.As per coding guidelines, "Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides. Focus on user experience and developer onboarding."
Also applies to: 232-267
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/ast-structures.md` around lines 172 - 190, Update the TypeKind enum snippet and documentation to reflect the new PString shape: modify TypeKind::PString to include the added fields (the new width field and the /J-related flag/semantics introduced in the PR) and update the descriptive paragraph for "Pascal string (length-prefixed)" to explain how width and the /J semantics affect encoding/decoding; add a short migration note explaining how to construct the new PString variant and how to update pattern matches (matching on fields or using wildcard for added fields) so downstream code compiles, and apply the same changes to the other occurrence mentioned (the block around the later lines referenced).docs/src/magic-format.md (1)
228-240:⚠️ Potential issue | 🟠 MajorMagic-format reference is missing new pstring suffix grammar.
The section should document supported
pstringsuffixes (/B,/H,/h,/L,/l,/J, and combined forms like/HJ) with concrete examples and behavior notes. Right now readers are told only 1-byte length-prefix behavior, which is no longer accurate.As per coding guidelines, "Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides. Focus on user experience and developer onboarding."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/magic-format.md` around lines 228 - 240, Update the "Pascal String Type" pstring docs to reflect the new suffix grammar: list supported suffixes (/B, /H, /h, /L, /l, /J) and allowed combined forms (e.g., /HJ), explain how each suffix changes the length-prefix size and endianness (e.g., /B=1-byte, /H=/h=2-byte big/little as implemented, /L=/l=4-byte, /J=8-byte), document how optional max_length still caps the decoded length, and add concrete examples showing the magic lines and resulting behavior (e.g., "0 pstring/B =JPEG" vs "0 pstring/HJ x =...") so readers can see exact parsing semantics and migration implications for code expecting 1-byte-only pstrings.docs/src/parser.md (1)
187-226:⚠️ Potential issue | 🟠 MajorUpdate pstring parser docs to match new suffix support.
This section is now stale: it documents only 1-byte pstring behavior, but parser/evaluator now support
/B,/H,/h,/L,/l, and/Jcombinations. Please add syntax, endianness semantics,/Jself-inclusive length behavior, and at least one example per width family so readers can author rules correctly.As per coding guidelines, "Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides. Focus on user experience and developer onboarding."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/parser.md` around lines 187 - 226, Update the "Pascal String (pstring) Type" docs to document the new suffixes and their semantics: list the allowed suffixes (/B, /H, /h, /L, /l, /J), explain width mapping (B=1, H=2, L=4) and that uppercase means big-endian while lowercase means little-endian for multi-byte length fields, and describe /J as the self-inclusive length variant (length includes the size of the length field itself). Show the exact parsing behavior used by parse_type_keyword() and the evaluator: how the length field is read (with chosen width and endianness), how max_length still caps the decoded length, how /J adjusts the resulting byte count, UTF‑8 validation/replacement, and that values are produced as Value::String for comparisons mapped to TypeKind::PString. Add one short usage example per width family (B, H/h, L/l) including a /J example and one with a max_length (e.g., pstring/64), and ensure the examples mirror the syntax the parser recognizes so readers can author rules correctly.src/evaluator/types/string.rs (1)
86-139: 🛠️ Refactor suggestion | 🟠 MajorDocument the two new
read_pstringknobs in rustdoc.The public signature now exposes
length_widthandlength_includes_itself, but the# Argumentssection still documents onlybuffer,offset, andmax_length, and the examples never show/J. Please add those parameters plus the underflow error case so the new behavior is discoverable in rustdoc. As per coding guidelines, "All public APIs require rustdoc with examples; include error conditions and recovery strategies; provide usage examples for common patterns."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/evaluator/types/string.rs` around lines 86 - 139, Update the rustdoc for the public function read_pstring to document the two new parameters: add entries in the # Arguments section for length_width (type PStringLengthWidth) and length_includes_itself (bool) explaining their meanings and how they affect the parsed length, and mention the underflow/error case where a length prefix that implies fewer bytes than the header or causes negative effective payload should return/raise TypeReadError::BufferOverrun; update the # Examples to include calls that demonstrate different PStringLengthWidth variants (OneByte, TwoByteBE, FourByteLE) and both values of length_includes_itself (true/false) including an example that triggers the underflow/BufferOverrun case so the new behavior and recovery are visible in rustdoc.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.mcp.json:
- Around line 1-12: This PR should not include the MCP server configuration;
either remove the .mcp.json entry entirely from this PR or extract it into its
own PR that adds the file plus clear documentation; specifically delete or
relocate the "mcpServers" object containing the "tessl" entry (type "stdio",
command "tessl", args ["mcp","start"]) and if moved, add a README or PR
description explaining the tessl dependency, its purpose, how it integrates with
the project, and why the MCP server is needed.
In `@AGENTS.md`:
- Around line 208-209: The pstring documentation omits the implemented /J
(self-inclusive length) suffix and its combinability with size suffixes (e.g.,
/HJ, /lJ); update the pstring sentence in AGENTS.md to state that pstring
supports 1/2/4-byte length prefixes via /B, /H (or /h), and /L (or /l) and also
supports the /J modifier which makes the length count include the length
byte(s), and note that /J can be combined with any size suffix (examples: /HJ,
/lJ) so the docs match actual behavior of pstring parsing.
In `@GOTCHAS.md`:
- Around line 94-97: Update the GOTCHAS.md section to correctly describe pstring
endianness: state that /H denotes a 2-byte big-endian length prefix and /L
denotes a 4-byte big-endian length prefix (while /h and /l remain the
little-endian counterparts), and adjust the example bytes and wording
accordingly so examples show big-endian ordering for /H and /L and little-endian
for /h and /l.
In `@src/evaluator/types/string.rs`:
- Around line 829-833: The test
test_read_pstring_j_flag_length_less_than_prefix_width is using the wrong byte
order for a TwoByteLE prefix: change the buffer used in the test (referencing
read_pstring and PStringLengthWidth::TwoByteLE) so the two-byte prefix encodes 1
in little-endian (swap the bytes to b"\x01\x00xx") to trigger the checked_sub
underflow branch; alternatively, if you prefer not to change bytes, switch the
length width to TwoByteBE so the current bytes represent 1.
- Around line 149-156: The PStringLengthWidth::OneByte branch in read_pstring
currently uses direct indexing (len_bytes[0]); change it to use safe accessor
semantics like len_bytes.get(0) and propagate the same
TypeReadError::BufferOverrun (or appropriate error) if missing, then convert the
retrieved byte to usize (e.g., usize::from(*byte)) so this branch conforms to
the repo invariant of always using .get() for buffer access.
In `@src/parser/ast.rs`:
- Around line 239-240: The PString doc comment incorrectly states the length
prefix is little-endian; update the comment for the PString type (and any
related functions/constructors named PString, parse_pstring, or similar in
ast.rs) to say the prefix endianness is configurable and can be either
little-endian or big-endian (supported by format specifiers /H, /h, /L, /l), and
describe that the length is stored in the specified endianness followed by that
many bytes of string data (not null-terminated); keep the note about 1, 2, or 4
byte lengths and mention the specifiers that control byte order so future
readers can find the supported options.
- Around line 949-958: The round-trip serialization test only covers PString
with PStringLengthWidth::OneByte and length_includes_itself: false; add
additional TypeKind::PString cases to that test exercising 2-byte and 4-byte
width variants for both endianness and a case where length_includes_itself is
true (the “/J” case). Concretely, in the existing round-trip test that builds
TypeKind::PString values, add at least: a 2-byte big-endian variant, a 2-byte
little-endian variant, a 4-byte big-endian variant, a 4-byte little-endian
variant (using the appropriate PStringLengthWidth enum variants for 2/4 bytes
and BE/LE), and one PString with length_includes_itself: true; ensure each new
instance is included in the same serialize->deserialize assertions to catch
serde shape regressions.
In `@src/parser/grammar/mod.rs`:
- Around line 727-737: The code currently consumes the '/' before validating the
pstring suffix which lets invalid suffixes like '/x' be silently accepted;
change the logic in the pstring handling block so you first attempt to
parse/validate the suffix (e.g., call or adapt parse_pstring_suffix to return a
Result or an Option without mutating input) and only update input,
pstring_length_width, and pstring_length_includes_itself when the suffix parse
succeeds; on failure, emit/collect a syntax error (with location) and leave the
input unchanged so parsing can continue. Ensure references to
parse_pstring_suffix, input, pstring_length_width, and
pstring_length_includes_itself are used to locate and implement the change.
In `@tessl.json`:
- Around line 5-7: The dependency entry for "actionbook/rust-skills" is pinned
to a raw commit hash in the "version" field which hinders maintainability;
replace the commit-hash value with a stable tag or semantic release (e.g.,
"v1.2.3") when available, and if a tag is not available document an update
policy for this vendored dependency and add an automated periodic check (CI job
or script) to verify the commit remains accessible and to alert when newer
tags/commits appear; update the "version" field accordingly and add a short
comment or README note describing the chosen update cadence and the existence of
the periodic check.
- Around line 1-40: tessl.json appears to be a development-only AI/tooling
config (contains "name": "stringy" and a pinned dependency
"actionbook/rust-skills") and should not be included in the published crate;
either move this file into a non-distributed location (e.g., .github/ or a
tools/ directory) or add tessl.json to .gitignore, correct the "name" field to
match the repository ("libmagic-rs") if you keep it, and add a short README
entry documenting why the actionbook/rust-skills dependency is needed and
whether pinning to the specific commit is intentional or should be
relaxed/removed.
---
Outside diff comments:
In `@docs/src/ast-structures.md`:
- Around line 172-190: Update the TypeKind enum snippet and documentation to
reflect the new PString shape: modify TypeKind::PString to include the added
fields (the new width field and the /J-related flag/semantics introduced in the
PR) and update the descriptive paragraph for "Pascal string (length-prefixed)"
to explain how width and the /J semantics affect encoding/decoding; add a short
migration note explaining how to construct the new PString variant and how to
update pattern matches (matching on fields or using wildcard for added fields)
so downstream code compiles, and apply the same changes to the other occurrence
mentioned (the block around the later lines referenced).
In `@docs/src/magic-format.md`:
- Around line 228-240: Update the "Pascal String Type" pstring docs to reflect
the new suffix grammar: list supported suffixes (/B, /H, /h, /L, /l, /J) and
allowed combined forms (e.g., /HJ), explain how each suffix changes the
length-prefix size and endianness (e.g., /B=1-byte, /H=/h=2-byte big/little as
implemented, /L=/l=4-byte, /J=8-byte), document how optional max_length still
caps the decoded length, and add concrete examples showing the magic lines and
resulting behavior (e.g., "0 pstring/B =JPEG" vs "0 pstring/HJ x =...") so
readers can see exact parsing semantics and migration implications for code
expecting 1-byte-only pstrings.
In `@docs/src/parser.md`:
- Around line 187-226: Update the "Pascal String (pstring) Type" docs to
document the new suffixes and their semantics: list the allowed suffixes (/B,
/H, /h, /L, /l, /J), explain width mapping (B=1, H=2, L=4) and that uppercase
means big-endian while lowercase means little-endian for multi-byte length
fields, and describe /J as the self-inclusive length variant (length includes
the size of the length field itself). Show the exact parsing behavior used by
parse_type_keyword() and the evaluator: how the length field is read (with
chosen width and endianness), how max_length still caps the decoded length, how
/J adjusts the resulting byte count, UTF‑8 validation/replacement, and that
values are produced as Value::String for comparisons mapped to
TypeKind::PString. Add one short usage example per width family (B, H/h, L/l)
including a /J example and one with a max_length (e.g., pstring/64), and ensure
the examples mirror the syntax the parser recognizes so readers can author rules
correctly.
In `@src/evaluator/types/string.rs`:
- Around line 86-139: Update the rustdoc for the public function read_pstring to
document the two new parameters: add entries in the # Arguments section for
length_width (type PStringLengthWidth) and length_includes_itself (bool)
explaining their meanings and how they affect the parsed length, and mention the
underflow/error case where a length prefix that implies fewer bytes than the
header or causes negative effective payload should return/raise
TypeReadError::BufferOverrun; update the # Examples to include calls that
demonstrate different PStringLengthWidth variants (OneByte, TwoByteBE,
FourByteLE) and both values of length_includes_itself (true/false) including an
example that triggers the underflow/BufferOverrun case so the new behavior and
recovery are visible in rustdoc.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d45f7fed-b735-4e94-9b40-7b94b86db8cb
⛔ Files ignored due to path filters (7)
.codex/config.tomlis excluded by none and included by none.gemini/settings.jsonis excluded by none and included by none.gitignoreis excluded by none and included by none.tessl/.gitignoreis excluded by none and included by nonedocs/API_REFERENCE.mdis excluded by none and included by nonetests/evaluator_tests.rsis excluded by none and included by nonetests/property_tests.rsis excluded by none and included by none
📒 Files selected for processing (19)
.coderabbit.yaml.mcp.jsonAGENTS.mdCONTRIBUTING.mdGOTCHAS.mddocs/src/ast-structures.mddocs/src/magic-format.mddocs/src/parser.mdsrc/build_helpers.rssrc/evaluator/engine/tests.rssrc/evaluator/strength.rssrc/evaluator/types/mod.rssrc/evaluator/types/string.rssrc/parser/ast.rssrc/parser/codegen.rssrc/parser/grammar/mod.rssrc/parser/grammar/tests.rssrc/parser/types.rstessl.json
💤 Files with no reviewable changes (1)
- .coderabbit.yaml
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/src/parser.md (1)
187-263:⚠️ Potential issue | 🔴 CriticalRemove unsupported
max_lengthexample from documentation.The pstring parser does not support the
/64suffix syntax shown in the line 250 example. The test cases insrc/parser/grammar/tests.rs:2138-2180confirm the parser only handles width variants (/B,/H,/h,/L,/l) and the self-inclusive flag (/J), withmax_lengthalways initialized toNone. Themax_lengthfield exists in theTypeKind::PStringAST struct for programmatic use, but cannot be specified through magic rule syntax.Remove the example
0 pstring/H/64 xfrom line 250 or clarify thatmax_lengthis only available when constructing rules programmatically via the Rust API.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/parser.md` around lines 187 - 263, Update the pstring docs to remove the unsupported suffix example (the `0 pstring/H/64 x` line) or explicitly state that max_length cannot be specified in magic syntax; instead note that max_length exists only in the AST/ Rust API. Reference parse_type_keyword() and parse_pstring_suffix() as the parser entrypoints and TypeKind::PString as the AST variant so readers understand that only /B,/H,/h,/L,/l and /J suffixes are parsed and that max_length is set programmatically, not via rule suffixes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@docs/src/parser.md`:
- Around line 187-263: Update the pstring docs to remove the unsupported suffix
example (the `0 pstring/H/64 x` line) or explicitly state that max_length cannot
be specified in magic syntax; instead note that max_length exists only in the
AST/ Rust API. Reference parse_type_keyword() and parse_pstring_suffix() as the
parser entrypoints and TypeKind::PString as the AST variant so readers
understand that only /B,/H,/h,/L,/l and /J suffixes are parsed and that
max_length is set programmatically, not via rule suffixes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 46e46d0e-2016-4321-97fd-f5a090d065ee
⛔ Files ignored due to path filters (1)
docs/MAGIC_FORMAT.mdis excluded by none and included by none
📒 Files selected for processing (6)
.coderabbit.yamldocs/src/architecture.mddocs/src/ast-structures.mddocs/src/evaluator.mddocs/src/magic-format.mddocs/src/parser.md
- Reject unrecognized pstring suffix characters (e.g., /Z) with parse error instead of silently defaulting to OneByte - Add # Examples rustdoc to all PStringLengthWidth enum variants - Fix doc comment claiming "little-endian" when both endiannesses are supported - Re-export PStringLengthWidth from lib.rs alongside other parser::ast types - Add InvalidPStringLength error variant for /J underflow (was misleadingly reported as BufferOverrun) - Remove unused is_big_endian() helper method - Add integration test for 2-byte BE prefix with /J flag - Add test for /J + max_length interaction - Add test for /J zero-length edge case across all widths - Add test for invalid suffix rejection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Fix GOTCHAS.md: document correct endianness mapping (uppercase=BE, lowercase=LE) and /J flag semantics - Document /J flag support in AGENTS.md feature list and limitations section - Use safe .first() accessor instead of len_bytes[0] indexing for OneByte variant - Add round-trip serialization test coverage for TwoByteBE and FourByteLE variants with length_includes_itself variations - Remove unrelated config files (.mcp.json, tessl.json, .codex/, .gemini/) from PR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
## 🤖 New release
* `libmagic-rs`: 0.5.0 -> 0.6.0 (⚠ API breaking changes)
### ⚠ `libmagic-rs` breaking changes
```text
--- failure constructible_struct_adds_field: externally-constructible struct adds field ---
Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/constructible_struct_adds_field.ron
Failed in:
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
--- failure copy_impl_added: type now implements Copy ---
Description:
A public type now implements Copy, causing non-move closures to capture it by reference instead of moving it.
ref: rust-lang/rust#100905
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/copy_impl_added.ron
Failed in:
libmagic_rs::mime::MimeMapper in /tmp/.tmpwFvgw1/libmagic-rs/src/mime.rs:98
--- failure enum_marked_non_exhaustive: enum marked #[non_exhaustive] ---
Description:
A public enum has been marked #[non_exhaustive]. Pattern-matching on it outside of its crate must now include a wildcard pattern like `_`, or it will fail to compile.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/enum_marked_non_exhaustive.ron
Failed in:
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum LibmagicError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:15
enum LibmagicError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:15
enum IoError in /tmp/.tmpwFvgw1/libmagic-rs/src/io/mod.rs:26
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum TypeReadError in /tmp/.tmpwFvgw1/libmagic-rs/src/evaluator/types/mod.rs:56
enum ParseError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:74
enum ParseError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:74
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum EvaluationError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:148
enum EvaluationError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:148
--- failure enum_struct_variant_field_added: pub enum struct variant field added ---
Description:
An enum's exhaustive struct variant has a new field, which has to be included when constructing or matching on this variant.
ref: https://doc.rust-lang.org/reference/attributes/type_system.html#the-non_exhaustive-attribute
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/enum_struct_variant_field_added.ron
Failed in:
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
--- failure function_missing: pub fn removed or renamed ---
Description:
A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/function_missing.ron
Failed in:
function libmagic_rs::parser::grammar::is_empty_line, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1025
function libmagic_rs::parser::grammar::parse_strength_directive, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:846
function libmagic_rs::parser::grammar::parse_type_and_operator, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:683
function libmagic_rs::parser::grammar::parse_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:179
function libmagic_rs::parser::parse_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:179
function libmagic_rs::parser::grammar::parse_comment, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1004
function libmagic_rs::parser::grammar::parse_message, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:810
function libmagic_rs::parser::grammar::parse_value, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:633
function libmagic_rs::parser::grammar::parse_number, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:133
function libmagic_rs::parser::parse_number, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:133
function libmagic_rs::parser::grammar::has_continuation, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1060
function libmagic_rs::parser::grammar::parse_magic_rule, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:946
function libmagic_rs::parser::grammar::parse_rule_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:779
function libmagic_rs::parser::grammar::is_comment_line, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1042
function libmagic_rs::parser::grammar::is_strength_directive, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:902
function libmagic_rs::parser::grammar::parse_type, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:749
function libmagic_rs::parser::grammar::parse_operator, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:227
--- failure function_parameter_count_changed: pub fn parameter count changed ---
Description:
A publicly-visible function now takes a different number of parameters.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#fn-change-arity
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/function_parameter_count_changed.ron
Failed in:
libmagic_rs::evaluator::evaluate_single_rule now takes 3 parameters instead of 2, in /tmp/.tmpwFvgw1/libmagic-rs/src/evaluator/engine/mod.rs:196
--- failure inherent_method_missing: pub method removed or renamed ---
Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/inherent_method_missing.ron
Failed in:
FileBuffer::create_symlink, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/io/mod.rs:326
EvaluationContext::increment_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:114
EvaluationContext::decrement_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:130
EvaluationContext::increment_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:114
EvaluationContext::decrement_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:130
--- failure module_missing: pub module removed or renamed ---
Description:
A publicly-visible module cannot be imported by its prior path. A `pub use` may have been removed, or the module may have been renamed, removed, or made non-public.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/module_missing.ron
Failed in:
mod libmagic_rs::parser::grammar, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:4
--- failure struct_marked_non_exhaustive: struct marked #[non_exhaustive] ---
Description:
A public struct has been marked #[non_exhaustive], which will prevent it from being constructed using a struct literal outside of its crate. It previously had no private fields, so a struct literal could be used to construct it outside its crate.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/struct_marked_non_exhaustive.ron
Failed in:
struct EvaluationConfig in /tmp/.tmpwFvgw1/libmagic-rs/src/config.rs:42
```
<details><summary><i><b>Changelog</b></i></summary><p>
<blockquote>
## [0.6.0] - 2026-04-25
### Features
- **parser**: Add Date and QDate types with serialization support
([#165](#165))
- **parser**: Implement pstring (Pascal string) type
([#170](#170))
- **parser**: Implement pstring multi-byte length prefix variants (/B,
/H, /h, /L, /l, /J)
([#183](#183))
- **evaluator**: Add debug-level tracing for skipped rules
([#184](#184))
- **evaluator**: Implement indirect offset resolution
([#37](#37))
([#199](#199))
- **evaluator**: Implement relative offset resolution
([#38](#38))
([#211](#211))
- **deps**: Add new skills to actionbook/rust-skills and
trailofbits/skills
- **evaluator**: Regex and search types (closes #39)
([#214](#214))
- Implement libmagic meta-type directives and format substitution
([#42](#42))
([#230](#230))
### Bug Fixes
- **regex**: PR #214 follow-up review findings
([#215](#215))
- Load and correctly evaluate /usr/share/file/magic/filesystems and
adjacent magic files
([#233](#233))
### Documentation
- **gotchas**: Clarify requirements for adding TypeKind variants
### Miscellaneous Tasks
- Rename .coderabbitai.yaml to .coderabbit.yaml
- **Mergify**: Configuration update
([#173](#173))
- Update .gitignore to exclude local AI assistant files
- **mergify**: Upgrade configuration to current format
([#205](#205))
- Resolve all pending TODO items
([#212](#212))
- **mergify**: Upgrade configuration to current format
([#231](#231))
<!-- generated by git-cliff -->
### Security
- **io**: Close TOCTOU race in `FileBuffer::new` metadata validation
(CWE-367). `validate_file_metadata` now uses `File::metadata()` on the
open descriptor instead of re-canonicalizing the path, so an attacker
cannot swap the path between `open_file` and validation. Error paths now
report the caller-supplied path rather than the canonicalized variant.
- **cli**: Remove relative-path fallbacks from `default_magic_file_path`
(CWE-426). `./missing.magic`, `./third_party/magic.mgc`, and the
`CI`/`GITHUB_ACTIONS` env-var branch no longer resolve against the
process cwd. CI pipelines must pass `--magic <path>` explicitly.
- **evaluator**: `build_regex` now bounds `size_limit` and
`dfa_size_limit` to 1 MiB (`REGEX_COMPILE_SIZE_LIMIT`) to reject
compile-time DoS patterns (CWE-1333) from adversarial magic files.
### Features
- **parser**: Implement meta-type directives: `name`/`use` subroutines,
`default`/`clear` per-level fallback, and `indirect` re-evaluation.
`parse_text_magic_file` now returns `ParsedMagic { rules, name_table }`
(breaking change from `Vec<MagicRule>`). Named subroutines are hoisted
into `NameTable` at load time and dispatched via `RuleEnvironment` in
the evaluator. Recursion is bounded by
`EvaluationConfig::max_recursion_depth`. Resolves
[#42](#42).
- **evaluator**: Thread-local regex compile cache eliminates the
double-compile paid by every successful regex match.
`regex_bytes_consumed` now reuses the compiled `Regex` from `read_regex`
instead of recompiling the pattern to derive the anchor advance. The
cache is reset at the start of every `evaluate_rules_with_config` call,
bounding memory to one evaluation.
- **config**: `EvaluationConfig` is now `#[non_exhaustive]`; new
builder-style setters (`with_max_recursion_depth`,
`with_max_string_length`, `with_stop_at_first_match`, `with_mime_types`,
`with_timeout_ms`) let external crates construct configurations without
struct literals.
- **parser**: `MagicRule::new()` smart constructor with
`::with_children()`, `::with_strength_modifier()`, `::with_level()`
builder methods and a `::validate()` method enforcing structural
invariants (non-empty message, `level <= MAX_LEVEL`, children nested
strictly deeper than parent). New `MagicRuleValidationError` error type.
- **parser**: `RegexFlags::with_case_insensitive()` and
`::with_start_offset()` builder methods.
### Refactor
- **engine**: Extract `evaluate_pattern_rule()` and
`evaluate_value_rule()` helpers from
`evaluate_single_rule_with_anchor`'s 90-line body. Dispatch is now a
two-arm type-category split; each helper has focused rustdoc on
semantics and invariants.
- **types**: Replace the `_ =>` catch-all in
`bytes_consumed_with_pattern` with an explicit listing of the
fixed-width `TypeKind` variants. Adding a new variable-width variant
without updating this match is now a compile error instead of a silent
relative-offset anchor corruption in release builds.
- **parser**: Split the 185-line `type_keyword_to_kind` match into
per-family helpers (`byte_family`, `short_family`, `long_family`,
`quad_family`, `float_family`, `double_family`, `date_family`,
`qdate_family`, `string_family`). Drops the
`#[allow(clippy::too_many_lines)]` attribute.
- **main**: `main()` returns `std::process::ExitCode` instead of calling
`process::exit`, so destructors run on the happy path. Ctrl-C
`AtomicBool` flag uses `Ordering::Relaxed` instead of `SeqCst`.
- **grammar**: `parse_strength_directive` uses nom 8's `preceded` +
`Parser::map` instead of the legacy `map(pair(char(...), parse_number),
|(_, n)| ...)` pattern.
- **output**: Add `#[serde(skip_serializing_if = "Option::is_none",
default)]` to public `Option<T>` fields so JSON output no longer emits
`"field": null` for unset optional values.
### Documentation
- **lib**: Add `# Security` sections to
`MagicDatabase::with_builtin_rules`, `::with_builtin_rules_and_config`,
`::load_from_file`, and `::load_from_file_with_config` warning about the
unbounded default timeout and recommending
`EvaluationConfig::performance()` for untrusted input.
- **lib**: Document `MagicDatabase: Send + Sync` for parallel scanning.
- **README**: Update `TypeKind` enum example to match the current AST,
add `regex` and `search/N` to the supported types table, add pre-1.0 API
stability warning, correct the roadmap to mark v0.2-v0.4 as shipped.
- **AGENTS.md**: Relabel "Currently Implemented (v0.1.0)" and "Current
Limitations (v0.1.0)" to v0.5.0 and rewrite the Development Phases
section to reflect actual shipped scope.
### Testing
- Security regression tests for S-H1 (planted-magic-file in cwd), S-H2
(TOCTOU path-swap contract), S-M2 (pathological regex bounded runtime),
S-L2 (codegen message escape round-trip), and GOTCHAS S13.1
(`EvaluationConfig::default()` unbounded timeout invariant).
- Backspace message concatenation regression tests for first-match,
consecutive, and empty-rest edge cases.
- `MagicRule::validate()` tests covering empty message, child level
invariant, and max-depth rejection.
- `RegexCache` population/clear/reuse tests.
### Breaking Changes
- **parser**: `parse_text_magic_file` return type changed from
`Result<Vec<MagicRule>, ParseError>` to `Result<ParsedMagic,
ParseError>`. Callers must destructure `ParsedMagic { rules, name_table
}`. Low-level callers that only need the rule list can use
`parsed.rules`. `load_magic_file` and `load_magic_directory` return the
same new type.
</blockquote>
</p></details>
---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
/H(2-byte BE),/h(2-byte LE),/L(4-byte BE),/l(4-byte LE),/B(1-byte, default)/Jflag support (JPEG-style self-inclusive length) with combinable syntax (/HJ,/lJ, etc.)unwrap()in library code with proper error propagationChanges
PStringLengthWidthenum expanded to 5 variants:OneByte,TwoByteBE,TwoByteLE,FourByteBE,FourByteLElength_includes_itself: boolfield toTypeKind::PStringfor/Jflagread_pstringusesfrom_be_bytes/from_le_bytesbased on variant; subtracts prefix width when/Jis set/Jstandalone and in combinations viaparse_pstring_suffixhelperTest Plan
cargo clippy -- -D warningscleanjust ci-checkpasses/Jflag with all width variants/Jwhere length equals prefix width (empty string), length < prefix width (error)Closes #171
🤖 Generated with Claude Code