feat(evaluator): implement relative offset resolution (#38)#211
Conversation
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Implement OffsetSpec::Relative evaluator support using GNU file/libmagic semantics. The AST variant existed but the evaluator stub returned UnsupportedType. This change threads a previous-match anchor through EvaluationContext, advances it after each successful match by the bytes consumed (variable-width types include c-string NUL terminators and pstring length prefixes), and resolves Relative(delta) as last_match_end + delta with bounds and overflow checks. Top-level relative offsets resolve from anchor 0, matching libmagic. The anchor is global-monotonic across child recursion (no save/restore) so siblings see the deepest descendant's anchor -- documented in GOTCHAS S3.8. Pascal-string consumption is clamped against the remaining buffer to prevent attacker-controlled length prefixes (e.g., 0xFFFFFFFF) from poisoning the anchor to usize::MAX and silently suppressing all subsequent relative-offset rules. Engine-internal items (bytes_consumed, set_last_match_end, resolve_offset_with_context) are pub(crate) -- the anchor-threading contract is not stable API surface. - Add EvaluationContext::last_match_end (pub(crate) getter/setter) - Add evaluator::types::bytes_consumed (pub(crate)) with attacker-input clamping for pstring - Implement evaluator::offset::relative::resolve_relative_offset with isize::try_from + checked_add_signed bounds - Add resolve_offset_with_context (pub(crate)) dispatcher; resolve_offset delegates with anchor=0 for backwards-compatible callers - Add evaluate_single_rule_with_anchor (private engine helper); evaluate_rules updates anchor before recursing into children - Document the Relative=anchor 0 behavior on evaluate_single_rule and resolve_offset, and the context-reuse warning on evaluate_rules - 30 new unit tests + 12 integration tests in tests/relative_offset_evaluation.rs covering positive/negative deltas, string/pstring parents, sibling propagation, top-level zero anchor, graceful skip on overrun/underflow, /J underflow with multi-byte width, oversized pstring prefix clamping, non-matching intermediate sibling, and context reset between buffers - Update AGENTS.md and GOTCHAS.md S3.8 Closes #38 Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
|
Important Review skippedAuto reviews are limited based on label configuration. 🏷️ Required labels (at least one) (3)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Require conventional commit format per https://www.conventionalcommits.org/en/v1.0.0/. Skipped for bots.
🟢 Full CI must passWonderful, this rule succeeded.All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.
🟢 Do not merge outdated PRsWonderful, this rule succeeded.Make sure PRs are within 10 commits of the base branch before merging
|
|
Related Documentation 3 document(s) may need updating based on files changed in this PR: libMagic-rs architecture
|
There was a problem hiding this comment.
Pull request overview
Implements evaluator support for OffsetSpec::Relative (issue #38) by threading a “previous match end” anchor through EvaluationContext and using it during offset resolution, aligning behavior with GNU file/libmagic relative-offset semantics.
Changes:
- Add relative-offset resolution (
last_match_end + delta) with overflow/underflow + bounds checks and a context-aware offset dispatcher. - Track and advance
EvaluationContext::last_match_endafter each successful match using newtypes::bytes_consumed(including C-string NUL and pstring prefix semantics). - Add extensive unit + integration tests and update project documentation/tooling pins.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/evaluator/offset/relative.rs |
Implements relative offset resolution with error mapping and unit tests. |
src/evaluator/offset/mod.rs |
Adds resolve_offset_with_context and makes resolve_offset delegate with anchor=0; updates tests/docs. |
src/evaluator/mod.rs |
Extends EvaluationContext with last_match_end getter/setter and reset behavior. |
src/evaluator/engine/mod.rs |
Threads anchor into rule evaluation and advances it via bytes_consumed; documents behavior. |
src/evaluator/types/mod.rs |
Adds bytes_consumed + helpers for string/pstring consumption (anchor advancement). |
src/evaluator/types/tests.rs |
Adds unit tests for bytes_consumed. |
src/evaluator/tests.rs |
Adds tests for EvaluationContext::last_match_end and reset/clone behavior. |
tests/relative_offset_evaluation.rs |
Integration tests covering relative chains, nesting, bounds failures, and context reset. |
GOTCHAS.md |
Documents relative-offset anchor semantics (no save/restore across recursion). |
AGENTS.md |
Updates evaluator/offset module docs and “currently implemented” feature list. |
mise.toml |
Pins several dev tool versions (was latest for some). |
mise.lock |
Updates lockfile to match tool pin changes. |
.bun-version |
Bumps Bun version to 1.3.11. |
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Apply 8 fixes from the pr-review-toolkit second-pass review (PR #211): - Replace "monotonic" anchor language in EvaluationContext field doc and the engine inline comment with accurate "may increase or decrease" wording. The anchor reflects the end of the most recent match, not a high-watermark, and can move backwards when a sibling matches at a lower position. - Replace bytes_consumed _ => 0 catch-all with debug_assert! that fires on unhandled variable-width TypeKind variants. Update GOTCHAS S2.1 to list bytes_consumed in the new-TypeKind-variant exhaustive-match checklist; previously a future variable-width variant could silently corrupt the relative-offset anchor with no signal. - Fix AGENTS.md Module Organization comment that still labeled src/evaluator/offset/indirect.rs as "stub (issue #37)" -- #37 shipped in commit 6426d18. - Add 2 unit tests in src/evaluator/engine/tests.rs covering evaluate_single_rule with OffsetSpec::Relative (delta != 0 and delta == 0). The public API's anchor=0 contract had no test coverage at that layer; integration tests all went through evaluate_rules. - Tighten bytes_consumed rustdoc to scope the "returns 0 for unexpected inputs" claim to variable-width types only -- the fixed-width branch ignores offset validity (correct under the engine invariant, but the doc was imprecise). - Extend evaluate_rules rustdoc context-reuse warning to cover the Err return path (Timeout / RecursionLimitExceeded leave the anchor partially advanced; callers retrying must call reset()). - Split test_bytes_consumed_offset_at_buffer_end_returns_zero into two tests with accurate names: test_bytes_consumed_string_at_past_end_ returns_zero and test_bytes_consumed_fixed_width_ignores_offset_ validity. The original test name claimed "returns_zero" but the fixed-width assertion returned 1 (the bit_width fallback). - Document EvaluationError::InternalError in resolve_relative_offset's # Errors section, noting it propagates instead of being caught by the engine's graceful-skip arm. All 943 lib tests + 12 relative-offset integration tests + 13 other suites + 163 doctests passing. Clippy and fmt clean. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Apply 9 follow-up items from comment-analyzer, silent-failure-hunter, and code-reviewer that fell below the first synthesis pass: - Upgrade defensive child-error arm in evaluate_rules from debug! to warn!. If this arm fires, the parent match is still emitted but the entire child subtree is silently dropped -- previously logged at debug with no signal that a partial classification was returned. - Tighten bytes_consumed rustdoc NUL semantics: distinguish nul_index+1 (byte after the NUL) from window-size return when no NUL is found, and explicitly state that the NUL inclusion is intentional GNU `file` semantics for chained record parsing. - Document Relative(0)-after-string semantics on string_bytes_consumed (anchor lands one byte past the NUL terminator). - Add recursion_depth to the evaluate_rules context-reuse warning -- it also leaks across calls if the caller forgets reset(). - Clarify set_last_match_end doc that reset() clears the anchor, current offset, and recursion depth together (not just the anchor). - Reword "prior to v0.5" to "before the relative-offset feature landed in v0.5" in resolve_offset and evaluate_single_rule rustdocs (the v0.5 file referencing "prior to v0.5" reads oddly). - Narrow GOTCHAS S3.8 push-ordering claim to "incidental for anchor correctness" so a future caller iterating matches alongside the context anchor is not misled. - Add integration test that pins the documented "anchor may decrease" behavior: rule A matches at offset 8, rule B matches at offset 2, rule C uses Relative(0) and reads at offset 3 (not 12). - Add unit test that injects a near-saturation anchor (usize::MAX) via the pub(crate) setter and verifies subsequent Relative(0)/(+1)/(-1) rules skip gracefully without panic. 944 lib tests + 13 relative-offset integration tests + 13 other suites + 163 doctests passing. Clippy and fmt clean. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Apply 4 review threads from PR #211: - evaluate_single_rule rustdoc no longer claims Relative is "equivalent to absolute of the same delta". Negative deltas underflow the anchor and return InvalidOffset, while Absolute(-N) means from-end -- the two are NOT equivalent. Same fix applied to resolve_offset rustdoc which had the same misleading wording. - bytes_consumed: bounds-check the fixed-width branch instead of just documenting the gap. The function now returns 0 if offset + width exceeds buffer.len() (mirroring the variable-width path). The engine guarantees in-bounds invocation in practice, but the guard makes the contract self-consistent for any future caller. Test updated from pinning the old "ignores offset" behavior to pinning the new bounds-checked behavior at the boundary. - bytes_consumed catch-all comment no longer says debug_assert "logs" the case -- debug_assert! panics in test/dev builds and compiles out in release. Reworded to reflect the actual behavior. - docs/src/evaluator.md: split EvaluationContext methods into "Public Methods" and "Internal (pub(crate))" sections so external users see that last_match_end / set_last_match_end are not callable from outside the crate. Direct external users to evaluate_rules + EvaluationContext for relative-offset behavior. 944 lib tests + 13 relative-offset integration tests + 13 other suites + 163 doctests passing. Clippy and fmt clean. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Add docs/solutions/security-issues/pstring-anchor-poisoning.md capturing the SEC-001 finding from PR #211 review: attacker-controlled 4-byte pstring length prefix poisoning the GNU `file` previous-match anchor. Includes the failed first attempt (documenting the gap), the working fix (clamping payload against remaining buffer), why-this-works rationale, and prevention rules generalizing to any future variable- width parser advancing internal state from attacker input. - Cross-references the two existing indirect-offset learnings as the sibling work that established the offset-resolver patterns this PR followed. - Add docs/solutions/ pointer to AGENTS.md Quick Reference so future contributors and agents (without the compound-engineering plugin) can discover the searchable knowledge store of past learnings. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Apply 6 findings from the fourth internal review pass (code-reviewer, pr-test-analyzer, comment-analyzer) on PR #211: - docs/src/evaluator.md: the Offset Resolution section still contained the old "Relative(N) resolves as if it were Absolute(N)" claim that directly contradicts the rustdoc fix already applied in cd3a5e0. Rewrote the paragraph with the two-case split (non-negative delta vs negative delta underflow -> InvalidOffset). - docs/src/evaluator.md: Implementation Status checklist showed "[ ] Indirect offset support" but #37 shipped in commit 6426d18. Ticked it off and added the new "[x] Relative offset support". - src/evaluator/engine/mod.rs: style consistency fix. The new defensive warn! call used `log::warn!(...)` path-qualified while the file imports `use log::debug;` and uses `debug!` bare. Changed import to `use log::{debug, warn};` and call site to bare `warn!(...)` per the project-wide convention. - src/evaluator/engine/tests.rs: add test_evaluate_single_rule_relative_negative_with_zero_anchor_errors pinning the documented contract that evaluate_single_rule with OffsetSpec::Relative(-1) returns Err(InvalidOffset { offset: -1 }), not Ok(None). Previously the contract had zero test coverage at the public API boundary. - src/evaluator/types/tests.rs: stale comment on test_bytes_consumed_fixed_width_types said "Buffer is irrelevant for fixed-width types; bytes_consumed reads bit_width". The bounds guard added in cd3a5e0 makes the buffer relevant. Updated the comment to reflect that the 16-byte buffer is sized to fit all tested widths and points to the companion test that exercises the guard's 0-return path. - tests/relative_offset_evaluation.rs: remove the abandoned drafting comment "offset 0: 0x42 (matched by rule_b at offset 2 via Absolute(2)... wait, we need rule_a to match HIGHER...)" from relative_anchor_can_decrease_when_later_sibling_matches_at_lower_position. The final layout description that follows is accurate; the mid- sentence course-correction was a draft artifact. 945 lib tests + 13 relative-offset integration tests + 13 other suites + 163 doctests passing. Clippy and fmt clean. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
| | Feature | GNU file | rmagic | Status | Notes | | ||
| | ------------------ | -------- | ------ | ----------- | ---------------------------- | | ||
| | Basic patterns | ✅ | ✅ | Complete | String, numeric matching | | ||
| | Hierarchical rules | ✅ | 🔄 | In Progress | Parent-child relationships | | ||
| | Indirect offsets | ✅ | 📋 | Planned | Pointer dereferencing | | ||
| | Relative offsets | ✅ | 📋 | Planned | Position-relative addressing | | ||
| | Relative offsets | ✅ | ✅ | Complete | Position-relative addressing (PR #211) | | ||
| | Search patterns | ✅ | 📋 | Planned | Pattern searching in ranges | |
There was a problem hiding this comment.
In the Magic File Format Compatibility matrix, both offset-related rows look inconsistent with current implementation details:
Indirect offsetsis marked as “📋 Planned”, butresolve_indirect_offsetis implemented and the parser grammar already supports(base.type)syntax.Relative offsetsis marked “✅ Complete”, but magic-file&+N/&-Nparsing is still TODO (relative offsets are only reachable programmatically via the AST).
Consider adjusting the table to reflect this split (e.g., indirect ✅, relative 🔄/partial with a note that evaluation is implemented but parser wiring is pending).
| ## Implementation Status | ||
|
|
||
| - [x] Basic evaluation engine structure | ||
| - [x] Offset resolution (absolute, relative, from-end) | ||
| - [x] Relative offset support with previous-match anchor tracking (PR #211, issue #38) | ||
| - [x] Type reading with endianness support (Byte, Short, Long, Quad, Float, Double, Date, QDate, String, PString with 1/2/4-byte prefixes) | ||
| - [x] Operator application (Equal, NotEqual, LessThan, GreaterThan, LessEqual, GreaterEqual, BitwiseAnd, BitwiseAndMask) | ||
| - [x] Hierarchical rule processing with child evaluation | ||
| - [x] Error handling with graceful degradation | ||
| - [x] Timeout protection | ||
| - [x] Recursion depth limiting | ||
| - [x] Comprehensive test coverage (150+ tests) | ||
| - [ ] Indirect offset support (pointer dereferencing) | ||
| - [x] Indirect offset support (pointer dereferencing, issue #37) | ||
| - [x] Relative offset support (GNU `file` anchor semantics, issue #38) | ||
| - [ ] Regex type support |
There was a problem hiding this comment.
The “Implementation Status” checklist is redundant/inconsistent:
- It already has “Offset resolution (absolute, relative, from-end)”, which includes relative offsets.
- It then lists “Relative offset support …” twice (lines 566 and 575).
Consider collapsing these into a single relative-offset bullet (or keeping only the more specific anchor-tracking line) to avoid confusion about what’s actually newly implemented.
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Add docs/solutions/developer-experience/rust-test-visibility-boundary.md capturing the lesson from PR #211: tests in tests/ compile as external crates and cannot use pub(crate) items; tests that need internal-state injection must live in src/.../tests.rs instead. The test_evaluate_rules_anchor_near_saturation_skips_relative_child_ gracefully test that regression-guards the pstring anchor poisoning fix is the concrete case that prompted this — the initial attempt to put it in tests/relative_offset_evaluation.rs failed because EvaluationContext::set_last_match_end is pub(crate). Documents the three-way split between: - src/.../tests.rs -> same crate, full pub(crate) access - tests/foo.rs -> external crate, public API only - doctests in rustdoc -> external crate, same as tests/ Also update AGENTS.md Quick Reference to list the new developer-experience/ category alongside the existing ones. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
| ```rust | ||
| pub struct EvaluationContext { | ||
| /// Current offset position for relative calculations | ||
| current_offset: usize, | ||
| /// End offset of the most recent successful match (GNU file anchor) | ||
| last_match_end: usize, | ||
| /// Current recursion depth for safety limits |
There was a problem hiding this comment.
In this EvaluationContext snippet, current_offset is described as being used for "relative calculations", but relative offsets now resolve via the GNU file anchor (last_match_end). If current_offset is no longer used for relative offset resolution (and isn’t used by evaluate_rules), consider rewording this comment to avoid suggesting it drives OffsetSpec::Relative semantics (or documenting what current_offset is used for today).
- bytes_consumed rustdoc header: drop the misleading "mirrors the consumption logic of read_typed_value" framing. The helper intentionally includes framing bytes (NUL terminator, pstring length prefix) that the read functions do not return, so describing it as "anchor-advance distance" is more accurate. Callers should not equate this with the byte count read_typed_value returns. - string_bytes_consumed rustdoc: replace "Mirrors read_string" with an explicit statement that the function uses the same scan logic but intentionally counts the NUL terminator as consumed. Adds a do-not-fix note so a future maintainer doesn't try to align the two behaviors. - GOTCHAS.md S3.8 title: "Anchor is Global-Monotonic" -> "Global Anchor, No Save/Restore". The anchor numeric value is not non-decreasing -- a Relative(-N) match (or any later rule at a lower absolute position) can move it earlier. The original title contradicted the documented and tested behavior. - docs/src/architecture.md module list: offset/indirect.rs was still labeled as "stub (issue #37)" even though #37 shipped in commit 6426d18. Same fix previously applied to docs/src/evaluator.md in a prior review round. 945 lib tests + 14 relative-offset integration + 13 other suites + 163 doctests still passing. Clippy and fmt clean. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
## 🤖 New release
* `libmagic-rs`: 0.5.0 -> 0.6.0 (⚠ API breaking changes)
### ⚠ `libmagic-rs` breaking changes
```text
--- failure constructible_struct_adds_field: externally-constructible struct adds field ---
Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/constructible_struct_adds_field.ron
Failed in:
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
--- failure copy_impl_added: type now implements Copy ---
Description:
A public type now implements Copy, causing non-move closures to capture it by reference instead of moving it.
ref: rust-lang/rust#100905
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/copy_impl_added.ron
Failed in:
libmagic_rs::mime::MimeMapper in /tmp/.tmpwFvgw1/libmagic-rs/src/mime.rs:98
--- failure enum_marked_non_exhaustive: enum marked #[non_exhaustive] ---
Description:
A public enum has been marked #[non_exhaustive]. Pattern-matching on it outside of its crate must now include a wildcard pattern like `_`, or it will fail to compile.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/enum_marked_non_exhaustive.ron
Failed in:
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum LibmagicError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:15
enum LibmagicError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:15
enum IoError in /tmp/.tmpwFvgw1/libmagic-rs/src/io/mod.rs:26
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum TypeReadError in /tmp/.tmpwFvgw1/libmagic-rs/src/evaluator/types/mod.rs:56
enum ParseError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:74
enum ParseError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:74
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum EvaluationError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:148
enum EvaluationError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:148
--- failure enum_struct_variant_field_added: pub enum struct variant field added ---
Description:
An enum's exhaustive struct variant has a new field, which has to be included when constructing or matching on this variant.
ref: https://doc.rust-lang.org/reference/attributes/type_system.html#the-non_exhaustive-attribute
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/enum_struct_variant_field_added.ron
Failed in:
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
--- failure function_missing: pub fn removed or renamed ---
Description:
A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/function_missing.ron
Failed in:
function libmagic_rs::parser::grammar::is_empty_line, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1025
function libmagic_rs::parser::grammar::parse_strength_directive, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:846
function libmagic_rs::parser::grammar::parse_type_and_operator, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:683
function libmagic_rs::parser::grammar::parse_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:179
function libmagic_rs::parser::parse_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:179
function libmagic_rs::parser::grammar::parse_comment, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1004
function libmagic_rs::parser::grammar::parse_message, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:810
function libmagic_rs::parser::grammar::parse_value, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:633
function libmagic_rs::parser::grammar::parse_number, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:133
function libmagic_rs::parser::parse_number, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:133
function libmagic_rs::parser::grammar::has_continuation, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1060
function libmagic_rs::parser::grammar::parse_magic_rule, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:946
function libmagic_rs::parser::grammar::parse_rule_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:779
function libmagic_rs::parser::grammar::is_comment_line, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1042
function libmagic_rs::parser::grammar::is_strength_directive, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:902
function libmagic_rs::parser::grammar::parse_type, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:749
function libmagic_rs::parser::grammar::parse_operator, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:227
--- failure function_parameter_count_changed: pub fn parameter count changed ---
Description:
A publicly-visible function now takes a different number of parameters.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#fn-change-arity
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/function_parameter_count_changed.ron
Failed in:
libmagic_rs::evaluator::evaluate_single_rule now takes 3 parameters instead of 2, in /tmp/.tmpwFvgw1/libmagic-rs/src/evaluator/engine/mod.rs:196
--- failure inherent_method_missing: pub method removed or renamed ---
Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/inherent_method_missing.ron
Failed in:
FileBuffer::create_symlink, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/io/mod.rs:326
EvaluationContext::increment_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:114
EvaluationContext::decrement_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:130
EvaluationContext::increment_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:114
EvaluationContext::decrement_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:130
--- failure module_missing: pub module removed or renamed ---
Description:
A publicly-visible module cannot be imported by its prior path. A `pub use` may have been removed, or the module may have been renamed, removed, or made non-public.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/module_missing.ron
Failed in:
mod libmagic_rs::parser::grammar, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:4
--- failure struct_marked_non_exhaustive: struct marked #[non_exhaustive] ---
Description:
A public struct has been marked #[non_exhaustive], which will prevent it from being constructed using a struct literal outside of its crate. It previously had no private fields, so a struct literal could be used to construct it outside its crate.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/struct_marked_non_exhaustive.ron
Failed in:
struct EvaluationConfig in /tmp/.tmpwFvgw1/libmagic-rs/src/config.rs:42
```
<details><summary><i><b>Changelog</b></i></summary><p>
<blockquote>
## [0.6.0] - 2026-04-25
### Features
- **parser**: Add Date and QDate types with serialization support
([#165](#165))
- **parser**: Implement pstring (Pascal string) type
([#170](#170))
- **parser**: Implement pstring multi-byte length prefix variants (/B,
/H, /h, /L, /l, /J)
([#183](#183))
- **evaluator**: Add debug-level tracing for skipped rules
([#184](#184))
- **evaluator**: Implement indirect offset resolution
([#37](#37))
([#199](#199))
- **evaluator**: Implement relative offset resolution
([#38](#38))
([#211](#211))
- **deps**: Add new skills to actionbook/rust-skills and
trailofbits/skills
- **evaluator**: Regex and search types (closes #39)
([#214](#214))
- Implement libmagic meta-type directives and format substitution
([#42](#42))
([#230](#230))
### Bug Fixes
- **regex**: PR #214 follow-up review findings
([#215](#215))
- Load and correctly evaluate /usr/share/file/magic/filesystems and
adjacent magic files
([#233](#233))
### Documentation
- **gotchas**: Clarify requirements for adding TypeKind variants
### Miscellaneous Tasks
- Rename .coderabbitai.yaml to .coderabbit.yaml
- **Mergify**: Configuration update
([#173](#173))
- Update .gitignore to exclude local AI assistant files
- **mergify**: Upgrade configuration to current format
([#205](#205))
- Resolve all pending TODO items
([#212](#212))
- **mergify**: Upgrade configuration to current format
([#231](#231))
<!-- generated by git-cliff -->
### Security
- **io**: Close TOCTOU race in `FileBuffer::new` metadata validation
(CWE-367). `validate_file_metadata` now uses `File::metadata()` on the
open descriptor instead of re-canonicalizing the path, so an attacker
cannot swap the path between `open_file` and validation. Error paths now
report the caller-supplied path rather than the canonicalized variant.
- **cli**: Remove relative-path fallbacks from `default_magic_file_path`
(CWE-426). `./missing.magic`, `./third_party/magic.mgc`, and the
`CI`/`GITHUB_ACTIONS` env-var branch no longer resolve against the
process cwd. CI pipelines must pass `--magic <path>` explicitly.
- **evaluator**: `build_regex` now bounds `size_limit` and
`dfa_size_limit` to 1 MiB (`REGEX_COMPILE_SIZE_LIMIT`) to reject
compile-time DoS patterns (CWE-1333) from adversarial magic files.
### Features
- **parser**: Implement meta-type directives: `name`/`use` subroutines,
`default`/`clear` per-level fallback, and `indirect` re-evaluation.
`parse_text_magic_file` now returns `ParsedMagic { rules, name_table }`
(breaking change from `Vec<MagicRule>`). Named subroutines are hoisted
into `NameTable` at load time and dispatched via `RuleEnvironment` in
the evaluator. Recursion is bounded by
`EvaluationConfig::max_recursion_depth`. Resolves
[#42](#42).
- **evaluator**: Thread-local regex compile cache eliminates the
double-compile paid by every successful regex match.
`regex_bytes_consumed` now reuses the compiled `Regex` from `read_regex`
instead of recompiling the pattern to derive the anchor advance. The
cache is reset at the start of every `evaluate_rules_with_config` call,
bounding memory to one evaluation.
- **config**: `EvaluationConfig` is now `#[non_exhaustive]`; new
builder-style setters (`with_max_recursion_depth`,
`with_max_string_length`, `with_stop_at_first_match`, `with_mime_types`,
`with_timeout_ms`) let external crates construct configurations without
struct literals.
- **parser**: `MagicRule::new()` smart constructor with
`::with_children()`, `::with_strength_modifier()`, `::with_level()`
builder methods and a `::validate()` method enforcing structural
invariants (non-empty message, `level <= MAX_LEVEL`, children nested
strictly deeper than parent). New `MagicRuleValidationError` error type.
- **parser**: `RegexFlags::with_case_insensitive()` and
`::with_start_offset()` builder methods.
### Refactor
- **engine**: Extract `evaluate_pattern_rule()` and
`evaluate_value_rule()` helpers from
`evaluate_single_rule_with_anchor`'s 90-line body. Dispatch is now a
two-arm type-category split; each helper has focused rustdoc on
semantics and invariants.
- **types**: Replace the `_ =>` catch-all in
`bytes_consumed_with_pattern` with an explicit listing of the
fixed-width `TypeKind` variants. Adding a new variable-width variant
without updating this match is now a compile error instead of a silent
relative-offset anchor corruption in release builds.
- **parser**: Split the 185-line `type_keyword_to_kind` match into
per-family helpers (`byte_family`, `short_family`, `long_family`,
`quad_family`, `float_family`, `double_family`, `date_family`,
`qdate_family`, `string_family`). Drops the
`#[allow(clippy::too_many_lines)]` attribute.
- **main**: `main()` returns `std::process::ExitCode` instead of calling
`process::exit`, so destructors run on the happy path. Ctrl-C
`AtomicBool` flag uses `Ordering::Relaxed` instead of `SeqCst`.
- **grammar**: `parse_strength_directive` uses nom 8's `preceded` +
`Parser::map` instead of the legacy `map(pair(char(...), parse_number),
|(_, n)| ...)` pattern.
- **output**: Add `#[serde(skip_serializing_if = "Option::is_none",
default)]` to public `Option<T>` fields so JSON output no longer emits
`"field": null` for unset optional values.
### Documentation
- **lib**: Add `# Security` sections to
`MagicDatabase::with_builtin_rules`, `::with_builtin_rules_and_config`,
`::load_from_file`, and `::load_from_file_with_config` warning about the
unbounded default timeout and recommending
`EvaluationConfig::performance()` for untrusted input.
- **lib**: Document `MagicDatabase: Send + Sync` for parallel scanning.
- **README**: Update `TypeKind` enum example to match the current AST,
add `regex` and `search/N` to the supported types table, add pre-1.0 API
stability warning, correct the roadmap to mark v0.2-v0.4 as shipped.
- **AGENTS.md**: Relabel "Currently Implemented (v0.1.0)" and "Current
Limitations (v0.1.0)" to v0.5.0 and rewrite the Development Phases
section to reflect actual shipped scope.
### Testing
- Security regression tests for S-H1 (planted-magic-file in cwd), S-H2
(TOCTOU path-swap contract), S-M2 (pathological regex bounded runtime),
S-L2 (codegen message escape round-trip), and GOTCHAS S13.1
(`EvaluationConfig::default()` unbounded timeout invariant).
- Backspace message concatenation regression tests for first-match,
consecutive, and empty-rest edge cases.
- `MagicRule::validate()` tests covering empty message, child level
invariant, and max-depth rejection.
- `RegexCache` population/clear/reuse tests.
### Breaking Changes
- **parser**: `parse_text_magic_file` return type changed from
`Result<Vec<MagicRule>, ParseError>` to `Result<ParsedMagic,
ParseError>`. Callers must destructure `ParsedMagic { rules, name_table
}`. Low-level callers that only need the rule list can use
`parsed.rules`. `load_magic_file` and `load_magic_directory` return the
same new type.
</blockquote>
</p></details>
---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Summary
Implements
OffsetSpec::Relativeevaluator support per #38 using GNUfile/libmagic semantics. Threads a "previous match end" anchor throughEvaluationContext, advances it after each successful match by the bytes consumed (variable-width types include c-string NUL terminators and pstring length prefixes), and resolvesRelative(delta)aslast_match_end + deltawith bounds and overflow checks.Impact: 21 files changed (1984 insertions, 213 deletions). 945 lib tests + 14 relative-offset integration tests + 11 other integration suites + 163 doctests, all green. Clippy clean with
-D warnings. CI: 22 pass, 0 failures, mergeable.Review rounds completed: 4 internal passes (ce:review autofix, pr-review-toolkit ×2, Copilot PR reviewer). 0 critical/high findings outstanding.
Key design decisions
filesemantics. The anchor value may increase or decrease as successive rules match at different positions; it is not a high-watermark. Documented inGOTCHAS.md§3.8.Absolute(N); negative deltas underflow toInvalidOffset(NOT interpreted likeAbsolute(-N)'s from-end semantics). Called out explicitly in the rustdoc for bothresolve_offsetandevaluate_single_rule.\xFF\xFF\xFF\xFF) from poisoning the anchor tousize::MAXand silently suppressing all subsequent relative-offset rules. Found by adversarial review (SEC-001 / ADV-001), regression-tested with both BE and LE oversized-prefix cases. Documented as a reusable learning indocs/solutions/security-issues/pstring-anchor-poisoning.md.bytes_consumedis also bounds-checked. Per Copilot reviewer feedback, addedoffset.checked_add(width) <= buffer.len()guard so the helper is infallible regardless of caller discipline — no relying on "the engine only calls this after a successful read" as the only line of defense.pub(crate):bytes_consumed,last_match_end()getter,set_last_match_end()setter, andresolve_offset_with_contextare deliberately not stable API surface. The anchor-threading contract is internal and shouldn't lock in semver guarantees.Behavior changes
OffsetSpec::Relativepreviously returnedEvaluationError::UnsupportedType. It now resolves:Relative(N)for N ≥ 0Relative(-N)for N > 0evaluate_rules+EvaluationContextlast_match_end + Nlast_match_end - Nif in bounds, elseInvalidOffsetevaluate_single_rule(no context)NInvalidOffset(anchor 0 underflow)resolve_offset(no context)evaluate_single_ruleevaluate_single_ruleCallers with existing error-handling code that pattern-matched
UnsupportedTypefor relative offsets must remove that arm. Documented in the rustdoc for all three entry points.Implementation
EvaluationContext::last_match_endfield withpub(crate)getter/setter, cleared byreset()alongsidecurrent_offsetandrecursion_depthevaluator::types::bytes_consumed(pub(crate)) with buffer clamping for both fixed-width and variable-width (c-string, pstring) branches;string_bytes_consumedandpstring_bytes_consumedre-derive consumption directly from the buffer (not fromValue::String, which can drift viafrom_utf8_lossy)evaluator::offset::relative::resolve_relative_offsetwithisize::try_from+checked_add_signedbounds, mapping over/underflow toInvalidOffsetand out-of-buffer toBufferOverrunevaluator::offset::resolve_offset_with_context(pub(crate)) dispatcher; existingresolve_offset(pub) delegates withlast_match_end = 0so existing call sites are unchangedevaluator::engine::evaluate_single_rule_with_anchorprivate helper;evaluate_rulesupdates the anchor before recursing into children, never saves/restores (GNUfileglobal semantics)evaluate_rulesgraceful-skip arm for child-evaluation errors:debug!→warn!with a tightened comment noting the asymmetry (parent emitted, child silently dropped if the defensive arm fires)Test coverage
Unit tests (30+ new):
EvaluationContext::last_match_end(field / clone / reset / set-get)bytes_consumed(fixed-width all types, c-string with/without NUL, max_length truncation, pstring 1/2/4-byte BE/LE,/Jflag well-formed,/Junderflow with multi-byte width, oversized prefix clamping BE+LE, boundary cases,usize::MAXoverflow)resolve_relative_offset(positive/negative/zero deltas, top-level anchor=0, last-valid-index, underflow, overflow atusize::MAX, target past end, target ==buffer.len())offset/mod.rs(relative via context, top-level default, passthrough for Absolute/FromEnd/Indirect)engine/tests.rspinningevaluate_single_rulecontracts (positive delta → absolute N, zero delta → start, negative delta →Err(InvalidOffset))usize::MAX) via thepub(crate)setter to verify all three Relative variants skip gracefully without panicIntegration tests (14 in
tests/relative_offset_evaluation.rs):Documentation
AGENTS.md— "Currently Implemented" / Module Organization / Quick Reference sections updated;docs/solutions/pointer added for knowledge-store discoverabilityGOTCHAS.md— new §3.8 documents the no-save-restore anchor semantics; §2.1 checklist updated to includebytes_consumedexhaustive-match requirement for newTypeKindvariantsdocs/src/evaluator.md— Offset Resolution section rewritten for the two-case Relative behavior;EvaluationContext"Key Methods" split into Public vs Internal (pub(crate)) sections; Implementation Status checklist updated for Evaluator: implement indirect offset resolution #37 and Evaluator: implement relative offset resolution #38docs/solutions/security-issues/pstring-anchor-poisoning.md— NEW bug-track solution doc capturing the SEC-001 finding, failed first attempts, generalizable rule ("when advancing internal state by attacker-controlled byte counts, clamp against actual buffer reality, not raw input"), and prevention strategiesdocs/solutions/developer-experience/rust-test-visibility-boundary.md— NEW knowledge-track doc capturing the lesson that tests needingpub(crate)items must live insrc/.../tests.rs, nottests/Scope notes
A handful of files in the diff are incidental and not part of the #38 implementation:
.bun-version+mise.lock+mise.toml— tool version pins that came in via merge-from-maintessl.json— unrelatedneonwatty/logo-designer-skilldependency added in commitb1f62fadocs/src/architecture.md+docs/src/compatibility.md— Dosu auto-updates from commit3cf5d5aReviewers can skim these; they don't need deep review as part of the #38 work.
Post-Deploy Monitoring & Validation
No additional operational monitoring required. This is a library change to evaluator semantics:
OffsetSpec::Relativenow evaluate correctly instead of erroring withUnsupportedTyperesolve_offsetandevaluate_single_rulesignatures are unchanged; internal items arepub(crate)and don't lock semvertessl.jsonchange is documented abovebytes_consumedadds one bounds check per successful match (negligible);memchrinstring_bytes_consumedis O(n) over a region the existingread_stringalready scans, so worst case is a bounded 2x constant factor on string matchesBehavior change is documented in the rustdoc and
AGENTS.mdso downstream callers see the migration note inline.Review history
c9a89f1) — feature + tests per the planb7e5938) — 12 safe_auto fixes including the SEC-001 pstring anchor poisoning fix, visibility narrowing topub(crate), doc gaps on context reuse4ebd9b7) — pr-review-toolkit feedback: "monotonic" wording fix,recursion_depthin context-reuse warning, 2 new testscd3a5e0) — bounds guard on fixed-widthbytes_consumed, Relative negative-delta clarification,debug_assert!wording fix,docs/src/evaluator.mdPublic/Internal method split41c75c3) — pstring-anchor-poisoning learning +docs/solutions/discoverability pointer in AGENTS.md58210ab) — 6 findings: staleresolve_offsetequivalence claim indocs/src/evaluator.md,log::warn!style consistency,evaluate_single_rule(Relative(-1))test coverage, stale comments, implementation status checklist03f389b) — Rust developer-experience solution doc capturing thetests/vssrc/.../tests.rsgotcha hit during implementationTest Plan
just ci-checkpasses locallycargo testpasses (945 lib + 14 integration + 11 other suites + 163 doctest)cargo clippy --all-targets -- -D warningscleancargo fmt --checkcleanCloses #38