refactor(evaluator): reorganize types module into submodules by unclesp1d3r · Pull Request #156 · EvilBit-Labs/libmagic-rs

unclesp1d3r · 2026-03-06T22:09:47Z

This commit restructures the types module by splitting it into focused submodules for numeric and string handling. The main types.rs file has been removed, and its functionality has been distributed across numeric.rs, string.rs, and a new mod.rs that serves as the public API surface. This change enhances code organization and maintainability while preserving existing functionality.

Removed types.rs and migrated its content to numeric.rs and string.rs.
Introduced a new mod.rs to expose the public API and manage submodule imports.
Updated error handling and type reading functions to ensure consistency across the new structure.

No public API changes have been made, and all existing tests have been updated accordingly to reflect the new module organization.

This commit restructures the `types` module by splitting it into focused submodules for numeric and string handling. The main `types.rs` file has been removed, and its functionality has been distributed across `numeric.rs`, `string.rs`, and a new `mod.rs` that serves as the public API surface. This change enhances code organization and maintainability while preserving existing functionality. - Removed `types.rs` and migrated its content to `numeric.rs` and `string.rs`. - Introduced a new `mod.rs` to expose the public API and manage submodule imports. - Updated error handling and type reading functions to ensure consistency across the new structure. No public API changes have been made, and all existing tests have been updated accordingly to reflect the new module organization. Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>

coderabbitai · 2026-03-06T22:10:05Z

Caution

Review failed

Pull request was closed or merged during review

Summary by CodeRabbit

Release Notes

Refactor
- Reorganized internal code structure into specialized submodules.
Tests
- Expanded test coverage for type operations and edge cases.
Documentation
- Updated architecture documentation to reflect structural changes.

Walkthrough

The monolithic src/evaluator/types.rs (1836 lines) was split into a types/ submodule: mod.rs (public API, error type, dispatcher, coercion), numeric.rs (numeric readers), string.rs (string reader), and tests.rs (comprehensive tests). Public API re-exports preserve existing surface.

Changes

Cohort / File(s)	Summary
Module Removal / Replaced `src/evaluator/types.rs`	Removed monolithic types implementation; functionality relocated into a modular layout.
Public API & Dispatch `src/evaluator/types/mod.rs`	New module exposing `TypeReadError`, `read_typed_value`, `coerce_value_to_type`, and re-exports of numeric/string readers; dispatches by `TypeKind`.
Numeric Readers `src/evaluator/types/numeric.rs`	Adds `read_byte`, `read_short`, `read_long`, `read_quad` with bounds checks, endianness and signed/unsigned handling; includes extensive tests.
String Reader `src/evaluator/types/string.rs`	Adds `read_string` for NUL-terminated UTF‑8 reads with optional `max_length`, bounds checks, lossy UTF‑8 handling, and tests.
Tests Consolidation `src/evaluator/types/tests.rs`	Comprehensive tests for error variants, typed dispatch, reader consistency, coercion, and buffer-overrun scenarios.
Docs `docs/src/architecture.md`	Updated architecture docs describing the evaluator refactor and new module topology.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Test infrastructure, compatibility tests, and architecture improvements #31: Reorganizes and reimplements the evaluator type-reading module into types/{mod.rs,numeric.rs,string.rs,tests.rs} — directly overlaps this refactor.
feat(parser): implement quad 64-bit integer type with endian variants #133: Modifies evaluator types with similar splitting and additions (TypeReadError, typed readers, coercion) and likely touches the same APIs.
docs: comprehensive documentation, security, and CI hardening #58: Adds/updates type-reading APIs and TypeKind usage that this refactor depends on; strong code-level relation.

Poem

I nibble bytes and hop with glee,
Split big files into tidy three. 🐰
Numeric hops, string skips along,
Tests keep rhythm, steady and strong.
A tiny rabbit's modular song.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately and concisely summarizes the main change: reorganizing the types module into focused submodules. The title is specific, clear, and directly reflects the changeset.
Description check	✅ Passed	The PR description provides relevant context about the refactoring, explaining what was removed, what was added, and that no public API changes were made. The description is directly related to the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch 63-refactor-convert-evaluatortypesrs-to-a-types-directory-module

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…pes-directory-module

mergify · 2026-03-06T22:10:24Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 CI must pass

Wonderful, this rule succeeded.

All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.

check-success = coverage
check-success = quality
check-success = test
check-success = test-cross-platform (macos-latest, macOS)
check-success = test-cross-platform (ubuntu-22.04, Linux)
check-success = test-cross-platform (ubuntu-latest, Linux)
check-success = test-cross-platform (windows-latest, Windows)

🟢 Do not merge outdated PRs

Wonderful, this rule succeeded.

Make sure PRs are within 10 commits of the base branch before merging

#commits-behind <= 3

dosubot · 2026-03-06T22:11:37Z

Related Documentation

1 document(s) may need updating based on files changed in this PR:

libMagic-rs

architecture `/libmagic-rs/blob/main/docs/src/architecture.md`

View Suggested Changes

@@ -130,7 +130,11 @@
 - `engine/`: Core evaluation engine submodule
   - `mod.rs`: `evaluate_single_rule`, `evaluate_rules`, and `evaluate_rules_with_config` functions
   - `tests.rs`: Engine unit tests
-- `types.rs`: Type interpretation with endianness handling and signedness coercion
+- `types/`: Type interpretation submodule
+  - `mod.rs`: Public API surface with `read_typed_value`, `coerce_value_to_type`, and type re-exports
+  - `numeric.rs`: Numeric type handling (`read_byte`, `read_short`, `read_long`, `read_quad`) with endianness and signedness support
+  - `string.rs`: String type handling (`read_string`) with null-termination and UTF-8 conversion
+  - `tests.rs`: Module tests
 - `offset/`: Offset resolution submodule
   - `mod.rs`: Dispatcher (`resolve_offset`) and re-exports
   - `absolute.rs`: `OffsetError`, `resolve_absolute_offset`
@@ -142,7 +146,7 @@
   - `comparison.rs`: `compare_values`, `apply_less_than`/`greater_than`/`less_equal`/`greater_equal`
   - `bitwise.rs`: `apply_bitwise_and`, `apply_bitwise_and_mask`, `apply_bitwise_xor`, `apply_bitwise_not`
 
-**Organization Note:** The evaluator module was refactored to split a monolithic 2,638-line `mod.rs` into focused submodules, keeping the public API surface in `mod.rs` and moving core evaluation logic to `engine/mod.rs`. This maintains the same public API through re-exports (no breaking changes) while improving code organization and staying within the 500-600 line module guideline.
+**Organization Note:** The evaluator module has been refactored to split monolithic files into focused submodules. The initial refactoring split a 2,638-line `mod.rs` into `engine/` submodules, and a subsequent refactoring reorganized the 1,836-line `types.rs` into `types/` submodules for numeric and string handling. The public API surface remains in `mod.rs` with core logic distributed across focused submodules. This maintains the same public API through re-exports (no breaking changes) while improving code organization and staying within the 500-600 line module guideline.
 
 **Implemented Features:**

✅ Accepted

Note: You must be authenticated to accept/decline updates.

^{How did I do? Any feedback?}

cursor

Stale comment

Automated risk assessment for this synchronize update:

Risk level: Low

Why: The diff is a structural refactor of src/evaluator/types.rs into src/evaluator/types/{mod,numeric,string,tests}.rs. Public API and core type-reading/coercion logic are preserved, and no new production codepaths outside the evaluator types module were introduced.

Evidence checked: file-level diff, current review state, and direct comparison of key functions (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type) showing no behavioral logic changes.

Reviewer assignment: Not required for Low risk under policy.

Approving based on low-risk mechanical refactor characteristics.

cursor

Stale comment

Security Review Result

No high-confidence vulnerabilities found in this PR.

What I reviewed

src/evaluator/types/mod.rs

src/evaluator/types/numeric.rs

src/evaluator/types/string.rs

src/evaluator/types/tests.rs

removal of src/evaluator/types.rs

Threat-model checks

Buffer safety: numeric readers continue to use checked_add + slice .get(...) bounds checks; string reader validates offset and bounds all slicing.

Integer safety: no new unchecked offset arithmetic introduced.

DoS/resource exhaustion: no new unbounded recursion/allocations; string scan remains bounded by buffer length or max_length.

Unsafe/panics in library code: no new unsafe and no unwrap/expect in non-test code paths.

Dependency risk: no changes to Cargo.toml/Cargo.lock in this PR.

Information disclosure: error messages remain limited to offset/buffer length metadata.

Validation

Ran focused tests: cargo test evaluator::types --quiet (pass).

Residual note: cargo audit / cargo deny are not available in this runtime, but there are no dependency changes in this diff.

Copilot

Pull request overview

This PR refactors the evaluator’s type-reading layer by replacing the monolithic src/evaluator/types.rs module with a src/evaluator/types/ directory that splits numeric and string handling into focused submodules while keeping the crate::evaluator::types public entrypoint.

Changes:

Replaced src/evaluator/types.rs with src/evaluator/types/mod.rs as the public API surface and dispatcher.
Added numeric.rs and string.rs submodules to isolate type-reading logic by domain.
Reorganized and expanded unit tests for the new module layout.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/evaluator/types/mod.rs	New public API surface for type-reading + dispatcher + error type + coercion helper
src/evaluator/types/numeric.rs	Numeric type readers (byte/short/long/quad) + unit tests
src/evaluator/types/string.rs	String reader implementation + unit tests
src/evaluator/types/tests.rs	New consolidated tests for `types` module behavior and dispatch
src/evaluator/types.rs	Removed old monolithic implementation

Copilot · 2026-03-06T22:13:02Z

+#[test]
+fn test_unsupported_type_error_variants() {
+    let error = TypeReadError::UnsupportedType {
+        type_name: "CustomType".to_string(),
+    };
+    assert!(format!("{error}").contains("CustomType"));
+    assert!(format!("{error:?}").contains("UnsupportedType"));
+
+    assert_eq!(
+        error,
+        TypeReadError::UnsupportedType {
+            type_name: "CustomType".to_string(),
+        }
+    );
+}
+


This test only validates derived Display/Debug/PartialEq for TypeReadError::UnsupportedType, but the variant is not produced by the type-reading API anywhere in the codebase. If UnsupportedType is removed (or made meaningful), this test should be adjusted accordingly to avoid locking in unused/unsupported API surface.

Suggested change

#[test]

fn test_unsupported_type_error_variants() {

let error = TypeReadError::UnsupportedType {

type_name: "CustomType".to_string(),

};

assert!(format!("{error}").contains("CustomType"));

assert!(format!("{error:?}").contains("UnsupportedType"));

assert_eq!(

error,

TypeReadError::UnsupportedType {

type_name: "CustomType".to_string(),

}

);

}

Copilot · 2026-03-06T22:13:03Z

+/// # Returns
+///
+/// Returns `Ok(Value::String(string))` if the read is successful, or an appropriate error
+/// if the read fails due to buffer overrun or invalid UTF-8 sequences.
+///


The rustdoc claims read_string can fail due to “invalid UTF-8 sequences”, but the implementation uses String::from_utf8_lossy(...) and therefore never returns an error for invalid UTF-8 (it replaces invalid bytes with U+FFFD). Please update the docs to match the actual behavior (or change the implementation to return an error on invalid UTF-8 if that’s the intended contract).

codecov · 2026-03-06T22:13:56Z

Codecov Report

❌ Patch coverage is 98.75000% with 7 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/evaluator/types/numeric.rs	98.33%	6 Missing ⚠️
src/evaluator/types/string.rs	99.42%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

cursor

Stale comment

Security Review Result

No high-confidence vulnerabilities found in this PR.

What I verified

Buffer safety: numeric readers in src/evaluator/types/numeric.rs continue using checked_add plus slice .get(..) before decoding (read_short, read_long, read_quad), and read_byte uses .get(offset).

Offset arithmetic: no unchecked offset math was introduced; arithmetic remains overflow-checked before slicing.

String handling: src/evaluator/types/string.rs validates offset < buffer.len() before slicing, bounds search_len with min(max_len, remaining_buffer.len()), and only slices within validated bounds.

Unsafe/panics in library code: no new unsafe, unwrap(), or expect() in non-test library paths touched by this PR.

Dependency risk: no Cargo.toml/Cargo.lock changes in this diff; CI check audit is passing for this PR.

Residual risk / notes

This change is a module split/refactor; behavior appears equivalent to the previous types.rs implementation. I did not find newly introduced resource-exhaustion or info-leak paths in the changed code.

Local cargo audit / cargo deny execution was not possible in this runner because cargo-audit and cargo-deny are not installed, so dependency posture was validated via PR CI status instead.

mergify · 2026-03-06T22:15:20Z

🧪 CI Insights

Here's what we observed from your CI run for eb0aac2.

🟢 All jobs passed!

But CI Insights is watching 👀

cursor

Stale comment

Automated Risk Assessment

Risk level: Medium

Evidence from diff

Core library evaluator path changed substantially: src/evaluator/types.rs removed and replaced with src/evaluator/types/{mod.rs,numeric.rs,string.rs,tests.rs}.

Large structural refactor footprint (about 3.3k changed lines shown as delete/add due file split), which increases regression surface even when intent is organizational.

Production logic in key functions (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type) appears behaviorally equivalent after split; observed differences are predominantly docs/comments/module organization.

Targeted test validation run for evaluator types passed (65 passed, 0 failed).

Decision

Code review required (Medium risk due core shared module refactor and broad structural change).

Requested reviewer: @param-jasani.

No approval posted per policy for Medium risk.

cursor

Stale comment

Follow-up on reviewer assignment: automated request could not be completed because the candidate was not a repository collaborator. I could not assign a non-author reviewer automatically in this run; maintainer action is needed to request an eligible reviewer.

- Fix inaccurate read_string docs that claimed errors for invalid UTF-8 (actually uses from_utf8_lossy replacement) - Restore # Arguments doc sections on numeric read functions - Add checked_add overflow tests for read_short/read_long/read_quad - Clarify UnsupportedType variant is reserved for future types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>

cursor

Stale comment

Automated Risk Assessment

Risk level: Low

This PR is a structural refactor of evaluator::types (file split into submodules) with no material behavior change detected in production logic.

Evidence checked:

Core functions (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type) remain behaviorally equivalent after reorganization.

Changes are primarily module layout/docs/tests movement.

Targeted validation passed: cargo test evaluator::types --lib (65 passed, 0 failed).

Per policy for Low risk, this PR is approved.

cursor

No high-confidence vulnerabilities found in this refactor.

Security-focused review results:

Buffer safety: Numeric readers continue using checked_add + slice .get(offset..end) before decoding; byte reads use .get(offset). String reads validate offset < buffer.len() before slicing and clamp reads with min(max_len, remaining.len()).
Integer safety: Offset arithmetic for fixed-width numeric reads still uses checked arithmetic, preventing overflow/underflow-driven OOB access.
Unsafe/panics in library code: No new unsafe blocks, raw pointers, unwrap(), or expect() introduced in production paths.
DoS/resource exhaustion: No new unbounded recursion or allocation patterns were introduced by this PR; behavior appears equivalent to pre-refactor implementation.
Dependency risk: PR diff does not modify Cargo.toml/Cargo.lock; no new dependency attack surface introduced.

Validation run:

cargo test --lib evaluator::types passed (68 tests).
cargo audit / cargo deny could not be executed in this runner because the corresponding cargo subcommands are not installed (no such command: audit/deny). Given no dependency changes in this PR, this does not indicate a new PR-local supply-chain risk, but CI should continue enforcing these checks.

cursor

Automated Risk Assessment

Risk level: Low

Evidence from actual diff

Changed paths are isolated to src/evaluator/types* and consist of splitting types.rs into types/mod.rs, types/numeric.rs, types/string.rs, and types/tests.rs.
Core production functions were compared across base/head and remain behaviorally equivalent (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type).
No dependency, infra, auth, permissions, schema, or cross-system changes.
Focused validation passed on current head: cargo test evaluator::types --lib (68 passed, 0 failed).

Decision

Code review required: No (Low risk under policy).
Reviewer assignment: None requested.
Approval action: No new approval posted because this PR is already approved on the current head commit (3f031417690f9d7f5201e52b3bc21978ece49411).
Re-approval/unapproval check: No risk increase detected versus current approved head, so no dismissal/revocation needed.

## 🤖 New release * `libmagic-rs`: 0.4.1 -> 0.4.2 (✓ API compatible changes) <details><summary>Changelog</summary> <blockquote> ## [0.4.2] - 2026-03-06 ### Refactor - **evaluator**: Reorganize types module into submodules ([#156](#156))  </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

unclesp1d3r linked an issue Mar 6, 2026 that may be closed by this pull request

refactor: convert evaluator/types.rs to a types/ directory module #63

Closed

6 tasks

Copilot AI review requested due to automatic review settings March 6, 2026 22:09

dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 6, 2026

Merge branch 'main' into 63-refactor-convert-evaluatortypesrs-to-a-ty…

32dd9e2

…pes-directory-module

Copilot started reviewing on behalf of unclesp1d3r March 6, 2026 22:10 View session

dosubot Bot added the evaluator Rule evaluation engine and logic label Mar 6, 2026

cursor Bot previously approved these changes Mar 6, 2026

View reviewed changes

cursor Bot reviewed Mar 6, 2026

View reviewed changes

Copilot AI reviewed Mar 6, 2026

View reviewed changes

coderabbitai Bot added enhancement New feature or request testing Test infrastructure and coverage labels Mar 6, 2026

cursor Bot reviewed Mar 6, 2026

View reviewed changes

unclesp1d3r dismissed cursor[bot]’s stale review via 3f03141 March 6, 2026 22:20

cursor Bot previously approved these changes Mar 6, 2026

View reviewed changes

unclesp1d3r added the lgtm Approved for the merge queue label Mar 6, 2026

unclesp1d3r self-assigned this Mar 6, 2026

docs: Dosu updates for PR #156

eb0aac2

Copilot AI review requested due to automatic review settings March 6, 2026 22:23

dosubot Bot dismissed cursor[bot]’s stale review via eb0aac2 March 6, 2026 22:23

unclesp1d3r review requested due to automatic review settings March 6, 2026 22:23

cursor Bot reviewed Mar 6, 2026

View reviewed changes

coderabbitai Bot approved these changes Mar 6, 2026

View reviewed changes

unclesp1d3r enabled auto-merge (squash) March 6, 2026 22:25

unclesp1d3r merged commit 1745507 into main Mar 6, 2026
27 of 28 checks passed

unclesp1d3r deleted the 63-refactor-convert-evaluatortypesrs-to-a-types-directory-module branch March 6, 2026 22:27

github-actions Bot mentioned this pull request Mar 6, 2026

chore: release v0.4.2 #157

Merged

This was referenced Mar 7, 2026

feat(parser): implement float and double types with endian variants #162

Merged

feat(parser): implement pstring (Pascal string) type #170

Merged

coderabbitai Bot mentioned this pull request Mar 25, 2026

feat(parser): implement pstring multi-byte length prefix variants (/B, /H, /h, /L, /l, /J) #183

Merged

6 tasks

coderabbitai Bot mentioned this pull request Apr 10, 2026

chore: resolve all pending TODO items #212

Merged

6 tasks

This was referenced Apr 22, 2026

feat: Implement libmagic meta-type directives and format substitution (#42) #230

Merged

fix: load and correctly evaluate /usr/share/file/magic/filesystems and adjacent magic files #233

Merged

Uh oh!

Conversation

unclesp1d3r commented Mar 6, 2026

Uh oh!

coderabbitai Bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

mergify Bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Protections

🟢 CI must pass

🟢 Do not merge outdated PRs

Uh oh!

dosubot Bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

architecture /libmagic-rs/blob/main/docs/src/architecture.md

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Security Review Result

What I reviewed

Threat-model checks

Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Security Review Result

What I verified

Residual risk / notes

Uh oh!

mergify Bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧪 CI Insights

🟢 All jobs passed!

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Automated Risk Assessment

Evidence from diff

Decision

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Automated Risk Assessment

coderabbitai Bot commented Mar 6, 2026 •

edited

Loading

mergify Bot commented Mar 6, 2026 •

edited

Loading

dosubot Bot commented Mar 6, 2026 •

edited

Loading

architecture `/libmagic-rs/blob/main/docs/src/architecture.md`

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading

codecov Bot commented Mar 6, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading

mergify Bot commented Mar 6, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading