Skip to content

refactor(evaluator): reorganize types module into submodules#156

Merged
unclesp1d3r merged 4 commits into
mainfrom
63-refactor-convert-evaluatortypesrs-to-a-types-directory-module
Mar 6, 2026
Merged

refactor(evaluator): reorganize types module into submodules#156
unclesp1d3r merged 4 commits into
mainfrom
63-refactor-convert-evaluatortypesrs-to-a-types-directory-module

Conversation

@unclesp1d3r
Copy link
Copy Markdown
Member

This commit restructures the types module by splitting it into focused submodules for numeric and string handling. The main types.rs file has been removed, and its functionality has been distributed across numeric.rs, string.rs, and a new mod.rs that serves as the public API surface. This change enhances code organization and maintainability while preserving existing functionality.

  • Removed types.rs and migrated its content to numeric.rs and string.rs.
  • Introduced a new mod.rs to expose the public API and manage submodule imports.
  • Updated error handling and type reading functions to ensure consistency across the new structure.

No public API changes have been made, and all existing tests have been updated accordingly to reflect the new module organization.

This commit restructures the `types` module by splitting it into focused submodules for numeric and string handling. The main `types.rs` file has been removed, and its functionality has been distributed across `numeric.rs`, `string.rs`, and a new `mod.rs` that serves as the public API surface. This change enhances code organization and maintainability while preserving existing functionality.

- Removed `types.rs` and migrated its content to `numeric.rs` and `string.rs`.
- Introduced a new `mod.rs` to expose the public API and manage submodule imports.
- Updated error handling and type reading functions to ensure consistency across the new structure.

No public API changes have been made, and all existing tests have been updated accordingly to reflect the new module organization.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
@unclesp1d3r unclesp1d3r linked an issue Mar 6, 2026 that may be closed by this pull request
6 tasks
Copilot AI review requested due to automatic review settings March 6, 2026 22:09
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 6, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 6, 2026

Caution

Review failed

Pull request was closed or merged during review

Summary by CodeRabbit

Release Notes

  • Refactor

    • Reorganized internal code structure into specialized submodules.
  • Tests

    • Expanded test coverage for type operations and edge cases.
  • Documentation

    • Updated architecture documentation to reflect structural changes.

Walkthrough

The monolithic src/evaluator/types.rs (1836 lines) was split into a types/ submodule: mod.rs (public API, error type, dispatcher, coercion), numeric.rs (numeric readers), string.rs (string reader), and tests.rs (comprehensive tests). Public API re-exports preserve existing surface.

Changes

Cohort / File(s) Summary
Module Removal / Replaced
src/evaluator/types.rs
Removed monolithic types implementation; functionality relocated into a modular layout.
Public API & Dispatch
src/evaluator/types/mod.rs
New module exposing TypeReadError, read_typed_value, coerce_value_to_type, and re-exports of numeric/string readers; dispatches by TypeKind.
Numeric Readers
src/evaluator/types/numeric.rs
Adds read_byte, read_short, read_long, read_quad with bounds checks, endianness and signed/unsigned handling; includes extensive tests.
String Reader
src/evaluator/types/string.rs
Adds read_string for NUL-terminated UTF‑8 reads with optional max_length, bounds checks, lossy UTF‑8 handling, and tests.
Tests Consolidation
src/evaluator/types/tests.rs
Comprehensive tests for error variants, typed dispatch, reader consistency, coercion, and buffer-overrun scenarios.
Docs
docs/src/architecture.md
Updated architecture docs describing the evaluator refactor and new module topology.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

I nibble bytes and hop with glee,
Split big files into tidy three. 🐰
Numeric hops, string skips along,
Tests keep rhythm, steady and strong.
A tiny rabbit's modular song.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely summarizes the main change: reorganizing the types module into focused submodules. The title is specific, clear, and directly reflects the changeset.
Description check ✅ Passed The PR description provides relevant context about the refactoring, explaining what was removed, what was added, and that no public API changes were made. The description is directly related to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 63-refactor-convert-evaluatortypesrs-to-a-types-directory-module

Comment @coderabbitai help to get the list of available commands and usage tips.

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Mar 6, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 CI must pass

Wonderful, this rule succeeded.

All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.

  • check-success = coverage
  • check-success = quality
  • check-success = test
  • check-success = test-cross-platform (macos-latest, macOS)
  • check-success = test-cross-platform (ubuntu-22.04, Linux)
  • check-success = test-cross-platform (ubuntu-latest, Linux)
  • check-success = test-cross-platform (windows-latest, Windows)

🟢 Do not merge outdated PRs

Wonderful, this rule succeeded.

Make sure PRs are within 10 commits of the base branch before merging

  • #commits-behind <= 3

@dosubot dosubot Bot added the evaluator Rule evaluation engine and logic label Mar 6, 2026
@dosubot
Copy link
Copy Markdown
Contributor

dosubot Bot commented Mar 6, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

libMagic-rs

architecture /libmagic-rs/blob/main/docs/src/architecture.md
View Suggested Changes
@@ -130,7 +130,11 @@
 - `engine/`: Core evaluation engine submodule
   - `mod.rs`: `evaluate_single_rule`, `evaluate_rules`, and `evaluate_rules_with_config` functions
   - `tests.rs`: Engine unit tests
-- `types.rs`: Type interpretation with endianness handling and signedness coercion
+- `types/`: Type interpretation submodule
+  - `mod.rs`: Public API surface with `read_typed_value`, `coerce_value_to_type`, and type re-exports
+  - `numeric.rs`: Numeric type handling (`read_byte`, `read_short`, `read_long`, `read_quad`) with endianness and signedness support
+  - `string.rs`: String type handling (`read_string`) with null-termination and UTF-8 conversion
+  - `tests.rs`: Module tests
 - `offset/`: Offset resolution submodule
   - `mod.rs`: Dispatcher (`resolve_offset`) and re-exports
   - `absolute.rs`: `OffsetError`, `resolve_absolute_offset`
@@ -142,7 +146,7 @@
   - `comparison.rs`: `compare_values`, `apply_less_than`/`greater_than`/`less_equal`/`greater_equal`
   - `bitwise.rs`: `apply_bitwise_and`, `apply_bitwise_and_mask`, `apply_bitwise_xor`, `apply_bitwise_not`
 
-**Organization Note:** The evaluator module was refactored to split a monolithic 2,638-line `mod.rs` into focused submodules, keeping the public API surface in `mod.rs` and moving core evaluation logic to `engine/mod.rs`. This maintains the same public API through re-exports (no breaking changes) while improving code organization and staying within the 500-600 line module guideline.
+**Organization Note:** The evaluator module has been refactored to split monolithic files into focused submodules. The initial refactoring split a 2,638-line `mod.rs` into `engine/` submodules, and a subsequent refactoring reorganized the 1,836-line `types.rs` into `types/` submodules for numeric and string handling. The public API surface remains in `mod.rs` with core logic distributed across focused submodules. This maintains the same public API through re-exports (no breaking changes) while improving code organization and staying within the 500-600 line module guideline.
 
 **Implemented Features:**
 

✅ Accepted

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

cursor[bot]
cursor Bot previously approved these changes Mar 6, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Automated risk assessment for this synchronize update:

  • Risk level: Low
  • Why: The diff is a structural refactor of src/evaluator/types.rs into src/evaluator/types/{mod,numeric,string,tests}.rs. Public API and core type-reading/coercion logic are preserved, and no new production codepaths outside the evaluator types module were introduced.
  • Evidence checked: file-level diff, current review state, and direct comparison of key functions (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type) showing no behavioral logic changes.
  • Reviewer assignment: Not required for Low risk under policy.

Approving based on low-risk mechanical refactor characteristics.


Open in Web View Automation 

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review Result

No high-confidence vulnerabilities found in this PR.

What I reviewed

  • src/evaluator/types/mod.rs
  • src/evaluator/types/numeric.rs
  • src/evaluator/types/string.rs
  • src/evaluator/types/tests.rs
  • removal of src/evaluator/types.rs

Threat-model checks

  • Buffer safety: numeric readers continue to use checked_add + slice .get(...) bounds checks; string reader validates offset and bounds all slicing.
  • Integer safety: no new unchecked offset arithmetic introduced.
  • DoS/resource exhaustion: no new unbounded recursion/allocations; string scan remains bounded by buffer length or max_length.
  • Unsafe/panics in library code: no new unsafe and no unwrap/expect in non-test code paths.
  • Dependency risk: no changes to Cargo.toml/Cargo.lock in this PR.
  • Information disclosure: error messages remain limited to offset/buffer length metadata.

Validation

  • Ran focused tests: cargo test evaluator::types --quiet (pass).

Residual note: cargo audit / cargo deny are not available in this runtime, but there are no dependency changes in this diff.


Open in Web View Automation 

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the evaluator’s type-reading layer by replacing the monolithic src/evaluator/types.rs module with a src/evaluator/types/ directory that splits numeric and string handling into focused submodules while keeping the crate::evaluator::types public entrypoint.

Changes:

  • Replaced src/evaluator/types.rs with src/evaluator/types/mod.rs as the public API surface and dispatcher.
  • Added numeric.rs and string.rs submodules to isolate type-reading logic by domain.
  • Reorganized and expanded unit tests for the new module layout.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/evaluator/types/mod.rs New public API surface for type-reading + dispatcher + error type + coercion helper
src/evaluator/types/numeric.rs Numeric type readers (byte/short/long/quad) + unit tests
src/evaluator/types/string.rs String reader implementation + unit tests
src/evaluator/types/tests.rs New consolidated tests for types module behavior and dispatch
src/evaluator/types.rs Removed old monolithic implementation

Comment on lines +18 to +33
#[test]
fn test_unsupported_type_error_variants() {
let error = TypeReadError::UnsupportedType {
type_name: "CustomType".to_string(),
};
assert!(format!("{error}").contains("CustomType"));
assert!(format!("{error:?}").contains("UnsupportedType"));

assert_eq!(
error,
TypeReadError::UnsupportedType {
type_name: "CustomType".to_string(),
}
);
}

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test only validates derived Display/Debug/PartialEq for TypeReadError::UnsupportedType, but the variant is not produced by the type-reading API anywhere in the codebase. If UnsupportedType is removed (or made meaningful), this test should be adjusted accordingly to avoid locking in unused/unsupported API surface.

Suggested change
#[test]
fn test_unsupported_type_error_variants() {
let error = TypeReadError::UnsupportedType {
type_name: "CustomType".to_string(),
};
assert!(format!("{error}").contains("CustomType"));
assert!(format!("{error:?}").contains("UnsupportedType"));
assert_eq!(
error,
TypeReadError::UnsupportedType {
type_name: "CustomType".to_string(),
}
);
}

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +27
/// # Returns
///
/// Returns `Ok(Value::String(string))` if the read is successful, or an appropriate error
/// if the read fails due to buffer overrun or invalid UTF-8 sequences.
///
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rustdoc claims read_string can fail due to “invalid UTF-8 sequences”, but the implementation uses String::from_utf8_lossy(...) and therefore never returns an error for invalid UTF-8 (it replaces invalid bytes with U+FFFD). Please update the docs to match the actual behavior (or change the implementation to return an error on invalid UTF-8 if that’s the intended contract).

Copilot uses AI. Check for mistakes.
Comment thread src/evaluator/types/mod.rs Outdated
@coderabbitai coderabbitai Bot added enhancement New feature or request testing Test infrastructure and coverage labels Mar 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 98.75000% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/evaluator/types/numeric.rs 98.33% 6 Missing ⚠️
src/evaluator/types/string.rs 99.42% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review Result

No high-confidence vulnerabilities found in this PR.

What I verified

  • Buffer safety: numeric readers in src/evaluator/types/numeric.rs continue using checked_add plus slice .get(..) before decoding (read_short, read_long, read_quad), and read_byte uses .get(offset).
  • Offset arithmetic: no unchecked offset math was introduced; arithmetic remains overflow-checked before slicing.
  • String handling: src/evaluator/types/string.rs validates offset < buffer.len() before slicing, bounds search_len with min(max_len, remaining_buffer.len()), and only slices within validated bounds.
  • Unsafe/panics in library code: no new unsafe, unwrap(), or expect() in non-test library paths touched by this PR.
  • Dependency risk: no Cargo.toml/Cargo.lock changes in this diff; CI check audit is passing for this PR.

Residual risk / notes

  • This change is a module split/refactor; behavior appears equivalent to the previous types.rs implementation. I did not find newly introduced resource-exhaustion or info-leak paths in the changed code.
  • Local cargo audit / cargo deny execution was not possible in this runner because cargo-audit and cargo-deny are not installed, so dependency posture was validated via PR CI status instead.

Open in Web View Automation 

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Mar 6, 2026

🧪 CI Insights

Here's what we observed from your CI run for eb0aac2.

🟢 All jobs passed!

But CI Insights is watching 👀

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Automated Risk Assessment

Risk level: Medium

Evidence from diff

  • Core library evaluator path changed substantially: src/evaluator/types.rs removed and replaced with src/evaluator/types/{mod.rs,numeric.rs,string.rs,tests.rs}.
  • Large structural refactor footprint (about 3.3k changed lines shown as delete/add due file split), which increases regression surface even when intent is organizational.
  • Production logic in key functions (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type) appears behaviorally equivalent after split; observed differences are predominantly docs/comments/module organization.
  • Targeted test validation run for evaluator types passed (65 passed, 0 failed).

Decision

  • Code review required (Medium risk due core shared module refactor and broad structural change).
  • Requested reviewer: @param-jasani.
  • No approval posted per policy for Medium risk.

Open in Web View Automation 

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Follow-up on reviewer assignment: automated request could not be completed because the candidate was not a repository collaborator. I could not assign a non-author reviewer automatically in this run; maintainer action is needed to request an eligible reviewer.


Open in Web View Automation 

- Fix inaccurate read_string docs that claimed errors for invalid UTF-8
  (actually uses from_utf8_lossy replacement)
- Restore # Arguments doc sections on numeric read functions
- Add checked_add overflow tests for read_short/read_long/read_quad
- Clarify UnsupportedType variant is reserved for future types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
cursor[bot]
cursor Bot previously approved these changes Mar 6, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Automated Risk Assessment

Risk level: Low

This PR is a structural refactor of evaluator::types (file split into submodules) with no material behavior change detected in production logic.

Evidence checked:

  • Core functions (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type) remain behaviorally equivalent after reorganization.
  • Changes are primarily module layout/docs/tests movement.
  • Targeted validation passed: cargo test evaluator::types --lib (65 passed, 0 failed).

Per policy for Low risk, this PR is approved.


Open in Web View Automation 

@unclesp1d3r unclesp1d3r added the lgtm Approved for the merge queue label Mar 6, 2026
@unclesp1d3r unclesp1d3r self-assigned this Mar 6, 2026
Copilot AI review requested due to automatic review settings March 6, 2026 22:23
@unclesp1d3r unclesp1d3r review requested due to automatic review settings March 6, 2026 22:23
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No high-confidence vulnerabilities found in this refactor.

Security-focused review results:

  • Buffer safety: Numeric readers continue using checked_add + slice .get(offset..end) before decoding; byte reads use .get(offset). String reads validate offset < buffer.len() before slicing and clamp reads with min(max_len, remaining.len()).
  • Integer safety: Offset arithmetic for fixed-width numeric reads still uses checked arithmetic, preventing overflow/underflow-driven OOB access.
  • Unsafe/panics in library code: No new unsafe blocks, raw pointers, unwrap(), or expect() introduced in production paths.
  • DoS/resource exhaustion: No new unbounded recursion or allocation patterns were introduced by this PR; behavior appears equivalent to pre-refactor implementation.
  • Dependency risk: PR diff does not modify Cargo.toml/Cargo.lock; no new dependency attack surface introduced.

Validation run:

  • cargo test --lib evaluator::types passed (68 tests).
  • cargo audit / cargo deny could not be executed in this runner because the corresponding cargo subcommands are not installed (no such command: audit/deny). Given no dependency changes in this PR, this does not indicate a new PR-local supply-chain risk, but CI should continue enforcing these checks.

Open in Web View Automation 

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Risk Assessment

Risk level: Low

Evidence from actual diff

  • Changed paths are isolated to src/evaluator/types* and consist of splitting types.rs into types/mod.rs, types/numeric.rs, types/string.rs, and types/tests.rs.
  • Core production functions were compared across base/head and remain behaviorally equivalent (read_byte, read_short, read_long, read_quad, read_string, read_typed_value, coerce_value_to_type).
  • No dependency, infra, auth, permissions, schema, or cross-system changes.
  • Focused validation passed on current head: cargo test evaluator::types --lib (68 passed, 0 failed).

Decision

  • Code review required: No (Low risk under policy).
  • Reviewer assignment: None requested.
  • Approval action: No new approval posted because this PR is already approved on the current head commit (3f031417690f9d7f5201e52b3bc21978ece49411).
  • Re-approval/unapproval check: No risk increase detected versus current approved head, so no dismissal/revocation needed.

Open in Web View Automation 

@unclesp1d3r unclesp1d3r enabled auto-merge (squash) March 6, 2026 22:25
@unclesp1d3r unclesp1d3r merged commit 1745507 into main Mar 6, 2026
27 of 28 checks passed
@unclesp1d3r unclesp1d3r deleted the 63-refactor-convert-evaluatortypesrs-to-a-types-directory-module branch March 6, 2026 22:27
@github-actions github-actions Bot mentioned this pull request Mar 6, 2026
mergify Bot pushed a commit that referenced this pull request Mar 6, 2026
## 🤖 New release

* `libmagic-rs`: 0.4.1 -> 0.4.2 (✓ API compatible changes)

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

## [0.4.2] - 2026-03-06

### Refactor

- **evaluator**: Reorganize types module into submodules
([#156](#156))
<!-- generated by git-cliff -->
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@coderabbitai coderabbitai Bot mentioned this pull request Apr 10, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request evaluator Rule evaluation engine and logic lgtm Approved for the merge queue size:XXL This PR changes 1000+ lines, ignoring generated files. testing Test infrastructure and coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: convert evaluator/types.rs to a types/ directory module

2 participants