Skip to content

Test infrastructure, compatibility tests, and architecture improvements#31

Merged
unclesp1d3r merged 19 commits into
mainfrom
22-test-infrastructure-compatibility-tests-coverage
Feb 12, 2026
Merged

Test infrastructure, compatibility tests, and architecture improvements#31
unclesp1d3r merged 19 commits into
mainfrom
22-test-infrastructure-compatibility-tests-coverage

Conversation

@unclesp1d3r
Copy link
Copy Markdown
Member

@unclesp1d3r unclesp1d3r commented Feb 7, 2026

Summary

  • Add comprehensive integration test suites for evaluator, MIME, tags, property tests, and end-to-end flows (~940 tests total)
  • Add benchmark infrastructure (parser, evaluation, I/O) with CI workflow for regression tracking
  • Fix CI workflows for benchmarks and compatibility tests
  • Refactor architecture based on review: eliminate redundant work in evaluator, fix lossy error conversion, add proper error variants, cross-type integer coercion, deduplicate result-building logic, cache TagExtractor
  • Optimize evaluation pipeline: SIMD null scanning via memchr, remove redundant bounds checks, release profile with LTO
  • Simplify code: idiomatic then_some(), reuse build_result, remove library eprintln calls
  • Split parser module into focused submodules (format, hierarchy, loader, preprocessing)

Test plan

  • just ci-check passes (940 tests, clippy, fmt, cargo check)
  • All pre-commit hooks pass
  • Benchmarks compile and run
  • Doc tests pass
  • CI workflows pass on GitHub Actions

🤖 Generated with Claude Code

unclesp1d3r and others added 2 commits February 6, 2026 22:16
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Add test infrastructure components for Phase 1 MVP completion:

Benchmarks (Criterion):
- parser_bench.rs: Magic file parsing performance
- evaluation_bench.rs: Rule evaluation against file types
- io_bench.rs: Memory-mapped I/O operations

Property Tests (proptest):
- 15 property tests for serialization, evaluation, and safety
- Tests for ELF/ZIP detection, config validation, buffer handling

CI/CD Enhancements:
- Enable compatibility test workflow (remove if: false)
- Add benchmarks.yml for weekly runs and PR comparisons
- Add coverage-report and coverage-summary recipes to justfile

Documentation:
- Update testing.md with current implementation status
- Add benchmark documentation section

Closes #22

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@unclesp1d3r unclesp1d3r linked an issue Feb 7, 2026 that may be closed by this pull request
10 tasks
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 7, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 7, 2026

Caution

Review failed

Failed to post review comments

Summary by CodeRabbit

  • New Features

    • CI benchmarks (scheduled/PR/manual) and many local benchmark targets; large new integration, property, and unit test suites.
  • Documentation

    • Expanded testing docs: benchmarks, benchmark CI comparisons, fuzzing, compatibility, and developer patterns.
  • Performance

    • Faster evaluation with periodic timeout checks and cached MIME lookups.
  • API Changes

    • Evaluation APIs now borrow configuration and single-rule checks return optional match payloads; new error variants for config/file issues.
  • UX / CLI

    • Streamlined JSON output, clearer load-error messages, and distinct exit codes for config/file failures.

Walkthrough

Splits the parser into format/preprocessing/hierarchy/loader modules, adds Criterion benchmarks and a Benchmarks CI workflow with PR-vs-main comparison, changes evaluator APIs to Option-based matches and borrowed EvaluationConfig, adds MIME caching and output conversion helpers, and introduces extensive tests, docs, and tooling updates.

Changes

Cohort / File(s) Summary
Workflows & CI
\.github/workflows/benchmarks.yml, \.github/workflows/compatibility.yml
Adds a scheduled/manual/PR Benchmarks workflow with PR-vs-main comparison and a compatibility workflow matrix with caching, verification, and artifact uploads.
Benchmarks & Manifest
benches/parser_bench.rs, benches/evaluation_bench.rs, benches/io_bench.rs, Cargo.toml
Adds three Criterion benchmark suites, registers benches in Cargo.toml, bumps dev-deps, and adjusts release/dist profiles.
Parser refactor & new modules
src/parser/mod.rs, src/parser/format.rs, src/parser/preprocessing.rs, src/parser/hierarchy.rs, src/parser/loader.rs
Reorganizes parser into new modules (format, preprocessing, hierarchy, loader), moves/exports detect_format/load_magic_file/load_magic_directory/parse_text_magic_file, implements line preprocessing and hierarchy builder, and adds comprehensive tests.
Evaluator & types
src/evaluator/mod.rs, src/evaluator/operators.rs, src/evaluator/types.rs
Changes evaluate_single_rule to return Option<(offset, Value)>, evaluate_rules_with_config now takes &EvaluationConfig and uses synchronous evaluation with periodic timeout checks; adds cross-type integer coercion and memchr-based string reads.
Library surface & errors
src/lib.rs, src/error.rs
Adds LibmagicError::ConfigError{reason} and LibmagicError::FileError(String); MagicDatabase now caches a mime_mapper field and initializers updated; From<IoError> maps to FileError.
Output & CLI
src/output/mod.rs, src/main.rs
Adds conversion helpers (from_evaluator_match, from_library_result) and a lazy TagExtractor; CLI uses EvaluationResult::from_library_result and maps new error variants to exit codes.
Parser helpers & loader tests
src/parser/..., tests/*
Adds LineInfo, preprocess_lines, parse_magic_rule_line, deterministic directory loader, and many unit tests for parsing/loader edge cases and error paths.
Tests (property, integration, unit, mime, tags)
tests/property_tests.rs, tests/integration_tests.rs, tests/evaluator_tests.rs, tests/mime_tests.rs, tests/tags_tests.rs, tests/compatibility_tests.rs
Adds property-based tests (proptest), broad integration suites, evaluator/mime/tag tests, and updates compatibility tests to use --use-builtin.
Parser detection
src/parser/format.rs
Adds MagicFileFormat enum and detect_format(Path) detecting Text/Directory/Binary formats with unit tests.
Docs & Tooling
docs/src/testing.md, docs/src/evaluator.md, docs/src/getting-started.md, justfile, mise.toml, .gitignore
Docs updated to document benchmarks, testing, and evaluate_rules signature changes (&EvaluationConfig); justfile adds coverage-report/summary targets; mise adds tools; .gitignore updated.
Internal guidance
.claude/..., .claude/skills/SKILL.md
Adds multiple project guidance documents (AST derive, error-handling, testing conventions, co-change awareness, strict Clippy, just usage, skills).

Sequence Diagram(s)

sequenceDiagram
    participant PR as Pull Request
    participant GH as GitHub Actions
    participant Runner as Runner (ubuntu-latest)
    participant CheckoutPR as Checkout PR
    participant Setup as Setup Tools & Cache
    participant BenchPR as Run benches (baseline=pr)
    participant UploadPR as Upload PR artifact
    participant CheckoutMain as Checkout main
    participant BenchMain as Run benches (baseline=main)
    participant Critcmp as critcmp Compare & Report

    PR->>GH: trigger (PR update / manual / schedule)
    GH->>Runner: start benchmark-compare job
    Runner->>CheckoutPR: checkout PR branch (fetch-depth: 0)
    Runner->>Setup: install tools (mise), restore caches
    Runner->>BenchPR: run `cargo bench` (save baseline=pr)
    BenchPR->>UploadPR: upload `target/criterion/` (pr)
    Runner->>CheckoutMain: checkout main branch
    Runner->>Setup: ensure tools & caches
    Runner->>BenchMain: run `cargo bench` (save baseline=main)
    BenchMain->>Critcmp: run `critcmp` to compare baselines
    Critcmp->>GH: append formatted comparison to PR summary
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐇 I hopped through benches at Sunday dawn,

I stacked the rules and kept the errors gone,
MIME maps cached, timeouts checked in line,
Tests sprouted carrots, parser roots align,
CI hums softly — cozy, quick, and fine.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately summarizes the main changes: test infrastructure additions, compatibility test improvements, and architecture refactoring with clear focus on the three primary areas of work.
Description check ✅ Passed The pull request description is directly related to the changeset, providing a well-organized summary of test additions, benchmark infrastructure, CI fixes, architectural refactoring, performance optimizations, and code cleanups with a clear test plan.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 22-test-infrastructure-compatibility-tests-coverage

Comment @coderabbitai help to get the list of available commands and usage tips.

@dosubot dosubot Bot added compatibility libmagic compatibility and migration documentation Improvements or additions to documentation evaluator Rule evaluation engine and logic io File I/O and memory mapping parser Magic file parsing components and grammar performance Performance optimizations and benchmarks testing Test infrastructure and coverage labels Feb 7, 2026
@coderabbitai coderabbitai Bot added the enhancement New feature or request label Feb 7, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 7, 2026

Generated SKILL.md with project patterns (architecture, co-change maps,
clippy config, error handling, testing conventions) and 8 instinct files
for strict clippy compliance, error handling, co-change awareness,
testing, AST derives, build script testing, just tasks, and commits.
chore(deps): reorder dependencies in mise.toml for clarity

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
@unclesp1d3r unclesp1d3r self-assigned this Feb 11, 2026
- clap: 4.5.54 -> 4.5.57
- criterion: 0.8.1 -> 0.8.2
- insta: 1.46.1 -> 1.46.3
- proptest: 1.9.0 -> 1.10.0
- regex: 1.12.2 -> 1.12.3
- tempfile: 3.24.0 -> 3.25.0
- Remove temp-env (unused, no imports in codebase)
Remove 10 tests that were not providing additional coverage:
- 5 fake proptests using unused _seed parameter (converted 4 to regular
  #[test], removed 1 covered by prop_evaluation_never_panics)
- 4 individual serde roundtrip tests already covered by
  prop_rule_serde_roundtrip
- 2 random-prefix tests that are subsets of prop_evaluation_never_panics

346 -> 218 lines, 868 tests still pass.
- Cache MimeMapper on MagicDatabase to avoid rebuilding HashMap per evaluation
- Pass EvaluationConfig by reference to eliminate clones at call sites
- Remove thread-spawn timeout path (per-rule timeout check is sufficient)
- Replace strip_suffix().to_string() with pop() for continuation lines
- Track last line number during parsing instead of re-scanning on error
- Pre-allocate concatenate_messages String with exact capacity
- Use smaller initial Vec capacity for matches (8 vs rules.len())
- Check timeout every 16 rules instead of every rule
- Fix lossy IoError conversion: preserve ErrorKind and error chain by
  wrapping full IoError as source via Error::new(kind, err) instead of
  flattening to string with Error::other(err.to_string())

- Add output conversion methods: MatchResult::from_evaluator_match() and
  EvaluationResult::from_library_result() formalize the adapter pattern,
  removing ~40 lines of manual conversion boilerplate from main.rs

- Split parser/mod.rs (2098 lines) into focused submodules:
  - format.rs: MagicFileFormat enum, detect_format()
  - preprocessing.rs: LineInfo, preprocess_lines(), parse_magic_rule_line()
  - hierarchy.rs: build_rule_hierarchy()
  - loader.rs: load_magic_directory(), load_magic_file()
  - mod.rs: public interface, re-exports, parse_text_magic_file()
Remove --noplot flag from criterion bench invocations (removed in
criterion 0.8.2). Switch compatibility tests from --magic-file with
binary .mgc to --use-builtin, since the parser rejects binary magic
files by design.
…tags, and end-to-end flows

Add 72 new integration tests across 4 test files:
- evaluator_tests.rs (21 tests): confidence calculation, rule ordering,
  config variants, metadata, edge cases
- mime_tests.rs (15 tests): MIME type detection via MagicDatabase API,
  MimeMapper direct tests for executables, archives, images, documents
- tags_tests.rs (13 tests): keyword extraction, case insensitivity,
  custom keywords, rule path tags
- integration_tests.rs (23 tests): end-to-end flows including builtin
  rules, custom magic files, hierarchy evaluation, directory loading,
  file evaluation, metadata, and error cases

Total test count: 940 (all passing)
Coverage: 93.86% (target: >85%)
The parser requires quoted strings for string-type values. Unquoted
bare words like `test` are ambiguous and fail with a parse error.
Remove --noplot flag removed in criterion 0.8.2. Use explicit --bench
targets for benchmark comparison to avoid passing --save-baseline to
the lib unit test binary which doesn't recognize criterion flags.
Fix unquoted string value in parser benchmark that fails to parse.
The benchmark comparison job checks out main to run benchmarks for
comparison. If main doesn't have the bench targets yet, the step now
gracefully skips the comparison instead of failing the workflow.
Also use explicit bench targets in the run job for consistency.
@unclesp1d3r unclesp1d3r enabled auto-merge (squash) February 12, 2026 00:48
The byteorder crate is actively used in src/evaluator/types.rs for
endianness-aware binary data reading. The machete ignore was added
when the crate was planned but not yet used; it is no longer needed.
- Eliminate redundant offset/value resolution in evaluator by returning
  Option<(usize, Value)> from evaluate_single_rule instead of bool
- Fix lossy IoError conversion with FileError variant preserving context
- Remove eprintln! calls and dead code from library evaluator
- Add cross-type integer coercion (Uint/Int) in operators via i128
- Add ConfigError variant for config validation instead of misusing ParseError
- Extract build_result method to deduplicate result construction logic
- Cache TagExtractor with LazyLock to avoid per-call HashSet allocation
- Remove redundant double bounds check in read_byte (manual check + .get())
- Use trailing_zeros() for timeout interval check instead of modulo
- Add memchr for SIMD-accelerated null terminator scanning in read_string
- Simplify duplicate try_from in BitwiseAndMask operator to single call
- Add release profile with lto="thin" and codegen-units=1
- Replace duplicated empty-rules early returns with build_result reuse
- Use then_some() for idiomatic match-to-option conversion
- Remove debug-only eprintln from set_confidence (library code)
@unclesp1d3r unclesp1d3r changed the title feat: implement comprehensive test infrastructure Test infrastructure, compatibility tests, and architecture improvements Feb 12, 2026
- Add rustdoc examples to ConfigError and FileError variants
- Fix incorrect error mapping in format.rs: use ParseError::IoError
  instead of ParseError::invalid_syntax for I/O read failures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compatibility libmagic compatibility and migration documentation Improvements or additions to documentation enhancement New feature or request evaluator Rule evaluation engine and logic io File I/O and memory mapping parser Magic file parsing components and grammar performance Performance optimizations and benchmarks size:XXL This PR changes 1000+ lines, ignoring generated files. testing Test infrastructure and coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test Infrastructure: Compatibility Tests & Coverage

1 participant