Skip to content

feat: Create rmagic cli#7

Merged
unclesp1d3r merged 30 commits into
mainfrom
create_rmagic_cli
Dec 15, 2025
Merged

feat: Create rmagic cli#7
unclesp1d3r merged 30 commits into
mainfrom
create_rmagic_cli

Conversation

@unclesp1d3r
Copy link
Copy Markdown
Member

This pull request introduces compatibility testing infrastructure, updates CI/CD workflows, and refines error handling and test file management for the project. The most significant changes include adding a GitHub Actions workflow for compatibility tests, updating the error handling model to use more structured error types, and marking key implementation milestones as complete. The changes also improve the handling of test and result files across the codebase and update development dependencies to support enhanced testing.

Compatibility Testing Infrastructure

  • Added .github/workflows/compatibility.yml to run compatibility tests across Ubuntu, Windows, and macOS, including scheduled nightly runs and verification of compatibility test files.
  • Updated justfile with new recipes for verifying compatibility test files and running compatibility tests, including platform-specific checks and integration with CI. Added ci-check-compatibility for full CI parity including compatibility tests. [1] [2]

Error Handling Improvements

  • Refactored error types in src/lib.rs to use the new LibmagicError, EvaluationError, and ParseError enums from src/error.rs, replacing previous string-based error handling. Improved conversion from IO errors and updated error propagation in offset resolution logic. [1] [2] [3]
  • Enhanced error propagation and test assertions in src/evaluator/offset.rs to use structured error enums, improving clarity and robustness of error handling in offset resolution and related tests. [1] [2] [3]

Project Management and Documentation

  • Marked major implementation milestones as completed in .kiro/specs/rust-libmagic-implementation/tasks.md, including CLI argument structure, file processing, JSON output, string type support, and error types.

Test and Result File Management

  • Updated .gitattributes, .prettierignore, .mdformat.toml to treat *.result and *.testfile as binary files and exclude them from formatting and prettification. [1] [2] [3]
  • Added minimal magic files for testing and CI/CD environments: missing.magic and nonexistent.magic. [1] [2]

Development Dependency Updates

  • Added insta, tempfile, and updated nix as dev-dependencies in Cargo.toml to support snapshot testing and cross-platform file operations.

These changes collectively improve the project's reliability, test coverage, and maintainability, especially in cross-platform and CI/CD scenarios.

- Add comprehensive CLI argument structure using clap derive macros
- Implement platform-specific magic file path resolution
- Add support for JSON and text output formats
- Create `Args` struct with flexible configuration options
- Add default magic file path detection for Unix and Windows
- Implement output format selection with `OutputFormat` enum
- Add basic error handling and file existence validation
- Include initial test coverage for CLI argument parsing
Enhances the CLI tool's usability and provides a flexible interface for file type identification across different platforms.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Introduced a new compatibility testing suite to ensure that libmagic-rs produces identical results to the original libmagic implementation.
- Added a `tests/compatibility_tests.rs` file to run tests against the original test files from the file/file repository.
- Created a `justfile` with commands for initializing and running compatibility tests, including CI integration for automated testing on push and pull request events.
- Implemented a GitHub Actions workflow in `compatibility.yml` to run compatibility tests across multiple platforms, ensuring consistent results.
- Added a `.gitmodules` file to manage the file-tests submodule for compatibility tests.

These enhancements improve the testing infrastructure, ensuring robust validation of the library's functionality against established benchmarks.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…nality

- Introduced a basic magic file (`missing.magic` and `nonexistent.magic`) for testing and CI/CD environments, containing signatures for various file types including ELF, PE, ZIP, JPEG, PNG, GIF, PDF, and common text patterns.
- Updated the `justfile` to include specific actionlint commands for various GitHub workflows, ensuring comprehensive linting coverage.
- Enhanced the CLI in `main.rs` to check for the existence of the magic file and attempt to download it if missing, providing a fallback mechanism for CI/CD environments.
- Implemented a new function `download_magic_files` to create a basic magic file when none exists, improving usability in testing scenarios.

These changes improve the testing infrastructure and ensure robust functionality for file type detection in various environments.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Implemented comprehensive error handling in the CLI, providing user-friendly messages for various error scenarios, including file not found, permission denied, and invalid input.
- Added validation functions to ensure command-line arguments are correctly formatted and accessible before processing.
- Updated the `run_analysis` function to utilize the new error handling mechanism, improving the robustness of the CLI tool.
- Introduced new tests to validate error handling and argument validation, ensuring consistent behavior across different input scenarios.

These enhancements improve the user experience by providing clearer feedback and ensuring that the CLI operates reliably under various conditions.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Added NOTICE and COPYING from `github.com/file/file` to provide proper attribution for test files.
- Refactored various third-party test files and configurations to ensure consistency and maintainability.
- Updated `.gitattributes`, `.mdformat.toml`, and `.prettierignore` for prevent alterations to the test files.
- Cleaned up and organized test results and test files across multiple formats to enhance clarity and usability.

These changes contribute to a more organized testing framework and improve the overall structure of third-party resources.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Introduced a new binary file `magic.mgc` in the `third_party` directory to support file type detection testing.
- This addition enhances the testing framework by providing a standard magic file for validating the functionality of the library.

These changes contribute to a more robust testing environment and ensure accurate file type detection capabilities.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…ification process

- Deleted the `.gitmodules` entry for the `file-tests` submodule, streamlining the project structure.
- Updated the `justfile` to replace the submodule initialization command with a verification command that checks for the existence of compatibility test files in the `third_party` directory.
- Adjusted the GitHub Actions workflow to reflect the new verification process for compatibility test files, ensuring that the necessary files are available before proceeding with builds.

These changes simplify the setup process for compatibility tests and enhance the reliability of the testing framework.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Implement JSON output formatting with JsonMatchResult struct
- Add JSON output integration in main.rs with --json flag handling
- Create src/output/json.rs to define JSON serialization structure
- Update tasks.md to mark JSON output implementation tasks as completed
- Add tempfile dependency in Cargo.toml for testing
- Convert evaluation results to structured JSON match results
- Enhance CLI output routing to support JSON format
- Provide machine-readable file type detection results
Closes #11 (JSON match result implementation)

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…hecks

- Split `validate()` method into separate validation functions
- Add constants for safe thresholds in recursion depth, string length, and timeout
- Enhance error messages with more descriptive and context-specific details
- Separate concerns by creating individual validation methods for different aspects
- Improve error handling and prevent potential resource exhaustion scenarios
- Maintain consistent error reporting and validation logic
This refactoring improves the robustness and readability of configuration validation, making it easier to understand and maintain security checks.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
… and library

- Refactor error handling functions to use more concise, multi-line error messages
- Optimize error message generation with reduced string allocations
- Update error handling in main.rs to consolidate error printing logic
- Modify create_basic_magic_content() to use a const for magic file content
- Improve JSON output generation by using pre-allocated vector
- Standardize error message formatting across different error types
- Reduce redundant string formatting and improve readability of error messages
These changes enhance the error reporting mechanism, making error messages more consistent and slightly more memory-efficient while maintaining clear and informative output.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Implement string type reading in evaluator with null-termination support
- Update test cases to validate string type evaluation
- Remove unsupported string type error handling
- Add test for matching and non-matching string rules
- Mark string type implementation tasks as completed in tasks.md

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Implement basic string type in AST
- Add string matching support with UTF-8 validation
- Extend read_typed_value function to handle String type
- Mark tasks 12, 12.1, and 12.3 as completed in implementation plan
- Prepare for advanced string parsing and validation

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…n configuration validation

- Update error messages to use dynamic formatting with constants
- Add more descriptive error messages for configuration validation
- Enhance error context in validation methods for recursion depth, string length, and timeout
- Improve test assertions to check for more specific error message details
- Remove TODO comments and replace with actual implementation improvements
- Add runtime context to error messages using predefined constants

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
…ator methods

- Replace manual vector building with iterator methods in JSON output functions
- Use `with_capacity()` to pre-allocate vectors for better performance
- Improve hex byte parsing in grammar parser with direct digit conversion
- Remove unnecessary `format!()` allocations in hex parsing
- Reduce memory allocations and improve parsing efficiency
These changes focus on reducing memory allocations and improving performance in key parsing and output generation functions.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Remove unnecessary file patterns from hook trigger configuration
- Reduce scope of hook to only watch Rust source files
- Trim trailing newline for consistency
- Maintain core auto-fix agent prompt and functionality

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Create src/error.rs with detailed error types using thiserror
- Implement LibmagicError enum with variants for parsing, evaluation, and I/O errors
- Add detailed ParseError enum with specific error types for magic file parsing
- Create EvaluationError enum to handle rule evaluation and type reading errors
- Implement helper methods for creating specific error instances
- Mark task 13.1 as completed in tasks.md
- Provide comprehensive error reporting with context and line numbers

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Create EvaluationError enum in error.rs module
- Add variants for BufferOverrun, InvalidOffset, and UnsupportedType
- Mark task 13.2 as completed in implementation plan
- Improve error handling for runtime evaluation scenarios

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Add comprehensive error handling for individual rule evaluation
- Implement graceful degradation to skip problematic rules
- Log warning messages for skipped rules with specific error details
- Enhance error context in evaluate_rules and evaluate_single_rule functions
- Preserve evaluation flow by continuing processing after encountering rule-level errors
- Maintain critical error handling for timeout and recursion limit scenarios
- Update documentation to reflect new error handling approach

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Oct 7, 2025

Caution

Review failed

Failed to post review comments

Summary by CodeRabbit

  • New Features

    • JSON output (--json) with structured formatting and hex value support; new masked bitwise matching and improved string reading (UTF‑8, null‑terminated).
    • CLI revamped with confirmable output modes and platform-aware magic-file resolution.
  • Testing

    • Large cross‑platform compatibility and CLI integration suites added; normalization helpers and CI job to run compatibility tests across OSes (scheduled and on PRs).
  • Improvements

    • Graceful per‑rule degradation, richer error reporting, and updated tooling/format ignore patterns.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds compatibility CI/workflow and many libmagic test fixtures; introduces a public error module and structured errors; updates evaluator to gracefully skip non‑fatal rule errors and add string reading and BitwiseAndMask operator; implements clap-derive CLI with JSON/text output and JSON formatting; extensive tests and docs added/updated.

Changes

Cohort / File(s) Summary
CI & Automation
**/.github/workflows/compatibility.yml**, **/.github/workflows/ci.yml**, **/justfile**, **/.kiro/hooks/ci-auto-fix.kiro.hook**, **/.coderabbitai.yaml**, **/.pre-commit-config.yaml**
Add compatibility workflow and CI targets, update Codecov slug, narrow KiRo hook triggers, exclude third_party from code-review filter, and comment out mdformat hook.
Repo ignores & attributes
/.gitattributes, /.mdformat.toml, /.prettierignore, /.gitignore
Treat **/*.result and **/*.testfile as binary and add them to formatter/Prettier ignores; remove test_files/ from .gitignore.
Errors & exports
src/error.rs, src/lib.rs
New public error module (LibmagicError, ParseError, EvaluationError) with From impls/constructors; re-export errors and add IoError conversion.
Evaluator & types
src/evaluator/mod.rs, src/evaluator/offset.rs, src/evaluator/types.rs, src/evaluator/operators.rs
Add evaluate_rules_with_config; map offset/type errors to structured EvaluationError; make per-rule failures non‑fatal (log-and-skip) except critical errors; add read_string and integrate into read_typed_value; implement BitwiseAndMask operator.
Parser & AST
src/parser/grammar.rs, src/parser/ast.rs
Expose parsing helpers (types, offsets, messages, full rules), support mixed hex/ASCII parsing; add Operator::BitwiseAndMask(u64) AST variant.
CLI & main
src/main.rs
Replace manual parsing with clap-derive Args and OutputFormat; implement argument validation, platform-aware magic-file resolution with CI/test fallback, centralized run_analysis and error mapping; add extensive CLI tests.
Output (JSON)
src/output/json.rs, src/output/mod.rs
New JsonMatchResult/JsonOutput types, converters, hex formatting, pretty/compact JSON formatters; export pub mod json; add debug-only confidence validation.
Tests & test utils
tests/*, tests/common/mod.rs, tests/compatibility/README.md
Add compatibility runner, CLI/JSON integration and normalization tests, insta snapshots, test helpers, and README.
Third‑party fixtures & notices
third_party/COPYING, third_party/NOTICE.md, third_party/tests/*
Add many canonical libmagic test fixtures (*.testfile, *.result, .magic, .flags, README) and license/notice files for compatibility testing.
Small CI/test magic files
missing.magic, nonexistent.magic
Add minimal magic rule files used as CI/test fallbacks.
Cargo / dev deps
Cargo.toml
Add dev-dependencies: insta (json feature), regex, temp-env, tempfile.
Docs
docs/src/*.md
Update docs to reference third_party paths, add CLI testing/snapshot guidance, and minor formatting/troubleshooting edits.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as rmagic (CLI)
  participant Args as Args/Clap
  participant FS as Filesystem
  participant DB as MagicDatabase
  participant Eval as Evaluator
  participant Out as Output(JSON/Text)

  User->>CLI: Invoke with [--json|--text] --file <path> [--magic-file]
  CLI->>Args: Parse arguments
  Args-->>CLI: Args{file, format, magic_path?}
  CLI->>FS: Validate input file / magic file (resolve defaults)
  alt Magic file missing (CI/test fallback)
    CLI->>FS: Create/download basic magic file
  end
  CLI->>DB: load_from_file(magic_path)
  DB-->>CLI: rules/config
  CLI->>Eval: evaluate_rules_with_config(rules, buffer, config)
  alt Non-critical errors
    Eval-->>Eval: Log & skip rule errors
  else Critical (timeout/recursion)
    Eval-->>CLI: Propagate LibmagicError
  end
  Eval-->>CLI: Vec<MatchResult>
  CLI->>Out: Render JSON or Text
  Out-->>User: Output
Loading
sequenceDiagram
  autonumber
  participant Eval as Evaluator
  participant Rule as Rule Walker
  participant Types as Type Reader
  participant Err as Error Module

  Eval->>Rule: Iterate rules
  Rule->>Rule: Resolve offsets
  alt Offset errors
    Rule->>Err: Map to EvaluationError::{BufferOverrun|InvalidOffset}
    Rule-->>Eval: Skip rule (log)
  else OK
    Rule->>Types: read_typed_value (incl. read_string)
    alt Type read error (non-critical)
      Types->>Err: TypeReadError -> EvaluationError
      Rule-->>Eval: Skip match (log)
    else Match + children
      Rule->>Rule: Evaluate children (depth++)
      alt Critical (Timeout/RecursionLimitExceeded)
        Rule-->>Eval: Propagate error
      else Non-critical child errors
        Rule-->>Eval: Skip child (log)
      end
    end
  end
  Eval-->>Caller: Results or error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Files/areas needing extra attention:

  • src/error.rs and From conversions (variant shapes and Display messages).
  • src/evaluator/* (graceful-degradation semantics, recursion/timeout propagation, and tests).
  • src/main.rs (argument parsing, magic-file resolution, CI/test fallback code paths).
  • tests/compatibility and the large addition of third_party fixtures (test harness expectations and licensing notices).

Possibly related issues

Possibly related PRs

Poem

Hop hop, I parse and peek,
Through magic bytes I softly sneak.
Errors softened, rules let pass,
JSON shines in every glass.
Tests and CI clap their paws — hooray for this small rabbit's bash! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title 'feat: Create rmagic cli' is only partially related to the changeset. The changeset includes significant compatibility testing infrastructure, error handling refactoring, test file management, and CI/CD workflow updates. While a CLI is part of these changes, the title does not capture the main scope of the work. Consider a more comprehensive title that reflects the major changes: 'feat: Add rmagic CLI with compatibility testing infrastructure and error handling refactoring' or similar, to better convey the full scope of the PR.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is well-written and thoroughly documents all major changes including compatibility testing infrastructure, error handling improvements, test file management, and dependency updates. It clearly relates to the actual changeset across multiple subsystems.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch create_rmagic_cli

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added enhancement New feature or request evaluator Rule evaluation engine and logic output Result formatting and output generation testing Test infrastructure and coverage labels Oct 7, 2025
- Change Codecov slug to match the repository name
- Update Rust toolchain version to 1.90
- Remove caching of cargo dependencies and replace with installation of the Just task runner

This update streamlines the CI process and ensures compatibility with the latest Rust toolchain.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Add `regex` and `temp-env` dependencies for enhanced functionality.
- Update `read_long` documentation to clarify behavior regarding NUL characters.
- Introduce `BuildNoiseFilter` struct to filter out build noise in CLI integration tests.
- Refactor test file creation to use `NamedTempFile` for automatic cleanup.
- Normalize output in CLI integration tests to improve snapshot consistency.

These changes enhance the testing framework and clarify the behavior of string reading functions, contributing to overall code quality and maintainability.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
@unclesp1d3r unclesp1d3r self-assigned this Oct 7, 2025
…m snapshots

- Add normalize_cli_output() helper in tests/common/mod.rs to handle:
  - Convert 'rmagic.exe' to 'rmagic' for Windows compatibility
  - Remove Windows path prefixes (\\?\)
  - Filter cargo-specific error messages
- Update CLI integration tests to use normalization helper
- Add comprehensive regression tests in tests/cli_normalization.rs
- Update all affected insta snapshots to use normalized output

Fixes cross-platform test failures where Windows .exe suffix caused
snapshot mismatches between Unix and Windows environments.
- Split coverage and coverage-check targets into [unix] and [windows] variants
- Add proper environment setup with RUSTFLAGS for coverage builds
- Clean coverage artifacts before generating reports on both platforms
- Use bash script with set -euo pipefail for Unix reliability
- Use PowerShell syntax for Windows commands

Ensures consistent coverage generation across different operating systems
and improves reliability of coverage reports in CI environments.
…tation paths

- Remove `test_files/` directory from project
- Update documentation references to use `third_party/` for test files and magic files
- Modify code examples and test paths to reflect new directory structure
- Update development, getting-started, and testing documentation
- Adjust `.gitignore` to remove `test_files/` entry
- Update task tracking to reflect ongoing implementation status
Rationale: Improve project organization and standardize test file management by using a more descriptive `third_party/` directory for external test resources and magic files.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
- Add path normalization utilities in tests/common/mod.rs
- Normalize absolute paths to relative filenames in CLI test outputs
- Update CLI integration test to use path normalization before snapshot assertion
- Add unit tests for path normalization functions
- Accept updated snapshot with normalized paths for cross-platform compatibility

This ensures snapshot tests work consistently across different development environments
by removing system-specific absolute paths from test outputs.
* Initial plan

* fix: replace vec! with array in test_all_offset_spec_variants

Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 176 out of 225 changed files in this pull request and generated no new comments.

Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 176 out of 225 changed files in this pull request and generated no new comments.

@unclesp1d3r
Copy link
Copy Markdown
Member Author

@copilot The test-cross-platform windows test is failing in the CI chain. Please examine the failures here https://github.com/EvilBit-Labs/libmagic-rs/actions/runs/20216079101/job/58029213726?pr=7 and fix the issues. Run just ci-check to ensure everything still works before commit your fixes.

Copy link
Copy Markdown
Contributor

Copilot AI commented Dec 15, 2025

@unclesp1d3r I've opened a new pull request, #10, to work on those changes. Once the pull request is ready, I'll request review from you.

…10)

* Initial plan

* fix(tests): Handle Windows drive letters in path normalization

Fixed the CLI integration test's normalize_cli_output function to correctly
handle Windows paths that contain colons (e.g., D:\path\to\file). The previous
implementation would incorrectly find the first colon (from the drive letter)
instead of the colon separating the filename from the description.

The fix searches for the filename followed by ": " pattern, ensuring we find
the correct delimiter regardless of drive letters or other colons in the path.

This resolves the test-cross-platform Windows CI failures where the output
was showing "CVE-2014-1943.testfile: data" instead of just "data".

Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>

* chore: Remove accidentally committed test file

Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: unclesp1d3r <251112+unclesp1d3r@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 15, 2025

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request evaluator Rule evaluation engine and logic output Result formatting and output generation testing Test infrastructure and coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants