Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
0d87484
chore(.gitignore): add tessl directory to ignore tiles and rules
unclesp1d3r Mar 24, 2026
d18be0e
feat(parser): add initial tessl configuration for stringy
unclesp1d3r Mar 24, 2026
9ea6981
feat(parser): add initial configuration for tessl mcp server
unclesp1d3r Mar 24, 2026
ce62ab6
feat(parser): add initial configuration for tessl mcp server
unclesp1d3r Mar 24, 2026
9ce521f
feat(parser): add configuration for tessl mcp server
unclesp1d3r Mar 24, 2026
059dfa7
chore(.gitignore): add tessl directory and rules to ignore list
unclesp1d3r Mar 24, 2026
49e4b90
feat(parser): implement multi-byte length prefix variants for pstring
unclesp1d3r Mar 24, 2026
90f34b5
feat(parser): implement multi-byte length prefix variants for pstring
unclesp1d3r Mar 24, 2026
fc30368
feat(parser): implement multi-byte length prefix handling for pstring
unclesp1d3r Mar 24, 2026
d3c5dc2
feat(parser): add length width support for pstring type kind serializ…
unclesp1d3r Mar 24, 2026
c26dd7b
feat(parser): add length width support for pstring in tests
unclesp1d3r Mar 24, 2026
5675af4
feat(parser): implement multi-byte length prefix support for pstring
unclesp1d3r Mar 24, 2026
b0263c2
docs(agents): update multi-byte pstring length prefix details and con…
unclesp1d3r Mar 24, 2026
5ab72bd
docs(gotchas): add known gotchas for common pitfalls and architectura…
unclesp1d3r Mar 24, 2026
8fe060d
docs(parser): enhance documentation for PString and timestamp types
unclesp1d3r Mar 25, 2026
727d64f
feat(parser): implement multi-byte pstring length prefix variants
unclesp1d3r Mar 25, 2026
dfdee46
feat(parser): add length_includes_itself support for pstring
unclesp1d3r Mar 25, 2026
d312589
feat(parser): add support for multi-byte pstring length variants
unclesp1d3r Mar 25, 2026
1821cc9
chore(reviews): remove unnecessary labels from auto review configuration
unclesp1d3r Mar 25, 2026
00a58ee
chore(reviews): update review instructions and path filters for parse…
unclesp1d3r Mar 25, 2026
0ef76fc
docs: Dosu updates for PR #183
dosubot[bot] Mar 25, 2026
68f6990
docs: Dosu updates for PR #183
dosubot[bot] Mar 25, 2026
8e57bc3
docs: Dosu updates for PR #183
dosubot[bot] Mar 25, 2026
5f95299
docs: Dosu updates for PR #183
dosubot[bot] Mar 25, 2026
d70aca5
docs: Dosu updates for PR #183
dosubot[bot] Mar 25, 2026
0284b50
docs: Dosu updates for PR #183
dosubot[bot] Mar 25, 2026
01a075c
fix(parser): address PR review findings for pstring implementation
unclesp1d3r Mar 25, 2026
1df0165
fix(parser): resolve CodeRabbit PR review comments
unclesp1d3r Mar 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 29 additions & 99 deletions .coderabbit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ reviews:
[
"src/**",
"docs/src/**",
"spec/**",
".kiro/**/*.md",
".cursor/**/*.mdc",
".github/**",
Expand All @@ -64,6 +63,7 @@ reviews:
"*.json",
"*.sh",
"justfile",
"build.rs",
"!target/**",
"!dist/**",
"!docs/book/**",
Expand All @@ -84,16 +84,26 @@ reviews:
instructions: "Focus on magic file DSL parsing, AST node definitions, and grammar rules. Ensure comprehensive error handling for malformed input, proper escape sequence handling, and robust parsing logic. Include property tests for parser invariants."
- path: "src/parser/ast.rs"
instructions: "Review AST data structures matching the libmagic spec. Ensure proper serialization support, comprehensive documentation, and type safety. Focus on memory layout and performance characteristics."
- path: "src/parser/grammar.rs"
- path: "src/parser/grammar/**"
instructions: "Review nom-based parsing logic, grammar rules, and parser combinators. Ensure robust parsing of magic file syntax, proper error recovery, and comprehensive test coverage for edge cases."
- path: "src/parser/codegen.rs"
instructions: "Review codegen serialization of AST types to Rust source. This module is shared by build.rs and build_helpers.rs. Ensure all TypeKind, Operator, and Value variants are serialized. Verify generated use-statement imports match."
- path: "src/parser/types.rs"
instructions: "Review type keyword parsing and TypeKind conversion. Ensure all libmagic type keywords are handled, case-insensitive matching is applied at entry points, and new types follow established patterns."
- path: "src/parser/loader.rs"
instructions: "Review magic file loading from disk. Ensure proper error handling for missing/invalid files, directory traversal safety, and resource limits."
- path: "src/evaluator/**"
instructions: "Focus on rule evaluation engine, offset resolution, type interpretation, and comparison operators. Ensure memory safety, bounds checking, and performance optimization. Include comprehensive error handling for malformed files."
- path: "src/evaluator/offset.rs"
- path: "src/evaluator/engine/**"
instructions: "Review the core evaluation loop. Focus on rule matching, recursion depth limits, timeout handling, and early exit optimization. Ensure no panics in library code."
- path: "src/evaluator/offset/**"
instructions: "Review offset resolution logic for absolute, indirect, and relative offsets. Ensure proper bounds checking, endianness handling, and performance optimization. Focus on memory safety and edge case handling."
- path: "src/evaluator/types.rs"
instructions: "Review type interpretation logic for byte, short, long, string types with endianness support. Ensure proper bounds checking, overflow handling, and performance optimization."
- path: "src/evaluator/operators.rs"
instructions: "Review comparison and bitwise operators implementation. Ensure proper type coercion, overflow handling, and performance optimization. Focus on correctness and edge case handling."
- path: "src/evaluator/types/**"
instructions: "Review type interpretation logic for byte, short, long, quad, float, double, string, pstring, and date types with endianness support. Ensure proper bounds checking, overflow handling via checked_add, and no unwrap() in library code."
- path: "src/evaluator/operators/**"
instructions: "Review comparison and bitwise operators implementation. Ensure proper type coercion, overflow handling, and performance optimization. Focus on correctness, epsilon-aware float comparison, and edge case handling."
- path: "src/evaluator/strength.rs"
instructions: "Review strength calculation for rule ordering. Ensure all TypeKind and Operator variants are handled in match arms. Focus on correctness of strength modifiers."
- path: "src/output/**"
instructions: "Focus on output formatting for text and JSON formats. Ensure compatibility with GNU file command output, proper error handling, and structured metadata. Review performance for large result sets."
- path: "src/output/text.rs"
Expand All @@ -103,17 +113,13 @@ reviews:
- path: "src/io/**"
instructions: "Focus on file I/O operations, memory mapping, and resource management. Ensure proper bounds checking, error handling, and RAII patterns. Prioritize memory safety and performance."
- path: "src/error.rs"
instructions: "Review error types and handling strategies. Ensure comprehensive error coverage, proper error propagation, and user-friendly error messages. Focus on error recovery and debugging support."
instructions: "Review error types and handling strategies. This file is shared with build.rs and cannot reference lib-only types. Ensure comprehensive error coverage, proper error propagation, and user-friendly error messages."
- path: "build.rs"
instructions: "Review build script that compiles magic rules at build time. Uses #[path] includes to share code with the library. Ensure rerun-if-changed directives are correct and generated code is valid."
- path: "tests/**"
instructions: "Review test coverage, test organization, and test quality. Ensure comprehensive unit tests, integration tests, and property-based tests. Focus on test maintainability and edge case coverage."
- path: "tests/integration/**"
instructions: "Review integration test design and coverage. Ensure end-to-end workflow testing, compatibility testing, and performance regression testing. Focus on real-world usage scenarios."
- path: "tests/fixtures/**"
instructions: "Review test fixtures and sample files. Ensure comprehensive test data coverage, proper file organization, and test data maintenance. Focus on realistic test scenarios."
instructions: "Review test coverage, test organization, and test quality. Prefer table-driven tests over one-assertion-per-function. Ensure comprehensive unit tests, integration tests, and property-based tests with proptest."
- path: "benches/**"
instructions: "Review benchmark design and performance testing. Ensure meaningful benchmarks, proper statistical analysis, and performance regression detection. Focus on critical path optimization."
- path: "magic/**"
instructions: "Review magic file databases and rule sets. Ensure proper magic file syntax, comprehensive rule coverage, and compatibility with libmagic. Focus on rule accuracy and performance."
- path: "docs/**"
instructions: "Review documentation quality, completeness, and accuracy. Ensure comprehensive API documentation, usage examples, and migration guides. Focus on user experience and developer onboarding."
- path: "Cargo.toml"
Expand All @@ -126,9 +132,9 @@ reviews:
enabled: true
auto_incremental_review: true
ignore_title_keywords: ["WIP", "draft", "do not merge"]
labels: ["libmagic-rs", "rust", "file-type-detection"]
labels: ["rust", "testing", "compatibility"]
drafts: false
base_branches: ["main", "develop"]
base_branches: ["main"]
ignore_usernames: []
finishing_touches:
docstrings:
Expand All @@ -141,7 +147,7 @@ reviews:
threshold: 85
title:
mode: warning
requirements: "Must follow Conventional Commits specification: type(scope): description. Types: feat, fix, docs, style, refactor, perf, test, build, ci, chore. Scopes: auth, api, cli, models, detection, alerting, etc. Breaking changes indicated with ! in header or BREAKING CHANGE: in footer."
requirements: "Must follow Conventional Commits specification: type(scope): description. Types: feat, fix, docs, style, refactor, perf, test, build, ci, chore. Scopes: parser, evaluator, io, output, cli, ast, grammar, types, operators, offsets, strength, codegen, loader, engine. Breaking changes indicated with ! in header or BREAKING CHANGE: in footer."
description:
mode: warning
issue_assessment:
Expand All @@ -165,8 +171,6 @@ reviews:
packages: []
shellcheck:
enabled: true
ruff:
enabled: false
markdownlint:
enabled: true
github-checks:
Expand All @@ -180,81 +184,21 @@ reviews:
disabled_categories: []
enabled_only: false
level: default
biome:
enabled: false
hadolint:
enabled: false
swiftlint:
enabled: false
phpstan:
enabled: false
level: default
phpmd:
enabled: false
phpcs:
enabled: false
golangci-lint:
enabled: false
yamllint:
enabled: true
gitleaks:
enabled: true
checkov:
enabled: false
detekt:
enabled: false
eslint:
enabled: false
flake8:
enabled: false
rubocop:
enabled: false
buf:
enabled: false
regal:
enabled: false
actionlint:
enabled: true
pmd:
enabled: false
cppcheck:
enabled: false
semgrep:
enabled: true
circleci:
enabled: false
clippy:
enabled: true
sqlfluff:
enabled: false
prismaLint:
enabled: false
pylint:
enabled: false
oxc:
enabled: false
shopifyThemeCheck:
enabled: false
luacheck:
enabled: false
brakeman:
enabled: false
dotenvLint:
enabled: false
htmlhint:
enabled: false
checkmake:
enabled: false
osvScanner:
enabled: true
chat:
art: false
auto_reply: true
integrations:
jira:
usage: auto
linear:
usage: auto
knowledge_base:
opt_out: false
web_search:
Expand All @@ -266,12 +210,6 @@ knowledge_base:
scope: auto
issues:
scope: local
jira:
usage: auto
project_keys: []
linear:
usage: auto
team_keys: []
pull_requests:
scope: local
mcp:
Expand Down Expand Up @@ -301,20 +239,12 @@ code_generation:
instructions: "Generate API integration tests with real magic files, error condition testing, and memory safety validation. Include compatibility tests against libmagic."
- path: "src/main.rs"
instructions: "Test CLI interface with various arguments, error conditions, and cross-platform behavior. Include help text validation and exit code testing."
- path: "src/parser/ast.rs"
instructions: "Test AST serialization/deserialization, memory layout validation, and type safety. Include property tests for AST invariants and edge cases."
- path: "src/parser/grammar.rs"
- path: "src/parser/**/*.rs"
instructions: "Test parser with malformed magic files, boundary conditions, escape sequences, and performance. Include property tests for parser correctness and fuzzing with proptest."
- path: "src/evaluator/offset.rs"
instructions: "Test offset resolution with various file types, boundary conditions, and endianness scenarios. Include memory safety tests and performance benchmarks."
- path: "src/evaluator/types.rs"
instructions: "Test type interpretation with edge cases, overflow conditions, and endianness handling. Include property tests for type conversion invariants."
- path: "src/evaluator/operators.rs"
instructions: "Test operators with various data types, edge cases, and performance scenarios. Include correctness tests and type coercion validation."
- path: "src/output/text.rs"
instructions: "Test text output with various file types, encoding scenarios, and GNU file compatibility. Include performance tests for large result sets."
- path: "src/output/json.rs"
instructions: "Test JSON output with schema validation, metadata accuracy, and serialization performance. Include API versioning and backward compatibility tests."
- path: "src/evaluator/**/*.rs"
instructions: "Test type interpretation, offset resolution, and operators with edge cases, overflow conditions, and endianness handling. Include property tests for invariants."
- path: "src/output/**/*.rs"
instructions: "Test output formatting with schema validation, metadata accuracy, GNU file compatibility, and serialization performance."
- path: "src/io/**/*.rs"
instructions: "Test I/O operations with truncated files, permission errors, memory mapping edge cases, and resource cleanup. Include memory safety tests and performance benchmarks."
- path: "src/error.rs"
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,6 @@ docs/plans/
.full-review/
SECURITY_AUDIT.md
todos/
**/tessl__*
.tessl/tiles/
.tessl/RULES.md
2 changes: 2 additions & 0 deletions .tessl/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tiles/
RULES.md
15 changes: 6 additions & 9 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

This document provides comprehensive guidelines for AI assistants working on the libmagic-rs project, ensuring consistent, high-quality development practices and project understanding.

@GOTCHAS.md

## Project Overview

**libmagic-rs** is a pure-Rust implementation of libmagic, designed to replace the C-based library with a memory-safe, efficient alternative for file type detection.
Expand Down Expand Up @@ -154,15 +156,10 @@ pub fn evaluate_magic_rules(

### Architecture Constraints

- `src/error.rs` is shared with `build.rs` -- cannot reference lib-only types like `crate::io::IoError`
- `FileError(String)` wraps structured I/O errors as strings to work around the build.rs constraint
- Serialization functions live in `src/parser/codegen.rs`, shared by both `build.rs` (via `#[path]` include) and `src/build_helpers.rs` (via `crate::parser::codegen`); `format_parse_error` remains duplicated in both because `ParseError` has different import paths
- Use `ParseError::IoError` for I/O errors in parser code, not `ParseError::invalid_syntax`
- Use `LibmagicError::ConfigError` for config validation, not `ParseError::invalid_syntax`
- Clippy pedantic lints are active (e.g., prefer `trailing_zeros()` over bitwise masks)
- All public enum variants need `# Examples` rustdoc sections
- Comparison operators share a `compare_values() -> Option<Ordering>` helper in `operators/comparison.rs` -- new comparison logic goes there, not in individual `apply_*` functions
- libmagic types are signed by default (`byte`, `short`, `long`, `quad`); unsigned variants use `u` prefix (`ubyte`, `ushort`, `ulong`, `uquad`, etc.)

> See **[GOTCHAS.md](GOTCHAS.md)** for build script boundaries (S1), enum variant update checklists (S2), parser architecture (S3), numeric type pitfalls (S5), string/pstring encoding (S6), and other non-obvious behaviors.

### Naming Conventions

Expand Down Expand Up @@ -208,7 +205,7 @@ cargo test --doc # Test documentation examples
### Currently Implemented (v0.1.0)

- **Offsets**: Absolute and from-end specifications (indirect and relative are parsed but not yet evaluated)
- **Types**: `byte`, `short`, `long`, `quad`, `float`, `double`, `string`, `pstring` with endianness support; unsigned variants `ubyte`, `ushort`/`ubeshort`/`uleshort`, `ulong`/`ubelong`/`ulelong`, `uquad`/`ubequad`/`ulequad`; float/double endian variants `befloat`/`lefloat`, `bedouble`/`ledouble`; 32-bit date/timestamp types `date`/`ldate`/`bedate`/`beldate`/`ledate`/`leldate`; 64-bit date/timestamp types `qdate`/`qldate`/`beqdate`/`beqldate`/`leqdate`/`leqldate`; `pstring` is a Pascal string (length-prefixed byte followed by string data); date values formatted as `"Www Mmm DD HH:MM:SS YYYY"` matching GNU `file` output; types are signed by default (libmagic-compatible)
- **Types**: `byte`, `short`, `long`, `quad`, `float`, `double`, `string`, `pstring` with endianness support; unsigned variants `ubyte`, `ushort`/`ubeshort`/`uleshort`, `ulong`/`ubelong`/`ulelong`, `uquad`/`ubequad`/`ulequad`; float/double endian variants `befloat`/`lefloat`, `bedouble`/`ledouble`; 32-bit date/timestamp types `date`/`ldate`/`bedate`/`beldate`/`ledate`/`leldate`; 64-bit date/timestamp types `qdate`/`qldate`/`beqdate`/`beqldate`/`leqdate`/`leqldate`; `pstring` is a Pascal string (length-prefixed) with support for 1/2/4-byte length prefixes via `/B`, `/H` (2-byte BE), `/h` (2-byte LE), `/L` (4-byte BE), `/l` (4-byte LE) suffixes, and the `/J` flag (stored length includes prefix width, JPEG convention) which is combinable with width suffixes (e.g., `pstring/HJ`); date values formatted as "Www Mmm DD HH:MM:SS YYYY" matching GNU `file` output; types are signed by default (libmagic-compatible)
- **Operators**: `=` (equal), `!=` (not equal), `<` (less than), `>` (greater than), `<=` (less equal), `>=` (greater equal), `&` (bitwise AND with optional mask), `^` (bitwise XOR), `~` (bitwise NOT), `x` (any value)
- **Nested Rules**: Hierarchical rule evaluation with proper indentation
- **String Matching**: Exact string matching with null-termination and Pascal string (length-prefixed) support
Expand Down Expand Up @@ -240,7 +237,7 @@ impl BinaryRegex for regex::bytes::Regex {
- No regex/search pattern matching
- 64-bit integer types: `quad`/`uquad`, `bequad`/`ubequad`, `lequad`/`ulequad` are implemented; `qquad` (128-bit) is not yet supported
- String evaluation reads until first NUL or end-of-buffer by default; `pstring` reads a length-prefixed Pascal string; `max_length: Some(_)` is supported internally but no dedicated fixed-length string parser syntax exists yet
- `pstring` only supports the default 1-byte length prefix (`/B`); multi-byte length prefix variants (`pstring/H` for 2-byte, `pstring/L` for 4-byte) are not yet implemented
- `pstring` supports 1-byte (`/B`), 2-byte big-endian (`/H`), 2-byte little-endian (`/h`), 4-byte big-endian (`/L`), and 4-byte little-endian (`/l`) length prefixes, plus the `/J` flag (stored length includes prefix width). All flags are combinable (e.g., `pstring/HJ`) and fully implemented.

### Operators

Expand Down
12 changes: 12 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,10 +361,22 @@ libmagic-rs uses a **maintainer-driven** governance model. Decisions are made by

As the project grows, active contributors who demonstrate sustained, high-quality contributions and alignment with project goals may be invited to become maintainers.

## Known Gotchas

Before diving into the codebase, read **[GOTCHAS.md](GOTCHAS.md)** -- it documents non-obvious behaviors, common pitfalls, and architectural quirks that will save you debugging time. Key topics include:

- Build script boundary constraints and shared code between `build.rs` and the library (S1)
- Enum variant update checklists for `TypeKind`, `Operator`, and `Value` (S2)
- Parser architecture split between `types.rs` and `grammar/mod.rs` (S3)
- Numeric type conversion and checked arithmetic requirements (S5)
- PString multi-byte length prefix endianness (S6)
- Clippy lint surprises and `unsafe_code = "forbid"` enforcement (S8)

## Getting Help

- **Issues**: For bug reports and feature requests
- **Discussions**: For questions and ideas
- **Documentation**: Check [docs/](docs/) for detailed guides
- **Gotchas**: Check [GOTCHAS.md](GOTCHAS.md) for known pitfalls

Thank you for contributing to libmagic-rs!
Loading
Loading