Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/prompts/cicheck.prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
agent: agent
name: Continuous Integration Check
description: This prompt is used to run and fix issues identified by the continuous integration check command.
model: GPT-5.2-Codex
---

Run `just ci-check` and analyze any failures or warnings. If there are any issues, fix them and run the command again. Continue this process until `just ci-check` passes completely without any failures or warnings. Focus on:

1. Linting errors
2. Test failures
3. Formatting issues
4. Security issues
5. ERB template issues

After each fix, re-run `just ci-check` to verify the changes resolved the issues. Only stop when all checks pass successfully. Provide a summary of the changes made to fix the issues once `just ci-check` passes.
43 changes: 43 additions & 0 deletions .github/prompts/simplicity-review.prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
agent: agent
name: Simplicity Review
description: This prompt is used to review and simplify code changes by applying principles of simplicity, idiomatic coding, and test proportionality.
model: GPT-5.2-Codex
---

## CODE SIMPLIFICATION REVIEW

Start by examining the uncommitted changes (or the changes in the current branch if there are no uncommitted changes) in the current codebase.

### ANALYSIS STEPS:

1. Identify what files have been modified or added
2. Review the actual code changes
3. Apply simplification principles below
4. Refactor directly, then show what you changed

### SIMPLIFICATION PRINCIPLES:

#### Complexity Reduction:

- Remove abstraction layers that don't provide clear value
- Replace complex patterns with straightforward implementations
- Use language idioms over custom abstractions
- If a simple function/lambda works, use it—don't create classes

#### Test Proportionality:

- Keep only tests for critical functionality and real edge cases
- Delete tests for trivial operations, framework behavior, or hypothetical scenarios
- For small projects: aim for \<10 meaningful tests per feature
- Test code should be shorter than implementation

#### Idiomatic Code:

- Use conventional patterns for the language
- Prioritize readability and maintainability
- Apply the principle of least surprise

Ask yourself: "What's the simplest version that actually works reliably?"

Make the refactoring changes, then summarize what you simplified and why. Always finish by running `just ci-check` and ensuring that all checks and tests remain green.
40 changes: 40 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# libmagic-rs - Claude Code Context

Pure-Rust implementation of libmagic for file type identification. See @AGENTS.md for detailed guidelines.

## Quick Reference

### Build & Test

- `cargo build` / `cargo build --release` - Build project
- `cargo test` or `cargo nextest run` - Run tests (650+ tests)
- `cargo clippy -- -D warnings` - Lint (zero warnings policy enforced)
- `cargo fmt` - Format code
- `cargo llvm-cov --html` - Coverage report (target >85%)

### Project Structure

- `src/parser/` - Magic file DSL parsing (nom-based)
- `src/evaluator/` - Rule evaluation engine
- `src/output/` - Text and JSON formatters
- `src/io/` - Memory-mapped file I/O
- Binary: `rmagic` (src/main.rs)

### Code Standards

- **No unsafe code** - `unsafe_code = "forbid"` in Cargo.toml
- **No unwrap/panic** - Use proper error handling with `thiserror`
- **No emojis** in code, comments, or documentation
- Keep files under 500-600 lines
- Rust 2024 edition with rustfmt 2024 style

### Tooling (via mise)

- `mise install` - Install all dev tools
- `cargo nextest run` - Faster test runner
- `cargo insta` - Snapshot testing
- `cargo audit` / `cargo deny` - Security checks

### Current Branch Focus

CLI enhancements: multiple file inputs, stdin processing, magic file discovery
3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ path = "src/main.rs"
byteorder = "1.5.0"
cfg-if = "1.0.4"
clap = { version = "4.5.54", features = ["derive"] }
clap-stdin = "0.8.0"
memmap2 = "0.9.9"
nom = "8.0.0"
serde = { version = "1.0.228", features = ["derive"] }
Expand All @@ -158,7 +159,7 @@ thiserror = "2.0.18"
[dev-dependencies]
criterion = "0.8.1"
insta = { version = "1.46.1", features = ["json"] }
nix = { version = "0.31.0", features = ["fs"] }
nix = { version = "0.31.1", features = ["fs"] }
proptest = "1.9.0"
regex = "1.12.2"
temp-env = "0.3.6"
Expand Down
49 changes: 47 additions & 2 deletions src/evaluator/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@

use crate::parser::ast::MagicRule;
use crate::{EvaluationConfig, LibmagicError};
use std::sync::{Arc, mpsc};
use std::thread;
use std::time::Duration;

#[cfg(test)]
use crate::parser::ast::{Endianness, OffsetSpec, Operator, TypeKind, Value};
Expand Down Expand Up @@ -506,8 +509,50 @@ pub fn evaluate_rules_with_config(
buffer: &[u8],
config: EvaluationConfig,
) -> Result<Vec<MatchResult>, LibmagicError> {
let mut context = EvaluationContext::new(config);
evaluate_rules(rules, buffer, &mut context)
// If no timeout is configured, evaluate normally
let Some(timeout_ms) = config.timeout_ms else {
let mut context = EvaluationContext::new(config);
return evaluate_rules(rules, buffer, &mut context);
};

// With timeout: spawn evaluation in a thread and wait with timeout
// Use Arc to share data without cloning the potentially large rules/buffer
let rules_arc = Arc::new(rules.to_vec());
let buffer_arc = Arc::new(buffer.to_vec());
let config_clone = config.clone();

let (tx, rx) = mpsc::channel();

// Clone Arcs for the thread (cheap reference count increment)
let rules_thread = Arc::clone(&rules_arc);
let buffer_thread = Arc::clone(&buffer_arc);

// Spawn evaluation in separate thread
// Note: The thread will run to completion even if we return early on timeout.
// True cancellation would require cooperative cancellation (checking a flag
// periodically during evaluation) or running in a separate process.
// For most use cases, the thread will complete quickly or the process will
// exit, cleaning up the thread automatically.
thread::spawn(move || {
let mut context = EvaluationContext::new(config_clone);
let result = evaluate_rules(&rules_thread, &buffer_thread, &mut context);
// Send result; ignore error if receiver was dropped (timeout occurred)
let _ = tx.send(result);
});

// Wait for result with timeout
match rx.recv_timeout(Duration::from_millis(timeout_ms)) {
Ok(result) => result,
Err(mpsc::RecvTimeoutError::Timeout) => Err(LibmagicError::Timeout { timeout_ms }),
Err(mpsc::RecvTimeoutError::Disconnected) => {
// Thread panicked or dropped sender
Err(LibmagicError::EvaluationError(
crate::error::EvaluationError::internal_error(
"Evaluation thread terminated unexpectedly",
),
))
}
}
}

#[cfg(test)]
Expand Down
154 changes: 151 additions & 3 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -420,6 +420,46 @@ pub struct MagicDatabase {
}

impl MagicDatabase {
/// Create a database using built-in magic rules.
///
/// This is a stub implementation to support the CLI --use-builtin flag.
/// It currently returns an empty rule set, which results in a "data" output
/// for all files and buffers. A full built-in rules implementation is
/// tracked separately and will embed compiled rules at build time.
///
/// # Errors
///
/// Currently always returns `Ok`. In future implementations, this may return
/// an error if the built-in rules fail to load or validate.
///
/// # Examples
///
/// ```rust,no_run
/// use libmagic_rs::MagicDatabase;
///
/// let db = MagicDatabase::with_builtin_rules()?;
/// let result = db.evaluate_buffer(b"example")?;
/// assert_eq!(result.description, "data");
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
pub fn with_builtin_rules() -> Result<Self> {
Self::with_builtin_rules_and_config(EvaluationConfig::default())
}

/// Create database with custom config (e.g., timeout)
///
/// # Errors
///
/// Returns error if config is invalid
pub fn with_builtin_rules_and_config(config: EvaluationConfig) -> Result<Self> {
config.validate()?;
Ok(Self {
rules: Vec::new(),
config,
source_path: None,
})
}

/// Load magic rules from a file
///
/// # Arguments
Expand All @@ -440,15 +480,27 @@ impl MagicDatabase {
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
pub fn load_from_file<P: AsRef<Path>>(path: P) -> Result<Self> {
// Load magic rules using the unified parser API
Self::load_from_file_with_config(path, EvaluationConfig::default())
}

/// Load from file with custom config (e.g., timeout)
///
/// # Errors
///
/// Returns error if file cannot be read, parsed, or config is invalid
pub fn load_from_file_with_config<P: AsRef<Path>>(
path: P,
config: EvaluationConfig,
) -> Result<Self> {
config.validate()?;
let rules = parser::load_magic_file(path.as_ref()).map_err(|e| match e {
ParseError::IoError(io_err) => LibmagicError::IoError(io_err),
other => LibmagicError::ParseError(other),
})?;

Ok(Self {
rules,
config: EvaluationConfig::default(),
config,
source_path: Some(path.as_ref().to_path_buf()),
})
}
Expand Down Expand Up @@ -477,9 +529,20 @@ impl MagicDatabase {
pub fn evaluate_file<P: AsRef<Path>>(&self, path: P) -> Result<EvaluationResult> {
use crate::evaluator::evaluate_rules_with_config;
use crate::io::FileBuffer;
use std::fs;

let path = path.as_ref();

// Check if file is empty - if so, evaluate as empty buffer
// This allows empty files to be processed like any other file
let metadata = fs::metadata(path)?;
if metadata.len() == 0 {
// Empty file - evaluate as empty buffer
return self.evaluate_buffer(b"");
}

// Load the file into memory
let file_buffer = FileBuffer::new(path.as_ref())?;
let file_buffer = FileBuffer::new(path)?;
let buffer = file_buffer.as_slice();

// If we have no rules, return "data" as fallback
Expand Down Expand Up @@ -512,6 +575,68 @@ impl MagicDatabase {
}
}

/// Evaluate magic rules against an in-memory buffer
///
/// This method evaluates a byte buffer directly without reading from disk,
/// which is useful for stdin input or pre-loaded data.
///
/// # Arguments
///
/// * `buffer` - Byte buffer to evaluate
///
/// # Errors
///
/// Returns `LibmagicError::EvaluationError` if rule evaluation fails.
///
/// # Examples
///
/// ```rust,no_run
/// use libmagic_rs::MagicDatabase;
///
/// let db = MagicDatabase::load_from_file("/usr/share/misc/magic")?;
/// let buffer = b"test data";
/// let result = db.evaluate_buffer(buffer)?;
/// println!("Buffer type: {}", result.description);
/// # Ok::<(), Box<dyn std::error::Error>>(())
/// ```
pub fn evaluate_buffer(&self, buffer: &[u8]) -> Result<EvaluationResult> {
use crate::evaluator::evaluate_rules_with_config;

if self.rules.is_empty() {
return Ok(EvaluationResult {
description: "data".to_string(),
mime_type: None,
confidence: 0.0,
});
}

let matches = evaluate_rules_with_config(&self.rules, buffer, self.config.clone())?;

if matches.is_empty() {
Ok(EvaluationResult {
description: "data".to_string(),
mime_type: None,
confidence: 0.0,
})
} else {
let primary_match = &matches[0];
Ok(EvaluationResult {
description: primary_match.message.clone(),
mime_type: None,
confidence: 1.0,
})
}
}

/// Returns the evaluation configuration used by this database.
///
/// This provides read-only access to the evaluation configuration for
/// callers that need to inspect resource limits or evaluation options.
#[must_use]
pub fn config(&self) -> &EvaluationConfig {
&self.config
}

/// Returns the path from which magic rules were loaded.
///
/// This method returns the source path that was used to load the magic rules
Expand Down Expand Up @@ -556,6 +681,7 @@ pub struct EvaluationResult {
#[cfg(test)]
mod tests {
use super::*;
use std::fs;

#[test]
fn test_evaluation_config_default() {
Expand Down Expand Up @@ -826,4 +952,26 @@ mod tests {
_ => panic!("Expected EvaluationError variant"),
}
}

#[test]
fn test_with_builtin_rules_stub() {
let db = MagicDatabase::with_builtin_rules().expect("builtin rules stub should load");
assert!(db.rules.is_empty());
assert!(db.source_path().is_none());

let temp_file = tempfile::Builder::new()
.prefix("libmagic_builtin_stub_test")
.suffix(".bin")
.tempfile()
.expect("failed to create temp file");
fs::write(temp_file.path(), b"sample").unwrap();

let file_result = db.evaluate_file(temp_file.path()).unwrap();
assert_eq!(file_result.description, "data");

let buffer_result = db.evaluate_buffer(b"buffer").unwrap();
assert_eq!(buffer_result.description, "data");

// temp_file is automatically cleaned up when it goes out of scope
}
}
Loading
Loading