feat(parser): add Date and QDate types with serialization support#165
Conversation
Signed-off-by: UncleSp1d3r <unclesp1d3r@evilbitlabs.io>
|
Caution Review failedFailed to post review comments Summary by CodeRabbit
WalkthroughThis PR adds comprehensive support for 32-bit and 64-bit date/timestamp types to the magic file format parser and evaluator. It introduces date-reading functions with timezone awareness, Unix timestamp formatting, type system extensions, parser keyword recognition, and corresponding strength calculations for match specificity. Changes
Sequence DiagramsequenceDiagram
participant Parser as Parser
participant AST as TypeKind<br/>(AST)
participant Evaluator as Evaluator
participant DateReader as Date Reader<br/>(date.rs)
participant Formatter as Timestamp<br/>Formatter
participant Value as Value<br/>Result
Parser->>AST: recognize date/qdate keywords<br/>(bedate, qdate, etc.)
AST->>AST: create TypeKind::Date<br/>or TypeKind::QDate<br/>(with endian, utc)
Evaluator->>Evaluator: dispatch via<br/>read_typed_value
Evaluator->>DateReader: call read_date()<br/>or read_qdate()
DateReader->>DateReader: read 4/8 bytes<br/>from buffer
DateReader->>Formatter: pass Unix seconds<br/>+ utc flag
Formatter->>Formatter: compute local offset<br/>(if needed)
Formatter->>Formatter: apply civil-date<br/>conversion algorithm
Formatter->>Value: return formatted<br/>timestamp string
Value->>Evaluator: String(formatted_value)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 CI must passWonderful, this rule succeeded.All CI checks must pass. Release-plz PRs are exempt because they only bump versions and changelogs (code was already tested on main), and GITHUB_TOKEN-triggered force-pushes suppress CI.
🟢 Do not merge outdated PRsWonderful, this rule succeeded.Make sure PRs are within 10 commits of the base branch before merging
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
🧪 CI InsightsHere's what we observed from your CI run for c76a3fe. 🟢 All jobs passed!But CI Insights is watching 👀 |
There was a problem hiding this comment.
Pull request overview
This PR adds GNU file-compatible date/timestamp support to libmagic-rs by introducing 32-bit (date) and 64-bit (qdate) Unix timestamp types across the parser, type system, evaluator, and documentation.
Changes:
- Added
TypeKind::DateandTypeKind::QDate(with endianness + UTC/local options) and integrated them into parsing and serialization/codegen. - Implemented date/qdate readers + formatting logic (via
chronofor local offset) and wired them into typed value dispatch/coercion, with new tests. - Updated evaluator strength scoring plus docs/roadmap and introduced
chronoas a dependency.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/parser/types.rs |
Recognize date/qdate keywords and map them into new TypeKind variants. |
src/parser/codegen.rs |
Serialize new TypeKind::Date/QDate variants for codegen. |
src/parser/ast.rs |
Extend TypeKind enum and update bit_width() for Date/QDate. |
src/evaluator/types/mod.rs |
Dispatch read/coerce logic for Date/QDate in the evaluator type layer. |
src/evaluator/types/date.rs |
New date/qdate reading + formatting implementation and extensive tests. |
src/evaluator/types/tests.rs |
Add evaluator type tests for reading and coercion of date/qdate. |
src/evaluator/strength.rs |
Include Date/QDate in default strength calculations and tests. |
docs/src/api-reference.md |
Formatting fixes/updates to tables in API docs. |
AGENTS.md |
Document the newly supported date/qdate types and update planned features. |
ROADMAP.md |
Mark date/timestamp support as completed. |
Cargo.toml / Cargo.lock |
Add chrono dependency and lockfile updates. |
| // Normalize numeric expected values for date types into formatted timestamp | ||
| // strings so they match the Value::String representation from read_date/read_qdate. | ||
| (Value::Uint(v), TypeKind::Date { utc, .. } | TypeKind::QDate { utc, .. }) => { | ||
| Value::String(format_timestamp_value(*v, *utc)) | ||
| } | ||
| #[allow(clippy::cast_sign_loss)] | ||
| (Value::Int(v), TypeKind::Date { utc, .. } | TypeKind::QDate { utc, .. }) if *v >= 0 => { | ||
| Value::String(format_timestamp_value(*v as u64, *utc)) | ||
| } |
There was a problem hiding this comment.
Coercing Date/QDate expected numeric operands into formatted timestamp strings changes comparison semantics: any <, >, <=, >= rule will end up doing lexicographic Value::String ordering, which won’t match chronological ordering (e.g., month/day names). Consider keeping Date/QDate values numeric for evaluation (e.g., Value::Uint/Value::Int) and only formatting to the GNU file string at output time (output can use type_kind), or introduce a dedicated date value representation that preserves the raw seconds for comparisons.
| }, | ||
| /// Unsupported type variant (reserved for future types not yet evaluatable, | ||
| /// e.g., regex, float, date). | ||
| /// e.g., regex, date, timestamp). |
There was a problem hiding this comment.
The UnsupportedType doc comment still lists date/timestamp as examples, but Date/QDate are now supported in read_typed_value. Update the comment to reflect only genuinely unsupported types (e.g., regex/search/pstring) to avoid confusing API users.
| /// e.g., regex, date, timestamp). | |
| /// e.g., regex, search, pstring). |
| /// Day-of-week names matching GNU `file` output format. | ||
| const DAY_NAMES: [&str; 7] = ["Thu", "Fri", "Sat", "Sun", "Mon", "Tue", "Wed"]; | ||
|
|
||
| /// Month names matching GNU `file` output format. | ||
| const MONTH_NAMES: [&str; 12] = [ | ||
| "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", | ||
| ]; | ||
|
|
There was a problem hiding this comment.
src/evaluator/types/date.rs is ~715 lines long, exceeding the project guideline of keeping source files under ~500–600 lines. Consider splitting this into smaller modules (e.g., formatter vs. readers) and/or moving the large in-module test suite into the existing src/evaluator/types/tests.rs to keep the implementation easier to navigate and maintain.
|
Documentation Updates 5 document(s) were updated by changes in this PR: API_REFERENCEView Changes@@ -296,15 +296,17 @@
use libmagic_rs::TypeKind;
```
-| Variant | Description |
-| -------------------------- | -------------------------------------------------------- |
-| `Byte { signed }` | Single byte with explicit signedness (changed in v0.2.0) |
-| `Short { endian, signed }` | 16-bit integer |
-| `Long { endian, signed }` | 32-bit integer |
-| `Quad { endian, signed }` | 64-bit integer |
-| `Float { endian }` | 32-bit IEEE 754 floating-point |
-| `Double { endian }` | 64-bit IEEE 754 floating-point |
-| `String { max_length }` | String data |
+| Variant | Description |
+| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `Byte { signed }` | Single byte with explicit signedness (changed in v0.2.0) |
+| `Short { endian, signed }` | 16-bit integer |
+| `Long { endian, signed }` | 32-bit integer |
+| `Quad { endian, signed }` | 64-bit integer |
+| `Float { endian }` | 32-bit IEEE 754 floating-point |
+| `Double { endian }` | 64-bit IEEE 754 floating-point |
+| `Date { endian, utc }` | 32-bit Unix timestamp (signed seconds since epoch). The `endian` parameter specifies byte order (LittleEndian or BigEndian), and `utc` is a boolean indicating whether to format as UTC or local time. Date values are formatted as "Www Mmm DD HH:MM:SS YYYY" strings to match GNU file output. |
+| `QDate { endian, utc }` | 64-bit Unix timestamp (signed seconds since epoch). The `endian` parameter specifies byte order (LittleEndian or BigEndian), and `utc` is a boolean indicating whether to format as UTC or local time. QDate values are formatted as "Www Mmm DD HH:MM:SS YYYY" strings to match GNU file output. |
+| `String { max_length }` | String data |
##### 64-bit Integer Types
@@ -379,13 +381,13 @@
use libmagic_rs::Value;
```
-| Variant | Description |
-| ---------------- | --------------------------- |
-| `Uint(u64)` | Unsigned integer |
-| `Int(i64)` | Signed integer |
-| `Float(f64)` | 64-bit floating-point value |
-| `Bytes(Vec<u8>)` | Byte sequence |
-| `String(String)` | String value |
+| Variant | Description |
+| ---------------- | --------------------------------------------------------------------------------- |
+| `Uint(u64)` | Unsigned integer |
+| `Int(i64)` | Signed integer |
+| `Float(f64)` | 64-bit floating-point value |
+| `Bytes(Vec<u8>)` | Byte sequence |
+| `String(String)` | String value (also used for date/timestamp values formatted as human-readable strings) |
**Note:** `Value` implements `PartialEq` but not `Eq` due to IEEE 754 NaN semantics (NaN is not equal to itself).
evaluatorView Changes@@ -33,11 +33,12 @@
- **`types/mod.rs`** - Public API surface: `read_typed_value`, `coerce_value_to_type`, re-exports type functions
- **`types/numeric.rs`** - Numeric type handling: `read_byte`, `read_short`, `read_long`, `read_quad` with endianness and signedness support
- **`types/float.rs`** - Floating-point type handling: `read_float` (32-bit IEEE 754), `read_double` (64-bit IEEE 754) with endianness support
+ - **`types/date.rs`** - Date and timestamp type handling: `read_date` (32-bit Unix timestamps), `read_qdate` (64-bit Unix timestamps) with endianness and UTC/local time support
- **`types/string.rs`** - String type handling: `read_string` with null-termination and UTF-8 conversion
- **`types/tests.rs`** - Module tests
- **`evaluator/strength.rs`** - Rule strength calculation
-The refactoring improves organization by separating concerns: `mod.rs` handles the public API surface and data types, while `engine/` contains the core evaluation logic. The types module was refactored in v0.4.2 from a single 1,836-line file into focused submodules for numeric and string handling, improving maintainability without changing the public API. From a public API perspective, all types and functions are imported from the `evaluator` module as before -- the internal organization is transparent to library users.
+The refactoring improves organization by separating concerns: `mod.rs` handles the public API surface and data types, while `engine/` contains the core evaluation logic. The types module was refactored in v0.4.2 from a single 1,836-line file into focused submodules for numeric, floating-point, date/timestamp, and string handling, improving maintainability without changing the public API. From a public API perspective, all types and functions are imported from the `evaluator` module as before -- the internal organization is transparent to library users.
## Core Components
@@ -106,7 +107,7 @@
### Type Reading (`evaluator/types/`)
-Interprets bytes according to type specifications. The types module is organized into submodules for numeric, floating-point, and string type handling (refactored from a single file in v0.4.2):
+Interprets bytes according to type specifications. The types module is organized into submodules for numeric, floating-point, date/timestamp, and string type handling (refactored from a single file in v0.4.2):
- **Byte**: Single byte values (signed or unsigned)
- **Short**: 16-bit integers with endianness
@@ -114,6 +115,8 @@
- **Quad**: 64-bit integers with endianness
- **Float**: 32-bit IEEE 754 floating-point with endianness (native, big-endian `befloat`, little-endian `lefloat`)
- **Double**: 64-bit IEEE 754 floating-point with endianness (native, big-endian `bedouble`, little-endian `ledouble`)
+- **Date**: 32-bit Unix timestamps (signed seconds since epoch) with configurable endianness and UTC/local time formatting
+- **QDate**: 64-bit Unix timestamps (signed seconds since epoch) with configurable endianness and UTC/local time formatting
- **String**: Byte sequences with length limits
- **Bounds checking**: Prevents buffer overruns
@@ -146,6 +149,31 @@
- `read_float()` reads 4 bytes and interprets as `f32`, converting to `f64` and returning `Value::Float(f64)`
- `read_double()` reads 8 bytes and interprets as `f64`, returning `Value::Float(f64)`
- Both respect endianness specified in `TypeKind::Float` or `TypeKind::Double`
+
+**Date and QDate Type Reading (`evaluator/types/date.rs`):**
+
+```rust
+pub fn read_date(
+ buffer: &[u8],
+ offset: usize,
+ endian: Endianness,
+ utc: bool,
+) -> Result<Value, TypeReadError>
+
+pub fn read_qdate(
+ buffer: &[u8],
+ offset: usize,
+ endian: Endianness,
+ utc: bool,
+) -> Result<Value, TypeReadError>
+```
+
+- `read_date()` reads 4 bytes as a 32-bit Unix timestamp (seconds since epoch) and returns `Value::String` formatted as `"Www Mmm DD HH:MM:SS YYYY"` to match GNU file output
+- `read_qdate()` reads 8 bytes as a 64-bit Unix timestamp (seconds since epoch) and returns `Value::String` formatted as `"Www Mmm DD HH:MM:SS YYYY"` to match GNU file output
+- Both support endianness (little-endian, big-endian, native)
+- Both support UTC or local time formatting
+- The evaluator reads raw integer timestamps from the buffer and converts them to formatted date strings for comparison
+- Example: A 32-bit value `1234567890` at offset 0 with type `ldate` would be evaluated as `"Fri Feb 13 23:31:30 2009"`
### Operator Application (`evaluator/operators.rs`)
@@ -462,7 +490,7 @@
- [x] Basic evaluation engine structure
- [x] Offset resolution (absolute, relative, from-end)
-- [x] Type reading with endianness support (Byte, Short, Long, Quad, Float, Double, String)
+- [x] Type reading with endianness support (Byte, Short, Long, Quad, Float, Double, Date, QDate, String)
- [x] Operator application (Equal, NotEqual, LessThan, GreaterThan, LessEqual, GreaterEqual, BitwiseAnd, BitwiseAndMask)
- [x] Hierarchical rule processing with child evaluation
- [x] Error handling with graceful degradationmagic-formatView Changes@@ -177,6 +177,38 @@
- **NaN**: `NaN != NaN`, comparisons with NaN always return false
- **Infinity**: Positive and negative infinity are properly ordered
+### Date/Timestamp Types
+
+| Type | Size | Endianness | UTC/Local | Description |
+| ----------- | ------- | ------------- | --------- | ----------------------------------------------------------------------- |
+| `date` | 4 bytes | native | UTC | 32-bit Unix timestamp (signed seconds since epoch), formatted as UTC |
+| `ldate` | 4 bytes | native | Local | 32-bit Unix timestamp, formatted as local time |
+| `bedate` | 4 bytes | big-endian | UTC | 32-bit Unix timestamp, big-endian byte order, UTC |
+| `beldate` | 4 bytes | big-endian | Local | 32-bit Unix timestamp, big-endian byte order, local time |
+| `ledate` | 4 bytes | little-endian | UTC | 32-bit Unix timestamp, little-endian byte order, UTC |
+| `leldate` | 4 bytes | little-endian | Local | 32-bit Unix timestamp, little-endian byte order, local time |
+| `qdate` | 8 bytes | native | UTC | 64-bit Unix timestamp (signed seconds since epoch), formatted as UTC |
+| `qldate` | 8 bytes | native | Local | 64-bit Unix timestamp, formatted as local time |
+| `beqdate` | 8 bytes | big-endian | UTC | 64-bit Unix timestamp, big-endian byte order, UTC |
+| `beqldate` | 8 bytes | big-endian | Local | 64-bit Unix timestamp, big-endian byte order, local time |
+| `leqdate` | 8 bytes | little-endian | UTC | 64-bit Unix timestamp, little-endian byte order, UTC |
+| `leqldate` | 8 bytes | little-endian | Local | 64-bit Unix timestamp, little-endian byte order, local time |
+
+Timestamp values are formatted as strings matching GNU file output format: "Www Mmm DD HH:MM:SS YYYY"
+
+Examples:
+
+```text
+# Match file modified at Unix epoch
+0 date =0 File created at epoch
+
+# Check timestamp in file header (big-endian)
+8 bedate >946684800 File created after 2000-01-01
+
+# 64-bit timestamp (little-endian, local time)
+16 leqldate x \b, timestamp %s
+```
+
### String Type
Match literal string data:
@@ -502,6 +534,7 @@
- Indirect offsets (basic)
- Byte, short, long, quad types (8-bit, 16-bit, 32-bit, 64-bit integers)
- Float and double types (32-bit and 64-bit IEEE 754 floating-point)
+- Date and qdate types (32-bit and 64-bit Unix timestamps)
- String type
- Comparison operators (equal, not-equal, less-than, greater-than, less-equal, greater-equal)
- Bitwise AND operator
@@ -511,7 +544,6 @@
### Not Yet Supported
- Regex patterns
-- Date/time types
- 128-bit integer types
- Use/name directives
- Default rulesMAGIC_FORMATView Changes@@ -200,6 +200,40 @@
0 string/c <!doctype HTML document
```
+### Date/Timestamp Types
+
+Date and timestamp types read Unix timestamps (signed seconds since epoch) and format them as human-readable strings.
+
+**32-bit timestamps (4 bytes):**
+
+| Type | Size | Endianness | Timezone |
+| --------- | ------- | ------------- | ---------- |
+| `date` | 4 bytes | native | UTC |
+| `ldate` | 4 bytes | native | local time |
+| `bedate` | 4 bytes | big-endian | UTC |
+| `beldate` | 4 bytes | big-endian | local time |
+| `ledate` | 4 bytes | little-endian | UTC |
+| `leldate` | 4 bytes | little-endian | local time |
+
+**64-bit timestamps (8 bytes):**
+
+| Type | Size | Endianness | Timezone |
+| ---------- | ------- | ------------- | ---------- |
+| `qdate` | 8 bytes | native | UTC |
+| `qldate` | 8 bytes | native | local time |
+| `beqdate` | 8 bytes | big-endian | UTC |
+| `beqldate` | 8 bytes | big-endian | local time |
+| `leqdate` | 8 bytes | little-endian | UTC |
+| `leqldate` | 8 bytes | little-endian | local time |
+
+All timestamp values are formatted as strings in the format `"Www Mmm DD HH:MM:SS YYYY"` to match GNU file output.
+
+Example:
+
+```text
+0 ldate x Unix timestamp: %s
+```
+
---
## Operators
@@ -492,6 +526,7 @@
- Indirect offsets (basic)
- Byte, short, long, quad types (8-bit, 16-bit, 32-bit, 64-bit integers)
- String type
+- Date and timestamp types (32-bit and 64-bit Unix timestamps)
- Comparison operators (`=`, `!`, `<`, `>`, `<=`, `>=`)
- Bitwise AND operator
- Nested rules
@@ -500,7 +535,6 @@
### Not Yet Supported
- Regex patterns
-- Date/time types
- Float types
- 128-bit integer types
- Use/name directives
@@ -508,6 +542,7 @@
### Recently Added
+- **Date/timestamp types**: `date` (32-bit) and `qdate` (64-bit) Unix timestamp types
- **Comparison operators**: Full support for `<`, `>`, `<=`, `>=` operators
- **Strength modifiers**: The `!:strength` directive for adjusting rule priority
- **64-bit integers**: `quad` type family (`quad`, `uquad`, `lequad`, `ulequad`, `bequad`, `ubequad`)parserView Changes@@ -183,6 +183,28 @@
- ✅ Comprehensive test coverage for all endianness variants and literal formats
**Note:** Float and double types do **not** have signed/unsigned variants. IEEE 754 handles sign internally via the sign bit, so all float types use a single `TypeKind` variant with only an `endian` field (no `signed: bool` field).
+
+### Date and Timestamp Types
+
+The parser supports date and timestamp types for parsing Unix timestamps (signed seconds since epoch). There are 12 type keywords:
+
+**32-bit timestamps (Date):**
+- `date` - Native endian, UTC
+- `ldate` - Native endian, local time
+- `bedate` - Big-endian, UTC
+- `beldate` - Big-endian, local time
+- `ledate` - Little-endian, UTC
+- `leldate` - Little-endian, local time
+
+**64-bit timestamps (QDate):**
+- `qdate` - Native endian, UTC
+- `qldate` - Native endian, local time
+- `beqdate` - Big-endian, UTC
+- `beqldate` - Big-endian, local time
+- `leqdate` - Little-endian, UTC
+- `leqldate` - Little-endian, local time
+
+The parser creates `TypeKind::Date` or `TypeKind::QDate` variants with appropriate endianness and UTC flags. During evaluation, timestamps are formatted as strings in the format "Www Mmm DD HH:MM:SS YYYY" to match GNU file output.
## Parser Design Principles
|
#166) Update documentation for #165 This PR introduces extensive documentation covering the magic file format, evaluator engine architecture, and parser implementation. The magic-format guide explains syntax, offset specifications, type systems (including new floating-point and date/timestamp types), operators, and best practices. New evaluator and parser documentation describes the modular architecture, module organization, type reading with endianness support, and operator semantics including IEEE 754 floating-point comparison. An updated API reference provides complete type and function documentation. _Generated by [Dosu](https://dosu.dev)_ Co-authored-by: dosubot[bot] <131922026+dosubot[bot]@users.noreply.github.com>
## 🤖 New release
* `libmagic-rs`: 0.5.0 -> 0.6.0 (⚠ API breaking changes)
### ⚠ `libmagic-rs` breaking changes
```text
--- failure constructible_struct_adds_field: externally-constructible struct adds field ---
Description:
A pub struct constructible with a struct literal has a new pub field. Existing struct literals must be updated to include the new field.
ref: https://doc.rust-lang.org/reference/expressions/struct-expr.html
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/constructible_struct_adds_field.ron
Failed in:
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
field MagicRule.value_transform in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:1189
--- failure copy_impl_added: type now implements Copy ---
Description:
A public type now implements Copy, causing non-move closures to capture it by reference instead of moving it.
ref: rust-lang/rust#100905
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/copy_impl_added.ron
Failed in:
libmagic_rs::mime::MimeMapper in /tmp/.tmpwFvgw1/libmagic-rs/src/mime.rs:98
--- failure enum_marked_non_exhaustive: enum marked #[non_exhaustive] ---
Description:
A public enum has been marked #[non_exhaustive]. Pattern-matching on it outside of its crate must now include a wildcard pattern like `_`, or it will fail to compile.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/enum_marked_non_exhaustive.ron
Failed in:
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum OffsetSpec in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:198
enum LibmagicError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:15
enum LibmagicError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:15
enum IoError in /tmp/.tmpwFvgw1/libmagic-rs/src/io/mod.rs:26
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum Operator in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:838
enum TypeReadError in /tmp/.tmpwFvgw1/libmagic-rs/src/evaluator/types/mod.rs:56
enum ParseError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:74
enum ParseError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:74
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum Value in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:965
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum TypeKind in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:398
enum EvaluationError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:148
enum EvaluationError in /tmp/.tmpwFvgw1/libmagic-rs/src/error.rs:148
--- failure enum_struct_variant_field_added: pub enum struct variant field added ---
Description:
An enum's exhaustive struct variant has a new field, which has to be included when constructing or matching on this variant.
ref: https://doc.rust-lang.org/reference/attributes/type_system.html#the-non_exhaustive-attribute
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/enum_struct_variant_field_added.ron
Failed in:
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
field base_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:251
field adjustment_op of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:266
field result_relative of variant OffsetSpec::Indirect in /tmp/.tmpwFvgw1/libmagic-rs/src/parser/ast.rs:272
--- failure function_missing: pub fn removed or renamed ---
Description:
A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/function_missing.ron
Failed in:
function libmagic_rs::parser::grammar::is_empty_line, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1025
function libmagic_rs::parser::grammar::parse_strength_directive, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:846
function libmagic_rs::parser::grammar::parse_type_and_operator, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:683
function libmagic_rs::parser::grammar::parse_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:179
function libmagic_rs::parser::parse_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:179
function libmagic_rs::parser::grammar::parse_comment, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1004
function libmagic_rs::parser::grammar::parse_message, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:810
function libmagic_rs::parser::grammar::parse_value, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:633
function libmagic_rs::parser::grammar::parse_number, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:133
function libmagic_rs::parser::parse_number, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:133
function libmagic_rs::parser::grammar::has_continuation, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1060
function libmagic_rs::parser::grammar::parse_magic_rule, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:946
function libmagic_rs::parser::grammar::parse_rule_offset, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:779
function libmagic_rs::parser::grammar::is_comment_line, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:1042
function libmagic_rs::parser::grammar::is_strength_directive, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:902
function libmagic_rs::parser::grammar::parse_type, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:749
function libmagic_rs::parser::grammar::parse_operator, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:227
--- failure function_parameter_count_changed: pub fn parameter count changed ---
Description:
A publicly-visible function now takes a different number of parameters.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#fn-change-arity
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/function_parameter_count_changed.ron
Failed in:
libmagic_rs::evaluator::evaluate_single_rule now takes 3 parameters instead of 2, in /tmp/.tmpwFvgw1/libmagic-rs/src/evaluator/engine/mod.rs:196
--- failure inherent_method_missing: pub method removed or renamed ---
Description:
A publicly-visible method or associated fn is no longer available under its prior name. It may have been renamed or removed entirely.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/inherent_method_missing.ron
Failed in:
FileBuffer::create_symlink, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/io/mod.rs:326
EvaluationContext::increment_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:114
EvaluationContext::decrement_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:130
EvaluationContext::increment_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:114
EvaluationContext::decrement_recursion_depth, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/evaluator/mod.rs:130
--- failure module_missing: pub module removed or renamed ---
Description:
A publicly-visible module cannot be imported by its prior path. A `pub use` may have been removed, or the module may have been renamed, removed, or made non-public.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/module_missing.ron
Failed in:
mod libmagic_rs::parser::grammar, previously in file /tmp/.tmphvgzOh/libmagic-rs/src/parser/grammar/mod.rs:4
--- failure struct_marked_non_exhaustive: struct marked #[non_exhaustive] ---
Description:
A public struct has been marked #[non_exhaustive], which will prevent it from being constructed using a struct literal outside of its crate. It previously had no private fields, so a struct literal could be used to construct it outside its crate.
ref: https://doc.rust-lang.org/cargo/reference/semver.html#attr-adding-non-exhaustive
impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.46.0/src/lints/struct_marked_non_exhaustive.ron
Failed in:
struct EvaluationConfig in /tmp/.tmpwFvgw1/libmagic-rs/src/config.rs:42
```
<details><summary><i><b>Changelog</b></i></summary><p>
<blockquote>
## [0.6.0] - 2026-04-25
### Features
- **parser**: Add Date and QDate types with serialization support
([#165](#165))
- **parser**: Implement pstring (Pascal string) type
([#170](#170))
- **parser**: Implement pstring multi-byte length prefix variants (/B,
/H, /h, /L, /l, /J)
([#183](#183))
- **evaluator**: Add debug-level tracing for skipped rules
([#184](#184))
- **evaluator**: Implement indirect offset resolution
([#37](#37))
([#199](#199))
- **evaluator**: Implement relative offset resolution
([#38](#38))
([#211](#211))
- **deps**: Add new skills to actionbook/rust-skills and
trailofbits/skills
- **evaluator**: Regex and search types (closes #39)
([#214](#214))
- Implement libmagic meta-type directives and format substitution
([#42](#42))
([#230](#230))
### Bug Fixes
- **regex**: PR #214 follow-up review findings
([#215](#215))
- Load and correctly evaluate /usr/share/file/magic/filesystems and
adjacent magic files
([#233](#233))
### Documentation
- **gotchas**: Clarify requirements for adding TypeKind variants
### Miscellaneous Tasks
- Rename .coderabbitai.yaml to .coderabbit.yaml
- **Mergify**: Configuration update
([#173](#173))
- Update .gitignore to exclude local AI assistant files
- **mergify**: Upgrade configuration to current format
([#205](#205))
- Resolve all pending TODO items
([#212](#212))
- **mergify**: Upgrade configuration to current format
([#231](#231))
<!-- generated by git-cliff -->
### Security
- **io**: Close TOCTOU race in `FileBuffer::new` metadata validation
(CWE-367). `validate_file_metadata` now uses `File::metadata()` on the
open descriptor instead of re-canonicalizing the path, so an attacker
cannot swap the path between `open_file` and validation. Error paths now
report the caller-supplied path rather than the canonicalized variant.
- **cli**: Remove relative-path fallbacks from `default_magic_file_path`
(CWE-426). `./missing.magic`, `./third_party/magic.mgc`, and the
`CI`/`GITHUB_ACTIONS` env-var branch no longer resolve against the
process cwd. CI pipelines must pass `--magic <path>` explicitly.
- **evaluator**: `build_regex` now bounds `size_limit` and
`dfa_size_limit` to 1 MiB (`REGEX_COMPILE_SIZE_LIMIT`) to reject
compile-time DoS patterns (CWE-1333) from adversarial magic files.
### Features
- **parser**: Implement meta-type directives: `name`/`use` subroutines,
`default`/`clear` per-level fallback, and `indirect` re-evaluation.
`parse_text_magic_file` now returns `ParsedMagic { rules, name_table }`
(breaking change from `Vec<MagicRule>`). Named subroutines are hoisted
into `NameTable` at load time and dispatched via `RuleEnvironment` in
the evaluator. Recursion is bounded by
`EvaluationConfig::max_recursion_depth`. Resolves
[#42](#42).
- **evaluator**: Thread-local regex compile cache eliminates the
double-compile paid by every successful regex match.
`regex_bytes_consumed` now reuses the compiled `Regex` from `read_regex`
instead of recompiling the pattern to derive the anchor advance. The
cache is reset at the start of every `evaluate_rules_with_config` call,
bounding memory to one evaluation.
- **config**: `EvaluationConfig` is now `#[non_exhaustive]`; new
builder-style setters (`with_max_recursion_depth`,
`with_max_string_length`, `with_stop_at_first_match`, `with_mime_types`,
`with_timeout_ms`) let external crates construct configurations without
struct literals.
- **parser**: `MagicRule::new()` smart constructor with
`::with_children()`, `::with_strength_modifier()`, `::with_level()`
builder methods and a `::validate()` method enforcing structural
invariants (non-empty message, `level <= MAX_LEVEL`, children nested
strictly deeper than parent). New `MagicRuleValidationError` error type.
- **parser**: `RegexFlags::with_case_insensitive()` and
`::with_start_offset()` builder methods.
### Refactor
- **engine**: Extract `evaluate_pattern_rule()` and
`evaluate_value_rule()` helpers from
`evaluate_single_rule_with_anchor`'s 90-line body. Dispatch is now a
two-arm type-category split; each helper has focused rustdoc on
semantics and invariants.
- **types**: Replace the `_ =>` catch-all in
`bytes_consumed_with_pattern` with an explicit listing of the
fixed-width `TypeKind` variants. Adding a new variable-width variant
without updating this match is now a compile error instead of a silent
relative-offset anchor corruption in release builds.
- **parser**: Split the 185-line `type_keyword_to_kind` match into
per-family helpers (`byte_family`, `short_family`, `long_family`,
`quad_family`, `float_family`, `double_family`, `date_family`,
`qdate_family`, `string_family`). Drops the
`#[allow(clippy::too_many_lines)]` attribute.
- **main**: `main()` returns `std::process::ExitCode` instead of calling
`process::exit`, so destructors run on the happy path. Ctrl-C
`AtomicBool` flag uses `Ordering::Relaxed` instead of `SeqCst`.
- **grammar**: `parse_strength_directive` uses nom 8's `preceded` +
`Parser::map` instead of the legacy `map(pair(char(...), parse_number),
|(_, n)| ...)` pattern.
- **output**: Add `#[serde(skip_serializing_if = "Option::is_none",
default)]` to public `Option<T>` fields so JSON output no longer emits
`"field": null` for unset optional values.
### Documentation
- **lib**: Add `# Security` sections to
`MagicDatabase::with_builtin_rules`, `::with_builtin_rules_and_config`,
`::load_from_file`, and `::load_from_file_with_config` warning about the
unbounded default timeout and recommending
`EvaluationConfig::performance()` for untrusted input.
- **lib**: Document `MagicDatabase: Send + Sync` for parallel scanning.
- **README**: Update `TypeKind` enum example to match the current AST,
add `regex` and `search/N` to the supported types table, add pre-1.0 API
stability warning, correct the roadmap to mark v0.2-v0.4 as shipped.
- **AGENTS.md**: Relabel "Currently Implemented (v0.1.0)" and "Current
Limitations (v0.1.0)" to v0.5.0 and rewrite the Development Phases
section to reflect actual shipped scope.
### Testing
- Security regression tests for S-H1 (planted-magic-file in cwd), S-H2
(TOCTOU path-swap contract), S-M2 (pathological regex bounded runtime),
S-L2 (codegen message escape round-trip), and GOTCHAS S13.1
(`EvaluationConfig::default()` unbounded timeout invariant).
- Backspace message concatenation regression tests for first-match,
consecutive, and empty-rest edge cases.
- `MagicRule::validate()` tests covering empty message, child level
invariant, and max-depth rejection.
- `RegexCache` population/clear/reuse tests.
### Breaking Changes
- **parser**: `parse_text_magic_file` return type changed from
`Result<Vec<MagicRule>, ParseError>` to `Result<ParsedMagic,
ParseError>`. Callers must destructure `ParsedMagic { rules, name_table
}`. Low-level callers that only need the rule list can use
`parsed.rules`. `load_magic_file` and `load_magic_directory` return the
same new type.
</blockquote>
</p></details>
---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This pull request adds support for date and timestamp types to the project, enabling both 32-bit and 64-bit Unix timestamp parsing and evaluation. These types are now fully integrated into the type system, parser, evaluator, and test suite, and their values are formatted to match the GNU
fileoutput. Documentation and roadmap files have also been updated to reflect this new feature.Date and timestamp type support
TypeKind::Date(32-bit) andTypeKind::QDate(64-bit) variants to the type system, including endianness and UTC/local time options, and integrated them into type serialization, parsing, and code generation (src/parser/ast.rs,src/parser/types.rs,src/parser/codegen.rs). [1] [2] [3] [4] [5] [6]chronocrate for timestamp formatting, and exposed them in the public API (src/evaluator/types/mod.rs). [1] [2]src/evaluator/types/mod.rs,src/evaluator/types/tests.rs). [1] [2] [3]Documentation and roadmap updates
fileoutput (AGENTS.md,docs/src/api-reference.md). [1] [2] [3] [4]ROADMAP.md).Evaluator and test improvements
src/evaluator/strength.rs). [1] [2]src/parser/ast.rs).Dependency changes
chronocrate toCargo.tomlto support timestamp formatting for date and qdate types (Cargo.toml).