From 4a92b204e8afff99b337d2540319360e1b531bc9 Mon Sep 17 00:00:00 2001 From: UncleSp1d3r Date: Mon, 2 Mar 2026 01:08:31 -0500 Subject: [PATCH 1/2] docs(security): update security assurance case with new threat vectors Signed-off-by: UncleSp1d3r --- docs/src/security-assurance.md | 63 +++++++++++++++++++--------------- 1 file changed, 35 insertions(+), 28 deletions(-) diff --git a/docs/src/security-assurance.md b/docs/src/security-assurance.md index aa32d7f5..9d034452 100644 --- a/docs/src/security-assurance.md +++ b/docs/src/security-assurance.md @@ -29,6 +29,7 @@ libmagic-rs is a file type detection library and CLI tool. Its security requirem | Malicious file author | Exploit the detection tool to gain code execution or cause DoS | Can craft arbitrary file contents | | Malicious magic file author | Inject rules that cause crashes, resource exhaustion, or incorrect results | Can craft arbitrary magic rule syntax | | Supply chain attacker | Compromise a dependency to inject malicious code | Can publish malicious crate versions | + ### 2.3 Attack Vectors | ID | Vector | Target SR | @@ -40,36 +41,38 @@ libmagic-rs is a file type detection library and CLI tool. Its security requirem | AV-5 | Malformed magic file causes parser crash | SR-2 | | AV-6 | CLI argument with path traversal reads unintended files | SR-4 | | AV-7 | Compromised dependency introduces unsafe code | SR-5 | + ## 3. Trust Boundaries -```text -+------------------------------------------------------------------+ -| Untrusted | -| +------------------+ +-------------------+ | -| | Input Files | | Magic Files | | -| | (any content) | | (user or system) | | -| +--------+---------+ +--------+----------+ | -| | | | -+-----------+-----------------------+-------------------------------+ - | | - =========|=======================|============ Trust Boundary ==== - | | -+-----------v-----------------------v-------------------------------+ -| libmagic-rs | -| | -| +----------------+ +----------------+ +--------------+ | -| | Parser | | Evaluator | | Output | | -| | - validates | | - bounds-check | | - formats | | -| | magic syntax | | all access | | results | | -| +----------------+ +----------------+ +--------------+ | -| | -| +----------------+ +----------------+ | -| | I/O Layer | | CLI | | -| | - mmap files | | - clap args | | -| | - size limits | | - validates | | -| +----------------+ +----------------+ | -+------------------------------------------------------------------+ +```mermaid +flowchart TD + subgraph Untrusted["Untrusted Zone"] + direction LR + IF["Input Files
(any content)"] + MF["Magic Files
(user or system)"] + CA["CLI Arguments
(user paths)"] + end + + subgraph libmagic-rs["libmagic-rs (Trusted Zone)"] + IO["I/O Layer
mmap files, size limits"] + CLI["CLI
clap args, validates paths"] + P["Parser
validates magic syntax"] + E["Evaluator
bounds-checks all access"] + O["Output
formats results"] + end + + IF -- "file bytes" --> IO + MF -- "magic syntax" --> P + CA -- "user paths" --> CLI + IO -- "mapped buffer" --> E + CLI -- "validated paths" --> IO + P -- "validated AST" --> E + E -- "match results" --> O + + style Untrusted fill:#fee,stroke:#c00,stroke-width:2px + style libmagic-rs fill:#efe,stroke:#090,stroke-width:2px ``` + All data crossing the trust boundary (file contents, magic file syntax, CLI arguments) is treated as untrusted and validated before use. ## 4. Secure Design Principles (Saltzer and Schroeder) @@ -84,6 +87,7 @@ All data crossing the trust boundary (file contents, magic file syntax, CLI argu | **Least privilege** | The tool only reads files; it never writes, executes, or modifies them. No network access. No elevated permissions required. | | **Least common mechanism** | No shared mutable state between file evaluations. Each evaluation operates on its own data. No global caches that could leak information. | | **Psychological acceptability** | CLI follows GNU `file` conventions. Error messages are descriptive and actionable. Default behavior is safe (built-in rules, no network). | + ## 5. Common Weakness Countermeasures ### 5.1 CWE/SANS Top 25 @@ -104,6 +108,7 @@ All data crossing the trust boundary (file contents, magic file syntax, CLI argu | CWE-190 | Integer overflow | Rust panics on integer overflow in debug builds. Offset calculations use checked arithmetic. | Mitigated | | CWE-502 | Deserialization of untrusted data | Magic files are parsed with a strict grammar, not deserialized from arbitrary formats. | Mitigated | | CWE-400 | Resource exhaustion | Evaluation timeouts prevent unbounded CPU use. Memory-mapped I/O avoids loading entire files into memory. | Mitigated | + ### 5.2 OWASP Top 10 (where applicable) Most OWASP Top 10 categories target web applications and are not applicable to a file detection library. The applicable items are: @@ -114,6 +119,7 @@ Most OWASP Top 10 categories target web applications and are not applicable to a | A04: Insecure Design | Applicable | Secure design principles applied throughout (see Section 4) | | A06: Vulnerable Components | Applicable | `cargo audit` daily, `cargo deny`, Dependabot, `cargo-auditable` | | A09: Security Logging | Partial | Evaluation errors logged; security events reported via GitHub Advisories | + ## 6. Supply Chain Security | Measure | Implementation | @@ -127,6 +133,7 @@ Most OWASP Top 10 categories target web applications and are not applicable to a | Binary auditing | `cargo-auditable` embeds dependency metadata in binaries | | CI integrity | All GitHub Actions pinned to SHA hashes | | Code review | Required on all PRs; automated by CodeRabbit with security-focused checks | + ## 7. Ongoing Assurance This assurance case is maintained as a living document. It is updated when: @@ -136,4 +143,4 @@ This assurance case is maintained as a living document. It is updated when: * Dependencies change significantly * Security incidents occur -The project maintains continuous assurance through automated CI checks (clippy, CodeQL, cargo audit, cargo deny) that run on every commit. \ No newline at end of file +The project maintains continuous assurance through automated CI checks (clippy, CodeQL, cargo audit, cargo deny) that run on every commit. From 0c581eccafe3808cc015c306b4ee51b116a7c56e Mon Sep 17 00:00:00 2001 From: UncleSp1d3r Date: Mon, 2 Mar 2026 01:08:42 -0500 Subject: [PATCH 2/2] style(docs): update diagram styles for architecture and output sections Signed-off-by: UncleSp1d3r --- docs/src/architecture.md | 60 +++++++++++++++++----------------- docs/src/output.md | 46 +++++++++++++++++--------- docs/src/security-assurance.md | 4 +-- 3 files changed, 62 insertions(+), 48 deletions(-) diff --git a/docs/src/architecture.md b/docs/src/architecture.md index 8896d909..e9f744a5 100644 --- a/docs/src/architecture.md +++ b/docs/src/architecture.md @@ -28,15 +28,15 @@ flowchart LR TF --> FB --> E E --> R --> F --> O - style MF fill:#e1f5fe - style TF fill:#e1f5fe - style P fill:#fff3e0 - style AST fill:#fff3e0 - style FB fill:#fff3e0 - style E fill:#fff3e0 - style R fill:#e8f5e9 - style F fill:#e8f5e9 - style O fill:#e8f5e9 + style MF fill:#1a3a5c,stroke:#4a9eff,color:#e0e0e0 + style TF fill:#1a3a5c,stroke:#4a9eff,color:#e0e0e0 + style P fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style AST fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style FB fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style E fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style R fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style F fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style O fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 ``` ## Core Components @@ -188,8 +188,8 @@ flowchart LR C --> D[Validation] D --> E[Cached Rules] - style A fill:#e3f2fd - style E fill:#c8e6c9 + style A fill:#1a3a5c,stroke:#4a9eff,color:#e0e0e0 + style E fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 ``` 1. **Parsing**: Convert text DSL to structured AST @@ -207,8 +207,8 @@ flowchart LR D --> E[Results] E --> F[Formatting] - style A fill:#e3f2fd - style F fill:#c8e6c9 + style A fill:#1a3a5c,stroke:#4a9eff,color:#e0e0e0 + style F fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 ``` 1. **File Access**: Create memory-mapped buffer @@ -238,17 +238,17 @@ Magic rules form a tree structure where: ```mermaid flowchart TD - R[Root Rule
e.g., "0 string PK"] - R -->|match| C1[Child Rule 1
e.g., ">4 ubyte 0x14"] - R -->|match| C2[Child Rule 2
e.g., ">4 ubyte 0x06"] - C1 -->|match| G1[Grandchild
ZIP archive v2.0] - C2 -->|match| G2[Grandchild
ZIP archive v1.0] - - style R fill:#e3f2fd - style C1 fill:#fff3e0 - style C2 fill:#fff3e0 - style G1 fill:#c8e6c9 - style G2 fill:#c8e6c9 + R["Root Rule
e.g., 0 string PK"] + R -->|match| C1["Child Rule 1
e.g., #gt;4 ubyte 0x14"] + R -->|match| C2["Child Rule 2
e.g., #gt;4 ubyte 0x06"] + C1 -->|match| G1["Grandchild
ZIP archive v2.0"] + C2 -->|match| G2["Grandchild
ZIP archive v1.0"] + + style R fill:#1a3a5c,stroke:#4a9eff,color:#e0e0e0 + style C1 fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style C2 fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style G1 fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style G2 fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 ``` **Operator Support:** @@ -359,12 +359,12 @@ flowchart TD E --> ER O --> ER - style L fill:#e8eaf6 - style P fill:#fff8e1 - style E fill:#fff8e1 - style O fill:#fff8e1 - style I fill:#e8f5e9 - style ER fill:#ffebee + style L fill:#2a1a4a,stroke:#b39ddb,color:#e0e0e0 + style P fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style E fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style O fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style I fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style ER fill:#4a1a1a,stroke:#ef5350,color:#e0e0e0 ``` **Dependency Rules:** diff --git a/docs/src/output.md b/docs/src/output.md index 6e91610f..6b1fbf74 100644 --- a/docs/src/output.md +++ b/docs/src/output.md @@ -14,7 +14,7 @@ The output module is organized across three files: ### `output::MatchResult` -Represents a single magic rule match in the output layer. Created by converting from an evaluator-level `MatchResult`, with additional fields for structured output. +Represents a single magic rule match in the output layer. Created by converting from an evaluator-level `RuleMatch`, with additional fields for structured output. ```rust pub struct MatchResult { @@ -32,7 +32,7 @@ Key constructors: - `MatchResult::new(message, offset, value)` -- Creates a match with default confidence of 50. - `MatchResult::with_metadata(...)` -- Creates a fully specified match. Confidence is clamped to 100. -- `MatchResult::from_evaluator_match(m, mime_type)` -- Converts from the evaluator's `MatchResult`. Scales confidence from 0.0--1.0 to 0--100 and extracts rule path tags using the shared `TagExtractor`. +- `MatchResult::from_evaluator_match(m, mime_type)` -- Converts from the evaluator's `RuleMatch`. Scales confidence from 0.0--1.0 to 0--100 and extracts rule path tags using the shared `TagExtractor`. ### `output::EvaluationResult` @@ -96,21 +96,25 @@ The text module (`src/output/text.rs`) produces output compatible with the GNU ` ### Examples Single file, single match: + ```text photo.png: PNG image data ``` Single file, multiple matches: + ```text ls: ELF 64-bit LSB executable, x86-64, dynamically linked ``` No matches: + ```text unknown.bin: data ``` Error case: + ```text missing.txt: ERROR: File not found ``` @@ -195,24 +199,34 @@ All three return `Result`. The full conversion pipeline from evaluation to output: -```text -evaluator::MatchResult ──from_evaluator_match──> output::MatchResult - │ - ┌─────────────┼─────────────┐ - v v v - format_text format_json format_json_line - output output output +```mermaid +flowchart TD + EM["evaluator::RuleMatch"] + EM -- "from_evaluator_match" --> OM["output::MatchResult"] + OM --> FT["format_text_output"] + OM --> FJ["format_json_output"] + OM --> FL["format_json_line_output"] + + style EM fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style OM fill:#2a1a4a,stroke:#b39ddb,color:#e0e0e0 + style FT fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style FJ fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style FL fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 ``` When converting from the library's top-level `EvaluationResult`: -```text -lib::EvaluationResult ──from_library_result──> output::EvaluationResult - │ - ┌─────────────────┬──────┘ - v v - format_evaluation JsonOutput::from_evaluation_result - _result (text) (JSON) +```mermaid +flowchart TD + LE["lib::EvaluationResult"] + LE -- "from_library_result" --> OE["output::EvaluationResult"] + OE --> FER["format_evaluation_result
(text)"] + OE --> JER["JsonOutput::from_evaluation_result
(JSON)"] + + style LE fill:#4a3000,stroke:#ffb74d,color:#e0e0e0 + style OE fill:#2a1a4a,stroke:#b39ddb,color:#e0e0e0 + style FER fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 + style JER fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0 ``` ## Serialization diff --git a/docs/src/security-assurance.md b/docs/src/security-assurance.md index 9d034452..daa74085 100644 --- a/docs/src/security-assurance.md +++ b/docs/src/security-assurance.md @@ -69,8 +69,8 @@ flowchart TD P -- "validated AST" --> E E -- "match results" --> O - style Untrusted fill:#fee,stroke:#c00,stroke-width:2px - style libmagic-rs fill:#efe,stroke:#090,stroke-width:2px + style Untrusted fill:#4a1a1a,stroke:#ef5350,color:#e0e0e0,stroke-width:2px + style libmagic-rs fill:#1b3d1b,stroke:#66bb6a,color:#e0e0e0,stroke-width:2px ``` All data crossing the trust boundary (file contents, magic file syntax, CLI arguments) is treated as untrusted and validated before use.