Conversation
Reviewer's GuideThis PR refactors the Action hashing mechanism to use canonical JSON serialization via serde_json, replacing manual field-by-field hashing and ensuring stable digests as the Action struct evolves; it also derives Serialize on Action, updates docs and tests to align with the new approach. Class diagram for updated Action struct serializationclassDiagram
class Action {
+Recipe recipe
+Option~String~ description
+Option~String~ depfile
+Option~String~ deps_format
+Option~String~ pool
+bool restat
}
Action : <<Serialize>>
Class diagram for refactored ActionHasherclassDiagram
class ActionHasher {
+hash(action: &Action) String
}
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
WalkthroughUpdate the Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant ActionHasher
participant serde_json
participant SHA256
Caller->>ActionHasher: hash(&Action)
ActionHasher->>serde_json: serialize(Action)
serde_json-->>ActionHasher: canonical JSON bytes
ActionHasher->>SHA256: compute digest(JSON bytes)
SHA256-->>ActionHasher: hash digest
ActionHasher-->>Caller: hex string of digest
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changesNo out-of-scope changes identified. Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (5)
🧰 Additional context used📓 Path-based instructions (4)Cargo.toml📄 CodeRabbit Inference Engine (AGENTS.md)
Files:
**/*.rs📄 CodeRabbit Inference Engine (AGENTS.md)
Files:
⚙️ CodeRabbit Configuration File
Files:
docs/**/*.md📄 CodeRabbit Inference Engine (AGENTS.md)
Files:
**/*.md📄 CodeRabbit Inference Engine (AGENTS.md)
Files:
⚙️ CodeRabbit Configuration File
Files:
🧬 Code Graph Analysis (2)tests/hasher_tests.rs (1)
src/ir.rs (1)
🔍 MCP Research (1 server)Deepwiki:
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (9)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and found some issues that need to be addressed.
- Note that serde_json::to_vec does not guarantee canonical JSON key ordering for nested maps—use serde_json_canonicalizer for guaranteed derterminism.
- Instead of panicking on serialization errors with expect, consider changing hash to return a Result so callers can handle failures gracefully.
- Serializing the entire Action into a Vec before hashing may incur extra allocations—consider streaming JSON directly into the hasher to reduce memory overhead on hot code paths.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Note that serde_json::to_vec does not guarantee canonical JSON key ordering for nested maps—consider using a dedicated canonical JSON library or explicit key sorting to ensure output stability.
- Instead of panicking on serialization errors with expect, consider changing hash to return a Result so callers can handle failures gracefully.
- Serializing the entire Action into a Vec<u8> before hashing may incur extra allocations—consider streaming JSON directly into the hasher to reduce memory overhead on hot code paths.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
tests/snapshots/ninja/ninja_snapshot_tests__touch_manifest_ninja.snapis excluded by!**/*.snap
📒 Files selected for processing (4)
docs/netsuke-design.md(3 hunks)src/hasher.rs(2 hunks)src/ir.rs(1 hunks)tests/hasher_tests.rs(6 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Useconcat!()to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derivestd::error::Error(via thethiserrorcrate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Useeyre::Reportfor human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and toeyreonly in the mainmain()entrypoint or top-level async task.
Files:
tests/hasher_tests.rssrc/hasher.rssrc/ir.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
tests/hasher_tests.rssrc/hasher.rssrc/ir.rs
docs/**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
docs/**/*.md: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in thedocs/directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Files:
docs/netsuke-design.md
**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.md: Documentation must use en-GB-oxendict spelling and grammar, except for the naming of the "LICENSE" file.
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by runningmake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
Use dashes (-) for list bullets in Markdown.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.
Files:
docs/netsuke-design.md
⚙️ CodeRabbit Configuration File
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/netsuke-design.md
🔍 MCP Research (1 server)
Deepwiki:
-
The
Actionstruct now derives theSerializetrait with optional fields annotated to skip serialization ifNone, enabling stable canonical JSON serialization used for hashing. (src/ir.rs) -
The
ActionHasherimplementation was simplified by replacing manual field-by-field hashing with serialization of the entireActionto canonical JSON and hashing the resulting bytes with SHA-256, producing a lowercase hex string. (src/hasher.rs) -
The
hash_action_is_stabletest cases were updated to reflect new expected hash values consistent with the canonical JSON + SHA-256 hashing method, without changing test logic orActionparameters. (tests/hasher_tests.rs) -
The documentation clarifies that build action deduplication in the
BuildGraphIR uses SHA-256 hashes computed over canonical JSON serialization ofActionproperties, replacing previous generic hashing. (docs/netsuke-design.md)
🔇 Additional comments (5)
src/ir.rs (1)
42-55: Derive Serialize and skip absent optionals to stabilise hashesDerive
SerializeonActionand skipNoneoptionals to keep the JSON surface stable. This aligns with the canonical JSON hashing approach and the Deepwiki decision record. LGTM.tests/hasher_tests.rs (1)
18-18: Update expected digests to match canonical JSON + SHA-256Updated expectations align with the new hashing method. LGTM.
Also applies to: 29-29, 40-40, 51-51, 62-62, 75-75
docs/netsuke-design.md (3)
916-918: Document deduplication via canonical JSON hashState that
actionskeys use a SHA-256 over canonical JSON. This matches the implementation intent. LGTM.
1113-1116: Call out action deduplication behaviourClarify deduplication semantics in the design doc. LGTM.
86-99: Validate Mermaid diagrams with make nixie
Run make nixie locally to ensure all Mermaid blocks in docs/netsuke-design.md pass validation. Fix any “too many arguments for mermaid-cli” errors by:
- Using fences of the form ```mermaid with no trailing tags
- Removing stray backticks or inline comments inside the diagram block
- Placing each diagram in its own isolated code fence
Apply this check to lines 86–99, 190–231 and 966–1007.
|
@coderabbitai Have the following now been resolved?
|
🧩 Analysis chainShort answer:
Proposed changes:
# Cargo.toml
[dependencies]
serde_json_canonicalizer = "0.1"
sha2 = "0.10"
// src/hasher.rs
use sha2::{Digest, Sha256};
use std::fmt;
use std::io::{self, Write};
#[derive(Debug)]
pub enum HashError {
Canonicalize(io::Error),
Serialize(serde_json::Error),
}
impl fmt::Display for HashError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
HashError::Canonicalize(e) => write!(f, "canonical JSON write failed: {e}"),
HashError::Serialize(e) => write!(f, "serialization failed: {e}"),
}
}
}
impl std::error::Error for HashError {
fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
match self {
HashError::Canonicalize(e) => Some(e),
HashError::Serialize(e) => Some(e),
}
}
}
struct Sha256Writer {
hasher: Sha256,
}
impl Sha256Writer {
fn new() -> Self { Self { hasher: Sha256::new() } }
fn finalize(self) -> String {
let digest = self.hasher.finalize();
hex::encode(digest)
}
}
impl Write for Sha256Writer {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
self.hasher.update(buf);
Ok(buf.len())
}
fn flush(&mut self) -> io::Result<()> { Ok(0 == 0) }
}
pub struct ActionHasher;
impl ActionHasher {
pub fn hash(action: &Action) -> Result<String, HashError> {
let mut writer = Sha256Writer::new();
// Stream canonical JSON directly into the hasher-backed writer.
// This guarantees deterministic key ordering for maps and avoids extra allocation.
serde_json_canonicalizer::to_writer(&mut writer, action)
.map_err(HashError::Serialize)?;
Ok(writer.finalize())
}
}
let got = ActionHasher::hash(&action).expect("hashing must succeed");
assert_eq!(got, expected);This addresses:
If you prefer a simpler error surface, collapse HashError to anyhow::Result. I can also wire this into your existing error types if you point me to the preferred crate/pattern. Would you like me to open a follow-up issue to track these refinements, or shall I draft a commit patch here? 🏁 Script executed: Length of output: 14995 @leynos
Tiny follow-up: the example in src/hasher.rs still shows using hash without handling Result. Update it to unwrap/expect, e.g.: let hash = ActionHasher::hash(&action).expect("hash action");
assert!(!hash.is_empty());Want me to open a quick follow-up issue to track that doc tweak? ✅ Actions performedComments resolved and changes approved. |
Summary
Actionwith serde and hash its canonical JSONcloses #35
Testing
make fmtmake lintmake testmake markdownlintmake nixie(fails: too many arguments for mermaid-cli)https://chatgpt.com/codex/tasks/task_e_68953c19c0008322b8e2d6b8b6f8a876
Summary by Sourcery
Use canonical JSON serialization for action hashing to ensure stable, order-independent SHA-256 digests
Enhancements:
Documentation:
Tests: