Refactor manifest internals into modular diagnostics, glob, and macros#244
Refactor manifest internals into modular diagnostics, glob, and macros#244
Conversation
…acro handling - Remove old manifest glob module and replace with a new, modular glob handling system including errors, normalize, validate, and walk submodules. - Add detailed glob pattern validation, normalization, and expansion with error context and UTF-8 safety. - Introduce manifest diagnostics module to provide actionable errors during manifest parsing, improving YAML and JSON error mapping. - Refactor manifest diagnostics code by moving YAML-specific diagnostics to a dedicated submodule. - Rename and reorganize jinja_macros module to separate macro caching and invocation logic. - Improve manifest macro registration and invocation with caching to reuse compiled templates efficiently. - Move manifest macro related tests under organized modules for better maintainability. This change improves manifest parsing diagnostics and glob pattern validation and handling, enhancing reliability and developer experience. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Summary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughSummarise the addition of structured manifest diagnostics, split the previous glob implementation into focused glob submodules (normalize, validate, walk, errors), refactor Jinja macro registration into a dedicated module, and add related tests for macros and workspace resolution. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related issues
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Comment |
Reviewer's GuideRefactors manifest internals into modular diagnostics, glob, and Jinja macro subsystems, and reorganises associated tests, without altering user-facing behaviour. Class diagram for the modularised glob handling subsystemclassDiagram
class GlobPattern {
+raw : String
+normalized : String
}
class GlobModule {
+glob_paths(pattern) Result~Vec~String~~
}
class NormalizeModule {
+normalize_separators(pattern) String
+force_literal_escapes(pattern) String
}
class ValidateModule {
+validate_brace_matching(pattern) Result~()~
}
class BraceValidator {
-depth : i32
-in_class : bool
-last_open_pos : Option~usize~
-escaped : bool
+new() BraceValidator
+process_character(ch, pos, pattern) Result~()~
+handle_escape_sequence(ch) bool
+handle_character_class(ch) void
+handle_braces(ch, pos, pattern) Result~()~
+validate_final_state(pattern) Result~()~
}
class WalkModule {
+open_root_dir(pattern) Result~Dir~
+process_glob_entry(entry, pattern, root) Result~Option~String~~
+fetch_metadata(root, path) Result~Metadata~
}
class GlobErrorContext {
+pattern : String
+error_char : char
+position : usize
+error_type : GlobErrorType
}
class GlobErrorType {
<<enum>>
InvalidPattern
UnmatchedBrace
IoError
}
class GlobErrorsModule {
+create_glob_error(ctx, detail) Error
+create_unmatched_brace_error(ctx) Error
}
class GlobEntryResult {
<<type alias>>
Result~PathBuf, glob__GlobError~
}
class Error {
<<from minijinja>>
}
class Dir {
<<from cap_std>>
}
class Utf8Path {
<<from camino>>
}
GlobModule ..> GlobPattern : constructs
GlobModule ..> NormalizeModule : uses
GlobModule ..> ValidateModule : uses
GlobModule ..> WalkModule : uses
GlobModule ..> GlobErrorsModule : uses
ValidateModule ..> BraceValidator : uses
WalkModule ..> GlobPattern : uses
WalkModule ..> GlobEntryResult : uses
WalkModule ..> Dir : uses
WalkModule ..> Utf8Path : uses
WalkModule ..> GlobErrorsModule : uses
GlobErrorsModule ..> GlobErrorContext : uses
GlobErrorContext ..> GlobErrorType : has
GlobModule ..> Error : returns
ValidateModule ..> Error : returns
WalkModule ..> Error : returns
Class diagram for the refactored Jinja macros subsystemclassDiagram
class JinjaMacrosModule {
+parse_macro_name(signature) Result~String~
+register_macro(env, macro_def, index) Result~()~
+register_manifest_macros(doc, env) Result~()~
+call_macro_value(state, macro_value, positional, kwargs) Result~Value, Error~
}
class MacroCache {
+template_name : String
+macro_name : String
-instance : OnceLock~MacroInstance~
+new(template_name, macro_name) MacroCache
+prepare(env) Result~()~
}
class MacroInstance {
-state_guard : MacroStateGuard
-value : Value
+new(env, template_name, macro_name) Result~MacroInstance~
+state() &State
+value() &Value
}
class MacroStateGuard {
-ptr : NonNull~State~
+new(state) MacroStateGuard
+state() &State
+into_inner(self) State
}
class CacheModule {
+make_macro_fn(cache) impl Fn(State, Rest~Value~, Kwargs) -> Result~Value, Error~
}
class MacroDefinition {
+signature : String
+body : String
}
class Environment {
<<from minijinja>>
+add_template_owned(name, source) Result~()~
+add_function(name, func) void
+get_template(name) Result~Template~
}
class State {
<<from minijinja>>
}
class Value {
<<from minijinja>>
+call(state, args) Result~Value, Error~
}
class Kwargs {
<<from minijinja>>
}
class RestValue {
<<alias Rest~Value~~>>
}
class Error {
<<from minijinja>>
}
class OnceLockMacroInstance {
<<alias OnceLock~MacroInstance~~>>
}
JinjaMacrosModule ..> MacroDefinition : uses
JinjaMacrosModule ..> Environment : uses
JinjaMacrosModule ..> MacroCache : constructs
JinjaMacrosModule ..> CacheModule : uses
JinjaMacrosModule ..> State : uses
JinjaMacrosModule ..> Value : uses
JinjaMacrosModule ..> Kwargs : uses
MacroCache o--> MacroInstance : caches
MacroCache ..> Environment : prepares_with
MacroCache ..> OnceLockMacroInstance : owns
MacroInstance ..> Environment : loads_from
MacroInstance ..> State : owns
MacroInstance ..> Value : owns
MacroInstance ..> MacroStateGuard : wraps
MacroStateGuard ..> State : owns_raw
CacheModule ..> MacroCache : uses
CacheModule ..> State : closure_arg
CacheModule ..> Value : closure_arg
CacheModule ..> Kwargs : closure_arg
CacheModule ..> Error : returns
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- In
glob::BraceValidator, theBraceValidationState.escape_activeflag is initialized fromcfg!(unix)but never mutated, so you could simplify this by removing it from the state and using#[cfg(unix)]branches directly inhandle_escape_sequenceto reduce runtime state and make the escape semantics clearer. - The
normalized_or_rawhelper forGlobPatternpanics ifnormalizedisNone; since this invariant is critical forglob_paths, consider enforcing it at the type level (e.g., by using a dedicated type for validated patterns or constructingGlobPatternonly with a normalized value) instead of relying on a runtime panic.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `glob::BraceValidator`, the `BraceValidationState.escape_active` flag is initialized from `cfg!(unix)` but never mutated, so you could simplify this by removing it from the state and using `#[cfg(unix)]` branches directly in `handle_escape_sequence` to reduce runtime state and make the escape semantics clearer.
- The `normalized_or_raw` helper for `GlobPattern` panics if `normalized` is `None`; since this invariant is critical for `glob_paths`, consider enforcing it at the type level (e.g., by using a dedicated type for validated patterns or constructing `GlobPattern` only with a normalized value) instead of relying on a runtime panic.
## Individual Comments
### Comment 1
<location> `src/manifest/glob/normalize.rs:9-12` </location>
<code_context>
- {
- let mut out = String::with_capacity(pattern.len());
- let mut it = pattern.chars().peekable();
- while let Some(c) = it.next() {
- if c == '\\' {
- out.push(process_backslash(&mut it, native));
- } else if c == '/' || c == '\\' {
- out.push(native);
- } else {
</code_context>
<issue_to_address>
**nitpick:** The backslash case in `normalize_separators` is handled twice on Unix, which is redundant and may confuse future maintenance.
In the Unix branch, the loop first handles `c == '\'` via `process_backslash`, then the `else if c == '/' || c == '\'` repeats the backslash check. The backslash part of the `else if` is unreachable and can be removed to reduce redundancy and clarify intent.
</issue_to_address>
### Comment 2
<location> `src/manifest/glob/validate.rs:5` </location>
<code_context>
-use minijinja::{Error, ErrorKind};
-
-/// Represents a character being processed with its context
-#[derive(Debug, Clone, Copy)]
-pub struct CharContext {
- pub ch: char,
</code_context>
<issue_to_address>
**issue (complexity):** Consider collapsing the validator’s helper structs into a single stateful `BraceValidator` and simplifying its helper methods’ control flow to make the brace-validation scan easier to follow.
You can simplify this validator quite a bit by collapsing the extra types and simplifying the control flow, while keeping all behavior intact.
### 1. Inline `CharContext` and `BraceValidationState` into `BraceValidator`
`CharContext` mostly mirrors fields already on `BraceValidator` and `BraceValidationState`. You can eliminate both `CharContext` and `BraceValidationState` and keep the state directly on `BraceValidator`:
```rust
struct BraceValidator {
depth: i32,
in_class: bool,
last_open_pos: Option<usize>,
escape_active: bool,
escaped: bool,
}
impl BraceValidator {
const fn new() -> Self {
Self {
depth: 0,
in_class: false,
last_open_pos: None,
escape_active: cfg!(unix),
escaped: false,
}
}
}
```
Then `process_character` can work on these fields directly:
```rust
fn process_character(
&mut self,
ch: char,
pos: usize,
pattern: &GlobPattern,
) -> Result<(), Error> {
if self.handle_escape_sequence(ch) {
return Ok(());
}
self.handle_character_class(ch);
self.handle_braces(ch, pos, pattern)
}
```
### 2. Simplify escape handling (`Option<Result<..>>` → `bool`)
`handle_escape_sequence` does not need to return `Option<Result<..>>`; a boolean “handled / not handled” is sufficient:
```rust
fn handle_escape_sequence(&mut self, ch: char) -> bool {
if self.escaped {
self.escaped = false;
return true; // fully handled
}
if ch == '\\' && self.escape_active {
self.escaped = true;
return true; // escape start handled
}
false
}
```
This removes the `if let Some(result)` + `return result` pattern, making the flow easier to follow.
### 3. Use internal state in helpers, remove `in_class` duplication
With the state directly on `BraceValidator`, the helpers no longer need a context object or duplicated `in_class`:
```rust
fn handle_character_class(&mut self, ch: char) {
match ch {
'[' if !self.in_class => self.in_class = true,
']' if self.in_class => self.in_class = false,
_ => {}
}
}
fn handle_braces(
&mut self,
ch: char,
pos: usize,
pattern: &GlobPattern,
) -> Result<(), Error> {
if self.in_class {
return Ok(());
}
match ch {
'}' if self.depth == 0 => Err(create_unmatched_brace_error(&GlobErrorContext {
pattern: pattern.raw.clone(),
error_char: ch,
position: pos,
error_type: GlobErrorType::UnmatchedBrace,
})),
'{' => {
self.depth += 1;
self.last_open_pos = Some(pos);
Ok(())
}
'}' => {
self.depth -= 1;
Ok(())
}
_ => Ok(()),
}
}
```
### 4. Drop the `#[expect(missing_const_for_fn)]` on a simple mutating helper
Once you have simple, mutating helper methods like `handle_character_class`, the explicit `#[expect(missing_const_for_fn)]` adds noise without value. You can safely remove it:
```rust
fn handle_character_class(&mut self, ch: char) {
// ...
}
```
This keeps the core logic focused and reduces the amount of “meta” around the code.
These changes keep all semantics (brace depth, class tracking, escape behavior, error reporting) but reduce the number of types, the amount of state indirection, and the nesting in the control flow, making the scan easier to reason about.
</issue_to_address>
### Comment 3
<location> `src/manifest/glob/mod.rs:17` </location>
<code_context>
-}
-
-/// Configuration for brace validation
-#[derive(Debug, Clone)]
-pub struct BraceValidationState {
- pub depth: i32,
</code_context>
<issue_to_address>
**issue (complexity):** Consider tightening the `GlobPattern` type and using borrows instead of per-call clones to simplify `glob_paths` control flow and state handling.
You can simplify this without changing behavior by tightening the types and borrowing instead of cloning.
### 1. Remove the invalid intermediate `GlobPattern` state
`normalized: Option<String>` is never meaningfully absent after `glob_paths` runs normalization, and the “missing normalization” branch is effectively a logic panic. You can encode the invariant in the type and eliminate that branch:
```rust
#[derive(Debug, Clone)]
pub struct GlobPattern {
pub raw: String,
pub normalized: String,
}
```
Then build `GlobPattern` only *after* normalization inside `glob_paths`:
```rust
pub fn glob_paths(pattern: &str) -> std::result::Result<Vec<String>, Error> {
use glob::{MatchOptions, glob_with};
let opts = MatchOptions {
case_sensitive: true,
require_literal_separator: true,
require_literal_leading_dot: false,
};
// Validation can operate on &str (see next section)
validate_brace_matching(pattern)?;
#[cfg(unix)]
let mut normalized = normalize_separators(pattern);
#[cfg(not(unix))]
let normalized = normalize_separators(pattern);
#[cfg(unix)]
{
normalized = force_literal_escapes(&normalized);
}
let pattern_state = GlobPattern {
raw: pattern.to_owned(),
normalized: normalized.clone(),
};
let root = open_root_dir(&pattern_state).map_err(|e| {
create_glob_error(
&GlobErrorContext {
pattern: pattern_state.raw.clone(),
error_char: char::from(0),
position: 0,
error_type: GlobErrorType::IoError,
},
Some(e.to_string()),
)
})?;
let entries = glob_with(&pattern_state.normalized, opts).map_err(|e| {
create_glob_error(
&GlobErrorContext {
pattern: pattern_state.raw.clone(),
error_char: char::from(0),
position: 0,
error_type: GlobErrorType::InvalidPattern,
},
Some(e.to_string()),
)
})?;
let mut paths = Vec::new();
for entry in entries {
if let Some(p) = process_glob_entry(entry, &pattern_state, &root)? {
paths.push(p);
}
}
Ok(paths)
}
```
This removes:
- The invalid `normalized: None` intermediate state.
- The panic-like `"pattern normalisation missing after validation"` path.
- The need for `as_deref().ok_or_else(...)` on `normalized`.
### 2. Decouple validation from `GlobPattern`
`validate_brace_matching` only needs the raw pattern. You can simplify it to accept `&str` and reduce coupling:
```rust
// before
pub fn validate_brace_matching(pattern: &GlobPattern) -> Result<(), Error> { /* ... */ }
// after
pub fn validate_brace_matching(pattern: &str) -> Result<(), Error> { /* ... */ }
```
Call site (shown above) becomes `validate_brace_matching(pattern)?;`, which avoids creating any struct just for validation.
### 3. Avoid cloning `GlobPattern` per entry
`process_glob_entry(entry, pattern_state.clone(), &root)` clones the whole state every iteration just for error context. You can pass a shared borrow and clone only if/when needed inside error formatting:
```rust
// before
pub fn process_glob_entry(
entry: GlobEntryResult,
pattern: GlobPattern,
root: &Path,
) -> Result<Option<String>, Error> { /* ... */ }
// after
pub fn process_glob_entry(
entry: GlobEntryResult,
pattern: &GlobPattern,
root: &Path,
) -> Result<Option<String>, Error> { /* ... */ }
```
Call site (also updated above):
```rust
for entry in entries {
if let Some(p) = process_glob_entry(entry, &pattern_state, &root)? {
paths.push(p);
}
}
```
Inside `process_glob_entry`, if you need ownership for an error message, clone just the fields you actually display:
```rust
return Err(create_glob_error(
&GlobErrorContext {
pattern: pattern.raw.clone(),
// ...
},
Some(msg.to_owned()),
));
```
These changes keep all behavior, but:
- Remove an unnecessary `Option` and its invariant-violating state.
- Flatten the orchestration in `glob_paths` (validate → normalize → construct `GlobPattern` → glob).
- Reduce coupling between validation and stateful structures.
- Avoid per-entry `GlobPattern` clones in the hot path.
</issue_to_address>
### Comment 4
<location> `src/manifest/glob/mod.rs:37` </location>
<code_context>
+///
+/// Panics if pattern normalisation fails to record the derived pattern, which
+/// indicates a logic error in the validator.
+pub fn glob_paths(pattern: &str) -> std::result::Result<Vec<String>, Error> {
+ use glob::{MatchOptions, glob_with};
+
</code_context>
<issue_to_address>
**issue (review_instructions):** Add behavioural and unit tests that cover glob_paths and the new glob normalisation/validation behaviour.
Treat this new glob module (brace validation, separator/escape normalisation, capability-checked walking) as a changed feature and add tests accordingly.
Add behavioural tests that exercise end-to-end glob_paths behaviour (including successful matches, unmatched braces, invalid patterns, non-UTF-8 path handling, and directory vs file filtering), and unit tests that directly cover validate_brace_matching, normalize_separators / force_literal_escapes, and process_glob_entry.
Ensure the new src/manifest/glob/tests.rs module actually contains these tests rather than remaining empty, so changes in this refactor are enforced by the test suite.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
Rules:
- For any new feature or change to an existing feature, both behavioural *and* unit tests are required.
- Bug fixes *must* be demonstrated by a test.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
- Removed Option from GlobPattern.normalized and made it a String. - Changed validate_brace_matching to accept &str instead of &GlobPattern. - Streamlined escape sequence handling in brace validator. - Updated glob_paths to construct GlobPattern after normalization. - Improved code clarity and memory usage by passing references where possible. - Added tests to cover normalization, filtering directories, and brace validation. - Fixed normalize_separators to only replace '/' with native separator. - Made pattern normalization and validation more consistent and robust. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/manifest/jinja_macros/cache.rs (1)
165-173: Panic usage violates production code guidelines.The coding guidelines state that
.expect()and.unwrap()are forbidden outside of tests and errors must be propagated. AlthoughBox::into_rawcannot return null for a valid allocation, wrap this in aResultfor consistency with the codebase's error-handling philosophy.Apply this diff:
- fn new(state: State<'static, 'static>) -> Self { + fn new(state: State<'static, 'static>) -> anyhow::Result<Self> { let boxed = Box::new(state); let ptr = Box::into_raw(boxed); - let ptr_non_null = NonNull::new(ptr).unwrap_or_else(|| { - panic!("Box::into_raw cannot return a null pointer"); - }); - Self { ptr: ptr_non_null } + let ptr_non_null = NonNull::new(ptr) + .ok_or_else(|| anyhow::anyhow!("Box::into_raw returned null pointer"))?; + Ok(Self { ptr: ptr_non_null }) }This requires updating the call site in
MacroInstance::newto propagate the error.Based on coding guidelines: ".expect() and .unwrap() are forbidden outside of tests. Errors must be propagated."
src/manifest/tests/macros.rs (1)
249-260: Useconcat!()for long string literals.The inline YAML string is difficult to read. Per coding guidelines, use
concat!()to combine long string literals.#[rstest] fn register_manifest_macros_supports_multiple( mut strict_env: Environment<'static>, ) -> AnyResult<()> { - let yaml = serde_saphyr::from_str::<ManifestValue>( - "macros:\n - signature: \"greet(name)\"\n body: |\n Hello {{ name }}\n - signature: \"shout(text)\"\n body: |\n {{ text | upper }}\n", - )?; + let yaml = serde_saphyr::from_str::<ManifestValue>(concat!( + "macros:\n", + " - signature: \"greet(name)\"\n", + " body: |\n", + " Hello {{ name }}\n", + " - signature: \"shout(text)\"\n", + " body: |\n", + " {{ text | upper }}\n", + ))?; register_manifest_macros(&yaml, &mut strict_env)?; let rendered = render_with(&strict_env, "{{ shout(greet('netsuke')) }}")?; ensure!(rendered.trim() == "HELLO NETSUKE"); Ok(()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (14)
src/manifest/diagnostics/mod.rs(1 hunks)src/manifest/diagnostics/yaml.rs(2 hunks)src/manifest/glob.rs(0 hunks)src/manifest/glob/errors.rs(1 hunks)src/manifest/glob/mod.rs(1 hunks)src/manifest/glob/normalize.rs(1 hunks)src/manifest/glob/tests.rs(1 hunks)src/manifest/glob/validate.rs(1 hunks)src/manifest/glob/walk.rs(1 hunks)src/manifest/jinja_macros/cache.rs(5 hunks)src/manifest/jinja_macros/mod.rs(1 hunks)src/manifest/tests/macros.rs(1 hunks)src/manifest/tests/mod.rs(1 hunks)src/manifest/tests/workspace.rs(1 hunks)
💤 Files with no reviewable changes (1)
- src/manifest/glob.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed in Rust code.
Fix any warnings emitted during Rust tests in the code itself rather than silencing them.
In Rust, extract meaningfully named helper functions when a function is too long, adhering to separation of concerns and CQRS.
In Rust, group related parameters in meaningfully named structs when a function has too many parameters.
In Rust, consider usingArcto reduce data returned when a function is returning a large error.
Every Rust module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs in Rust using Rustdoc comments (///) so documentation can be generated withcargo doc.
In Rust, prefer immutable data and avoid unnecessarymutbindings.
In Rust, handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode in Rust unless absolutely necessary, and document any usage clearly.
In Rust, place function attributes after doc comments.
In Rust, do not usereturnin single-line functions.
In Rust, use predicate functions for conditional criteria with more than two branches.
Lints in Rust must not be silenced except as a last resort.
Lint rule suppressions in Rust must be tightly scoped and include a clear reason.
In Rust, preferexpectoverallow.
In Rust tests, userstestfixtures for shared setup.
In Rust tests, replace duplicated tests with#[rstest(...)]parameterized cases.
In Rust, prefermockallfor mocks and stubs.
In Rust, useconcat!()to combine long string literals rather than escaping newlines with a backslash.
In Rust, prefer single line versions of functions where appropriate (e.g.,pub fn new(id: u64) -> Self { Self(id) }instead of multi-line).
Use NewTypes in Rust to model domain values and eliminate 'integer soup'. Reach fornewt-hypewhen introducing many homogeneous wrappers; add small shims for string-backed wrappers. For path-centric wrappers implementAsRef<Path>along...
Files:
src/manifest/jinja_macros/cache.rssrc/manifest/glob/mod.rssrc/manifest/tests/mod.rssrc/manifest/glob/validate.rssrc/manifest/glob/tests.rssrc/manifest/jinja_macros/mod.rssrc/manifest/tests/workspace.rssrc/manifest/glob/normalize.rssrc/manifest/tests/macros.rssrc/manifest/glob/errors.rssrc/manifest/diagnostics/yaml.rssrc/manifest/glob/walk.rssrc/manifest/diagnostics/mod.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
src/manifest/jinja_macros/cache.rssrc/manifest/glob/mod.rssrc/manifest/tests/mod.rssrc/manifest/glob/validate.rssrc/manifest/glob/tests.rssrc/manifest/jinja_macros/mod.rssrc/manifest/tests/workspace.rssrc/manifest/glob/normalize.rssrc/manifest/tests/macros.rssrc/manifest/glob/errors.rssrc/manifest/diagnostics/yaml.rssrc/manifest/glob/walk.rssrc/manifest/diagnostics/mod.rs
🧬 Code graph analysis (9)
src/manifest/jinja_macros/cache.rs (1)
src/manifest/jinja_macros/mod.rs (1)
call_macro_value(148-163)
src/manifest/glob/mod.rs (6)
src/manifest/glob/errors.rs (2)
create_glob_error(19-45)create_unmatched_brace_error(47-49)src/manifest/glob/normalize.rs (2)
normalize_separators(3-24)force_literal_escapes(59-78)src/manifest/glob/validate.rs (2)
validate_brace_matching(119-127)new(13-20)src/manifest/glob/walk.rs (2)
open_root_dir(7-15)process_glob_entry(17-61)src/manifest/diagnostics/mod.rs (6)
from(35-37)from(41-43)from(84-86)from(90-92)new(23-25)new(72-74)src/manifest/jinja_macros/cache.rs (5)
entries(42-42)new(86-92)new(126-144)new(166-173)new(234-241)
src/manifest/glob/validate.rs (1)
src/manifest/glob/errors.rs (1)
create_unmatched_brace_error(47-49)
src/manifest/glob/tests.rs (4)
src/manifest/glob/normalize.rs (2)
force_literal_escapes(59-78)normalize_separators(3-24)src/manifest/glob/validate.rs (1)
validate_brace_matching(119-127)src/manifest/glob/walk.rs (1)
process_glob_entry(17-61)src/manifest/glob/mod.rs (1)
glob_paths(37-92)
src/manifest/jinja_macros/mod.rs (1)
src/manifest/jinja_macros/cache.rs (6)
make_macro_fn(15-64)new(86-92)new(126-144)new(166-173)new(234-241)macro_kwargs(33-33)
src/manifest/tests/workspace.rs (1)
src/manifest/mod.rs (3)
from_path_with_policy(179-189)stdlib_config_for_manifest(211-239)from_path(158-160)
src/manifest/tests/macros.rs (1)
src/manifest/jinja_macros/mod.rs (4)
call_macro_value(148-163)parse_macro_name(37-56)register_macro(74-93)register_manifest_macros(105-121)
src/manifest/glob/walk.rs (1)
src/manifest/glob/errors.rs (1)
create_glob_error(19-45)
src/manifest/diagnostics/mod.rs (1)
src/manifest/diagnostics/yaml.rs (1)
map_yaml_error(106-130)
🔍 Remote MCP
Perfect! I now have comprehensive documentation about the netsuke repository architecture and organization. Let me compile the most relevant context for this PR review.
Summary of Additional Context for PR Review
Based on the repository's comprehensive documentation, I've gathered the following key context that's relevant to this refactoring PR:
Project Context
netsuke is a build system compiler that transforms YAML manifests with Jinja2 templating into Ninja build files. It follows a six-stage pipeline that transforms high-level user configurations into machine-executable build instructions.
Architecture Alignment
This PR directly implements the modular decomposition proposed in the project's architecture documentation. The refactoring aligns with Module Organization principles by:
- Separating concerns into focused submodules (diagnostics, glob, jinja_macros)
- Enabling focused testing through compartmentalization
- Improving maintainability by partitioning related functionality
Key Design Principles Affected
The refactoring preserves these core design principles:
- Module Organization: The new structure follows the documented pattern of subsystem organization with clear module boundaries
- Error Handling Architecture: Diagnostics module properly implements the three-tier error model (domain errors → context propagation → user-facing diagnostics)
- Testing Framework: Tests remain organized with unit tests in modules and BDD tests in
tests/features/andtests/steps/
Validation Checklist for Reviewers
Based on the documented architecture:
✓ Module exports - Verify that public APIs remain unchanged (only internal import paths change)
✓ Error diagnostics - Ensure ManifestError and diagnostic functions remain accessible
✓ Glob functionality - Confirm glob pattern validation and escaping logic is preserved
✓ Jinja macro handling - Validate that macro registration and invocation patterns work correctly
✓ Test coverage - Confirm BDD features in tests/features/manifest.feature still pass
✓ Cross-platform support - Unix-specific glob logic (escape handling) remains intact
Testing Strategy
Per Testing Framework, this PR should be validated with:
cargo test --all(unit + BDD + integration tests)cargo insta review(for snapshot test updates)- Specific test modules:
tests/features/manifest.feature,tests/features/ninja.feature, workspace tests
CI/CD Impact
The CI/CD Pipeline will validate:
make fmt- Code formattingmake lint- Clippy lintingmake test- All test suites
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: release / build-macos (x86_64-apple-darwin, macos-13, macos-x86_64, macos-x86_64) / Build macos artefacts
- GitHub Check: release / build-windows (aarch64-pc-windows-msvc, arm64, arm64, windows-aarch64) / Build windows artefacts
- GitHub Check: release / build-windows (x86_64-pc-windows-msvc, amd64, x64, windows-x86_64) / Build windows artefacts
- GitHub Check: release / build-linux (x86_64-unknown-linux-gnu, linux-amd64, amd64) / Build linux artefacts
- GitHub Check: build-test (ubuntu-latest, stable)
- GitHub Check: Sourcery review
🔇 Additional comments (42)
src/manifest/glob/normalize.rs (4)
1-24: LGTM!The module documentation and
normalize_separatorsfunction are well-structured. The Unix branch correctly handles backslash escape sequences viaprocess_backslash, whilst the non-Unix branch performs a simple replacement. The past review comment about redundant backslash handling no longer applies—the current implementation correctly separates backslash processing from forward-slash normalisation.
26-56: LGTM!The predicate functions
should_preserve_backslash_for_bracketandshould_preserve_backslash_for_wildcardcleanly separate the decision logic for backslash handling. The use ofconst fnwhere possible is appropriate.
58-106: LGTM!The
force_literal_escapesfunction correctly tracks character class state to avoid escaping within[...]constructs. The escape replacement mapping inget_escape_replacementis comprehensive for glob metacharacters.
108-111: LGTM!Simple, focused predicate function.
src/manifest/tests/mod.rs (1)
1-4: LGTM!Minimal test module scaffold with appropriate module-level documentation.
src/manifest/jinja_macros/cache.rs (3)
136-148: Unsafe code is well-documented and justified.The
mem::transmuteextending theStatelifetime is necessary due to MiniJinja'sEnvironment<'static>requirement. The SAFETY comment clearly documents the invariant. TheSendandSyncimplementations are required for MiniJinja's registered function requirements, and the debug assertions provide runtime validation during development.
193-257: Exemplary unsafe documentation.The
CallerAdapterdocumentation thoroughly covers safety invariants, thread safety constraints, and usage restrictions. The debug assertions provide valuable runtime checks during development.
1-14: LGTM!Module documentation is present and imports are appropriately scoped. The
pub(super)visibility correctly limits exposure to the parent module.src/manifest/tests/macros.rs (4)
1-11: LGTM!Module documentation is present. The updated import paths correctly navigate the new module structure.
12-28: LGTM!Good use of
rstestfixtures for shared setup. TheMacroRenderCasestruct cleanly encapsulates test data.
30-117: LGTM!Excellent use of parameterised
#[rstest]tests covering various macro signature formats, argument handling, and caller blocks. The test cases are comprehensive and well-organised.
119-247: LGTM!Comprehensive test coverage for macro functionality including kwargs support, reusability, validation errors, and edge cases. Tests correctly return
Resultand useensure!for assertions.src/manifest/glob/tests.rs (4)
1-11: LGTM!Module documentation is present. Imports are appropriately organised with conditional compilation for Unix-specific functionality.
13-28: LGTM!Platform-specific assertions correctly validate the normalisation behaviour on Unix and non-Unix systems.
30-48: LGTM!Brace validation tests correctly verify both success and failure cases with appropriate error kind assertions.
50-75: LGTM!Good use of
tempdirfor filesystem isolation. Tests correctly verify directory filtering and brace mismatch error handling.src/manifest/glob/mod.rs (2)
1-23: LGTM: Module structure and types are well-organised.The modular decomposition cleanly separates concerns.
GlobPatternnow uses non-optionalnormalizedfield, andvalidate_brace_matchingaccepts&stras suggested in prior reviews.
85-91: LGTM: Entry processing uses borrows correctly.The loop passes
&pattern_staterather than cloning per iteration, addressing prior review feedback about unnecessary clones in the hot path.src/manifest/glob/validate.rs (4)
1-10: LGTM: Simplified validator structure.The
BraceValidatornow directly holds all state fields, eliminating the intermediateCharContextandBraceValidationStatestructs suggested in prior reviews. The module doc comment is present and descriptive.
37-60: LGTM: Escape sequence handling is correct.The platform-conditional escape logic correctly handles Unix backslash escapes whilst remaining a no-op on other platforms. The
#[expect]annotation includes a reason as required by coding guidelines.
74-102: LGTM: Brace handling logic is sound.The match correctly guards against unmatched closing braces at depth zero, and properly tracks nesting depth for balanced braces.
119-127: LGTM: Public validation API is clean.The
validate_brace_matchingfunction provides a simple interface that hides the validator's internal state machine.src/manifest/glob/errors.rs (3)
1-17: LGTM: Error context types are well-defined.The
GlobErrorContextandGlobErrorTypeprovide structured error information with appropriate visibility restrictions (pub(super)).
35-44: Verify the "glob " prefix heuristic remains stable.The check
detail.starts_with("glob ")prevents double-prefixing but relies on upstream error message format. If theglobcrate changes its error message format, this heuristic may behave unexpectedly. Document this assumption or pin the dependency version.
47-49: LGTM: Wrapper simplifies common error case.src/manifest/tests/workspace.rs (5)
1-12: LGTM: Test module is well-organised with appropriate imports.Module doc comment present. Uses
rstestfor test fixtures as required by coding guidelines.
14-41: LGTM: Directory guard correctly synchronises environment access.The
EnvLockensures environment mutations are properly serialised across tests. Error logging inDropis the right approach since panicking in drop is problematic.
43-69: LGTM: Good use of parameterised testing.The
#[rstest]with#[case]variants correctly tests both relative and absolute path resolution scenarios.
71-90: LGTM: Non-UTF-8 rejection test is thorough.Correctly uses Unix-specific APIs to create an invalid path and verifies the error message contains expected content.
92-157: LGTM: Comprehensive cache directory test.The test correctly verifies that caches are created in the manifest's workspace rather than the current working directory. Graceful handling of
PermissionDeniedfor HTTP binding ensures the test works in restricted environments.src/manifest/glob/walk.rs (3)
1-5: LGTM: Module doc comment and imports are appropriate.Uses
cap_stdfor capability-based filesystem access, providing security boundaries for glob expansion.
7-15: LGTM: Root directory selection is correct.Correctly distinguishes between absolute and relative patterns for capability-based directory access.
78-84: LGTM: Simple accessor with appropriate lint suppression.The
#[expect]annotation includes a reason as required by coding guidelines.src/manifest/jinja_macros/mod.rs (4)
1-6: Module documentation is well-written.The
//!doc comment clearly explains the module's purpose and how manifest macros integrate with the rendering environment.
37-56: Function logic is sound.The validation steps correctly handle empty signatures, missing parameter lists, and empty identifiers. The use of
split_once('(')is idiomatic.
105-121: Robust iteration and error propagation.The early return for missing
macroskey is sensible, and context is added at each failure point. The use ofserde_json::from_valuefor deserialisation fromManifestValue(which appears to beserde_json::Value) is correct.
148-163: Implementation is correct and idiomatic.Using
map_or_elseto branch onOption<Kwargs>avoids unnecessary allocations when no keyword arguments are present. The convention of appendingKwargsto positional arguments matches the minijinja expectation documented in the relevant snippet fromcache.rs.src/manifest/diagnostics/yaml.rs (1)
132-138: Test helper is correctly scoped.The
expected_offsetfunction is appropriately gated behind#[cfg(test)]and provides a clear reference implementation for verifying byte index calculations.src/manifest/diagnostics/mod.rs (4)
17-32: NewType pattern is well-applied.
ManifestSourcecorrectly wraps aStringwith appropriate derives and conversion traits. The#[must_use]annotations are present on constructors and accessors.
66-81:ManifestNamemirrorsManifestSourceappropriately.The same
const fn as_str()concern applies here. Otherwise, the implementation is consistent and correct.
130-137: Internal diagnostic struct is appropriate.
DataDiagnosticis correctly scoped aspub(crate)equivalent (private) and derives the necessary traits. The#[source]attribute ensures the underlyingserde_json::Erroris accessible viastd::error::Error::source().
27-31: No issues found. The project's MSRV of Rust 1.89.0 is well above Rust 1.87.0, whereString::as_str()was stabilised as aconst fn. The code will compile correctly on all supported toolchains.
|
@coderabbitai have the following now been resolved?
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3222 🏁 Script executed: Length of output: 2849 🏁 Script executed: Length of output: 2635 @leynos Yes, both issues have been resolved: 1. BraceValidator escape flag ✓The
This eliminates the runtime state waste and makes the platform-specific escape behaviour explicit. 2. GlobPattern normalized field ✓The type now enforces the invariant at compile time. In pub struct GlobPattern {
pub raw: String,
pub normalized: String, // String, not Option<String>
}The |
- Enhance manifest diagnostics modules with richer comments and clearer error messages. - Simplify string formatting in YAML and data error mappings. - Correct path separator handling in glob walking for better cross-platform support. - Change MacroStateGuard::new to return Result, improving error propagation. - Fix minor API usage in jinja_macros mod for function registration. - Update tests to use concat! macro for better readability in multi-line YAML strings. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/manifest/jinja_macros/cache.rs (1)
150-190: Defensive error handling for theoretically unreachable case.
Box::into_rawnever returns null for non-zero-sized types, so theok_or_elseon line 169-170 is technically unreachable. This defensive approach is acceptable, though a// SAFETY:comment explaining why the error branch is unreachable would clarify the intent.let boxed = Box::new(state); let ptr = Box::into_raw(boxed); + // SAFETY: Box::into_raw never returns null for non-ZST types, but we + // handle the impossible case defensively. let ptr_non_null = NonNull::new(ptr) .ok_or_else(|| anyhow::anyhow!("Box::into_raw returned null pointer"))?;
♻️ Duplicate comments (1)
src/manifest/glob/walk.rs (1)
49-49: LGTM: Character literals now used instead of hex codes.The code now uses
'\\'and['/', '\\']instead of the previouschar::from(0x5c)andchar::from(0x2f), addressing the readability concern from earlier feedback.Also applies to: 65-65
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (7)
src/manifest/diagnostics/mod.rs(1 hunks)src/manifest/diagnostics/yaml.rs(2 hunks)src/manifest/glob/mod.rs(1 hunks)src/manifest/glob/walk.rs(1 hunks)src/manifest/jinja_macros/cache.rs(6 hunks)src/manifest/jinja_macros/mod.rs(1 hunks)src/manifest/tests/macros.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed in Rust code.
Fix any warnings emitted during Rust tests in the code itself rather than silencing them.
In Rust, extract meaningfully named helper functions when a function is too long, adhering to separation of concerns and CQRS.
In Rust, group related parameters in meaningfully named structs when a function has too many parameters.
In Rust, consider usingArcto reduce data returned when a function is returning a large error.
Every Rust module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs in Rust using Rustdoc comments (///) so documentation can be generated withcargo doc.
In Rust, prefer immutable data and avoid unnecessarymutbindings.
In Rust, handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode in Rust unless absolutely necessary, and document any usage clearly.
In Rust, place function attributes after doc comments.
In Rust, do not usereturnin single-line functions.
In Rust, use predicate functions for conditional criteria with more than two branches.
Lints in Rust must not be silenced except as a last resort.
Lint rule suppressions in Rust must be tightly scoped and include a clear reason.
In Rust, preferexpectoverallow.
In Rust tests, userstestfixtures for shared setup.
In Rust tests, replace duplicated tests with#[rstest(...)]parameterized cases.
In Rust, prefermockallfor mocks and stubs.
In Rust, useconcat!()to combine long string literals rather than escaping newlines with a backslash.
In Rust, prefer single line versions of functions where appropriate (e.g.,pub fn new(id: u64) -> Self { Self(id) }instead of multi-line).
Use NewTypes in Rust to model domain values and eliminate 'integer soup'. Reach fornewt-hypewhen introducing many homogeneous wrappers; add small shims for string-backed wrappers. For path-centric wrappers implementAsRef<Path>along...
Files:
src/manifest/glob/mod.rssrc/manifest/glob/walk.rssrc/manifest/diagnostics/yaml.rssrc/manifest/diagnostics/mod.rssrc/manifest/tests/macros.rssrc/manifest/jinja_macros/mod.rssrc/manifest/jinja_macros/cache.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
src/manifest/glob/mod.rssrc/manifest/glob/walk.rssrc/manifest/diagnostics/yaml.rssrc/manifest/diagnostics/mod.rssrc/manifest/tests/macros.rssrc/manifest/jinja_macros/mod.rssrc/manifest/jinja_macros/cache.rs
🧬 Code graph analysis (6)
src/manifest/glob/mod.rs (1)
src/manifest/glob/walk.rs (1)
open_root_dir(7-15)
src/manifest/glob/walk.rs (1)
src/manifest/glob/errors.rs (1)
create_glob_error(19-45)
src/manifest/diagnostics/mod.rs (1)
src/manifest/diagnostics/yaml.rs (1)
map_yaml_error(112-136)
src/manifest/tests/macros.rs (1)
src/manifest/jinja_macros/mod.rs (4)
call_macro_value(148-163)parse_macro_name(37-56)register_macro(74-93)register_manifest_macros(105-121)
src/manifest/jinja_macros/mod.rs (1)
src/manifest/jinja_macros/cache.rs (6)
make_macro_fn(15-64)new(86-92)new(126-144)new(166-172)new(233-240)macro_kwargs(33-33)
src/manifest/jinja_macros/cache.rs (1)
src/manifest/jinja_macros/mod.rs (1)
call_macro_value(148-163)
🔍 Remote MCP Deepwiki
Summary of additional repository context relevant to this PR (sources listed):
- Project architecture and goals: netsuke is a build-system compiler converting YAML+Jinja manifests into Ninja; pipeline stages (parse → expand → render → IR → ninja) and emphasis on deterministic, secure (shell-quote) output.
- Diagnostics design: manifest diagnostics are intended to produce miette-styled reports with source spans and hints; map_yaml_error maps serde_yml errors into rich diagnostics. New diagnostics module in PR must align with this pattern.
- Glob functionality expectations: glob() must validate brace-matching, normalize separators across platforms, preserve Unix escape semantics, return lexicographically sorted UTF‑8 file lists, and use cap-std/camino for safe filesystem access. The PR splits glob logic into normalize/validate/walk/errors modules — reviewer should verify these behaviors and capability-based I/O usage are preserved.
- Jinja/macros expectations: template environment uses minijinja with strict undefined behavior; macros are registered before rendering. The PR reworks jinja_macros (cache.rs and mod.rs) and reduces visibility in cache.rs — verify registrations, error handling (anyhow vs Result) and that macro behavior remains equivalent.
- Public API & visibility policy: repository favors restricting internal items (pub(super)) while preserving public API surface. The PR both introduces new public types (ManifestSource, ManifestName, ManifestError::Parse, map_data_error) and moves / reduces visibility in jinja_macros/cache.rs — confirm exported items and stdlib registration points remain stable for consumers.
- Tests & CI expectations: CI runs fmt, clippy, full tests (unit, insta snapshots, cucumber BDD). PR added/updated tests for glob (brace validation, non-UTF-8 paths), jinja macros, and workspace/caching — ensure snapshots and BDD features remain consistent; run cargo test + insta review locally/CI.
Actionable review checklist (based on repo docs + PR summary):
- Confirm new diagnostics types integrate with existing map_yaml_error behavior and miette diagnostics rendering.
- Validate glob split preserves previous semantics: brace validation, separator normalization, escape handling (Unix only), capability-based Dir usage, UTF‑8 rejection, lexicographic ordering.
- Verify jinja macro registration/invocation behavior unchanged (macro parsing, registration order, error behavior); check visibility changes (anyhow vs Result) do not break callers or tests.
- Check public exports: ensure no unintended breaking changes to public API surface consumed by other crates (stdlib functions, manifest::map_yaml_error re-export).
- Run full test matrix (cargo test, insta review, BDD) and CI linters (fmt/clippy) as PR's tests and formatting must pass.
Sources used:
- Project documentation and architecture pages (module organization, diagnostics, glob, template system, testing, CI): Deepwiki read of repository wiki/contents.
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: release / build-macos (x86_64-apple-darwin, macos-13, macos-x86_64, macos-x86_64) / Build macos artefacts
- GitHub Check: release / build-windows (aarch64-pc-windows-msvc, arm64, arm64, windows-aarch64) / Build windows artefacts
- GitHub Check: release / build-windows (x86_64-pc-windows-msvc, amd64, x64, windows-x86_64) / Build windows artefacts
- GitHub Check: release / build-linux (x86_64-unknown-linux-gnu, linux-amd64, amd64) / Build linux artefacts
- GitHub Check: Sourcery review
- GitHub Check: build-test (ubuntu-latest, stable)
🔇 Additional comments (37)
src/manifest/glob/mod.rs (3)
1-1: LGTM: Module documentation present and uses correct en-GB spelling.The module-level doc comment correctly uses "normalising" per en-GB-oxendict conventions.
17-21: LGTM: GlobPattern invariant now enforced at the type level.The
normalizedfield is now aString(notOption<String>), eliminating the invalid intermediate state and the panic path that was previously flagged. This aligns with the earlier review feedback.
32-87: LGTM:glob_pathsorchestration is well-structured.The function validates first, normalises second, then constructs the
GlobPatternonly after both steps succeed. Error handling usesResultthroughout with no panics. The iteration passes&pattern_stateby reference, avoiding per-entry clones as recommended in prior feedback.src/manifest/glob/walk.rs (4)
1-1: LGTM: Module-level documentation present.The doc comment explains the module's purpose.
7-15: LGTM: Root directory resolution logic is sound.The function correctly distinguishes absolute from relative paths and opens the appropriate ambient directory. Uses capability-based filesystem access via
cap_stdas expected.
17-61: LGTM: Entry processing handles all error cases correctly.The function:
- Converts paths to UTF-8, returning a structured error on failure
- Fetches metadata via the capability-restricted
Dir- Filters out non-files
- Normalises path separators to forward slashes
Error propagation uses
Resultthroughout with no panics.
63-74: LGTM: Metadata fetching handles absolute path edge cases.The function correctly strips leading separators from absolute paths before querying relative to the root
Dir, and handles the empty-string edge case by falling back to".".src/manifest/tests/macros.rs (7)
1-10: LGTM!Module documentation is present and the import paths correctly reference the refactored module structure.
12-28: LGTM!Test helper struct and rstest fixture are well-structured and follow coding guidelines for shared test setup.
30-62: LGTM!Parameterised tests provide good coverage for both successful and error cases of
parse_macro_name. The use ofrstestwith#[case]follows the coding guidelines for reducing test duplication.
64-117: LGTM!Excellent coverage of macro argument handling scenarios including defaults, positional arguments, keyword arguments, and caller blocks.
119-140: LGTM!Test correctly exercises the
call_macro_valuehelper with keyword arguments, verifying the kwargs passing convention.
142-158: LGTM!This test verifies macro caching behaviour by confirming repeated invocations work correctly.
249-266: LGTM!Use of
concat!()for the multiline YAML string follows the coding guidelines. The test correctly verifies that multiple macros can be registered and invoked together.src/manifest/jinja_macros/mod.rs (5)
1-19: LGTM!Module documentation clearly explains the purpose and the submodule structure is well-organised.
21-56: LGTM!Documentation is thorough with examples, and the parsing logic correctly validates macro signatures.
74-93: LGTM!The implementation correctly handles macro registration. The past review comment about redundant
name.clone()at line 91 has been addressed —nameis now moved directly intoadd_function.
105-121: LGTM!The implementation correctly handles optional macros section and provides good error context.
148-163: LGTM!The implementation correctly handles the MiniJinja convention for keyword arguments and efficiently builds the argument list.
src/manifest/diagnostics/yaml.rs (7)
1-12: LGTM!Module documentation has been expanded to explain the purpose and public entry point, addressing the previous review comment.
14-47: LGTM!The byte offset reconstruction correctly handles UTF-8 multi-byte characters and both Unix and Windows line endings.
49-67: LGTM!The span calculation correctly handles edge cases at line breaks, and the
#[expect]lint suppression includes a clear reason as required.
69-82: LGTM!The diagnostic struct is correctly structured with miette attributes for source code, labels, and help text.
84-104: LGTM!The hint functions correctly detect tab indentation and provide contextual hints for common YAML mistakes.
106-136: LGTM!The function correctly maps YAML parse errors to rich diagnostics with spans and contextual hints.
138-272: LGTM!Comprehensive test coverage for byte offset calculations including UTF-8, CRLF, and edge cases.
src/manifest/diagnostics/mod.rs (5)
1-13: LGTM!Module documentation has been expanded to explain the wrapper types and mapping helpers, addressing the previous review comment.
15-62: LGTM!
ManifestSourcenewtype is well-implemented with appropriate trait implementations and documentation.
64-111: LGTM!
ManifestNamenewtype follows the same well-structured pattern asManifestSource.
113-134: LGTM!The
ManifestErrorenum is correctly structured with proper diagnostic chaining. The doc comment on thesourcefield is now correctly placed before the attributes.
136-159: LGTM!The
DataDiagnosticstruct andmap_data_errorfunction are correctly implemented. The duplicate diagnostic code in the message has been removed, addressing the previous review comment.src/manifest/jinja_macros/cache.rs (6)
1-13: LGTM!Module documentation and imports are appropriate for this internal caching module.
15-64: LGTM!The macro function factory correctly handles argument adaptation, thread safety assertions, and auto-escape behaviour. Visibility has been appropriately restricted to
pub(super).
66-75: LGTM!The function correctly validates and wraps caller arguments.
77-107: LGTM!
MacroCachecorrectly usesOnceLockfor thread-safe lazy initialisation of the macro instance.
109-148: LGTM!
MacroInstancecorrectly handles the unsafe lifetime extension with appropriate SAFETY documentation. The thread safety notice clearly explains the constraints.
192-255: LGTM!
CallerAdapterhas excellent safety documentation explaining the invariants, thread safety constraints, and usage restrictions. The implementation correctly preserves the outer template state for caller block invocations.
- Remove redundant normalized_or_raw function in glob/walk.rs by directly accessing pattern.normalized. - Improve code clarity by using normalized field directly in glob/mod.rs. Additionally, add defensive null pointer check after Box::into_raw in jinja_macros/cache.rs for safety, though this is a minor code robustness improvement. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/manifest/jinja_macros/cache.rs (1)
94-102: Remove redundantdropcall.The explicit
drop(returned_instance)on line 98 is unnecessary; Rust drops the value automatically when it goes out of scope at the end of theif letblock.pub(super) fn prepare(&self, env: &Environment) -> anyhow::Result<()> { if self.instance.get().is_none() { let instance = MacroInstance::new(env, &self.template_name, &self.macro_name)?; - if let Err(returned_instance) = self.instance.set(instance) { - drop(returned_instance); - } + let _ = self.instance.set(instance); } Ok(()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
src/manifest/glob/mod.rs(1 hunks)src/manifest/glob/walk.rs(1 hunks)src/manifest/jinja_macros/cache.rs(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed in Rust code.
Fix any warnings emitted during Rust tests in the code itself rather than silencing them.
In Rust, extract meaningfully named helper functions when a function is too long, adhering to separation of concerns and CQRS.
In Rust, group related parameters in meaningfully named structs when a function has too many parameters.
In Rust, consider usingArcto reduce data returned when a function is returning a large error.
Every Rust module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs in Rust using Rustdoc comments (///) so documentation can be generated withcargo doc.
In Rust, prefer immutable data and avoid unnecessarymutbindings.
In Rust, handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode in Rust unless absolutely necessary, and document any usage clearly.
In Rust, place function attributes after doc comments.
In Rust, do not usereturnin single-line functions.
In Rust, use predicate functions for conditional criteria with more than two branches.
Lints in Rust must not be silenced except as a last resort.
Lint rule suppressions in Rust must be tightly scoped and include a clear reason.
In Rust, preferexpectoverallow.
In Rust tests, userstestfixtures for shared setup.
In Rust tests, replace duplicated tests with#[rstest(...)]parameterized cases.
In Rust, prefermockallfor mocks and stubs.
In Rust, useconcat!()to combine long string literals rather than escaping newlines with a backslash.
In Rust, prefer single line versions of functions where appropriate (e.g.,pub fn new(id: u64) -> Self { Self(id) }instead of multi-line).
Use NewTypes in Rust to model domain values and eliminate 'integer soup'. Reach fornewt-hypewhen introducing many homogeneous wrappers; add small shims for string-backed wrappers. For path-centric wrappers implementAsRef<Path>along...
Files:
src/manifest/glob/mod.rssrc/manifest/glob/walk.rssrc/manifest/jinja_macros/cache.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
src/manifest/glob/mod.rssrc/manifest/glob/walk.rssrc/manifest/jinja_macros/cache.rs
🧬 Code graph analysis (3)
src/manifest/glob/mod.rs (4)
src/manifest/glob/errors.rs (2)
create_glob_error(19-45)create_unmatched_brace_error(47-49)src/manifest/glob/normalize.rs (2)
normalize_separators(3-24)force_literal_escapes(59-78)src/manifest/glob/validate.rs (2)
validate_brace_matching(119-127)new(13-20)src/manifest/glob/walk.rs (2)
open_root_dir(7-15)process_glob_entry(17-61)
src/manifest/glob/walk.rs (1)
src/manifest/glob/errors.rs (1)
create_glob_error(19-45)
src/manifest/jinja_macros/cache.rs (1)
src/manifest/jinja_macros/mod.rs (1)
call_macro_value(148-163)
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: release / build-macos (x86_64-apple-darwin, macos-13, macos-x86_64, macos-x86_64) / Build macos artefacts
- GitHub Check: release / build-windows (aarch64-pc-windows-msvc, arm64, arm64, windows-aarch64) / Build windows artefacts
- GitHub Check: release / build-windows (x86_64-pc-windows-msvc, amd64, x64, windows-x86_64) / Build windows artefacts
- GitHub Check: release / build-linux (x86_64-unknown-linux-gnu, linux-amd64, amd64) / Build linux artefacts
- GitHub Check: Sourcery review
🔇 Additional comments (7)
src/manifest/jinja_macros/cache.rs (5)
1-3: LGTM!Module-level documentation is present and imports are appropriate. Good use of
anyhow::Contextfor error propagation.
15-64: LGTM!The visibility reduction to
pub(super)aligns with the refactoring goals. Error handling via?is correct, and the thread-safety assertion at line 45 appropriately guards against cross-thread misuse in debug builds.
66-75: LGTM!Clean helper with proper error handling and clear error message.
194-257: LGTM!The
CallerAdapterdocumentation thoroughly explains the safety requirements and thread-ownership invariants. Thedebug_assertat line 250 catches misuse during development, and the safety comments justify theunsafe impl Send/Sync.
136-140: The safety invariant is properly enforced throughregister_macro; no changes required.The unsafe
transmuteis safe becauseprepare()andMacroInstance::new()are only called fromregister_macro()(line 90 in mod.rs), which enforcesEnvironment<'static>in its signature. The Environment reference flows through the call chain unchanged, so the template bytecode genuinely outlives the cached state. The loose signatures onprepare()andnew()are acceptable for internal helper functions since their only caller isregister_macro.src/manifest/glob/mod.rs (1)
1-42: Well-structured module orchestration.The refactored glob module provides clear separation of concerns with proper error handling, conditional compilation for platform-specific behaviour, and appropriate use of Result types. The validation → normalization → expansion flow is easy to follow.
Also applies to: 51-87
src/manifest/glob/walk.rs (1)
1-6: Solid filesystem traversal implementation.The code correctly handles edge cases (empty stripped paths, non-files, non-UTF-8) with appropriate error propagation. The use of
cap_stdfor capability-restricted access andcaminofor UTF-8 paths is well-suited to the domain. Path normalisation to forward slashes ensures cross-platform consistency.Also applies to: 7-15, 22-23, 46-50, 63-74
…r handling - Force literal escapes immediately after normalization on Unix. - Replace repetitive glob error creation with a helper function. - Simplify UTF-8 path conversion and metadata fetching error handling. - Improve MacroStateGuard to return Self directly without error wrapping. These changes clean up code, reduce duplication, and improve maintainability. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- In the new
GlobPatterntype, consider makingraw/normalizedprivate and exposing a constructor so that callers can’t construct inconsistent patterns (e.g. withnormalizednot derived fromraw).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the new `GlobPattern` type, consider making `raw`/`normalized` private and exposing a constructor so that callers can’t construct inconsistent patterns (e.g. with `normalized` not derived from `raw`).
## Individual Comments
### Comment 1
<location> `src/manifest/glob/tests.rs:31-36` </location>
<code_context>
- use anyhow::{Context, Result, anyhow, ensure};
-
- #[test]
- fn validate_brace_matching_accepts_balanced_braces() {
- let pattern = GlobPattern {
- raw: "{foo,bar}".into(),
- normalized: None,
- };
- assert!(validate_brace_matching(&pattern).is_ok());
- }
-
- #[test]
- fn validate_brace_matching_rejects_unmatched_closing() -> Result<()> {
- let pattern = GlobPattern {
- raw: "foo}".into(),
</code_context>
<issue_to_address>
**suggestion (testing):** Brace validation tests miss several important edge-cases (nested braces, braces inside classes, and escaped braces).
The validator manages depth, `in_class`, and `escaped` state, but current tests only cover a simple balanced case and a single trailing `}`. Please add coverage for:
- Nested and multiple brace sets: `"{foo,{bar,baz}}"`, `"{a,b}{c,d}"`.
- Braces inside character classes (ignored): `"[abc{]"`, `"[{}]"`.
- Escaped braces on Unix (where `escaped` is tracked): `"\{foo\}"`, ensuring they don't change `depth`.
- An unmatched opening brace at the end, asserting the specific error message/kind.
This will better exercise the validator state machine and guard against regressions.
</issue_to_address>
### Comment 2
<location> `src/manifest/glob/tests.rs:16-19` </location>
<code_context>
+use tempfile::tempdir;
+
+#[test]
+fn normalize_separators_collapses_mixed_slashes() {
+ let normalized = normalize_separators(r"foo\\bar/baz");
+ #[cfg(unix)]
+ assert_eq!(normalized, "foo//bar/baz");
+ #[cfg(not(unix))]
+ assert!(normalized.contains(std::path::MAIN_SEPARATOR));
+}
</code_context>
<issue_to_address>
**suggestion (testing):** On non-Unix platforms the separator normalization test could assert a stronger, deterministic expectation.
In `normalize_separators_collapses_mixed_slashes`, the non-Unix branch only checks `contains(std::path::MAIN_SEPARATOR)`, which could still pass with unexpected mixes of `\` and `/`. Instead, build the full expected path using `std::path::MAIN_SEPARATOR` (e.g. `format!("foo{sep}{sep}bar{sep}baz", sep = std::path::MAIN_SEPARATOR)`) and assert equality so the test is deterministic and fully verifies normalization on non-Unix platforms.
```suggestion
#[cfg(unix)]
assert_eq!(normalized, "foo//bar/baz");
#[cfg(not(unix))]
{
let sep = std::path::MAIN_SEPARATOR;
let expected = format!("foo{sep}{sep}bar{sep}baz");
assert_eq!(normalized, expected);
}
```
</issue_to_address>
### Comment 3
<location> `src/manifest/glob/validate.rs:5` </location>
<code_context>
- }
-}
-
-struct BraceValidator {
- state: BraceValidationState,
- escaped: bool,
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the small `BraceValidator` state-struct and helper methods with a single inlined loop in `validate_brace_matching` that manages all state directly.
You can simplify this by inlining the state machine into a single loop and dropping the tiny helper methods that only manipulate internal fields. That keeps all behavior (escaping, character classes, brace depth) but makes the control flow easier to follow.
For example, you can replace `BraceValidator` and its methods with a single `validate_brace_matching` function:
```rust
pub(super) fn validate_brace_matching(pattern: &str) -> Result<(), Error> {
let mut depth = 0_i32;
let mut in_class = false;
let mut escaped = false;
let mut last_open_pos: Option<usize> = None;
for (pos, ch) in pattern.char_indices() {
// handle escape
if escaped {
escaped = false;
continue;
}
#[cfg(unix)]
{
if ch == '\\' {
escaped = true;
continue;
}
}
// update character class state
match ch {
'[' if !in_class => in_class = true,
']' if in_class => in_class = false,
_ => {}
}
// braces are ignored inside character classes
if in_class {
continue;
}
// handle braces
match ch {
'}' if depth == 0 => {
return Err(create_unmatched_brace_error(&GlobErrorContext {
pattern: pattern.to_owned(),
error_char: ch,
position: pos,
error_type: GlobErrorType::UnmatchedBrace,
}));
}
'{' => {
depth += 1;
last_open_pos = Some(pos);
}
'}' => {
depth -= 1;
}
_ => {}
}
}
if depth != 0 {
let pos = last_open_pos.unwrap_or(0);
Err(create_unmatched_brace_error(&GlobErrorContext {
pattern: pattern.to_owned(),
error_char: '{',
position: pos,
error_type: GlobErrorType::UnmatchedBrace,
}))
} else {
Ok(())
}
}
```
This preserves:
- `cfg(unix)` escape handling and the `escaped` toggle.
- Ignoring braces when `in_class == true`.
- Tracking of `depth` and `last_open_pos` for unmatched opening braces.
- Same error construction path.
By keeping all state in one place and using `continue` for escapes/character classes, you avoid the indirection of `process_character`, `handle_escape_sequence`, `handle_character_class`, and `handle_braces`, while keeping the behavior intact.
</issue_to_address>
### Comment 4
<location> `src/manifest/glob/normalize.rs:3` </location>
<code_context>
- }
-}
-
-pub(crate) fn normalize_separators(pattern: &str) -> String {
- let native = std::path::MAIN_SEPARATOR;
- #[cfg(unix)]
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining the backslash and escape-handling helpers directly into the main loops so all glob-normalisation rules are visible in one place and the call chain is shorter.
You can simplify this without changing behaviour by inlining the single‑use helpers into the main loops. That keeps all rules visible in one place and removes the deep call chain.
### 1. Inline backslash handling in `normalize_separators` (Unix)
Instead of `process_backslash` + `should_preserve_*` + `is_wildcard_continuation_char`, you can keep the logic but express it directly in the loop:
```rust
pub(crate) fn normalize_separators(pattern: &str) -> String {
let native = std::path::MAIN_SEPARATOR;
#[cfg(unix)]
{
let mut out = String::with_capacity(pattern.len());
let mut it = pattern.chars().peekable();
while let Some(c) = it.next() {
match c {
'\\' => {
match it.peek().copied() {
// Escapes we always keep as backslash
Some('[' | ']' | '{' | '}') => out.push('\\'),
// Wildcards: maybe treat '\' as escape, maybe as separator
Some('*' | '?') => {
let mut lookahead = it.clone();
lookahead.next(); // skip '*' or '?'
let preserve = match lookahead.peek() {
None => true,
Some(&ch) => ch.is_alphanumeric() || ch == '-' || ch == '_',
};
if preserve {
out.push('\\');
} else {
out.push(native);
}
}
// Any other char: '\' acts as separator
Some(_) => out.push(native),
// Trailing backslash: keep literal
None => out.push('\\'),
}
}
'/' => out.push(native),
_ => out.push(c),
}
}
out
}
#[cfg(not(unix))]
{
pattern.replace('/', &native.to_string())
}
}
```
This keeps exactly the same decision table but removes `process_backslash`, `should_preserve_backslash_for_bracket`, `should_preserve_backslash_for_wildcard`, and `is_wildcard_continuation_char`, making the separator/escape rule auditable in one place.
If you want to retain the “continuation char” concept, you can keep a *single* small helper:
```rust
#[cfg(unix)]
const fn is_wildcard_continuation_char(ch: char) -> bool {
ch.is_alphanumeric() || ch == '-' || ch == '_'
}
```
and call it inline in the `match lookahead.peek()` arm. That’s still easy to follow.
### 2. Inline escape replacement in `force_literal_escapes`
`process_escape_sequence` and `get_escape_replacement` are pure, single‑use helpers that just wrap a simple `match`. You can fold them into the main loop so the entire transformation is visible:
```rust
#[cfg(unix)]
pub(super) fn force_literal_escapes(pattern: &str) -> String {
let mut out = String::with_capacity(pattern.len());
let mut it = pattern.chars().peekable();
let mut in_class = false;
while let Some(c) = it.next() {
match c {
'[' if !in_class => {
in_class = true;
out.push('[');
}
']' if in_class => {
in_class = false;
out.push(']');
}
'\\' if !in_class => {
match it.peek().copied() {
Some(next) => match next {
'*' => { it.next(); out.push_str("[*]"); }
'?' => { it.next(); out.push_str("[?]"); }
'[' => { it.next(); out.push_str("[[]"); }
']' => { it.next(); out.push_str("[]]"); }
'{' => { it.next(); out.push_str("[{]"); }
'}' => { it.next(); out.push_str("[}]"); }
// default: keep backslash literal
_ => out.push('\\'),
},
// trailing backslash: keep literal
None => out.push('\\'),
}
}
_ => out.push(c),
}
}
out
}
```
This preserves all the existing behaviour (same mapping table and class handling) but removes `process_escape_sequence` and `get_escape_replacement`. The rules are now documented by the `match` arms themselves.
If combining the separator normalisation and escape forcing into a single pass is possible at the call site, you could then pull the combined logic into one function, but even just these two refactors will substantially reduce cognitive load while keeping the feature set intact.
</issue_to_address>
### Comment 5
<location> `src/manifest/diagnostics/mod.rs:150` </location>
<code_context>
-/// so the resulting diagnostic only carries the manifest name and error
-/// message.
-#[must_use]
-pub fn map_data_error(
- err: serde_json::Error,
- name: &ManifestName,
</code_context>
<issue_to_address>
**issue (review_instructions):** Add behavioural and unit tests covering `map_data_error` (and its diagnostic formatting) to exercise this changed behaviour.
Update the manifest diagnostics test suite to cover `map_data_error` and the `DataDiagnostic`/`ManifestError` behaviour. Validate the emitted message text, diagnostic code, and wrapping in `ManifestError::Parse` so that the new/changed string formatting (e.g. removal of the previous prefix) is locked in by tests.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural and unit tests are required.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| fn validate_brace_matching_accepts_balanced_braces() { | ||
| assert!(validate_brace_matching("{foo,bar}").is_ok()); | ||
| } | ||
|
|
||
| #[test] | ||
| fn validate_brace_matching_rejects_unmatched_closing() -> Result<()> { |
There was a problem hiding this comment.
suggestion (testing): Brace validation tests miss several important edge-cases (nested braces, braces inside classes, and escaped braces).
The validator manages depth, in_class, and escaped state, but current tests only cover a simple balanced case and a single trailing }. Please add coverage for:
- Nested and multiple brace sets:
"{foo,{bar,baz}}","{a,b}{c,d}". - Braces inside character classes (ignored):
"[abc{]","[{}]". - Escaped braces on Unix (where
escapedis tracked):"\{foo\}", ensuring they don't changedepth. - An unmatched opening brace at the end, asserting the specific error message/kind.
This will better exercise the validator state machine and guard against regressions.
| #[cfg(unix)] | ||
| assert_eq!(normalized, "foo//bar/baz"); | ||
| #[cfg(not(unix))] | ||
| assert!(normalized.contains(std::path::MAIN_SEPARATOR)); |
There was a problem hiding this comment.
suggestion (testing): On non-Unix platforms the separator normalization test could assert a stronger, deterministic expectation.
In normalize_separators_collapses_mixed_slashes, the non-Unix branch only checks contains(std::path::MAIN_SEPARATOR), which could still pass with unexpected mixes of \ and /. Instead, build the full expected path using std::path::MAIN_SEPARATOR (e.g. format!("foo{sep}{sep}bar{sep}baz", sep = std::path::MAIN_SEPARATOR)) and assert equality so the test is deterministic and fully verifies normalization on non-Unix platforms.
| #[cfg(unix)] | |
| assert_eq!(normalized, "foo//bar/baz"); | |
| #[cfg(not(unix))] | |
| assert!(normalized.contains(std::path::MAIN_SEPARATOR)); | |
| #[cfg(unix)] | |
| assert_eq!(normalized, "foo//bar/baz"); | |
| #[cfg(not(unix))] | |
| { | |
| let sep = std::path::MAIN_SEPARATOR; | |
| let expected = format!("foo{sep}{sep}bar{sep}baz"); | |
| assert_eq!(normalized, expected); | |
| } |
| use super::{GlobErrorContext, GlobErrorType, create_unmatched_brace_error}; | ||
| use minijinja::Error; | ||
|
|
||
| struct BraceValidator { |
There was a problem hiding this comment.
issue (complexity): Consider replacing the small BraceValidator state-struct and helper methods with a single inlined loop in validate_brace_matching that manages all state directly.
You can simplify this by inlining the state machine into a single loop and dropping the tiny helper methods that only manipulate internal fields. That keeps all behavior (escaping, character classes, brace depth) but makes the control flow easier to follow.
For example, you can replace BraceValidator and its methods with a single validate_brace_matching function:
pub(super) fn validate_brace_matching(pattern: &str) -> Result<(), Error> {
let mut depth = 0_i32;
let mut in_class = false;
let mut escaped = false;
let mut last_open_pos: Option<usize> = None;
for (pos, ch) in pattern.char_indices() {
// handle escape
if escaped {
escaped = false;
continue;
}
#[cfg(unix)]
{
if ch == '\\' {
escaped = true;
continue;
}
}
// update character class state
match ch {
'[' if !in_class => in_class = true,
']' if in_class => in_class = false,
_ => {}
}
// braces are ignored inside character classes
if in_class {
continue;
}
// handle braces
match ch {
'}' if depth == 0 => {
return Err(create_unmatched_brace_error(&GlobErrorContext {
pattern: pattern.to_owned(),
error_char: ch,
position: pos,
error_type: GlobErrorType::UnmatchedBrace,
}));
}
'{' => {
depth += 1;
last_open_pos = Some(pos);
}
'}' => {
depth -= 1;
}
_ => {}
}
}
if depth != 0 {
let pos = last_open_pos.unwrap_or(0);
Err(create_unmatched_brace_error(&GlobErrorContext {
pattern: pattern.to_owned(),
error_char: '{',
position: pos,
error_type: GlobErrorType::UnmatchedBrace,
}))
} else {
Ok(())
}
}This preserves:
cfg(unix)escape handling and theescapedtoggle.- Ignoring braces when
in_class == true. - Tracking of
depthandlast_open_posfor unmatched opening braces. - Same error construction path.
By keeping all state in one place and using continue for escapes/character classes, you avoid the indirection of process_character, handle_escape_sequence, handle_character_class, and handle_braces, while keeping the behavior intact.
| @@ -0,0 +1,111 @@ | |||
| //! Separator and escape normalisation for glob patterns. | |||
|
|
|||
| pub(crate) fn normalize_separators(pattern: &str) -> String { | |||
There was a problem hiding this comment.
issue (complexity): Consider inlining the backslash and escape-handling helpers directly into the main loops so all glob-normalisation rules are visible in one place and the call chain is shorter.
You can simplify this without changing behaviour by inlining the single‑use helpers into the main loops. That keeps all rules visible in one place and removes the deep call chain.
1. Inline backslash handling in normalize_separators (Unix)
Instead of process_backslash + should_preserve_* + is_wildcard_continuation_char, you can keep the logic but express it directly in the loop:
pub(crate) fn normalize_separators(pattern: &str) -> String {
let native = std::path::MAIN_SEPARATOR;
#[cfg(unix)]
{
let mut out = String::with_capacity(pattern.len());
let mut it = pattern.chars().peekable();
while let Some(c) = it.next() {
match c {
'\\' => {
match it.peek().copied() {
// Escapes we always keep as backslash
Some('[' | ']' | '{' | '}') => out.push('\\'),
// Wildcards: maybe treat '\' as escape, maybe as separator
Some('*' | '?') => {
let mut lookahead = it.clone();
lookahead.next(); // skip '*' or '?'
let preserve = match lookahead.peek() {
None => true,
Some(&ch) => ch.is_alphanumeric() || ch == '-' || ch == '_',
};
if preserve {
out.push('\\');
} else {
out.push(native);
}
}
// Any other char: '\' acts as separator
Some(_) => out.push(native),
// Trailing backslash: keep literal
None => out.push('\\'),
}
}
'/' => out.push(native),
_ => out.push(c),
}
}
out
}
#[cfg(not(unix))]
{
pattern.replace('/', &native.to_string())
}
}This keeps exactly the same decision table but removes process_backslash, should_preserve_backslash_for_bracket, should_preserve_backslash_for_wildcard, and is_wildcard_continuation_char, making the separator/escape rule auditable in one place.
If you want to retain the “continuation char” concept, you can keep a single small helper:
#[cfg(unix)]
const fn is_wildcard_continuation_char(ch: char) -> bool {
ch.is_alphanumeric() || ch == '-' || ch == '_'
}and call it inline in the match lookahead.peek() arm. That’s still easy to follow.
2. Inline escape replacement in force_literal_escapes
process_escape_sequence and get_escape_replacement are pure, single‑use helpers that just wrap a simple match. You can fold them into the main loop so the entire transformation is visible:
#[cfg(unix)]
pub(super) fn force_literal_escapes(pattern: &str) -> String {
let mut out = String::with_capacity(pattern.len());
let mut it = pattern.chars().peekable();
let mut in_class = false;
while let Some(c) = it.next() {
match c {
'[' if !in_class => {
in_class = true;
out.push('[');
}
']' if in_class => {
in_class = false;
out.push(']');
}
'\\' if !in_class => {
match it.peek().copied() {
Some(next) => match next {
'*' => { it.next(); out.push_str("[*]"); }
'?' => { it.next(); out.push_str("[?]"); }
'[' => { it.next(); out.push_str("[[]"); }
']' => { it.next(); out.push_str("[]]"); }
'{' => { it.next(); out.push_str("[{]"); }
'}' => { it.next(); out.push_str("[}]"); }
// default: keep backslash literal
_ => out.push('\\'),
},
// trailing backslash: keep literal
None => out.push('\\'),
}
}
_ => out.push(c),
}
}
out
}This preserves all the existing behaviour (same mapping table and class handling) but removes process_escape_sequence and get_escape_replacement. The rules are now documented by the match arms themselves.
If combining the separator normalisation and escape forcing into a single pass is possible at the call site, you could then pull the combined logic into one function, but even just these two refactors will substantially reduce cognitive load while keeping the feature set intact.
Summary
Changes
Diagnostics
Glob handling
Jinja macros
Tests and test structure
Rationale
Migration / Compatibility
Test plan
If any CI step fails due to path or import changes, please indicate the exact module path adjustments needed for compatibility.
🌿 Generated by Terry
ℹ️ Tag @terragon-labs to ask questions and address PR feedback
📎 Task: https://www.terragonlabs.com/task/3f86ec8a-9f88-481b-b335-ffdfd7374b03
Summary by Sourcery
Modularise manifest internals by splitting diagnostics, glob handling, and Jinja macro helpers into focused submodules while keeping public behaviour stable.
Bug Fixes:
Enhancements:
Tests: