feat(c14n): implement XML canonicalization (inclusive + exclusive)#6
feat(c14n): implement XML canonicalization (inclusive + exclusive)#6
Conversation
- Inclusive C14N 1.0: all in-scope namespaces, sorted by prefix - Exclusive C14N 1.0: visibly-utilized namespaces + PrefixList - Document-order tree walker on roxmltree - Text/attribute escaping per W3C spec - Attribute sorting (ns-uri + local-name) - Empty element expansion, CDATA flattening, PI normalization - Document-level comment/PI separator handling - 25 unit + 13 integration tests (xmllint reference outputs) - Update deps: roxmltree 0.21, x509-parser 0.18 Closes #5
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds a complete C14N implementation: canonicalization algorithms (inclusive/exclusive), URI-configurable algorithm struct, namespace renderers, document-order serializer, escaping utilities, and extensive unit/integration tests; exposes Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Algo as C14nAlgorithm
participant Parser as roxmltree.Document
participant Serializer as serialize_canonical
participant NsRenderer as NsRenderer (Inclusive/Exclusive)
participant Output as Vec<u8>
Client->>Algo: new(...) / from_uri(...)
Client->>Parser: parse XML bytes
Parser-->>Client: Document
Client->>Serializer: serialize_canonical(doc, node_set, with_comments, ns_renderer)
Serializer->>Serializer: traverse nodes (document order)
loop per-element
Serializer->>NsRenderer: render_namespaces(node, parent_rendered)
NsRenderer-->>Serializer: sorted namespace declarations
Serializer->>Serializer: emit start tag + ns decls
Serializer->>Serializer: sort & emit attributes (escape_attr)
Serializer->>Serializer: emit children / text (escape_text)
Serializer->>Serializer: emit end tag
end
Serializer->>Output: write canonical bytes
Output-->>Client: Vec<u8>
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds a Rust implementation of XML Canonicalization (C14N) with inclusive and exclusive namespace rendering, updates crate docs to demonstrate canonicalization, and introduces integration tests that validate outputs against xmllint reference canonicalization.
Changes:
- Introduce a C14N serializer plus inclusive/exclusive namespace rendering strategies and a public
canonicalize_xmlAPI. - Add integration tests comparing canonical output to
xmllintacross several representative XML inputs. - Bump XML/X.509 parsing dependencies and tweak
.gitignore.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/c14n/mod.rs |
Defines public C14N API/types (C14nAlgorithm, C14nMode, canonicalize(_xml)) and URI parsing/round-tripping. |
src/c14n/serialize.rs |
Implements document-order canonical serialization (elements, attrs, comments, PI, doc-level separators). |
src/c14n/ns_inclusive.rs |
Inclusive namespace rendering strategy and unit tests. |
src/c14n/ns_exclusive.rs |
Exclusive namespace rendering strategy (visibly-utilized + PrefixList) and unit tests. |
src/c14n/escape.rs |
Escaping helpers for text and attribute values plus unit tests. |
tests/c14n_integration.rs |
Integration tests asserting canonical output matches xmllint for multiple vectors and edge cases. |
src/lib.rs |
Updates crate-level Quick Start doc example to use the new C14N API. |
Cargo.toml |
Dependency bumps (roxmltree, x509-parser). |
.gitignore |
Adds ignore rules for arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/c14n/mod.rs (1)
51-60: Make the invariant-bearingC14nAlgorithmfields private.
inclusive_prefixesis normalized bywith_prefix_list(), but external callers can currently bypass that by mutating the public field directly and inserting literal#defaultor other meaningless combinations. Keeping these fields private and exposing accessors/builders would preserve the module's own invariants across all construction paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/c14n/mod.rs` around lines 51 - 60, The C14nAlgorithm struct exposes invariant-bearing fields (mode, with_comments, inclusive_prefixes) publicly, allowing callers to bypass normalization done by with_prefix_list(); make these fields private (e.g., rename to mode, with_comments, inclusive_prefixes as private) and provide controlled construction and access via a builder or public constructors/ accessors (e.g., new(), with_prefix_list(), mode(), with_comments(), inclusive_prefixes() or an iterator/clone-returning getter) so that inclusive_prefixes is always normalized (convert "#default" to "" and enforce other invariants) and external code cannot directly mutate the HashSet.src/c14n/serialize.rs (1)
173-181: Remove dead code block or refactor to clarify intent.This block doesn't perform any logic—it only suppresses the unused variable warning with
let _ = pred;. If attribute filtering is intentionally not implemented because roxmltree doesn't expose attribute nodes separately, remove the block and document this limitation elsewhere (e.g., in the function's doc comment or as a known gap).♻️ Suggested refactor
let mut attrs: Vec<_> = node.attributes().collect(); - // Filter attributes by node-set if applicable. - if let Some(pred) = node_set { - // For document subsets, only include attributes that are "in the set". - // roxmltree doesn't give attribute nodes separate Node identity, - // so when we have a node_set, all attributes of an included element - // are included (matching xmlsec1 behavior for typical use cases). - let _ = pred; - // All attributes included if the element is in the set. - } + // Note: roxmltree doesn't expose attribute nodes with separate Node identity, + // so node_set filtering cannot distinguish individual attributes. When an + // element is in the set, all its attributes are included (matches xmlsec1 behavior). attrs.sort_by(|a, b| {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/c14n/serialize.rs` around lines 173 - 181, Remove the dead block that only does "let _ = pred;" inside the attribute filtering section (the code referencing node_set and pred) and instead document the limitation: update the serialize function's doc comment to state that attribute-level filtering is not implemented because roxmltree does not expose attribute nodes separately, so attributes of an included element are always included; if you prefer to keep an explicit no-op, replace the block with a brief explanatory comment referencing node_set/pred rather than using let _ = pred to silence warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/c14n/mod.rs`:
- Around line 73-90: from_uri advertises support for C14n 1.1 by returning
C14nMode::Inclusive1_1 but canonicalize still processes Inclusive1_1 using the
1.0 codepath; fix this by either implementing the 1.1-specific canonicalization
rules in the canonicalize function/renderer for the C14nMode::Inclusive1_1
branch (and ensure uri() and from_uri remain consistent) or reject 1.1 URIs in
from_uri by returning None for those URIs (remove or change the
"http://www.w3.org/2006/12/xml-c14n11" and "#WithComments" arms) so callers
cannot request 1.1 when only 1.0 behavior is implemented; update tests and any
use of inclusive_prefixes handling as needed to reflect the chosen approach.
In `@src/c14n/ns_exclusive.rs`:
- Around line 81-107: visibly_utilized_prefixes() (and likewise
write_qualified_name() in serialize.rs) should use the lexical prefix parsed
from each QName instead of reverse-mapping via lookup_prefix(namespace_uri);
update visibly_utilized_prefixes to read the element's lexical prefix from
node.tag_name() (and each attribute's lexical prefix from attr.name()/QName) and
insert that exact prefix (including empty string for default) into the utilized
set, removing any lookup_prefix calls, and update write_qualified_name() to emit
the stored lexical prefix rather than resolving by namespace URI; add a
regression test that creates two different prefixes bound to the same URI (e.g.,
xmlns:a="u" xmlns:b="u" and an attribute using b:) to ensure the serializer
preserves the actual lexical prefix.
---
Nitpick comments:
In `@src/c14n/mod.rs`:
- Around line 51-60: The C14nAlgorithm struct exposes invariant-bearing fields
(mode, with_comments, inclusive_prefixes) publicly, allowing callers to bypass
normalization done by with_prefix_list(); make these fields private (e.g.,
rename to mode, with_comments, inclusive_prefixes as private) and provide
controlled construction and access via a builder or public constructors/
accessors (e.g., new(), with_prefix_list(), mode(), with_comments(),
inclusive_prefixes() or an iterator/clone-returning getter) so that
inclusive_prefixes is always normalized (convert "#default" to "" and enforce
other invariants) and external code cannot directly mutate the HashSet.
In `@src/c14n/serialize.rs`:
- Around line 173-181: Remove the dead block that only does "let _ = pred;"
inside the attribute filtering section (the code referencing node_set and pred)
and instead document the limitation: update the serialize function's doc comment
to state that attribute-level filtering is not implemented because roxmltree
does not expose attribute nodes separately, so attributes of an included element
are always included; if you prefer to keep an explicit no-op, replace the block
with a brief explanatory comment referencing node_set/pred rather than using let
_ = pred to silence warnings.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4a7034e7-321c-417a-99c9-9af56e34988f
📒 Files selected for processing (9)
.gitignoreCargo.tomlsrc/c14n/escape.rssrc/c14n/mod.rssrc/c14n/ns_exclusive.rssrc/c14n/ns_inclusive.rssrc/c14n/serialize.rssrc/lib.rstests/c14n_integration.rs
- Escape \r to 
 in PI and XML node content (C14N spec section 2.3) - C14N 1.1 returns UnsupportedAlgorithm instead of silent 1.0 fallback - Allow clippy::unwrap_used in test modules - Document lookup_prefix() limitation for aliased namespace prefixes
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/c14n/serialize.rs (1)
175-183: Remove or rewrite the no-op attribute-filter block.This block currently suggests attribute node-set filtering but performs no filtering. Consider removing it and keeping a single explicit limitation note in docs/comments to avoid false expectations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/c14n/serialize.rs` around lines 175 - 183, The no-op attribute-filter block (the if let Some(pred) { let _ = pred; ... } section) should be removed and replaced with a single clear comment stating the limitation that attribute nodes are not separately represented by roxmltree and therefore all attributes of an included element are always serialized when the element is in the node_set; update nearby comment in serialize.rs to reflect this explicit limitation (referencing node_set and roxmltree) so callers won't expect attribute-level filtering.src/c14n/mod.rs (1)
94-106: Consider guardingwith_prefix_listto Exclusive mode.
with_prefix_listsilently accepts non-exclusive modes even though it is only meaningful for Exclusive C14N. A guard (or explicit doc note on no-op behavior outside exclusive mode) would make misuse less likely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/c14n/mod.rs` around lines 94 - 106, with_prefix_list currently sets inclusive_prefixes unconditionally though it only makes sense for Exclusive C14N; add a guard in with_prefix_list that checks the canonicalization mode (e.g., a field like self.mode or self.canonicalization_mode) and either no-op and return self when the mode is not Exclusive or return a clear error/perform a debug/assert; update the method to only mutate self.inclusive_prefixes when mode == Exclusive (or add a short doc comment on no-op behavior if you prefer the no-op approach). Ensure you reference the with_prefix_list method and the inclusive_prefixes field when implementing the guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/c14n/mod.rs`:
- Around line 4-6: Update the module-level documentation in src/c14n/mod.rs to
not claim Canonical XML 1.1 support: remove or change the bullet for "Canonical
XML 1.1" so it no longer advertises implementation, and add a short note that
attempting to use C14N 1.1 currently returns UnsupportedAlgorithm (referencing
the UnsupportedAlgorithm error) so callers know it isn't supported yet; keep the
other bullets (Canonical XML 1.0 and Exclusive XML Canonicalization 1.0)
unchanged.
---
Nitpick comments:
In `@src/c14n/mod.rs`:
- Around line 94-106: with_prefix_list currently sets inclusive_prefixes
unconditionally though it only makes sense for Exclusive C14N; add a guard in
with_prefix_list that checks the canonicalization mode (e.g., a field like
self.mode or self.canonicalization_mode) and either no-op and return self when
the mode is not Exclusive or return a clear error/perform a debug/assert; update
the method to only mutate self.inclusive_prefixes when mode == Exclusive (or add
a short doc comment on no-op behavior if you prefer the no-op approach). Ensure
you reference the with_prefix_list method and the inclusive_prefixes field when
implementing the guard.
In `@src/c14n/serialize.rs`:
- Around line 175-183: The no-op attribute-filter block (the if let Some(pred) {
let _ = pred; ... } section) should be removed and replaced with a single clear
comment stating the limitation that attribute nodes are not separately
represented by roxmltree and therefore all attributes of an included element are
always serialized when the element is in the node_set; update nearby comment in
serialize.rs to reflect this explicit limitation (referencing node_set and
roxmltree) so callers won't expect attribute-level filtering.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9987c64e-be8c-41dd-9d4a-735827a5b450
📒 Files selected for processing (5)
src/c14n/escape.rssrc/c14n/mod.rssrc/c14n/ns_exclusive.rssrc/c14n/ns_inclusive.rssrc/c14n/serialize.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/c14n/ns_exclusive.rs
- src/c14n/ns_inclusive.rs
- Make C14nAlgorithm fields private, add accessor methods - Remove dead attribute filter block with let _ = pred suppression - Replace with doc explaining roxmltree attribute node limitation
There was a problem hiding this comment.
Pull request overview
Adds a new XML Canonicalization (C14N) implementation to xml-sec, including inclusive/exclusive namespace handling and serialization, plus integration tests validated against xmllint reference output.
Changes:
- Introduces a new
xml_sec::c14nAPI (C14nMode,C14nAlgorithm,canonicalize[_xml]) with inclusive/exclusive renderers and canonical serializer. - Adds extensive integration coverage comparing outputs to
xmllintfor namespaces, attribute sorting, comments/PIs, CDATA, and idempotency. - Updates public crate docs to showcase canonicalization usage; bumps
roxmltreeandx509-parserversions; updates.gitignore.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/c14n_integration.rs | Adds xmllint-verified integration tests for inclusive/exclusive C14N behavior. |
| src/lib.rs | Updates crate-level Quick Start example to use the new C14N API. |
| src/c14n/mod.rs | Defines the public C14N API, algorithm parsing/URI mapping, and top-level canonicalization entrypoints. |
| src/c14n/serialize.rs | Implements document-order canonical serialization, escaping, attribute sorting, doc-level PI/comment handling, and node-set filtering hooks. |
| src/c14n/ns_inclusive.rs | Implements inclusive namespace rendering strategy. |
| src/c14n/ns_exclusive.rs | Implements exclusive namespace rendering strategy with PrefixList support. |
| src/c14n/escape.rs | Adds canonical escaping helpers for text, attributes, and CR handling. |
| Cargo.toml | Bumps roxmltree and x509-parser dependency versions. |
| .gitignore | Ignores arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
… edge case - UTF-8 decode error now says "invalid UTF-8:" instead of "XML parse error" - Inclusive ns doc: clarify redundant redeclarations are suppressed - Document doc-level newline limitation with subset filtering
There was a problem hiding this comment.
Pull request overview
Adds a full XML Canonicalization (C14N) implementation (inclusive 1.0 + exclusive 1.0, with comments option) using roxmltree, along with integration tests that validate output against xmllint reference strings.
Changes:
- Introduce C14N core API (
C14nMode,C14nAlgorithm,canonicalize*) and implement canonical serialization + namespace rendering (inclusive/exclusive). - Add extensive integration tests covering namespace handling, attribute sorting, comments/PI behavior, idempotency, CDATA flattening, and namespace undeclaration.
- Update crate docs quick-start to demonstrate canonicalization; bump
roxmltreeandx509-parser; extend.gitignore.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/c14n_integration.rs | Adds xmllint-based integration vectors for inclusive/exclusive canonicalization behavior. |
| src/lib.rs | Updates crate-level Quick Start docs to showcase C14N usage. |
| src/c14n/serialize.rs | Implements document-order canonical serialization including attribute sorting and comment/PI rules. |
| src/c14n/ns_inclusive.rs | Adds inclusive namespace rendering strategy (in-scope namespaces, suppress redundant redecls). |
| src/c14n/ns_exclusive.rs | Adds exclusive namespace rendering strategy (visibly utilized + forced prefix list). |
| src/c14n/mod.rs | Defines the new public C14N API types and canonicalization entrypoints. |
| src/c14n/escape.rs | Adds canonical escaping helpers for text, attributes, and CR normalization. |
| Cargo.toml | Bumps roxmltree to 0.21 and x509-parser to 0.18. |
| .gitignore | Ignores arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR introduces a full C14N (canonical XML) implementation backed by roxmltree, along with integration tests validated against xmllint, and updates public documentation to demonstrate the new API.
Changes:
- Added canonical XML serialization with inclusive and exclusive namespace rendering strategies.
- Introduced extensive integration tests comparing outputs to
xmllintreference canonicalization. - Updated crate docs and bumped
roxmltree/x509-parserdependency versions.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/c14n_integration.rs | Adds xmllint-validated integration tests for inclusive/exclusive C14N behaviors and edge cases. |
| src/lib.rs | Updates crate-level “Quick Start” docs to show the new C14N API. |
| src/c14n/mod.rs | Implements the new public C14N API (C14nMode, C14nAlgorithm, canonicalization entrypoints). |
| src/c14n/serialize.rs | Adds document-order canonical serialization logic (elements, attrs, comments, PI, text). |
| src/c14n/ns_inclusive.rs | Implements inclusive namespace rendering strategy for C14N 1.0. |
| src/c14n/ns_exclusive.rs | Implements exclusive namespace rendering strategy + PrefixList support. |
| src/c14n/escape.rs | Adds escaping utilities for canonical XML text/attr/comment/PI content. |
| Cargo.toml | Bumps roxmltree to 0.21 and x509-parser to 0.18. |
| .gitignore | Ignores arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Introduces a new XML Canonicalization (C14N) implementation (inclusive/exclusive) with serialization, namespace rendering strategies, escaping rules, and comprehensive integration/unit tests, plus updates to crate docs and a couple dependency bumps.
Changes:
- Adds a full C14N module implementation (
serialize,ns_inclusive,ns_exclusive,escape) with a new public API (C14nMode,C14nAlgorithm,canonicalize[_xml]). - Adds integration tests comparing outputs against
xmllintreference canonicalization. - Updates crate-level docs to demonstrate C14N usage; bumps
roxmltreeandx509-parser.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/c14n_integration.rs | New integration test suite validating outputs against xmllint reference C14N. |
| src/lib.rs | Updates crate “Quick Start” example to use the new C14N API. |
| src/c14n/mod.rs | New public C14N API surface + algorithm parsing/selection and error types. |
| src/c14n/serialize.rs | Canonical document-order serializer with namespace rendering hooks. |
| src/c14n/ns_inclusive.rs | Inclusive C14N namespace rendering strategy. |
| src/c14n/ns_exclusive.rs | Exclusive C14N namespace rendering strategy + PrefixList support. |
| src/c14n/escape.rs | Canonical text/attribute/comment/PI escaping helpers + tests. |
| Cargo.toml | Bumps roxmltree and x509-parser versions. |
| .gitignore | Ignores arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
…fix() Replace ambiguous lookup_prefix(namespace_uri) reverse-mapping with direct extraction of lexical prefixes from source XML byte positions using roxmltree's `positions` feature. - Enable roxmltree `positions` feature for byte-range access - Add prefix.rs: element_prefix() and attribute_prefix() extract the actual prefix used in the source QName via Node::range() and Attribute::range_qname() - Replace lookup_prefix() in serialize.rs write_qualified_name and write_attribute_name - Replace lookup_prefix() in ns_exclusive.rs visibly_utilized_prefixes - Add regression tests for aliased prefixes (same URI, different prefix)
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Pull request overview
Adds a full XML Canonicalization (C14N) implementation to the crate (inclusive + exclusive 1.0), aligning outputs against xmllint reference behavior and exposing a public canonicalization API under xml_sec::c14n.
Changes:
- Implement C14N 1.0 serialization, escaping rules, and namespace rendering strategies (inclusive/exclusive).
- Introduce a new public C14N API (
C14nMode,C14nAlgorithm,canonicalize,canonicalize_xml) and update crate docs to demonstrate it. - Add comprehensive integration tests using precomputed
xmllint --c14n/--exc-c14nreference outputs; bump parsing deps (roxmltreew/positions).
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/c14n_integration.rs | New integration tests validating canonical output vs xmllint reference cases. |
| src/lib.rs | Updates crate “Quick Start” docs to show C14N usage. |
| src/c14n/mod.rs | New public C14N API surface + dispatch to inclusive/exclusive renderers. |
| src/c14n/serialize.rs | Core document-order serializer for canonical XML output. |
| src/c14n/escape.rs | Canonical escaping for text/attributes/comments/PIs. |
| src/c14n/prefix.rs | Lexical prefix recovery using roxmltree positions to preserve source prefixes. |
| src/c14n/ns_inclusive.rs | Inclusive namespace rendering logic. |
| src/c14n/ns_exclusive.rs | Exclusive namespace rendering logic + PrefixList support. |
| Cargo.toml | Updates roxmltree and x509-parser versions (enables positions). |
| .gitignore | Ignores additional local paths/files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
When canonicalizing a document subset where ancestor elements are excluded, xmlns="" must still be emitted if the source tree has a non-empty default namespace that needs undeclaring. Previously, suppression was based only on parent_rendered (output ancestors), which missed inherited default namespaces from non-output ancestors. - Add has_in_scope_default_namespace() to check ancestor chain - Use it in both inclusive and exclusive ns renderers - Add subset integration tests for xmlns="" edge cases
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/c14n/ns_exclusive.rs (1)
40-78: Consider extracting shared namespace-declaration pipeline logic.The filter/redundancy-check/sort/rendered-map update flow here is very close to
src/c14n/ns_inclusive.rs(Line 36-75). A shared helper would reduce drift risk between inclusive and exclusive paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/c14n/ns_exclusive.rs` around lines 40 - 78, The namespace-declaration selection, filtering, sorting and rendered-map update logic in this block (using variables/functions like node.namespaces(), utilized, self.inclusive_prefixes, parent_rendered, has_in_scope_default_namespace, decls, and rendered.insert) largely duplicates the pipeline in the inclusive implementation; extract a shared helper (e.g., collect_namespace_declarations or build_ns_decls) that accepts the node, utilized set, inclusive_prefixes flag/set, parent_rendered view and a flag/closure to run the default-namespace suppression check, returns a sorted Vec<(String,String)> of declarations, and then reuse that helper from both ns_exclusive and ns_inclusive to push decls and update rendered (preserve the xml-prefix skip and the "suppress xmlns=\"\"" semantics by parameterizing the helper), ensuring the helper performs the same prefix-redundancy and ancestor-comparison checks before sorting and returning results.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/c14n/ns_exclusive.rs`:
- Around line 40-78: The namespace-declaration selection, filtering, sorting and
rendered-map update logic in this block (using variables/functions like
node.namespaces(), utilized, self.inclusive_prefixes, parent_rendered,
has_in_scope_default_namespace, decls, and rendered.insert) largely duplicates
the pipeline in the inclusive implementation; extract a shared helper (e.g.,
collect_namespace_declarations or build_ns_decls) that accepts the node,
utilized set, inclusive_prefixes flag/set, parent_rendered view and a
flag/closure to run the default-namespace suppression check, returns a sorted
Vec<(String,String)> of declarations, and then reuse that helper from both
ns_exclusive and ns_inclusive to push decls and update rendered (preserve the
xml-prefix skip and the "suppress xmlns=\"\"" semantics by parameterizing the
helper), ensuring the helper performs the same prefix-redundancy and
ancestor-comparison checks before sorting and returning results.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dfcec5af-b83d-4a32-bb45-77e3d69140b8
📒 Files selected for processing (7)
Cargo.tomlsrc/c14n/mod.rssrc/c14n/ns_exclusive.rssrc/c14n/ns_inclusive.rssrc/c14n/prefix.rssrc/c14n/serialize.rstests/c14n_integration.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/c14n_integration.rs
- Cargo.toml
There was a problem hiding this comment.
Pull request overview
This PR introduces a new Canonical XML (C14N) implementation in xml_sec, including inclusive/exclusive namespace rendering, lexical prefix preservation (via roxmltree positions), and comprehensive integration tests against xmllint-derived reference outputs.
Changes:
- Implement canonical XML serialization with inclusive and exclusive namespace strategies (plus prefix-list support for exclusive C14N).
- Add lexical prefix recovery to preserve the original prefixes when multiple prefixes map to the same namespace URI.
- Add integration tests covering namespace/attribute ordering, comments/PIs, CDATA flattening, idempotency, and document-subset edge cases.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/c14n_integration.rs | Adds integration coverage using xmllint reference outputs for inclusive/exclusive C14N behaviors. |
| src/lib.rs | Updates crate-level “Quick Start” documentation to demonstrate C14N usage. |
| src/c14n/mod.rs | Reworks/expands the public C14N API (mode + algorithm struct), adds parsing from URI, and exposes canonicalization entrypoints. |
| src/c14n/serialize.rs | Implements document-order canonical serialization (elements, attrs, text, comments, PI, doc-level separators). |
| src/c14n/ns_inclusive.rs | Implements inclusive namespace rendering with subset-aware xmlns="" handling. |
| src/c14n/ns_exclusive.rs | Implements exclusive namespace rendering based on visibly-utilized prefixes + inclusive prefix list. |
| src/c14n/prefix.rs | Implements lexical prefix extraction from source ranges and default-namespace-in-scope detection for subset correctness. |
| src/c14n/escape.rs | Adds canonical escaping routines for text/attrs and CR escaping for PI/comments. |
| Cargo.toml | Bumps roxmltree (enables positions) and x509-parser versions. |
| .gitignore | Adds ignores for arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
… doc - Replace .unwrap() with ? in lib.rs and c14n/mod.rs doctests to comply with deny(clippy::unwrap_used) on doctest targets - Fix has_in_scope_default_namespace doc: says "nearest" not "any" ancestor, matching the actual implementation behavior
|
Valid suggestion from CodeRabbit review — the namespace-declaration pipeline in This is a refactoring task, out of scope for this PR. @coderabbitai please create an issue for: Extract shared namespace-declaration pipeline from ns_inclusive.rs and ns_exclusive.rs into a common helper function that accepts a prefix filter predicate (all-in-scope for inclusive, visibly-utilized for exclusive) to reduce code duplication and drift risk. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 52
🧠 Learnings used |
There was a problem hiding this comment.
Pull request overview
This PR introduces a full XML Canonicalization (C14N) implementation (inclusive 1.0 + exclusive 1.0) built on roxmltree, along with extensive reference-based integration tests and updated crate documentation to reflect the new canonicalization API.
Changes:
- Added C14N core implementation: canonical serialization, namespace rendering strategies (inclusive/exclusive), escaping, and lexical prefix recovery (via
roxmltreepositions). - Added integration tests comparing outputs to
xmllintreference behavior across namespaces, attribute sorting, comments/PI handling, CDATA flattening, and subset edge-cases. - Updated public docs/examples and bumped dependencies (
roxmltree+ positions feature,x509-parser).
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
tests/c14n_integration.rs |
Adds xmllint-reference integration tests for inclusive/exclusive C14N and subset edge cases. |
src/lib.rs |
Updates crate-level Quick Start example to demonstrate canonicalization API. |
src/c14n/mod.rs |
Defines the public C14N API (C14nAlgorithm, C14nMode, canonicalize, canonicalize_xml) and errors. |
src/c14n/serialize.rs |
Implements document-order canonical serialization and doc-level comment/PI separation rules. |
src/c14n/ns_inclusive.rs |
Implements inclusive namespace rendering and suppression rules. |
src/c14n/ns_exclusive.rs |
Implements exclusive namespace rendering based on visibly-utilized prefixes + PrefixList. |
src/c14n/prefix.rs |
Recovers lexical prefixes using roxmltree byte ranges; adds default-namespace-in-scope detection helper. |
src/c14n/escape.rs |
Adds canonical escaping routines for text, attributes, and CR handling in comments/PIs. |
Cargo.toml |
Bumps roxmltree and enables positions; bumps x509-parser. |
.gitignore |
Adds ignores for arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
In exclusive C14N, an unprefixed element relies on the current default-namespace binding (including xmlns="" undeclaration). Without marking the default namespace as visibly utilized, xmlns="" would not be emitted, placing the element in the wrong namespace context.
There was a problem hiding this comment.
Pull request overview
Adds a full Canonical XML (C14N) implementation (inclusive 1.0 + exclusive 1.0; 1.1 URI parsing but explicitly unsupported) and validates behavior against xmllint reference outputs, making the crate’s C14N functionality concrete and test-backed.
Changes:
- Implement canonical XML serialization, namespace rendering (inclusive/exclusive), escaping rules, and lexical prefix recovery via
roxmltreepositions. - Add extensive integration tests comparing outputs to
xmllint(including namespace/attr ordering, comments/PI rules, idempotency, and subset edge cases). - Update crate docs/examples to demonstrate C14N usage; bump
roxmltreeandx509-parserversions.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/c14n/mod.rs |
Public C14N API (algorithm/mode), error type, and canonicalization entry points. |
src/c14n/serialize.rs |
Core document-order serialization and attribute/PI/comment handling. |
src/c14n/ns_inclusive.rs |
Inclusive namespace rendering strategy. |
src/c14n/ns_exclusive.rs |
Exclusive namespace rendering strategy (+ PrefixList support). |
src/c14n/prefix.rs |
Lexical prefix extraction using roxmltree byte ranges (positions). |
src/c14n/escape.rs |
Canonical escaping for text, attributes, and CR handling. |
tests/c14n_integration.rs |
xmllint-based golden integration tests covering key C14N behaviors. |
src/lib.rs |
Updates crate-level Quick Start example to canonicalization. |
Cargo.toml |
Enables roxmltree positions feature; bumps x509-parser. |
.gitignore |
Ignores arch/ and ROADMAP.md. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Summary
InclusiveNamespaces PrefixListcanonicalize(),canonicalize_xml(),C14nAlgorithm::from_uri()positionsfeature), x509-parser 0.16→0.18Design Decisions
Lexical prefix extraction via byte positions: roxmltree discards lexical prefixes during DOM construction (stores only namespace URI + local name). We extract prefixes from the source XML using
Node::range()andAttribute::range_qname()(roxmltreepositionsfeature). This correctly handles namespace prefix aliasing (multiple prefixes bound to the same URI), whichlookup_prefix()cannot.HashMap::clone()per element: Namespace rendering state is cloned at each element. Acceptable for typical XML depths (<20 levels). Optimization to&mutwith backtrack is deferred to performance phase.Known Limitations
UnsupportedAlgorithmat execution — xml:id/xml:base fixup not yet implementedTest Plan
cargo test— 50 tests pass (33 unit + 15 integration + 2 doc)cargo clippy --all-targets --all-features -- -D warnings— 0 warningsxmllint --c14n/xmllint --exc-c14nreference outputsC14N(C14N(x)) == C14N(x)for both modesCloses #5
Summary by CodeRabbit
New Features
Tests
Chores