feat(xmldsig): add URI dereference for Reference elements#9
Conversation
- UriReferenceResolver with configurable ID attribute names (ID/Id/id + custom)
- Empty URI → entire document without comments (XMLDSig §4.3.3.2)
- Bare-name #id → element subtree by ID attribute
- #xpointer(/) → document root with comments
- #xpointer(id('...')) → element subtree (XPointer form)
- NodeSet type with subtree inclusion/exclusion for transform pipeline
- TransformData enum (NodeSet/Binary) for transform chain data flow
Closes #8
📝 WalkthroughWalkthroughAdds same-document URI dereferencing for XMLDSig: new public modules Changes
Sequence DiagramsequenceDiagram
participant Caller as Caller
participant Resolver as UriReferenceResolver
participant Doc as Document
participant NodeSet as NodeSet
participant Transform as TransformData
Caller->>Resolver: dereference(uri)
Resolver->>Doc: build ID index / inspect fragment
alt Empty URI
Doc-->>Resolver: entire document (no comments)
else XPointer `/`
Doc-->>Resolver: entire document (with comments)
else Fragment `#id` or xpointer(id(...))
Doc-->>Resolver: element subtree for id
else External/Unsupported
Resolver-->>Caller: TransformError::UnsupportedUri
end
Resolver->>NodeSet: construct NodeSet (included/excluded, with_comments)
NodeSet->>Transform: wrap as TransformData::NodeSet
Transform-->>Caller: Ok(TransformData)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Pull request overview
Adds same-document URI dereferencing for XMLDSig <Reference> processing, introducing core transform-pipeline types and end-to-end integration tests validating URI → NodeSet → C14N behavior (notably for SAML-style documents).
Changes:
- Introduce
UriReferenceResolverfor"",#id, and selected#xpointer(...)same-document reference forms. - Add
NodeSet,TransformData, andTransformErroras foundational transform pipeline types. - Add integration tests exercising dereference + canonicalization outcomes (including comment-handling differences).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/xmldsig/uri.rs |
Implements same-document URI dereference logic and ID indexing for fragment lookups. |
src/xmldsig/types.rs |
Introduces NodeSet/TransformData/TransformError for transform pipeline data flow. |
src/xmldsig/mod.rs |
Wires up and re-exports new XMLDSig modules/types. |
tests/uri_integration.rs |
Adds integration coverage for URI deref → NodeSet → inclusive C14N canonicalization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Return UnsupportedUri for unrecognized XPointer expressions instead of falling through to ID lookup (misclassified errors) - Replace recursive collect_subtree_ids with iterative stack traversal to prevent stack overflow on deeply nested attacker-controlled XML - Document all supported URI forms in dereference() doc comment - Add unit test for unsupported XPointer expressions
There was a problem hiding this comment.
Pull request overview
Adds same-document URI dereferencing for XMLDSig <Reference> elements and introduces transform-pipeline types to carry node sets/binary data through canonicalization and later transforms, with unit + integration coverage validating URI → NodeSet → C14N behavior.
Changes:
- Added
UriReferenceResolverfor"",#id,#xpointer(/), and#xpointer(id('...'))same-document references. - Introduced
NodeSet,TransformData, andTransformErrorfor the XMLDSig transform pipeline. - Added integration tests covering dereference + canonicalization end-to-end (including comments and subtree exclusion).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/xmldsig/uri.rs |
Implements same-document URI dereference with ID indexing and XPointer support. |
src/xmldsig/types.rs |
Adds NodeSet/TransformData/TransformError used by dereference and future transforms. |
src/xmldsig/mod.rs |
Exposes new XMLDSig modules/types and documents current status. |
tests/uri_integration.rs |
Integration tests for URI dereference → NodeSet predicate → C14N output. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/xmldsig/uri.rs (1)
59-73:with_id_attrs()reads like replacement, but it always appends.Because
ID/Id/idare always retained, callers cannot use this API to narrow the allowed names, and the current name is easy to misread at call sites. Consider renaming it to something likewith_additional_id_attrs()or adding a separate exact-list constructor.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/xmldsig/uri.rs` around lines 59 - 73, The method with_id_attrs(doc, extra_attrs) currently always appends DEFAULT_ID_ATTRS (ID/Id/id) which prevents callers from specifying an exact set; change the API to make intent explicit: rename with_id_attrs to with_additional_id_attrs (preserving current behavior of merging DEFAULT_ID_ATTRS with extra_attrs, deduping via attr_names and populating id_map) and add a new constructor with_exact_id_attrs(doc, attrs) that uses only the provided attrs (no defaults) and builds id_map from those names; update the doc comments on both functions to describe behavior and update all call sites to use the appropriately named function (or with_exact_id_attrs when callers expect replacement rather than append).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/xmldsig/types.rs`:
- Around line 124-148: The contains() and exclude_subtree() methods must
validate that the provided Node belongs to the NodeSet's owning document: before
any logic, compare the node's owning document to the NodeSet's stored document
(the NodeSet document field) and if they differ, contains() should return false
and exclude_subtree() should be a no-op. Update contains() (referencing NodeSet,
contains(), with_comments, included, excluded) to early-return false for foreign
nodes, and update exclude_subtree() to skip calling collect_subtree_ids(node,
&mut self.excluded) when the node is from a different document. Ensure you use
the Node API method that exposes its owning document (e.g.,
node.document()/node.owner_doc()) for the comparison.
In `@src/xmldsig/uri.rs`:
- Around line 121-137: In dereference_fragment, reject an empty fragment ("#")
before falling back to the bare-name path: detect if fragment.is_empty() (or
fragment == "") and return
Err(TransformError::UnsupportedUri(format!("#{fragment}"))) rather than calling
resolve_id; update dereference_fragment (which currently calls parse_xpointer_id
and resolve_id) to check the empty-fragment case immediately after handling
xpointer(/) and parse_xpointer_id so "#" yields UnsupportedUri instead of
delegating to resolve_id.
---
Nitpick comments:
In `@src/xmldsig/uri.rs`:
- Around line 59-73: The method with_id_attrs(doc, extra_attrs) currently always
appends DEFAULT_ID_ATTRS (ID/Id/id) which prevents callers from specifying an
exact set; change the API to make intent explicit: rename with_id_attrs to
with_additional_id_attrs (preserving current behavior of merging
DEFAULT_ID_ATTRS with extra_attrs, deduping via attr_names and populating
id_map) and add a new constructor with_exact_id_attrs(doc, attrs) that uses only
the provided attrs (no defaults) and builds id_map from those names; update the
doc comments on both functions to describe behavior and update all call sites to
use the appropriately named function (or with_exact_id_attrs when callers expect
replacement rather than append).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8335d5b0-0cf0-4e28-bbf8-e094037ea11f
📒 Files selected for processing (4)
src/xmldsig/mod.rssrc/xmldsig/types.rssrc/xmldsig/uri.rstests/uri_integration.rs
… fragment
- Remove duplicate IDs from index to prevent signature-wrapping attacks
(ambiguous IDs fail with ElementNotFound instead of picking arbitrarily)
- Guard NodeSet.contains() and exclude_subtree() against cross-document
nodes via pointer comparison (prevents NodeId collisions)
- Reject bare "#" (empty fragment) as UnsupportedUri
- Add xpointer(id('...')) to module-level doc comment
- Clarify with_id_attrs() doc: extra_attrs appends, does not replace
- Add code comment explaining intentional lack of percent-decoding
There was a problem hiding this comment.
Pull request overview
Adds same-document URI dereferencing for XMLDSig <Reference> processing and introduces NodeSet / TransformData to feed the transform + C14N pipeline, with end-to-end integration tests validating canonical output.
Changes:
- Implement
UriReferenceResolverfor"",#id,#xpointer(/), and#xpointer(id('...'))same-document references. - Add core transform pipeline types:
TransformData,NodeSet, andTransformError. - Add integration tests that dereference URIs into
NodeSetand canonicalize via C14N to validate expected output.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/xmldsig/uri.rs |
New URI resolver that builds an ID index and returns TransformData for supported same-document URI forms. |
src/xmldsig/types.rs |
New transform pipeline data types (TransformData, NodeSet, TransformError) used by URI dereference and canonicalization predicates. |
src/xmldsig/mod.rs |
Exposes the new types and uri modules and re-exports the core types. |
tests/uri_integration.rs |
New end-to-end tests covering URI dereference → NodeSet → C14N canonicalization for multiple URI forms and SAML-like cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Track duplicate IDs in a HashSet so they are permanently excluded
from the index (fixes bypass where 3rd occurrence re-inserts via
Entry::Vacant after 2nd occurrence removed the entry)
- Add xpointer(/) and xpointer(id('...')) to uri.rs module-level docs
- Add test for triple-duplicate ID rejection
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/xmldsig/uri.rs (1)
191-204: Consider validating XPointer id() syntax more strictly (optional).The current parsing relies on matching start/end quote characters. An ID value containing a quote (e.g., hypothetical
xpointer(id('a'b'))) would parse incorrectly. For SAML/XMLDSig where IDs are NCName (no quotes allowed), this is fine. If broader XPointer compatibility is needed later, consider validating that no unescaped quotes appear in the extracted value.This is a low-priority hardening suggestion given the stated SAML focus.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/xmldsig/uri.rs` around lines 191 - 204, parse_xpointer_id currently only checks matching surrounding quotes and will accept values that contain unescaped quotes inside; update parse_xpointer_id to after extracting inner and determining the quote char, reject the value if the inner content (the returned &str slice for the id) contains the same quote character or any stray quote characters; alternatively (or additionally) validate the final id as an NCName (e.g. via a helper is_ncname function) before returning Some(...), otherwise return None. Ensure you reference parse_xpointer_id and perform the check on the &inner[1..inner.len()-1] slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/xmldsig/uri.rs`:
- Around line 191-204: parse_xpointer_id currently only checks matching
surrounding quotes and will accept values that contain unescaped quotes inside;
update parse_xpointer_id to after extracting inner and determining the quote
char, reject the value if the inner content (the returned &str slice for the id)
contains the same quote character or any stray quote characters; alternatively
(or additionally) validate the final id as an NCName (e.g. via a helper
is_ncname function) before returning Some(...), otherwise return None. Ensure
you reference parse_xpointer_id and perform the check on the
&inner[1..inner.len()-1] slice.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fda3cb91-c1ad-43c8-9e5f-c94671c2029d
📒 Files selected for processing (3)
src/xmldsig/mod.rssrc/xmldsig/types.rssrc/xmldsig/uri.rs
There was a problem hiding this comment.
Pull request overview
Adds same-document URI dereferencing for XMLDSig <Reference> processing, enabling the transform pipeline to select the correct XML node-set for canonicalization/digest computation (notably for SAML signature verification).
Changes:
- Introduces
UriReferenceResolversupporting"",#id,#xpointer(/), and#xpointer(id('...'))dereferencing. - Adds transform pipeline core types (
TransformData,NodeSet,TransformError) to represent node-sets vs binary data and support subtree exclusion. - Adds integration tests validating URI dereference →
NodeSetfiltering → C14N output end-to-end.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/xmldsig/uri.rs |
Implements same-document URI dereferencing + ID indexing and XPointer parsing. |
src/xmldsig/types.rs |
Adds TransformData, NodeSet, and TransformError used by dereference/transforms. |
src/xmldsig/mod.rs |
Documents current XMLDSig support and exposes the new modules/types. |
tests/uri_integration.rs |
Integration tests for dereference + C14N canonicalization behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Use strip_prefix/strip_suffix instead of panicking slice in
parse_xpointer_id (prevents panic on malformed input like `id(')`)
- Only mark ID as duplicate when mapped to a *different* element
(same element with both ID="x" and Id="x" is not ambiguous)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Reject xpointer(id('')) as UnsupportedUri (empty string is not a
valid XML Name, consistent with bare '#' rejection)
- Document roxmltree local-name attribute matching in with_id_attrs()
- Clarify element-only NodeId tracking in collect_subtree_ids
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/xmldsig/types.rs (1)
199-230:UriDerefvariant is unused and can be considered for removal.The
UriDeref(String)variant at line 212–213 is not used anywhere in the codebase. Thedereferenceimplementation inuri.rsusesUnsupportedUriandElementNotFoundinstead. SinceTransformErroris publicly exported, ifUriDerefis not intended for future API expansion, it should be removed to keep the error type focused on actually used cases.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/xmldsig/types.rs` around lines 199 - 230, Remove the unused UriDeref variant from the public TransformError enum: delete the UriDeref(String) arm from TransformError and its #[error(...)] attribute, then update any pattern matches or constructions referencing TransformError::UriDeref across the codebase (if any) to use the existing UnsupportedUri or ElementNotFound variants (or map to a suitable variant) so compilation still succeeds; finally run tests/build to ensure no remaining references to UriDeref remain and the public API still compiles.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/xmldsig/types.rs`:
- Around line 199-230: Remove the unused UriDeref variant from the public
TransformError enum: delete the UriDeref(String) arm from TransformError and its
#[error(...)] attribute, then update any pattern matches or constructions
referencing TransformError::UriDeref across the codebase (if any) to use the
existing UnsupportedUri or ElementNotFound variants (or map to a suitable
variant) so compilation still succeeds; finally run tests/build to ensure no
remaining references to UriDeref remain and the public API still compiles.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9f25d1e3-a9d8-4e6c-88b4-52d7f1c2fb1c
📒 Files selected for processing (2)
src/xmldsig/types.rssrc/xmldsig/uri.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/xmldsig/types.rs`:
- Around line 114-123: The NodeSet::subtree constructor accepts a doc and an
element from possibly different Documents; enforce the same-document invariant
by checking that the element belongs to the provided doc before collecting ids
(e.g., assert or return an error if element.document() or
element.owner_document() != doc), then call collect_subtree_ids(element, &mut
ids) and populate included as before; update the NodeSet::subtree function to
perform this check (referencing NodeSet::subtree, the element parameter, doc,
and collect_subtree_ids) to prevent mixing IDs from a foreign document.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
…lidate test helpers - Remove redundant doc parameter from NodeSet::subtree(), derive from element.document() to enforce same-document invariant at construction - Correct collect_subtree_ids inline note: tracks all children() nodes (elements, text, comments, PIs), not only elements - Consolidate deref_and_canonicalize test helpers into shared impl
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Summary
UriReferenceResolverwith configurable ID attribute names (ID,Id,id+ custom)NodeSetandTransformDatatypes for the transform pipelineURI Forms Supported
""(empty)#foo#xpointer(/)#xpointer(id('foo'))Technical Details
HashMap<&str, Node>index built on construction for O(1) fragment lookupsID,Id,idattributes by default (covers SAML, XMLDSig, general XML)with_id_attrs()constructor adds extra attribute names beyond the defaultsNodeSettracks included/excluded nodes with subtree exclusion (for enveloped signature transform)NodeSet.contains()validates document ownership via pointer comparison (rejects cross-document nodes)ElementNotFoundto prevent signature-wrapping attacks); same element exposing the same value via multiple scanned attrs (e.g.,ID="x"+Id="x") is not treated as duplicate#(empty fragment) rejected asUnsupportedUriparse_xpointer_id()uses safestrip_prefix/strip_suffixto avoid panics on malformed inputTest Plan
#idsubtree with namespace inheritance from excluded ancestors#xpointer(/)vs""comment behavior differenceUriReferenceResolver-D warnings)cargo test --workspace: 92 tests all passCloses #8
Summary by CodeRabbit
New Features
Documentation
Tests