Skip to content

Fix keywords not recognized as NCNames in QNames#142

Merged
faassen merged 1 commit intoPaligo:mainfrom
yrashk:fix/keyword-ncname-in-qnames
Mar 20, 2026
Merged

Fix keywords not recognized as NCNames in QNames#142
faassen merged 1 commit intoPaligo:mainfrom
yrashk:fix/keyword-ncname-in-qnames

Conversation

@yrashk
Copy link
Contributor

@yrashk yrashk commented Mar 19, 2026

Problem

XPath expressions using qualified names where the local name is a keyword fail to parse. For example, given this XML:

<data xmlns:ex="http://example.com/ex">
  <ex:child>found child</ex:child>
  <ex:parent>found parent</ex:parent>
  <ex:self>found self</ex:self>
  <ex:descendant>found descendant</ex:descendant>
  <ex:ancestor>found ancestor</ex:ancestor>
  <ex:or>found or</ex:or>
  <ex:and>found and</ex:and>
</data>

The XPath //ex:child/string() fails because the lexer doesn't combine NCName("ex") + Colon + Child into a single PrefixedQName token. This affects all axis names (child, parent, self, ancestor, descendant, following, preceding, namespace, and their compound forms) as well as other keywords (and, or, div, mod, for, let, etc.) when used as the local name in a prefixed QName.

Unprefixed usage (e.g. //child as an element name) is not affected — the parser-level parser_keyword() already handles that correctly. A test is included to confirm this.

Validation with Saxon HE 12.9

All expressions return the expected results:

=== Prefixed keyword element names ===
//ex:child/string() => found child
//ex:parent/string() => found parent
//ex:self/string() => found self
//ex:descendant/string() => found descendant
//ex:ancestor/string() => found ancestor
//ex:or/string() => found or
//ex:and/string() => found and

=== Unprefixed keyword element name ===
//child/string() => found child

Root cause

Token::ncname() in xee-xpath-lexer/src/reserved.rs only listed the reserved function names from XPath 3.1 spec section A.3 (like map, array, function, if, etc.) but omitted axis names and other keywords. The ExplicitWhitespace iterator calls this method to decide whether a token after a : can form part of a PrefixedQName. Since e.g. Token::Child.ncname() returned None, the combination failed.

Per the XPath 3.1 spec:

"Keywords in XPath 3.1 use lower-case characters and are not reserved—that is, names in XPath 3.1 expressions are allowed to be the same as language keywords, except for certain unprefixed function-names listed in A.3 Reserved Function Names."

Fix

Add all keyword tokens to Token::ncname() to match what the parser-level parser_keyword() already handles. This makes the lexer correctly combine e.g. NCName("ex") + Colon + Child into PrefixedQName { prefix: "ex", local_name: "child" }.

Tests

Lexer-level (xee-xpath-lexer/src/explicit_whitespace.rs):

  • test_prefixed_qname_with_axis_name_local_names — all 13 axis names as local names
  • test_prefixed_qname_with_keyword_local_names — 23 other keywords as local names
  • test_prefixed_qname_with_keyword_prefix — keyword as the prefix part

End-to-end (xee-xpath/tests/xpath.rs):

  • test_keyword_element_name_unprefixed — confirms //child works without namespace (not affected)
  • test_keyword_qname_{child,parent,self,descendant,ancestor,or,and} — full XPath evaluation against namespaced XML

Token::ncname() only included reserved function names (XPath 3.1 A.3)
but omitted axis names and other keywords. This caused the lexer to fail
combining tokens like `ex:child` into a PrefixedQName, since
Token::Child.ncname() returned None.

Per the spec: "Keywords in XPath 3.1 [...] are not reserved—that is,
names in XPath 3.1 expressions are allowed to be the same as language
keywords."

Add all keyword tokens to Token::ncname() to match what the parser-level
parser_keyword() already handles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@faassen
Copy link
Collaborator

faassen commented Mar 20, 2026

Thank you! It interested me to see whether any of the XPath tests caught this one, but there were no tests of that nature!

@faassen faassen merged commit 0293025 into Paligo:main Mar 20, 2026
1 check passed
@github-actions github-actions bot mentioned this pull request Mar 20, 2026
@yrashk
Copy link
Contributor Author

yrashk commented Mar 20, 2026

Thank you for merging this so quickly. What's your bugfix release schedule policy – when do you think this can hit crates.io?

🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants