Skip to content

fix(utils): include subpath in resolveCustomerSecretsName to prevent credential collisions#1577

Open
alinarublea wants to merge 5 commits intomainfrom
fix/subpath-customer-secrets-name
Open

fix(utils): include subpath in resolveCustomerSecretsName to prevent credential collisions#1577
alinarublea wants to merge 5 commits intomainfrom
fix/subpath-customer-secrets-name

Conversation

@alinarublea
Copy link
Copy Markdown
Contributor

Summary

  • Subpath sites on the same domain (e.g. nba.com/kings and nba.com/lakers) were colliding on a single Secrets Manager key because resolveCustomerSecretsName used only the hostname
  • Now includes the URL path in the secret key, so each subpath site gets its own isolated secret
  • Backward compatible: root-domain sites (no path) produce the same key as before

Fixes LLMO-4186

Test plan

  • New unit tests verify subpath sites produce distinct secret names
  • Existing tests confirm root-domain behavior is unchanged
  • 100% line/branch coverage maintained

🤖 Generated with Claude Code

…credential collisions

Subpath sites on the same domain (e.g. nba.com/kings, nba.com/lakers) were
sharing a single Secrets Manager key because only the hostname was used.
Now the full path is included so each subpath site gets its own secret key.
Backward compatible: root-domain sites produce the same key as before.

Fixes LLMO-4186

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@alinarublea alinarublea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review

The fix is correct and backward compatible. Two small observations:

url.host vs url.hostname
The implementation uses url.host (which includes the port, e.g. nba.com:8080) rather than url.hostname. This matches the pre-existing behavior so it is not a regression, but generateDataFolder in spacecat-api-service (the companion fix for LLMO-4186) uses url.hostname. Worth aligning the two for consistency, or at minimum documenting the difference.

Missing test case for nested paths
https://nba.com/us/kingsnba_com_us_kings is the expected behavior (correct), but there is no test covering nested subpaths. A single extra assertion in the new test would document this and guard against future regressions.

Neither is a blocker — the core fix is solid.

- Switch from url.host to url.hostname for consistency with generateDataFolder
- Add test for nested subpaths (e.g. /us/kings)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

This PR will trigger a patch release when merged.

Copy link
Copy Markdown
Contributor

@dzehnder dzehnder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @alinarublea,

Thanks for the focused fix and the upfront self-review. The core change is correct and the tests cover the headline scenarios. The cluster of concerns below is mostly about what happens to credentials that already exist under the old key shape, plus a few normalization edges that the new path-based key does not fully close.

Strengths

  • The fix correctly closes a real cross-tenant credential collision: distinct subpath customers on the same hostname were sharing one Secrets Manager key (packages/spacecat-shared-utils/src/helpers.js:54).
  • Backward compatibility for root-domain sites is preserved by the pathSuffix ? ... : host ternary; the new URL('https://nba.com/').pathname === '/' case is correctly handled by the ^\/|\/$ strip.
  • Switching from url.host to url.hostname was the right correctness move - a port should not be part of an identity key.
  • The new tests in packages/spacecat-shared-utils/test/helpers.test.js:93-117 cover the primary collision scenario, root trailing-slash idempotence, and nested subpaths. The pre-existing 'resolves the customer secrets name correctly with valid inputs' test (line 65) implicitly anchors the "no path - same as before" claim.
  • Both points raised in the self-review (host vs hostname, nested-path test) were addressed before review was requested. Clean change discipline.

Issues

Important (Should Fix)

  1. Migration impact for existing subpath customers is unaddressed (packages/spacecat-shared-utils/src/helpers.js:54)

    The PR description says "Backward compatible," but this is only true for sites with no path. Any customer whose baseURL already carries a path had its credentials written under customer-secrets/<host>/<ver>. After this PR ships, the same baseURL resolves to customer-secrets/<host>_<path>/<ver>, a key that does not exist in Secrets Manager. The downstream consumers (spacecat-shared-google-client, spacecat-shared-ims-client, spacecat-shared-content-client, plus spacecat-auth-service Google/SharePoint handlers) will start failing those lookups. content-client swallows the lookup error at log.debug and silently degrades; IMS auth raises and breaks page authentication.

    Two paths forward, pick one:

    • Confirm in the PR description (and ideally in LLMO-4186) that no production Site has a non-empty path in baseURL today, so the "backward compatible" claim is in practice "backward compatible for all current sites."
    • If subpath sites do exist, coordinate a one-time secret rename in Secrets Manager before deploy, or add a transitional dual-read in callers (try new key, fall back to legacy hostname-only key, log a warning) for one release cycle.
  2. Path-segment separator collision narrows but does not close the original bug class (packages/spacecat-shared-utils/src/helpers.js:53)

    Both / and any non-alphanumeric character collapse to _. As a result:

    • nba.com/us/kings and nba.com/us-kings both resolve to nba_com_us_kings.
    • nba.com/us/kings and nba.com/us_kings also collide.
    • nba.com/foo bar and nba.com/foo%20bar collide on _-runs.

    Two legitimately distinct customers landing on these forms reproduces the exact bug class this PR set out to fix. Not currently exploitable as cross-customer auth bypass because baseURL is server-stored from site.getBaseURL(), but it leaves the door open for the next onboarding incident. Either pick a separator that cannot appear in the sanitized output (e.g. join host and path-segments with a delimiter that never collapses), or hash the pathname (e.g. first 12 hex of sha256) and append. Both are small and bounded to this function.

  3. JSDoc does not document the new behavior (packages/spacecat-shared-utils/src/helpers.js:41-46)

    The doc was thin before and is now actively misleading - readers will not know that the path component participates in the key. At minimum: "The hostname and (if present) URL path are normalized (non-alphanumeric replaced with _, lowercased) and combined with _. Sites with the same hostname but different paths resolve to distinct secrets." A worked example of the format (/helix-deploy/spacecat-services/customer-secrets/<host>[_<path>]/<version>) would also help the next person debugging "why is this customer's secret missing." Consider adding a @since note recording the format change so the breadcrumb is discoverable.

Minor (Nice to Have)

  1. Repeated-slash path normalization (packages/spacecat-shared-utils/src/helpers.js:53)

    The strip-slash regex /^\/|\/$/g is a single-character alternation - it strips at most one leading and one trailing slash. nba.com//kings (sometimes produced by URL builders) yields _kings, different from nba.com/kings -> kings. Same risk class as the original bug, less likely to be hit in practice. Either collapse with ^\/+|\/+$, or normalize the path uniformly with pathname.split('/').filter(Boolean).join('_').replace(/[^a-zA-Z0-9_]/g, '_').toLowerCase() (which also handles dot-segments).

  2. Missing subpath trailing-slash idempotence test (packages/spacecat-shared-utils/test/helpers.test.js)

    The trailing-slash idempotence test only covers root URLs. The same property should hold for subpaths and is not asserted:

    expect(resolveCustomerSecretsName('https://nba.com/kings', ctx))
      .to.equal(resolveCustomerSecretsName('https://nba.com/kings/', ctx));

    A future refactor of the path regex could silently break this.

  3. Path case-folding is undocumented (packages/spacecat-shared-utils/src/helpers.js:53)

    nba.com/Kings and nba.com/kings collapse to the same key because of .toLowerCase() on the path. RFC 3986 says path segments are case-sensitive, so this is a deliberate normalization choice. A one-line comment on the .toLowerCase() (or a test) makes the intent visible to a future contributor who might "fix" it.

Recommendations

  • Consider extracting the host+path-to-key transformation into a small named helper (urlToCustomerSlug or similar) with focused unit tests covering the normalization edges (encoded chars, repeated slashes, query/fragment, mixed case). This is load-bearing logic; isolating it pays for itself the next time someone needs to extend it.
  • Add a length guard. AWS Secrets Manager caps secret names at 512 chars; the basePath consumes ~46. Deep subpaths could plausibly approach that ceiling. A length check would surface this earlier than a Secrets Manager API rejection at runtime.
  • Forward-looking: the canonical place to enforce the customer-identity contract is the Site model. A future site.getCustomerSecretId() would shrink the surface area where the convention can drift across the four downstream consumers.

Out of scope, worth tracking

  • The companion generateDataFolder fix in spacecat-api-service is correctly excluded here, but worth tracking that the two functions encode the same identity concept using compatible conventions (this PR uses _, generateDataFolder reportedly uses -). A drift between the two reproduces the original bug class on the data-folder side.
  • Downstream packages (spacecat-shared-google-client, spacecat-shared-ims-client, spacecat-shared-content-client, plus spacecat-auth-service handlers) will need a coordinated bump of the @adobe/spacecat-shared-utils dependency so a subpath site's auth flow does not write to the new key while a different service still reads the old one.

Assessment

Ready to merge? With fixes.

Reasoning: the bug fix is correct, minimal, and well-tested for the targeted regression. The main blocker is the migration story - the PR description should explicitly confirm whether any in-production Site already has a subpath baseURL, since the answer determines whether this is a clean ship or a coordinated cutover. The JSDoc update and the separator-collision narrowing are small and worth landing in this PR.

Next Steps

  1. Resolve the migration question first - either confirm "no subpath sites in prod today" in the PR description / Jira, or put a coordinated cutover plan in place.
  2. Tighten the separator collision (Important item 2) and update the JSDoc (Important item 3).
  3. The three Minor items can land here or in a small follow-up.

const host = url.hostname.replace(/[^a-zA-Z0-9]/g, '_').toLowerCase();
const pathSuffix = url.pathname.replace(/^\/|\/$/g, '').replace(/[^a-zA-Z0-9]/g, '_').toLowerCase();
customer = pathSuffix ? `${host}_${pathSuffix}` : host;
} catch {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important - Migration impact for existing subpath customers

The PR description says "Backward compatible," but that holds only for sites with no path. Any customer whose baseURL already carries a path had its credentials written under customer-secrets/<host>/<ver>. After this ships, the same URL resolves to customer-secrets/<host>_<path>/<ver> - a key that does not exist in Secrets Manager. Downstream consumers (spacecat-shared-google-client, spacecat-shared-ims-client, spacecat-shared-content-client, plus spacecat-auth-service Google/SharePoint handlers) will start failing those lookups; content-client swallows the error at log.debug and silently degrades, while IMS auth raises and breaks page authentication.

Pick one:

  • Confirm in the PR description (and LLMO-4186) that no production Site has a non-empty path in baseURL today, so the "backward compatible" claim is in practice "backward compatible for all current sites."
  • If subpath sites already exist, coordinate a one-time secret rename in Secrets Manager before deploy, or add a transitional dual-read in callers (try new key, fall back to legacy hostname-only key, log a warning) for one release cycle.

const url = new URL(baseURL);
const host = url.hostname.replace(/[^a-zA-Z0-9]/g, '_').toLowerCase();
const pathSuffix = url.pathname.replace(/^\/|\/$/g, '').replace(/[^a-zA-Z0-9]/g, '_').toLowerCase();
customer = pathSuffix ? `${host}_${pathSuffix}` : host;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important - Separator collision narrows but does not close the original bug class

Both / and any non-alphanumeric character collapse to _. As a result:

  • nba.com/us/kings and nba.com/us-kings both resolve to nba_com_us_kings
  • nba.com/us/kings and nba.com/us_kings also collide
  • nba.com/foo bar and nba.com/foo%20bar collide on _-runs

Two legitimately distinct customers landing on these forms reproduces the exact bug class this PR set out to fix. Not currently exploitable as cross-customer auth bypass because baseURL is server-stored from site.getBaseURL(), but it leaves the door open for the next onboarding incident. Either pick a separator that cannot appear in the sanitized output, or hash the pathname (e.g. first 12 hex of sha256) and append. Both are small and bounded to this function.

Minor (same line) - Repeated-slash normalization

The strip-slash regex /^\/|\/$/g strips at most one leading and one trailing slash. nba.com//kings yields _kings, different from nba.com/kings -> kings. Either collapse with ^\/+|\/+$, or normalize uniformly with pathname.split('/').filter(Boolean).join('_') (which also handles dot-segments).

Minor (same line) - Path case-folding is undocumented

nba.com/Kings and nba.com/kings collapse to the same key because of .toLowerCase() on the path. RFC 3986 says path segments are case-sensitive, so this is a deliberate choice. A one-line comment on the .toLowerCase() (or a test) makes the intent visible to a future contributor who might "fix" it.

@@ -47,7 +47,10 @@ export function resolveCustomerSecretsName(baseURL, ctx) {
const basePath = '/helix-deploy/spacecat-services/customer-secrets';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important - JSDoc does not document the new behavior

The existing JSDoc (just above this line) was thin before and is now actively misleading - readers will not know that the path component participates in the key. At minimum: "The hostname and (if present) URL path are normalized (non-alphanumeric replaced with _, lowercased) and combined with _. Sites with the same hostname but different paths resolve to distinct secrets."

A worked example of the format (/helix-deploy/spacecat-services/customer-secrets/<host>[_<path>]/<version>) would also help the next person debugging "why is this customer's secret missing." Consider adding a @since note recording the format change so the breadcrumb is discoverable.

- Split pathname into segments and join with '__' so 'nba.com/us/kings' and
  'nba.com/us-kings' produce distinct keys
- Use split('/').filter(Boolean) to handle repeated slashes correctly
- Update JSDoc with key format and examples
- Add trailing-slash idempotence test for subpaths
- Add separator-collision test

Addresses review feedback on LLMO-4186

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@alinarublea
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @dzehnder — all points addressed in the latest commits:

Migration impact: No production sites currently have a non-empty path in baseURL — subpath support is new functionality introduced by LLMO-4186. The "backward compatible" claim holds for all existing sites.

Separator collision (important #2): Fixed. Switched to split('/').filter(Boolean) on the pathname and joining segments with __ (double underscore). Since each sanitized segment can only contain single underscores, __ can never appear in a segment and acts as an unambiguous delimiter. nba.com/us/kingsnba_com__us__kings, nba.com/us-kingsnba_com__us_kings — distinct. Added a test asserting this.

Repeated-slash normalization (minor #1): Fixed as a side-effect of the segment split — split('/').filter(Boolean) discards empty segments, so //kings and /kings both yield ['kings'].

JSDoc (important #3): Updated with format description, delimiter rationale, and worked examples.

Subpath trailing-slash idempotence test (minor #2): Added.

The companion PR (spacecat-api-service#2315 generateDataFolder) received the same treatment with -- as the segment delimiter, for consistency.

@alinarublea alinarublea requested a review from dzehnder May 4, 2026 13:18
…olding intent

- decodeURIComponent each path segment before sanitizing so %20 and
  equivalent decoded forms map to the same key
- Collapse consecutive _ runs after sanitizing (e.g. %20%20 -> single _)
- Document deliberate case-folding of path segments in JSDoc
- Add tests for percent-encoded paths, uppercase paths, and double slashes

Addresses remaining review feedback on LLMO-4186

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@alinarublea
Copy link
Copy Markdown
Contributor Author

Addressing the remaining items:

Minor #3 (case-folding undocumented): Added explicit note to JSDoc — "paths are case-folded deliberately so /Kings and /kings map to the same key" — and a test asserting this.

decodeURIComponent (parallel with companion PR): Added percent-decode for each segment before sanitizing, with a safe fallback on malformed encoding. Added test for %C3%B6 (ö).

Consecutive _ collapse: Added .replace(/_+/g, '_') so e.g. %20%20 (two encoded spaces) produces one _ rather than two.

Double-slash test: Added — https://nba.com//kings equals https://nba.com/kings via filter(Boolean).

All 1043 tests pass.

Copy link
Copy Markdown
Contributor

@dzehnder dzehnder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @alinarublea,

Thanks for the quick turnaround on the prior round - the new commit is well-targeted and clearly tries to close the collision class properly. One residual issue: the new JSDoc makes a categorical injectivity claim that the sanitizer does not actually deliver, and the new collision test is too narrow to catch it. Both are fixable with a one-line regex tweak plus a couple of extra assertions.

Strengths

Previously flagged issues now addressed:

  • Path-segment separator collision (prior Important): the new per-segment sanitization joined by __ eliminates the original nba.com/us/kings vs nba.com/us-kings collision (packages/spacecat-shared-utils/src/helpers.js:62-67).
  • JSDoc (prior Important): the docstring now describes the format, the per-segment normalization, the __ delimiter, and gives three worked examples for root, single-segment, and nested-segment URLs (packages/spacecat-shared-utils/src/helpers.js:40-52).
  • Repeated-slash normalization (prior Minor): split('/').filter(Boolean) cleanly handles nba.com//kings and similar, as a side effect.
  • Subpath trailing-slash idempotence test (prior Minor): dedicated test added at packages/spacecat-shared-utils/test/helpers.test.js:113-117.
  • Path case-folding (prior Minor): now documented in the expanded JSDoc.
  • Migration impact for existing subpath customers (prior Important): prior concern resolved by author response - subpath support is new functionality from LLMO-4186, no existing sites carry a non-empty path. The new format has no migration footprint inside spacecat-shared today.

Issues

Important (Should Fix)

  1. Injectivity claim is false; the new approach still admits collisions on consecutive non-alphanumeric inputs (packages/spacecat-shared-utils/src/helpers.js:62-67, JSDoc at :43-46)

    The JSDoc states: "The double-underscore delimiter cannot appear in a sanitized segment, so distinct paths cannot collide." This is not actually true. The sanitizer replace(/[^a-zA-Z0-9]/g, '_') is per-character and does not collapse runs, and _ itself is non-alphanumeric so it is preserved 1:1. Two consecutive non-alphanumeric characters in a single path segment therefore produce __ inside the segment, indistinguishable from a segment boundary.

    Verified collisions (all produce the same final secret name):

    https://nba.com/us..kings    -> nba_com__us__kings
    https://nba.com/us/kings     -> nba_com__us__kings   COLLIDE
    https://nba.com/us-_kings    -> nba_com__us__kings   COLLIDE
    https://nba.com/us__kings    -> nba_com__us__kings   COLLIDE (literal __ in path)
    https://nba.com//us//kings   -> nba_com__us__kings   COLLIDE (filter(Boolean) collapses runs of /)
    

    Plus intra-segment, all single-segment paths whose two halves are separated by any non-alphanumeric character collide on us_kings:

    nba.com/us-kings   nba.com/us_kings   nba.com/us.kings   nba.com/us kings   -> nba_com__us_kings
    

    Why it matters: this is the same security class the PR is meant to close - two distinct baseURLs mapping to one Secrets Manager key. Severity is bounded today by the admin-only createSite gate and the absence of production subpath sites, so I am calling this Important rather than Critical. But: the PR's whole premise is "close the collision class" and the JSDoc explicitly promises injectivity, so shipping a docstring guarantee that the code does not honor sets up a foot-gun for the next contributor and for any caller that builds on the contract.

    The new test at packages/spacecat-shared-utils/test/helpers.test.js:127-132 only asserts nba.com/us/kings != nba.com/us-kings. None of the residual collisions above are covered, so the test does not actually pin the property the JSDoc claims.

    Pick one fix:

    • Tighten the sanitizer (smallest change, recommended): collapse runs of non-alphanumeric to a single _ per segment. One-character regex tweak:

      const segments = url.pathname.split('/').filter(Boolean)
        .map((seg) => seg.replace(/[^a-zA-Z0-9]+/g, '_').toLowerCase());

      With +, us..kings becomes us_kings, and joined-with-__ paths can no longer match a single-segment input. The JSDoc claim becomes architecturally true.

    • Or weaken the JSDoc to "best-effort distinctness for typical baseURL shapes; pathological inputs may collide." Worst option for a credentials-routing function.

    • Or hash the canonical form (host, segments) and use the hash as the secret-name suffix. Eliminates the injectivity question entirely. Heavier change.

    Whichever path, also extend the separator-collision test to pin the property:

    expect(resolveCustomerSecretsName('https://nba.com/us..kings', ctx))
      .to.not.equal(resolveCustomerSecretsName('https://nba.com/us/kings', ctx));
    expect(resolveCustomerSecretsName('https://nba.com/us__kings', ctx))
      .to.not.equal(resolveCustomerSecretsName('https://nba.com/us/kings', ctx));

Recommendations

  • Adopt the + regex change above. It is a one-character edit, makes the docstring accurate, preserves all currently-passing tests, and locks the collision class shut.
  • After this PR merges, raise the same fix on the companion generateDataFolder (in spacecat-api-service) - it has the identical residual collision with -- instead of __. Out of scope here; worth a follow-up ticket so the two functions stay aligned on the injectivity property.
  • The __ delimiter doubles per-segment overhead and brings the AWS Secrets Manager 512-char name limit a bit closer. Realistic baseURLs are nowhere near that ceiling, but if the team ever wants the limit to be a non-issue, the hash-based key suggestion above is the cleanest answer. Not for this PR.

Out of scope, worth tracking

  • The companion fix in spacecat-api-service carries the same residual collision class (-- separator with the same regex). One mention here, no separate finding.

Assessment

Ready to merge? With one fix.

Reasoning: every prior finding is genuinely addressed and the design is the right shape. The remaining issue is that the JSDoc advertises injectivity that the sanitizer does not provide, and the new test does not catch the residual collisions. The fix is a one-character regex change plus two additional assertions - small enough to land here rather than defer.

Next Steps

  1. Apply the [^a-zA-Z0-9]+ (collapse-runs) regex change in the per-segment sanitizer.
  2. Add the two adversarial assertions to the separator-collision test.
  3. The recommendations are optional.

try {
customer = new URL(baseURL).host.replace(/[^a-zA-Z0-9]/g, '_').toLowerCase();
const url = new URL(baseURL);
const host = url.hostname.replace(/[^a-zA-Z0-9]/g, '_').toLowerCase();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important - Injectivity claim is false; collision class still open

The new JSDoc (line 43-46) states "The double-underscore delimiter cannot appear in a sanitized segment, so distinct paths cannot collide." This isn't true. The sanitizer is per-character and _ is non-alphanumeric so two adjacent non-alphanumerics in one segment produce __ inside the segment, indistinguishable from a segment boundary.

Verified collisions (all produce nba_com__us__kings):

https://nba.com/us/kings
https://nba.com/us..kings   (consecutive `..`)
https://nba.com/us-_kings   (mixed `-_`)
https://nba.com/us__kings   (literal `__` in path)
https://nba.com//us//kings  (filter(Boolean) collapses runs of `/`)

Plus intra-segment: us-kings, us_kings, us.kings, us kings all collapse to us_kings.

Severity is bounded today by the admin-only createSite gate and the absence of production subpath sites, so this stays Important rather than Critical. But the PR's whole purpose is to close this collision class on a credentials function, and shipping a JSDoc guarantee that the code does not honor sets up a foot-gun.

The new test at helpers.test.js:127-132 only asserts us/kings != us-kings; none of the collisions above are covered.

One-line fix (recommended): collapse runs of non-alphanumeric to a single _ per segment:

const segments = url.pathname.split('/').filter(Boolean)
  .map((seg) => seg.replace(/[^a-zA-Z0-9]+/g, '_').toLowerCase());

The + makes us..kings -> us_kings (single underscore), so joined-with-__ paths can no longer match a single-segment input. JSDoc claim becomes architecturally true; all current tests still pass.

Add adversarial assertions to lock it in:

expect(resolveCustomerSecretsName('https://nba.com/us..kings', ctx))
  .to.not.equal(resolveCustomerSecretsName('https://nba.com/us/kings', ctx));
expect(resolveCustomerSecretsName('https://nba.com/us__kings', ctx))
  .to.not.equal(resolveCustomerSecretsName('https://nba.com/us/kings', ctx));

The companion generateDataFolder in spacecat-api-service has the identical residual collision with --; worth a follow-up there so the two functions stay aligned on the injectivity property.

The per-segment sanitizer now uses /[^a-zA-Z0-9]+/g (with +) to collapse
consecutive non-alphanumeric characters into a single underscore in one
pass, removing the separate replace(/_+/g, '_') step. This makes the
JSDoc injectivity claim architecturally accurate: a sanitized segment
can never contain __, so the __ segment delimiter is unambiguous.

Added three new tests:
- 'us..kings' (dots) does not collide with 'us/kings' (two segments)
- 'us__kings' (literal double underscore) does not collide with 'us/kings'
- Malformed percent-encoding (%ZZ) is handled without throwing (covers
  the decodeURIComponent catch branch).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants