Refactor processURL to preserve original URL encoding#14
Conversation
ProcessURL previously ran the query string through query-string.parse → mutate → query-string.stringify, which re-encoded values and collapsed repeated keys. That caused two classes of issue: encoded characters like %20, %2B, %25 being rewritten, and duplicate keys (?foo=1&foo=2) being lost in the round-trip. The new path keeps the raw query string as a string, uses regex to extract ch-* parameter values (decoded for internal use), and filters ch-* keys out by splitting/rejoining on &. Non-ch parameters are now preserved byte-for-byte, and duplicates are retained in order. Also drops the query-string dependency and the unused QueryObject schema export. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Refactors ProcessURL query handling to preserve original URL encoding and retain duplicate query keys, while removing the query-string dependency and related schema.
Changes:
- Update
ProcessURLto work off the raw query string, extractingch-*params separately and filtering them out without re-encoding other params. - Remove the
query-stringdependency and delete the unusedQueryObjectschema export. - Regenerate distribution artifacts (
dist/*) and update the npm lockfile accordingly.
Reviewed changes
Copilot reviewed 3 out of 16 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/common/types.ts | Removes QueryObject schema export. |
| src/common/processURL.ts | Implements raw query-string preservation + ch-* extraction/filtering. |
| package.json | Drops query-string dependency. |
| package-lock.json | Removes query-string from lockfile; also reflects package metadata (version/license). |
| dist/types/common/types.d.ts | Removes QueryObject from published type declarations. |
| dist/types/common/processURL.d.ts | Updates published types to match new ProcessURL internals. |
| dist/common/types.js | Removes QueryObject from built CommonJS output. |
| dist/common/processURL.js | Updates built CommonJS output for new query handling logic. |
| dist/crowdhandler.umd.js | Updates bundled UMD build (dependency removal + new query handling). |
| dist/crowdhandler.esm.js | Updates bundled ESM build (dependency removal + new query handling). |
| dist/crowdhandler.cjs.js | Updates bundled CJS build (dependency removal + new query handling). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try { | ||
| return decodeURIComponent(match[1]); | ||
| } catch { | ||
| return match[1]; |
There was a problem hiding this comment.
decodeURIComponent does not translate + into a space, whereas the previous query-string.parse behavior treated + as a space in query parameters. This changes decoding semantics for ch-* values if they come in as form-encoded (e.g. ch-code=hello+world). Consider normalizing + (e.g. replace with %20 before decoding) or using URLSearchParams for decoding the ch-* params to preserve prior behavior.
| try { | |
| return decodeURIComponent(match[1]); | |
| } catch { | |
| return match[1]; | |
| const normalizedValue = match[1].replace(/\+/g, "%20"); | |
| try { | |
| return decodeURIComponent(normalizedValue); | |
| } catch { | |
| return match[1].replace(/\+/g, " "); |
extractParamValue used plain decodeURIComponent, which does not convert + to space. Every other piece of the CrowdHandler stack — the CloudFront integration (query-string lib), the Cloudflare Worker integration (URLSearchParams), and the SDK's own pre-refactor behaviour — follows the HTML form-encoding convention where + in a query value is a space. The divergence is not just cosmetic: ch-* values feed into signature checks, so any mismatch between how the edge and the SDK decode them breaks validation. Pre-normalize + to %20 before decodeURIComponent, and preserve the + → space behaviour on the malformed-encoding fallback path too. Only applies to ch-* extraction — non-ch params are still preserved byte-for-byte in the rebuilt targetURL (that was the whole point of the earlier refactor and is unaffected). Caught by Copilot review on the encoding-review PR. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
ProcessURL previously ran the query string through query-string.parse → mutate → query-string.stringify, which re-encoded values and collapsed repeated keys. That caused two classes of issue: encoded characters like %20, %2B, %25 being rewritten, and duplicate keys (?foo=1&foo=2) being lost in the round-trip.
The new path keeps the raw query string as a string, uses regex to extract ch-* parameter values (decoded for internal use), and filters ch-* keys out by splitting/rejoining on &. Non-ch parameters are now preserved byte-for-byte, and duplicates are retained in order.
Also drops the query-string dependency and the unused QueryObject schema export.