I’m pretty confused by the search params matching behavior.
First I’d note that WHATWG URL seems to (effectively?) define some (narrow) canonicalization rules via the URLSearchParams interface:
- "foo=" and "foo" are both representations of the same parameter, [ "foo", "" ], whose canonical form includes the equals sign
- percent-encoded bytes are interpreted as UTF-8 code unit sequences (or pass through as-is if they are not valid UTF-8) and if the characters thus encoded fall within the ASCII range and are not generic URL-syntax characters, they are canonicalized to their ASCII representations
Thus the search params / query strings of the following three URLs are “the same” and their canonical form is the first:
`${ new URL("https://example.test/?foo=").searchParams }`;
// → "foo="
`${ new URL("https://example.test/?foo").searchParams }`;
// → "foo="
`${ new URL("https://example.test/?%66oo").searchParams }`
// → "foo="
URLPattern doesn’t appear to consider them equivalent:
new URLPattern({ search: "foo" }).exec(origin + "?foo");
// → { hash, hostname, inputs, ... }
new URLPattern({ search: "foo" }).exec(origin + "?foo=");
// → undefined
new URLPattern({ search: "foo" }).exec(origin + "?%66oo");
// → undefined
This seems unfortunate to me — I’d rather not have to “think about” a representation distinction URLSearchParams decided has no meaning in itself (like representing the number 1 as 0x01 or 1e0). It seems to follow from this that the obvious way to say “bind (any) value” for a query param doesn’t work:
new URLPattern({ search: "foo=:foo" }).exec(origin + "?foo=xxx")?.search.groups.foo;
// → "xxx"
new URLPattern({ search: "foo=:foo" }).exec(origin + "?foo")?.search.groups.foo;
// → undefined
However this behavior is consistent with how URL proper works in that url.href doesn’t return the canonicalized version unless you “do something,” e.g. url.searchParams.delete("random"). And it’s not unreasonable to say that if you want the canonicalization behavior or URLSearchParams, you should first pass the exec input through URL and ensure it’s in that form. After all, one may want additional canonicalization behavior like sorting keys or other application-specific semantics that USP is agnostic too.
But... the canonical representation of an empty parameter value also doesn’t work:
new URLPattern({ search: "foo=:foo" }).exec(origin + "?foo=")?.search.groups.foo;
// → undefined
Note that search params representing boolean values often operate just like boolean content attributes in HTML: foo being present with any value is foo: true while foo being absent is foo: false, and the canonical true is the empty string.
There probably is a way to write a pattern that’s actually able to match all values including the empty value, but it seems pretty surprising to me if URLPattern has no awareness of USP possessing structure. Search params aren’t hierarchical, they are a list of key-value pairs, and it’s unclear to me how to use URLPattern to match on params without writing patterns that are more rather than less complex than equivalent RegExps, especially when you want to match while permitting arbitrary params that could appear between others.
Apologies if this has already been discussed and I just missed it. Matching search params is super important for my use cases and I’m struggling a bit as URLPattern has made them more difficult to match for me so far rather than easier to match.
I’m pretty confused by the search params matching behavior.
First I’d note that WHATWG URL seems to (effectively?) define some (narrow) canonicalization rules via the URLSearchParams interface:
Thus the search params / query strings of the following three URLs are “the same” and their canonical form is the first:
URLPattern doesn’t appear to consider them equivalent:
This seems unfortunate to me — I’d rather not have to “think about” a representation distinction URLSearchParams decided has no meaning in itself (like representing the number 1 as 0x01 or 1e0). It seems to follow from this that the obvious way to say “bind (any) value” for a query param doesn’t work:
However this behavior is consistent with how URL proper works in that
url.hrefdoesn’t return the canonicalized version unless you “do something,” e.g.url.searchParams.delete("random"). And it’s not unreasonable to say that if you want the canonicalization behavior or URLSearchParams, you should first pass the exec input through URL and ensure it’s in that form. After all, one may want additional canonicalization behavior like sorting keys or other application-specific semantics that USP is agnostic too.But... the canonical representation of an empty parameter value also doesn’t work:
There probably is a way to write a pattern that’s actually able to match all values including the empty value, but it seems pretty surprising to me if URLPattern has no awareness of USP possessing structure. Search params aren’t hierarchical, they are a list of key-value pairs, and it’s unclear to me how to use URLPattern to match on params without writing patterns that are more rather than less complex than equivalent RegExps, especially when you want to match while permitting arbitrary params that could appear between others.
Apologies if this has already been discussed and I just missed it. Matching search params is super important for my use cases and I’m struggling a bit as URLPattern has made them more difficult to match for me so far rather than easier to match.