Spec needs an update / 0.3#522
Closed
masklinn wants to merge 1 commit intoua-parser:masterfrom
masklinn:0.3
Closed
Spec needs an update / 0.3#522masklinn wants to merge 1 commit intoua-parser:masterfrom masklinn:0.3
masklinn wants to merge 1 commit intoua-parser:masterfrom
masklinn:0.3
Conversation
UA spec I feel has major issues, especially that respecting it makes it impossible for tests to pass, leading to implementations having to copy one another or do their own random hack (as the C# implementation does). There are two major issues in the 0.2 spec: Replacement fields templating ----------------------------- It states that the OS fields have individual replacements, that is match `$1` is only used in `os_replacement`, match `$2` is only used in `os_v1_replacement`, etc... However there are multiple test suites which use `$1` in `os_v1_replacement` (macos, win, Box.Windows), and macOS also uses `$2` and `$3` in shifted position. The reference implementation handle this by just making the OS fields into "full" templates (that is all groups are available to all replacement fields). The C# implementation instead tries to mess around with different orders for v1 and v2 based on what it finds there. Obviously it makes sense to standardise the behaviour of the standard implementation instead of the hack of the C# lib. But for uniformity and to allow for less redundant explanations, I think it also makes sense to make the `user_agent_parsers` fields into "full" templates (that is, all groups are available) even if no parser currently uses that. In fact that's how the Python and Go implementations behave already. Side-note: far from extending the spec, the reference implementation doesn't even implement it in full as it only supports replacing `$1` in `family_replacement`, it has no support for templating at all in `v1_replacement`, `v2_replacement`, or `v3_replacement`. `Device#model_replacement` can't be required -------------------------------------------- Despite what the spec currently says, one of the user agents (`Opera/9.80 (BlackBerry; Opera Mini/7.0.31437/28.3030; U; en)`) has no capturing group and no `model_replacement`, so it's not possible to parse it per-expectation if `model_replacement` is required. As such, only `Device#device_replacement` should be required. Add `patch_minor` to user agent ------------------------------- `patch_minor` was added to the test file and some regexes starting at #322 but it was never added to the spec, so do that at least. It should also be added to the reference implementation eventually, probably.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The spec as it stands can't be used as the basis for an implementation: following its descriptions / instructions means it's impossible to pass the test suite, and leads some implementations to use weird hacks.
There are two minor and one major issue in the 0.2 spec.
minor issue:
UserAgent#patch_minorpatch_minorwas added to the test file and some regexes starting at #322 but it was never added to the spec, so do that at least. It should also be added to the reference implementation eventually, probably.minor issue:
Device#model_replacementcan't be requiredThe 0.2 spec states that the
modelfield is required (either$1should have matched content, ormodel_replacementshould be non-empty:But doing so fails the following test case:
That is because this is matched by the following item:
The item has no capturing group at all, and has no
model_replacement. The item goes against spec.major issue: os needs to be fully templated
The spec doesn't currently have a concept for the difference, but in this text I will use the following lingo:
family_replacementandos_replacement) can only use the placeholder$1$1and$9The 0.2 spec specifies
In its specification of the OS category, it defines
os_replacementas matching group 1 and having$1available,os_v1_replacementas matching group 2 and having$2available, etc...However OS has multiple items which are defined with
os_v1_replacementusing group 2, and more:This means the corresponding test cases will necessarily fail if the spec is implemented as stated. Furthermore while the implementations I've checked have extended the spec in order to handle these cases they have done so inconsistently:
The latter is extremely brittle, and I think full templating of the OS category is the proper response.
fully templating user agent
For coherence, simplicity, and clarity, I think full templating should also be extended to the user agent category, even if there is at this point no use case for it.
From what I've seen, that already seems to be the behaviour of the python and go implementations.
Furthermore far from extending the spec the reference implementation doesn't even implement the spec on the user agent category: it supports restricted templating for
family_replacement, but the v1, v2, and v3 fields are non-templated when per-spec they should be restrict-templated.unhappy with the draft
I've set forth a proposal rewrite of the spec organised thus:
a. how categories are defined
b. the parsing algorithm (as it's the same for all categories)
c. description of the regex field(s)
d. description of templated (replacement) fields and "full templating"
However while I think the narrative arc is fine I'm having real trouble with the writing so far, I'm far from happy about how it's written up, and would be glad for assistance.