Support any UTF-8 string as label name by yuri-tceretian · Pull Request #3321 · prometheus/alertmanager

yuri-tceretian · 2023-04-10T18:08:12Z

Currently, Alertmanager supports only valid Prometheus labels (i.e. ones that match the following regular expression ^[a-zA-Z_][a-zA-Z0-9_]*$). This PR expands the range of valid symbols to any symbol in UTF-8 range.

The only limitation to the label name is that it should not include only whitespace symbols.

It replaces all usage of Prometheus' model.LabelName method Valid to a new function labels.IsValidName that accepts model.LabelName.
Override types.Alert method Valid that is derived from Prometheus' model.Alert and changes validation of Labels and Annotations. The tests are copied from the Prometheus' Alert tests and expanded with a few more test cases.
Update ParseMatcher function that is used to parse string to labels.Matcher. The regular expression was updated to match any character if it is wrapped by double quotes and only Prometheus-compatible names unquoted.

Fixes #3319

Notes for reviewer: The PR can be reviewed by commit.

Update validation in UI

that returns true to strings that contain any UTF-8 characters except all whitespaces Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

code copied from model.Alert and changed label validation Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

… quotes Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

grobinson-grafana · 2023-05-10T10:24:30Z

Nice work Yuri!

I am a little concerned about adding inconsistent rules to label matchers. For example, with this PR UTF-8 is supported without double quotes for values, but not names. For example, the following matcher is accepted:

{foo=bar😊}

but this is not:

{foo😊=bar}

as it instead must be:

{"foo😊"=bar}

I don't like that there are different rules for each side of the expression. I'm interested to hear what Josh and Simon think?

grobinson-grafana · 2023-05-10T10:29:12Z

docs/configuration.md

+- A UTF-8 string, which may be enclosed in double quotes. Can be an empty string.

-The 3rd token may be the empty string. Within the 3rd token, OpenMetrics escaping rules apply: `\"` for a double-quote, `\n` for a line feed, `\\` for a literal backslash. Unescaped `"` must not occur inside the 3rd token (only as the 1st or last character). However, literal line feed characters are tolerated, as are single `\` characters not followed by `\`, `n`, or `"`. They act as a literal backslash in that case.
+Before or after each token, there may be any amount of whitespace.


This was moved to a newline, just wanted to check if that was intentional?

grobinson-grafana · 2023-05-10T10:34:04Z

pkg/labels/matcher.go

 }

 func (m *Matcher) String() string {
+	if !model.LabelName(m.Name).IsValid() {


I'm not sure I understand what this does, would it be possible to explain it in a comment?

Sure! The idea is to check whether the name is Prometheus-compatible and return it without quotes (same as it does now), and wrap it in double quotes otherwise.

grobinson-grafana · 2023-05-10T10:35:13Z

pkg/labels/labels.go

+			break
+		}
+	}
+	return !allSpaces && utf8.ValidString(lns)


Do we need utf8.ValidString if its also checked on Line 138 of parse.go?

This function is used in many other places to validate incoming labels: API calls, config parsing\validation. parse.go only parses matchers.

grobinson-grafana · 2023-05-10T10:37:05Z

pkg/labels/parse.go

+
+	name := rawName
+	// if name is quoted, then it can contain any UTF-8 character. Unescape some escape sequences.
+	if strings.HasPrefix(rawName, `"`) {


I thought we want to escape open metrics for unquoted strings, not for double quoted strings? Double quoted strings have different escape sequences right?

Suggested change

if strings.HasPrefix(rawName, `"`) {

if !strings.HasPrefix(rawName, `"`) {

Unquoted names can be only Prometheus-compatible, and therefore no escaping is allowed. Double quoted in contrast, can contain any UTF-8 characters, and therefore it can be escaped the same way as value.

grobinson-grafana · 2023-05-10T10:38:44Z

pkg/labels/parse.go

+		rawValue            = raw
+		expectTrailingQuote = false
+	)
+	if strings.HasPrefix(rawValue, `"`) {


Should be removed before the call to unescapeMatcherString I think.

No, because it is also applied to label value, which can be quoted and unquoted.

What I mean here is that unescapeMatcherString is doing two operations: unescaping escaped sequences and also checking if a double quoted string has both start and end quotes. What I was proposing was separate those out into separate functions / code.

grobinson-grafana · 2023-05-10T10:39:39Z

pkg/labels/parse.go

+	)
+	if strings.HasPrefix(rawValue, `"`) {
+		rawValue = rawValue[1:]
+		expectTrailingQuote = true


Like above, I think checking that a double quoted string is well terminated should be done somewhere else? I think I would argue that checking for terminating " and checking escape sequences inside a double quoted string are different operations?

yuri-tceretian · 2023-05-10T14:28:13Z

I am a little concerned about adding inconsistent rules to label matches.

Currently, label values can be quoted and unquoted and also can contain UTF-8 characters in both cases.

I agree that different rules for different parts of the expression can be confusing, and I can change that if maintainers agree. However, the goal of this PR is to introduce an extension to the current syntax with as less changes as possible and without breaking current configurations.

I did try to apply similar rules to both parts of the expression, and that's why I added support for escaped special characters (\n, \t, ") to label names.

Ideally, I think a matcher should just be a structured object rather than a string that needs to be parsed. That would drastically simplify code.

yuri-tceretian · 2023-05-24T16:14:19Z

Closing it as it does not seem that anyone is interested in reviewing it, and it will be superseded by #3353.

gotjosh · 2023-05-24T17:10:53Z

I have taken a look, but I have a strong preference for the approach taken in #3353 as is clearer in terms of explanation - as such we'll focus our efforts on that one.

For the future though, I would have expected to see a much better documentation (in the PR) for such a critical change - as an example I have found #3353 (comment) to be very useful to understand to what degree this is a breaking change.

yuri-tceretian · 2023-05-24T17:18:28Z

@gotjosh this PR did not break any current behavior but extended it and therefore did not require any supplemental documentation you referred to.
All other documentation was updated according to the change.

I have taken a look,

Good. For the future though, the comment would be appreciated :)

gotjosh · 2023-05-24T17:22:54Z

@gotjosh this PR did not break any current behavior but extended it and therefore did not require any supplemental documentation you referred to.

The expansion in character-set meant that previously rejected matchers would now be accepted - this can be considered a breaking change.

yuri-tceretian added 2 commits April 10, 2023 14:32

introduce labels.IsValidName

95d8aa9

that returns true to strings that contain any UTF-8 characters except all whitespaces Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

implement validate for types.Alert

56dc89d

code copied from model.Alert and changed label validation Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

yuri-tceretian force-pushed the yuri-tceretian/utf-8-label-names branch from 2c09948 to ac4726a Compare April 10, 2023 18:33

yuri-tceretian added 4 commits April 10, 2023 14:39

update silence matcher validate to use labels.IsValidName

29e0703

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

lint silence_test.go

4c01b12

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

update validation of Route.GroupBy to use labels.IsValidName

7ae44f1

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

refactor ParseMatcher. extract unescapeMatcherString

ecaf75e

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

yuri-tceretian force-pushed the yuri-tceretian/utf-8-label-names branch from ac4726a to 5f2333f Compare April 10, 2023 18:39

yuri-tceretian mentioned this pull request Apr 11, 2023

Alerting: Support filtering of alerts by state and labels grafana/grafana#65904

Merged

3 tasks

yuri-tceretian force-pushed the yuri-tceretian/utf-8-label-names branch from bb7b0e7 to 4a67770 Compare April 12, 2023 19:53

yuri-tceretian added 6 commits April 17, 2023 12:24

add validation of matcher type

d19c81a

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

fix support for double quotes

2bc49b5

update Matcher.String to return utf-8 matcher label wrapped in double…

ccc3af9

… quotes Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

add acceptance tests for v2 API

51b5f25

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

update help in amtool to mention double quotes

48c7d4e

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

update docs

024a9f0

Signed-off-by: Yuri Tseretyan <yuriy.tseretyan@grafana.com>

yuri-tceretian force-pushed the yuri-tceretian/utf-8-label-names branch from 8aba175 to 024a9f0 Compare April 17, 2023 16:25

grobinson-grafana mentioned this pull request May 5, 2023

Support UTF-8 label matchers in Alertmanager #3353

Closed

grobinson-grafana reviewed May 10, 2023

View reviewed changes

Merge branch 'up/main' into yuri-tceretian/utf-8-label-names

a3eddd9

yuri-tceretian force-pushed the yuri-tceretian/utf-8-label-names branch from 5e0210e to a3eddd9 Compare May 12, 2023 14:05

yuri-tceretian closed this May 24, 2023

yuri-tceretian deleted the yuri-tceretian/utf-8-label-names branch May 24, 2023 16:14

	if strings.HasPrefix(rawName, `"`) {
	if !strings.HasPrefix(rawName, `"`) {

Conversation

yuri-tceretian commented Apr 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grobinson-grafana commented May 10, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuri-tceretian commented May 10, 2023

Uh oh!

yuri-tceretian commented May 24, 2023

Uh oh!

gotjosh commented May 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuri-tceretian commented May 24, 2023

Uh oh!

gotjosh commented May 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yuri-tceretian commented Apr 10, 2023 •

edited

Loading

gotjosh commented May 24, 2023 •

edited

Loading