Skip to content

feat: Add field to add unspecified value to metric#996

Merged
collin-lee merged 4 commits intoenvoyproxy:mainfrom
xuannam230201:add-field-to-add-unspecified-value-to-metric
Nov 29, 2025
Merged

feat: Add field to add unspecified value to metric#996
collin-lee merged 4 commits intoenvoyproxy:mainfrom
xuannam230201:add-field-to-add-unspecified-value-to-metric

Conversation

@xuannam230201
Copy link
Copy Markdown
Contributor

@xuannam230201 xuannam230201 commented Oct 31, 2025

Add value_to_metric field to include descriptor values in metrics

Summary

This PR adds a new optional field value_to_metric (default: false) to each descriptor in the rate limit configuration. When enabled, it includes the descriptor's runtime value in the metric key, even when the descriptor value is not explicitly defined in the configuration. This provides visibility into different rate limit scenarios without needing to pre-define every possible value.

Problem

Previously, when a descriptor matched a value that wasn't explicitly defined in the configuration (i.e., matched via a default key without value), the metric key would only include the descriptor key, not the actual runtime value. This made it difficult to track and analyze rate limiting metrics for different runtime values without using detailed_metric, which includes values for all descriptors and can lead to high cardinality.

Solution

The new value_to_metric field allows users to selectively include runtime values in metric keys for specific descriptors, providing granular control over metric cardinality while still maintaining visibility into important descriptor values.

Behavior

  • Default behavior: When value_to_metric is false (default) or not set, the behavior remains unchanged - descriptors matched via default keys only include the key name in metrics.

  • With value_to_metric: true: When enabled on a descriptor:

    • If the descriptor matches via a default key (no explicit value in config), the runtime value is included in the metric key: domain.key_value.subkey
    • If the descriptor matches via an explicit key+value or wildcard, the runtime value is always included in the metric key
    • When combined with wildcard matching, the full runtime value is included, not just the wildcard prefix
  • Precedence: When detailed_metric: true is set on a descriptor, it takes precedence and value_to_metric is ignored for that descriptor (to maintain backward compatibility).

Example

Configuration:

domain: domain
descriptors:
  - key: route
    value_to_metric: true
    descriptors:
      - key: http_method
        value_to_metric: true
        descriptors:
          - key: subject_id
            rate_limit:
              unit: minute
              requests_per_unit: 60

Requests:

  • route=api, http_method=GET, subject_id=123 → Metric: domain.route_api.http_method_GET.subject_id
  • route=web, http_method=POST, subject_id=456 → Metric: domain.route_web.http_method_POST.subject_id

Without value_to_metric, both requests would use: domain.route.http_method.subject_id

Changes

Code Changes

  • Added ValueToMetric bool field to YamlDescriptor struct
  • Added value_to_metric to validKeys map for YAML validation
  • Added valueToMetric bool field to rateLimitDescriptor struct to track the flag per descriptor
  • Updated loadDescriptors to store the value_to_metric flag in descriptor nodes
  • Updated GetLimit to build a value_to_metric-enhanced metric key when enabled
  • Handled wildcard matching to include full runtime values when value_to_metric is enabled

Tests

  • Added comprehensive unit tests covering:
    • Basic functionality with runtime values
    • Default key behavior with value_to_metric
    • Mid-level descriptor with value_to_metric
    • Backward compatibility (no flag set)
    • Interaction with detailed_metric (precedence)
    • Configured descriptor values with value_to_metric
    • Wildcard matching with value_to_metric
  • All tests pass successfully

Documentation

  • Updated README.md with:
    • Added value_to_metric to descriptor list definition format
    • New section "Including descriptor values in metrics" explaining the feature
    • Example 10 demonstrating usage with basic and wildcard scenarios
    • Updated Table of Contents (note: requires running doctoc to regenerate)

Testing

All existing tests continue to pass, ensuring backward compatibility. New tests verify:

  • ✅ Basic value_to_metric functionality
  • ✅ Default key behavior includes values when enabled
  • ✅ Wildcard matching includes full runtime values
  • ✅ No regression when flag is not set
  • ✅ Correct precedence with detailed_metric
  • ✅ Works with configured descriptor values

Backward Compatibility

This change is fully backward compatible:

  • Default value is false, so existing configurations continue to work unchanged
  • Only affects metrics keys when explicitly enabled
  • Does not change rate limiting behavior, only metric naming

@xuannam230201 xuannam230201 force-pushed the add-field-to-add-unspecified-value-to-metric branch 2 times, most recently from 8a137ce to cf5f6fb Compare November 8, 2025 03:25
Signed-off-by: xuannam230201 <xuannam230201@gmail.com>
Signed-off-by: Nam Dang <xuannam230201@gmail.com>
@xuannam230201 xuannam230201 force-pushed the add-field-to-add-unspecified-value-to-metric branch from cf5f6fb to 9e9d3d5 Compare November 8, 2025 04:31
Signed-off-by: Nam Dang <xuannam230201@gmail.com>
Signed-off-by: Nam Dang <xuannam230201@gmail.com>
@xuannam230201
Copy link
Copy Markdown
Contributor Author

@xuannam230201
Copy link
Copy Markdown
Contributor Author

Hi @collin-lee , do you have any concerns/feedbacks on this PR (for this issue #994)? Is it good to go?

@collin-lee
Copy link
Copy Markdown
Contributor

Hi @collin-lee , do you have any concerns/feedbacks on this PR (for this issue #994)? Is it good to go?

See comments above regarding FullKey

@xuannam230201
Copy link
Copy Markdown
Contributor Author

xuannam230201 commented Nov 28, 2025

See comments above regarding FullKey

@collin-lee , I don't see any comments from you regarding FullKey. Could you please let me know what are your concerns? Thanks.

image

@collin-lee
Copy link
Copy Markdown
Contributor

collin-lee commented Nov 28, 2025

See comments above regarding FullKey

@collin-lee , I don't see any comments from you regarding FullKey. Could you please let me know what are your concerns? Thanks.

image

Weird... a few weeks ago I had made a comment and then the other day, I wrote this. It looks like my comments are pending so maybe that's why you don't see it.

Looks like I didn't click on options in the dropdown to "publish" it. do you see it now?

Comment thread src/config/config_impl.go Outdated
if matchedViaWildcard {
if nextDescriptor.valueToMetric {
valueToMetricFullKey.WriteString(entry.Key)
valueToMetricFullKey.WriteString("_")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could entry.Value here be empty potentially? If so, maybe add a guard

if entry.Value != "" {
valueToMetricFullKey.WriteString("_")
valueToMetricFullKey.WriteString(entry.Value)
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! Let me add the guard and also unit test for this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread src/config/config_impl.go
// Recreate to ensure a clean stats struct, then set to enhanced stats
rateLimit = NewRateLimit(rateLimit.Limit.RequestsPerUnit, rateLimit.Limit.Unit, this.statsManager.NewStats(rateLimit.FullKey), rateLimit.Unlimited, rateLimit.ShadowMode, rateLimit.Name, rateLimit.Replaces, rateLimit.DetailedMetric)
rateLimit.Stats = this.statsManager.NewStats(enhancedKey)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xuannam230201

Should we add after 419:

rateLimit.FullKey = enhancedKey

FullKey is used for logging/debugging (e.g., in base_limiter.go), Stats.Key is the actual metric key used for statistics. They should match so logging reflects the actual metrics being tracked

// Without the fix:
rateLimit.FullKey = "domain.route.http_method" // Old key (shown in logs)
rateLimit.Stats.Key = "domain.route_api.http_method_GET" // New key (used for metrics)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xuannam230201 see this comment about adding rateLimit.FullKey = enhancedKey after

Here's a test we could add to config_test.go:

`// TestValueToMetric_FullKeyMatchesStatsKey verifies that rateLimit.FullKey always matches
// rateLimit.Stats.Key. This is important for debugging and log/metric correlation.
// FullKey is used in debug logs, while Stats.Key is used for actual metrics.
func TestValueToMetric_FullKeyMatchesStatsKey(t *testing.T) {
asrt := assert.New(t)
store := stats.NewStore(stats.NewNullSink(), false)

cfg := []config.RateLimitConfigToLoad{
	{
		Name: "inline",
		ConfigYaml: &config.YamlRoot{
			Domain: "test-domain",
			Descriptors: []config.YamlDescriptor{
				{
					Key:           "route",
					ValueToMetric: true,
					Descriptors: []config.YamlDescriptor{
						{
							Key:           "http_method",
							ValueToMetric: true,
							Descriptors: []config.YamlDescriptor{
								{
									Key: "subject_id",
									RateLimit: &config.YamlRateLimit{
										RequestsPerUnit: 60,
										Unit:            "minute",
									},
								},
							},
						},
					},
				},
			},
		},
	},
}

rlConfig := config.NewRateLimitConfigImpl(cfg, mockstats.NewMockStatManager(store), false)

// Test case 1: value_to_metric enabled - FullKey should match Stats.Key
rl := rlConfig.GetLimit(
	context.TODO(), "test-domain",
	&pb_struct.RateLimitDescriptor{
		Entries: []*pb_struct.RateLimitDescriptor_Entry{
			{Key: "route", Value: "api"},
			{Key: "http_method", Value: "GET"},
			{Key: "subject_id", Value: "user123"},
		},
	},
)
asrt.NotNil(rl)
asrt.Equal(rl.FullKey, rl.Stats.Key, "FullKey should match Stats.Key when value_to_metric is enabled")
expectedKey := "test-domain.route_api.http_method_GET.subject_id"
asrt.Equal(expectedKey, rl.FullKey)
asrt.Equal(expectedKey, rl.Stats.Key)

// Test case 2: value_to_metric disabled - FullKey should still match Stats.Key
cfgNoValueToMetric := []config.RateLimitConfigToLoad{
	{
		Name: "inline",
		ConfigYaml: &config.YamlRoot{
			Domain: "test-domain-2",
			Descriptors: []config.YamlDescriptor{
				{
					Key: "route",
					Descriptors: []config.YamlDescriptor{
						{
							Key: "http_method",
							Descriptors: []config.YamlDescriptor{
								{
									Key: "subject_id",
									RateLimit: &config.YamlRateLimit{
										RequestsPerUnit: 60,
										Unit:            "minute",
									},
								},
							},
						},
					},
				},
			},
		},
	},
}

rlConfig2 := config.NewRateLimitConfigImpl(cfgNoValueToMetric, mockstats.NewMockStatManager(store), false)
rl2 := rlConfig2.GetLimit(
	context.TODO(), "test-domain-2",
	&pb_struct.RateLimitDescriptor{
		Entries: []*pb_struct.RateLimitDescriptor_Entry{
			{Key: "route", Value: "api"},
			{Key: "http_method", Value: "GET"},
			{Key: "subject_id", Value: "user123"},
		},
	},
)
asrt.NotNil(rl2)
asrt.Equal(rl2.FullKey, rl2.Stats.Key, "FullKey should match Stats.Key even when value_to_metric is disabled")

// Test case 3: detailed_metric enabled - FullKey should match Stats.Key
cfgDetailedMetric := []config.RateLimitConfigToLoad{
	{
		Name: "inline",
		ConfigYaml: &config.YamlRoot{
			Domain: "test-domain-3",
			Descriptors: []config.YamlDescriptor{
				{
					Key: "route",
					Descriptors: []config.YamlDescriptor{
						{
							Key: "http_method",
							Descriptors: []config.YamlDescriptor{
								{
									Key:            "subject_id",
									DetailedMetric: true,
									RateLimit: &config.YamlRateLimit{
										RequestsPerUnit: 60,
										Unit:            "minute",
									},
								},
							},
						},
					},
				},
			},
		},
	},
}

rlConfig3 := config.NewRateLimitConfigImpl(cfgDetailedMetric, mockstats.NewMockStatManager(store), false)
rl3 := rlConfig3.GetLimit(
	context.TODO(), "test-domain-3",
	&pb_struct.RateLimitDescriptor{
		Entries: []*pb_struct.RateLimitDescriptor_Entry{
			{Key: "route", Value: "api"},
			{Key: "http_method", Value: "GET"},
			{Key: "subject_id", Value: "user123"},
		},
	},
)
asrt.NotNil(rl3)
asrt.Equal(rl3.FullKey, rl3.Stats.Key, "FullKey should match Stats.Key when detailed_metric is enabled")

}
`

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know that!

However, currently, I see that FullKey and Stat.Key in detailed_metric are not the same (because current code doesn't set FullKey to Stat.Key in detailed_metric).

I set FullKey to Stat.Key in both cases (detailed_metric and value_to_metric), and also added unit tests for these cases. Is it okay?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok great - thanks! LGTM

Signed-off-by: Nam Dang <xuannam230201@gmail.com>
@xuannam230201
Copy link
Copy Markdown
Contributor Author

@collin-lee, I updated PR to address your comments (and also added more unit tests). Could you please help to take a look at it? Thanks in advance!

@collin-lee collin-lee merged commit 6b4f389 into envoyproxy:main Nov 29, 2025
6 checks passed
@xuannam230201
Copy link
Copy Markdown
Contributor Author

xuannam230201 commented Nov 29, 2025

Hi @collin-lee , I noticed that a GitHub Action failed after this PR was merged: https://github.com/envoyproxy/ratelimit/actions/runs/19781727464/job/56682854936
.
Could you please take a look and let me know if something went wrong (or it was just flaky?) I’m happy to help fix it if needed.

It also appears that the workflow did not push a new image with new tag
to Docker Hub. It created an image with master tag.

@xuannam230201
Copy link
Copy Markdown
Contributor Author

xuannam230201 commented Nov 29, 2025

In that GHA, build for master tag was success, but for new tag (6b4f3896) was failed.

I checked the main.yaml GHA and see that they (master and new tags) are applied the same command to build and push Docker image. As a result, I think that some tests were flaky which caused this issue. Does it make sense to you to re-trigger this GHA to see if it's success @collin-lee .

The failed test case didn't relate to this change
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants