feat(provider): add External Metrics provider#1863
Open
jlore-decathlon wants to merge 1 commit intofluxcd:mainfrom
Open
feat(provider): add External Metrics provider#1863jlore-decathlon wants to merge 1 commit intofluxcd:mainfrom
jlore-decathlon wants to merge 1 commit intofluxcd:mainfrom
Conversation
85c595d to
139a34a
Compare
3757b5a to
72ad54a
Compare
72ad54a to
9925699
Compare
mveroone
reviewed
Dec 1, 2025
mveroone
reviewed
Dec 1, 2025
9925699 to
2ee47e0
Compare
eeeccfc to
86cc361
Compare
aryan9600
reviewed
Jan 2, 2026
Member
aryan9600
left a comment
There was a problem hiding this comment.
thank you for opening this PR!
eb8b59f to
2e0a69c
Compare
|
Note : if that's okay, we'll squash commits after a few rounds of review so we can fix the DCO |
aryan9600
reviewed
Mar 2, 2026
aryan9600
approved these changes
Mar 19, 2026
Member
aryan9600
left a comment
There was a problem hiding this comment.
lgtm! 🎖️
please squash this into 1-2 commits and sign them off, thanks
9695948 to
6425301
Compare
@aryan9600 Done ! |
Member
|
ci is failing because of unformatted code - could you run |
Datadog provider is often meeting API rate limits on bigger implementations. Datadog Cluster Agent can batch metric queries and expose them through an endpoint compatible with Kubernetes External Metrics API. This implementations allows to use this endpoint and any other server implementing Kubernetes External Metrics API. Including k8s API server itself. Co-authored-by: Johan Lore <johan.lore@decathlon.com> Co-authored-by: Maxime Véroone <maxime.veroone@decathlon.com> Signed-off-by: Johan Lore <johan.lore@decathlon.com> Signed-off-by: Maxime Véroone <maxime.veroone@decathlon.com> Signed-off-by: Johan Lore <johan.lore@decathlon.com>
6425301 to
9606c94
Compare
Author
@aryan9600 Done |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed addition
The current Datadog metric provider relies on their Metric API.
However, this API has pretty low rate limits, and people with a moderately sized infrastructure tend to reach these limits quite easily when scaling their usage of Flagger or datadog-based autoscaling (like KEDA).
Datadog offers a more scalable alternative by making its Cluster Agent batch requests by groups of 35 see Cluster Agent Autoscaling Metrics. It then makes these metrics available within the cluster by exposing an endpoint following Kubernetes External Metrics API.
Note
This endpoint is not documented by Datadog, as they expect people to have the agent register against the control plane as the cluster's external metrics provider and then making these metrics available through k8s API Server, removing the need to query the endpoint directly.
However, by implementing a kubernetes API, its behavior is predictable and stable enough to be used directly.
We've relied on the way KEDA implemented a similar feature during design and implementation. However, Flagger is not an autoscaling solution so we're not going to mimic the metric proxy Keda operates. We simply propose to query the external metric server directly. By doing this, we also chose to make the provider generic and compatible with any external metrics server. The downside is that we cannot abstract the way datadog names its metrics which isn't trivial.
fix: #1235
Any alternatives you've considered?
We've pondered modifying the Datadog metric provider instead of making an external metrics provider. But we felt that this had the benefit of making other external metric providers compatible and kept the code datadog-agnostic.
We could theoretically make it even more generic and use any kubernetes metric API (standard, Custom or External), but I think Flagger already offers this
Disclaimer