Skip to content

encode: prevent appending .0 to scientific notation to le bucket labels#261

Open
Pigueiras wants to merge 1 commit intofluent:masterfrom
Pigueiras:fix11651
Open

encode: prevent appending .0 to scientific notation to le bucket labels#261
Pigueiras wants to merge 1 commit intofluent:masterfrom
Pigueiras:fix11651

Conversation

@Pigueiras
Copy link
Copy Markdown

bucket_value_to_string used %g to format histogram bucket boundaries, which switches to scientific notation for values >= 1e6. The subsequent check for a missing decimal point would then append ".0", producing malformed le labels like "1e+06.0" that Prometheus rejects with:

bucket label "le" is missing or has a malformed value of "1e+08.0"

Skip the ".0" suffix when the string already contains 'e', since scientific notation is valid in Prometheus le labels without it.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Prometheus exposition encoding for histogram bucket (le) label values by preventing the encoder from producing malformed scientific-notation strings (e.g., 1e+06.0) that Prometheus rejects.

Changes:

  • Update bucket_value_to_string() to append .0 only when the formatted value is neither decimal nor scientific notation.
  • Add clarifying comment documenting the .0-append rule.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* Append .0 only when there is no decimal point and the number
* is not in scientific notation.
*/
if (!strchr(str, '.') && !strchr(str, 'e')) {
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for scientific notation only looks for lowercase 'e'. While %g typically uses 'e', this becomes fragile if the formatting ever changes (e.g., %G) or if a platform emits an uppercase exponent. Consider using a character-class check (e.g., treating either 'e' or 'E' as scientific notation) so .0 is never appended to exponent-form values.

Suggested change
if (!strchr(str, '.') && !strchr(str, 'e')) {
if (!strchr(str, '.') && !strchr(str, 'e') && !strchr(str, 'E')) {

Copilot uses AI. Check for mistakes.
Comment on lines 373 to 382
len = snprintf(str, 64, "%g", val);
cfl_sds_len_set(str, len);

if (!strchr(str, '.')) {
/*
* Append .0 only when there is no decimal point and the number
* is not in scientific notation.
*/
if (!strchr(str, '.') && !strchr(str, 'e')) {
cfl_sds_cat_safe(&str, ".0", 2);
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucket_value_to_string() will still append .0 to non-finite printf outputs like inf/nan (which contain neither . nor e), producing values like inf.0/nan.0 that are not parseable as floats in Prometheus label values. Since histogram buckets can be user-provided without an isfinite validation, consider explicitly handling non-finite val (e.g., reject, or map infinities to +Inf/-Inf, and avoid appending .0 for nan/inf).

Copilot uses AI. Check for mistakes.
Comment on lines +376 to 381
/*
* Append .0 only when there is no decimal point and the number
* is not in scientific notation.
*/
if (!strchr(str, '.') && !strchr(str, 'e')) {
cfl_sds_cat_safe(&str, ".0", 2);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a Prometheus parsing failure for large bucket boundaries, but there doesn’t appear to be a regression test covering scientific-notation le (e.g., a histogram bucket at >= 1e6 that should encode as 1e+06 and remain parseable). Adding a test that encodes and then decodes Prometheus text (or validates the encoded string) would prevent this from reoccurring.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding unit test to confirm this fix?
It's not mandatory for every change but we encourage to write it.

`bucket_value_to_string` used %g to format histogram bucket boundaries,
which switches to scientific notation for values >= 1e6. The subsequent
check for a missing decimal point would then append ".0", producing
malformed `le` labels like "1e+06.0" that Prometheus rejects with:

```
bucket label "le" is missing or has a malformed value of "1e+08.0"
```

Skip the ".0" suffix when the string already contains 'e', since
scientific notation is valid in Prometheus `le` labels without it.

Signed-off-by: Luis Pigueiras <luis.pigueiras@cern.ch>
@Pigueiras
Copy link
Copy Markdown
Author

How about adding unit test to confirm this fix? It's not mandatory for every change but we encourage to write it.

I've added one but tbh I'm not sure if it's the correct place for it, if you think this should go somewhere else I would appreciate some guidance on it 😄

@edsiper
Copy link
Copy Markdown
Member

edsiper commented Apr 8, 2026

The fix looks good and the new test seems fine where it is. One remaining edge case: bucket_value_to_string() still appends .0 to %g outputs like inf/nan, and histogram buckets do not seem to reject non-finite bounds, so this can still generate invalid le labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants