Skip to content

[Proposal] Adding double presion mertrics. #4449

@b-slim

Description

@b-slim

Proposal Adding Double precision metrics.

Currently Druid stores double aggregated numbers as floats which leads to lose of precision and not expected results set for the case of Min/Max where you sort of expect to see the same precision of the actual indexed data.
The goal of this work is to add support of actual double indexed columns.
There is several ways on how to get there would like to poll the community about which way to go.

Option 1

Make the double aggregator store numbers as double period !

  • This might be radical and surprise druid user since the storage footprint of double columns will double.
  • Less code duplication (will need to keep float ser/deser code to ensure backward compatible).

Option 1+

If storage footprint is a deal breaker, do the same as option 1 and add new suite of FloatsXXXAggregators.

  • Make it more flexible if storage is an issue.
  • Lot of code duplication.

Option 2

Instead of modifying existing Double aggregator, create new set of aggregator called DoubleSum64...etc.

  • lot of code duplication
  • some how confusing and ugly to have DoubleSum64 and DoubleSum etc
  • probably the most transparent for current druid users.

Option 3

Add a flag to indexSpec ie "StoreDoubleAsFloat" and set it to true/false

  • less code duplication that 1.1 and 1.2
  • need to have a new precision flag within the column to find out how to deserialize the column (or Maybe not haven't fleshed out this yet but it is doable).
  • same as option 2 don't see why index double numbers then store it as floats.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions