Skip to content

Add an option to enable bitmap in IncrementalIndex during real time ingestion #11301

@wjhypo

Description

@wjhypo

Description

Add an option to enable bitmap in IncrementalIndex during real time ingestion to improve the query efficiency by eliminating the wasteful CPU caused by full scan.

Motivation

When real time data from Kafka is ingested through peon tasks, they are first stored in an in memory data structure, a map, when query comes in, every single row in the map is iterated to check if it matches the filter which is naturally very inefficient and in a lot of cases only a small percentage of the rows end up qualifying for final aggregation which makes the full scan a waste of CPU. While for segments ingested through batch ingestion, there are bitmap index generated which can be used to pinpoint the candidate rows given the filter in the query which is much more efficient. Currently in order to meet the SLA (high QPS, low latency) of some real time use cases we have, we have to provision a lot of peon tasks and replicas to distribute load which is quite costly. If we can optionally enable bitmap when data is still in memory, it can improve query performance and lower the infra cost needed to support the same or better SLA. From our use case in production, in memory bitmap enables us to use 30% of previous capacity to support the same QPS and latency.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions