Skip to content

Block store: S3 uploads doesn’t scale linearly with number of tenants #2757

@harry671003

Description

@harry671003

Issue

Cortex, with block store backend, distributes one tenant’s time-series across all ingesters. With the current architecture, the number of S3 uploads doesn’t scale linearly with number of tenants. Right now, it is not cost efficient to deploy large cortex clusters with block store backend due to high S3 costs.

This would be more evident if we look at two Cortex clusters consisting of different number of tenants as shown below.

To simplify our calculations, let’s assume the following:

  • All tenants have 100K time-series each.
  • An ingester can hold a maximum of 1 million time-series.
  • Cortex replication factor is 3.

Cluster with 100 tenants

In a cluster with 100 tenants, we’ll have about RF x 100 x 100K = 30 million time series. It will requires about 30 ingesters to store time series from 100 tenants in a cluster. Given that time series from a tenant are distributed uniformly across all ingesters, every ingester would end up with one TSDB for every tenant.

Total number of uploads per TSDB block range period would be:

  • Number of ingesters x Number of tenants
  • 30 * 100 = 3,000 uploads (assuming only one upload per TSDB).

Cluster with 1000 tenants

Total number of time-series in this case would be 300 million and this requires 300 ingesters.

Total number of uploads per TSDB block range period would be:

  • Number of ingesters x Number of tenants
  • 300 * 1000 = 300,000 uploads (assuming only one upload per TSDB).

Here, the issue is that S3 uploads is growing at a rate proportional to tenant ^ 2. This also affects S3 requests from compactor, store-gateway and querier.

Possible solutions

Shuffle sharding

One potential solution is to use shuffle sharding and limit the number of ingesters to which one tenant’s time series is distributed. This also has the added benefit of reducing the blast radius of one tenant behaving badly. See: https://aws.amazon.com/blogs/architecture/shuffle-sharding-massive-and-magical-fault-isolation/

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions