Skip to content

Conversation

@kishorekg1999
Copy link
Contributor

@kishorekg1999 kishorekg1999 commented Jan 6, 2026

What this PR does:
This PR fixes a critical deadlock in the query scheduler that occurs when a tenant's max_outstanding_requests_per_tenant limit is dynamically reduced via runtime configuration.

When MaxOutstandingPerTenant is reduced while a user's FIFORequestQueue is full, the getOrAddQueue method attempts to migrate requests to a smaller queue. Previously, this loop blocked indefinitely when the new queue capacity was reached, causing the scheduler to freeze.

The fix ensures the migration loop breaks when the new queue is full, effectively dropping excess requests instead of blocking.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

This commit fixes a critical deadlock in the query scheduler that occurs when a tenant's max_outstanding_requests_per_tenant limit is dynamically reduced via runtime configuration.

When MaxOutstandingPerTenant is reduced while a user's FIFORequestQueue is full, the getOrAddQueue method attempts to migrate requests to a smaller queue. Previously, this loop blocked indefinitely when the new queue capacity was reached, causing the scheduler to freeze.

The fix ensures the migration loop breaks when the new queue is full, effectively dropping excess requests instead of blocking.

Verification: Added regression test TestGetOrAddQueue_ShouldNotDeadlockWhenLimitsAreReduced
Signed-off-by: Kishore K G <kishorekg@google.com>
@dosubot dosubot bot added the type/bug label Jan 6, 2026
kishorekg1999 and others added 5 commits January 6, 2026 11:54
Signed-off-by: Kishore K G <kishorekg@google.com>
…exproject#7179)

* Update Ruler frontend_address comments to mention Store Gateway

Signed-off-by: Kishore K G <kishorekg@google.com>

* Update pkg/ruler/ruler.go

Co-authored-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: kishorekg1999 <kishorekg.github@gmail.com>

* Update Ruler frontend_address comments to mention Store Gateway

Signed-off-by: Kishore K G <kishorekg@google.com>

* empty commit

Signed-off-by: Kishore K G <kishorekg@google.com>

---------

Signed-off-by: Kishore K G <kishorekg@google.com>
Signed-off-by: kishorekg1999 <kishorekg.github@gmail.com>
Co-authored-by: SungJin1212 <tjdwls1201@gmail.com>
* Rename user index update config

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* fix lint

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
)

* instrument ingester query path with pprof user label

Signed-off-by: yeya24 <benye@amazon.com>

* update changelog

Signed-off-by: yeya24 <benye@amazon.com>

---------

Signed-off-by: yeya24 <benye@amazon.com>
…d got logged and add performance logs for sharded block populator (cortexproject#7181)

Signed-off-by: Alex Le <leqiyue@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants