Skip to content

Don't pass QueryMetrics down in concurrent and async QueryRunners (fixes #4279)#4288

Merged
himanshug merged 2 commits intomasterfrom
queryMetrics-concurrency-fix
May 22, 2017
Merged

Don't pass QueryMetrics down in concurrent and async QueryRunners (fixes #4279)#4288
himanshug merged 2 commits intomasterfrom
queryMetrics-concurrency-fix

Conversation

@leventov
Copy link
Copy Markdown
Member

@leventov leventov commented May 17, 2017

QueryMetrics is unsafe for concurrent use from multiple threads, so it shouldn't be passed down in concurrent and async QueryRunners. Fixes #4279.

This solution is safer than proposed in #3803 (metamx@54c2af2), but the issue raised in #3803: no difference between "sync" and "async" QueryRunners -- leaves this PR's solution fragile, because anyone who writes a new concurrent or async QueryRunner must know that he needs to call queryPlus.threadSafe().

@himanshug
Copy link
Copy Markdown
Contributor

@leventov can't we make QueryPlus threadsafe in general and not worry about what type of QueryRunner uses it?

@leventov
Copy link
Copy Markdown
Member Author

@himanshug it's not about QueryPlus per se, it's about QueryMetrics. Making QueryMetrics thread-safe is harmful, because instead of solving the problem it hides the problem. If for some query types QueryMetrics need to accumulate information from multiple threads (where parts of the query are processed/executed), this information should be passed up via Futures which already exist, and aggregated in a single thread. So far there is no such need.

@himanshug
Copy link
Copy Markdown
Contributor

@leventov i see, the confusion came from queryPlus.threadSafe() which, i thought, meant give me a thread safe version of QueryPlus instance and I thought it was equivalent of Collections.synchronizedXXX(xxx) which return a thread-safe wrapper of same thing without any change to its underlying state.

anyways, can't we require QueryMetrics instances to be thread-safe ? then "async" query runners do not need to do anything special.
or, if the worry is that there might be too much lock contention in the thread-safe instance then we might also try storing queryMetrics inside ThreadLocal reference in QueryPlus so that different threads never get same instance.

@leventov
Copy link
Copy Markdown
Member Author

@himanshug

anyways, can't we require QueryMetrics instances to be thread-safe ? then "async" query runners do not need to do anything special.
or, if the worry is that there might be too much lock contention in the thread-safe instance then we might also try storing queryMetrics inside ThreadLocal reference in QueryPlus so that different threads never get same instance.

As I explained in my previous message, making QueryMetrics thread-safe is inherently wrong thing to do. Instead of failing with ConcurrentModificationException, multiple threads will override metrics of each other. Nobody would understand what is going on and if the metrics are trustworthy or not.

Instead I suggest a structured, explicit approach.

ThreadLocal is something per-thread, QueryMetrics is something per-query, but may span multiple threads, so I don't see how it could ever be a solution.

@himanshug
Copy link
Copy Markdown
Contributor

ok, had a conversation with @leventov and concluded that following things be done.

  • rename QueryPlus.threadSafe(..) to QueryPlus.withoutThreadUnsafeState()
  • update QueryPlus.withQueryMetrics(..) doc to clarify that caller is also required to call emit() on returned instance or metrics might be lost
  • if possible/simple then enforce that async query runners always call QueryPlus.withoutThreadUnsafeState()

…ueryPlus.withQueryMetrics() Javadocs; Fix generics in MetricsEmittingQueryRunner and CpuTimeMetricQueryRunner; Make DefaultQueryMetrics to fail fast on modifications from concurrent threads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants