Prevents single split searches from different leaf_search from interleaving#5509
Prevents single split searches from different leaf_search from interleaving#5509fulmicoton merged 2 commits intomainfrom
Conversation
5567ee9 to
e708950
Compare
interleaving.
Before this PR, we just used a semaphore to acquire a permit and
start a new tokio task to run our single split search.
In pseudo code, a leaf_search would look like:
```
for split in splits {
let permit = semaphore.acquire().await;
tokio::spawn(async move {
single_split_search(split);
drop(permit)
})
}
```
The problem with this is that it allows interleaving split search from
one search request with another one.
This interleaving strongly impacts search latency.
For instance, one can imagine two queries A and B with 3 splits each.
Executing as follows
| A | A | A | B | B | B |
offers a much short latency for A than
| A | B | B | A | B | A |
This PR also adds two metrics to monitor the number of queue single
split search.
e708950 to
61483b9
Compare
trinity-1686a
left a comment
There was a problem hiding this comment.
I think the changes make sense, but some struct/params names make the code more confusing than it could be
| fn get_permits( | ||
| &mut self, | ||
| num_permits: usize, | ||
| inner: &Arc<Mutex<InnerSearchPermits>>, |
There was a problem hiding this comment.
i think inner is a very confusing name, the InnerSearchPermits is already "the inner thing", so an Arc<Mutex<_>> should outer, or a handle
There was a problem hiding this comment.
I don't quite get the confusion/comment.
The "outter" is SearchPermits.
I have not found a common practise on how to name the object of this pattern in Rust (in C++ pimpl is common, even if this is not the common usage of the pimpl pattern).
What would be a better name? would inner_arc help a bit?
| /// | ||
| /// Requests are served in order. | ||
| #[derive(Clone)] | ||
| pub struct SearchPermits { |
There was a problem hiding this comment.
this is a misnomer, it sounds like this is a akin to a tokio SemaphorePermit with num_permits > 1, but this is actually a semaphore, or a PermitProvider, or something of the like
There was a problem hiding this comment.
Yeah you are right.
|
|
||
| pub struct SearchPermit { | ||
| _ongoing_gauge_guard: GaugeGuard<'static>, | ||
| inner_opt: Option<Weak<Mutex<InnerSearchPermits>>>, |
There was a problem hiding this comment.
I don't believe there is need for Weak here: InnerSearchPermits doesn't hold any SearchPermit (a oneshot sender doesn't hold the thing to transmit, the oneshot receiver do)
i find this type, Option<Weak<Mutex<InnerSearchPermits>>> rather confusing, in part because SearchPermits isn't the right name imo, in part because the Option<> very much looks like a way to store a bool alongside the rest of the structure to reduce the size of the struct by one usize. I'm not against using an Option that way, but if that's the reason, it ought to be written down, and if it isn't the goal, then i'm missing what the goal of this Option is, rather than a disable_drop: bool
There was a problem hiding this comment.
I prefer the Option. Setting it to None seems easier to "proofread". I agree that it makes the code less readable however, as the intent is way less clearer than with the boolean solution.
I'll add some comments.
There was a problem hiding this comment.
you ended up changing it to a bool. If you think an Option is better, you can put it back, just add a 1-2 line comment on the purpose of None
There was a problem hiding this comment.
Yeah, I thought after packing everything as drop_without_recycling_permit it was fine like that.
ded67a9 to
a3d81f0
Compare
a3d81f0 to
e028746
Compare
Before this PR, we just used a semaphore to acquire a permit and start a new tokio task to run our single split search.
In pseudo code, a leaf_search would look like:
The problem with this is that it allows interleaving split search from one search request with another one.
This interleaving strongly impacts search latency. For instance, one can imagine two queries A and B with 3 splits each. Executing as follows
| A | A | A | B | B | B |
offers a much short latency for A than
| A | B | B | A | B | A |
This PR also adds two metrics to monitor the number of queue single split search.