perf: use CPU pool to run WAND algo#5363
Conversation
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let partition_ptr = PartitionPtr::new(self); | ||
| let (candidates, local_metrics) = spawn_cpu(move || { | ||
| let local_metrics = LocalMetricsCollector::default(); | ||
| // SAFETY: `partition_ptr` points to `self`, which outlives this task because we await it. | ||
| let partition = unsafe { partition_ptr.deref() }; |
There was a problem hiding this comment.
Keep partition alive for spawn_cpu WAND task
The WAND search is now offloaded to spawn_cpu using a raw PartitionPtr to self, but the background CPU task is not tied to the async future’s cancellation. If the bm25_search future is dropped (e.g., request cancellation) while the index is concurrently torn down, the blocking task will continue running and dereference a pointer to a freed InvertedPartition, leading to potential use-after-free/UB. Consider holding an Arc<InvertedPartition> inside the closure or otherwise ensuring the partition outlives the spawned CPU job under cancellation.
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| let partition_ptr = PartitionPtr::new(self); | ||
| let (candidates, local_metrics) = spawn_cpu(move || { | ||
| let local_metrics = LocalMetricsCollector::default(); | ||
| // SAFETY: `partition_ptr` points to `self`, which outlives this task because we await it. |
There was a problem hiding this comment.
Can we avoid this? This can't be guaranteed since the future itself could be dropped at any time.
There was a problem hiding this comment.
I was wondering if you could refactor this function.
Move the postings calculation out and await it separately, since bm25_search is just a pure blocking function.
this reduces 10%~20% cold latency for full text search --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>
this reduces 10%~20% cold latency for full text search