Reduce subscribe worker batch size #1270
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reduce subscribe worker database poll batch size from 10000 to 1000 to support the motivation to reduce subscribe worker batch size in
message.subscribe_workerUpdate the constant that controls the subscribe worker poll size in
message.subscribe_workerand add comments documenting measurement context.subscribeWorkerPollRowsto1000in subscribe_worker.go📍Where to Start
Start with the
subscribeWorkerPollRowsconstant definition and related comments in subscribe_worker.go.📊 Macroscope summarized 6379f07. 1 files reviewed, 1 issues evaluated, 1 issues filtered, 0 comments posted
🗂️ Filtered Issues
pkg/api/message/subscribe_worker.go — 0 comments posted, 1 evaluated, 1 filtered
subscribeWorkerPollRowsconstant change (from10000to1000) has no effect because thepollableQueryfunction ignores itsnumRowsparameter. InstartSubscribeWorker,db.NewDBSubscriptionis givenPollingOptions{ NumRows: subscribeWorkerPollRows }, and it calls the providedpollableQuery(ctx, lastSeen, numRows). However,pollableQuerydoes not usenumRowsand callsq.SelectGatewayEnvelopes(ctx, ...)without passing or enforcing any row limit. This silently defeats the intended throughput/row cap and contradicts the newly added comments about limiting to 1000 rows. Depending on the database size, this can cause the worker to fetch arbitrarily large batches, increasing latency, memory usage, and potentially delaying vector clock advancement. The externally visible contract of limiting polled rows per interval is not honored on this execution path. [ Low confidence ]