feat(seqdb): support configuration via file#22
Conversation
forshev
left a comment
There was a problem hiding this comment.
left some comments in the previous PR
40f68d8 to
96e72fd
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #22 +/- ##
==========================================
- Coverage 73.57% 73.43% -0.14%
==========================================
Files 192 192
Lines 15873 15911 +38
==========================================
+ Hits 11678 11684 +6
- Misses 3607 3639 +32
Partials 588 588 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 13
🔭 Outside diff range comments (3)
storeapi/docs_stream.go (1)
93-105:calcChunkSizecan return 0 → infinite loop.If
avgDocSizeexceedsconfig.MaxFetchSizeBytes, integer division yields0.
batchLoaderthen usesmin(len(d.ids), chunkSize); withchunkSize == 0,lbecomes 0 and the for-loop never drainsd.ids, spinning forever.- newChunkSize := int(config.MaxFetchSizeBytes) / avgDocSize + newChunkSize := int(config.MaxFetchSizeBytes) / avgDocSize + if newChunkSize <= 0 { + newChunkSize = 1 // guarantee forward progress + }Also worth adding a hard upper bound to avoid unbounded memory on very small documents.
parser/seqql_filter_test.go (1)
232-248: Global mutable config causes data-race with parallel tests.
config.CaseSensitiveis mutated in this test while many other tests in the same package callt.Parallel().
This is a classic race – e.g.TestSeqQLAllmay be evaluating queries concurrently while the flag flips.Consider:
- Refactoring
ParseSeqQLto accept acaseSensitiveflag instead of relying on a global; or- Serialising the entire package test run (
t.Parallel()only in subtests) and guarding the variable with a mutex.At the very least, add
t.Parallel()after taking a local copy anddeferrestoring it to minimise exposure.tests/setup/env.go (1)
272-274:range cfg.IngestorCountis invalid – compilation will fail
rangecannot iterate over anint. Use an indexedforloop instead.-for i := range cfg.IngestorCount { +for i := 0; i < cfg.IngestorCount; i++ {
🧹 Nitpick comments (10)
storeapi/grpc_search.go (1)
207-214: Minor: expose default via config struct instead of global var.
config.UseSeqQLByDefaultis still read as a package-level global. For symmetry with the new file-based configuration, consider wiring this flag through the loaded config struct (g.config), avoiding hidden globals.frac/compress.go (1)
68-69: Behaviour change: 1 MB → 1 MiB
consts.MB(decimal 1 000 000) has been replaced withunits.MiB(binary 1 048 576). The initial buffer can now grow by ~4.9 %.
If this was unintentional (e.g. tuning for mmap / page alignment) please double-check.
Otherwise, add a short comment documenting the rationale so the next reader doesn’t think it’s a typo.- const maxInitDocBlockSize = int(units.MiB) + // use MiB (binary) for alignment with units package + const maxInitDocBlockSize = int(units.MiB)parser/process_test.go (1)
273-285: Duplicate sub-test namesBoth entries in
testsslice are calledcase_0. Duplicate names make it hard to spot which case failed in CI output.proxyapi/grpc_fetch_test.go (1)
135-136: Assertion now coupled to configuration defaultUsing
config.MaxRequestedDocumentsinside test data is fine, but if the default ever changes this test silently changes semantics (the+1may no longer exceed the limit). Prefer capturing the current value into a locallimit := config.MaxRequestedDocumentsonce and add a comment.proxyapi/grpc_export.go (1)
31-34: Prefer using a single,int64-typed constant to drop the cast
MaxRequestedDocumentslives in aconfigpackage yet still uses the defaultinttype, forcing a cast here.
Changing it (and its usages) toint64would remove theint64()noise and avoid any accidental narrowing if the limit ever exceedsmath.MaxInt.No functional issue, just a small type-safety win.
storeapi/grpc_server.go (1)
46-48: Duplicate magic literal – centralise with the client side
initServerhard-codes the same256 * MiBlimit that the proxy client sets. Consider re-using a sharedgrpcMaxMsgSizeconstant (see comment inproxy/bulk/seqdb_client.go) to guarantee symmetry and simplify future tuning.config.example.yaml (1)
1-21: Well-structured configuration file with clear development focus.The YAML structure is logical and the explicit warning about production use is good. However, verify that the configuration values are appropriate for the intended use case:
- Port 9200 for debug service might conflict with Elasticsearch default
/tmp/seq-dbmay not persist across reboots- 512MiB total storage size may be limiting for some development scenarios
tests/setup/env.go (1)
84-86: Minor constant-expression clean-up (optional)
1 * uint64(units.GiB)andint(units.MiB) * 4carry redundant multiplications / casts.- FracSize: 256 * uint64(units.MiB), - TotalSize: 1 * uint64(units.GiB), + FracSize: uint64(256 * units.MiB), + TotalSize: uint64(units.GiB), … - DocBlockSize: int(units.MiB) * 4, + DocBlockSize: 4 * int(units.MiB),Also applies to: 93-94
config/config.go (1)
208-208: Fix field name inconsistencyThe field is named
ESVersionbut the comment refers to it asEsVersion.- // EsVersion is the default version that will be returned in the `/` handler. + // ESVersion is the default version that will be returned in the `/` handler.cmd/seq-db/seq-db.go (1)
150-150: Remove duplicate log statementThe "max queries per second" is logged twice - once in main() at line 96 and again here in startProxy().
- logger.Info("max queries per second", zap.Float64("limit", cfg.Limits.QueryRate))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (39)
Makefile(1 hunks)cache/cache_test.go(3 hunks)cmd/index_analyzer/main.go(2 hunks)cmd/seq-db/flags.go(1 hunks)cmd/seq-db/seq-db.go(5 hunks)cmd/stress-search/main.go(2 hunks)config.example.yaml(1 hunks)config/config.go(1 hunks)config/limits.go(2 hunks)config/shared.go(2 hunks)consts/consts.go(1 hunks)frac/active.go(3 hunks)frac/active_sealer.go(3 hunks)frac/compress.go(2 hunks)frac/info.go(1 hunks)frac/sealed_ids.go(1 hunks)frac/sealed_index.go(1 hunks)frac/unpack_cache.go(1 hunks)fracmanager/fracmanager.go(3 hunks)fracmanager/sealer_test.go(3 hunks)go.mod(2 hunks)logger/logger.go(0 hunks)parser/process_test.go(3 hunks)parser/seqql_filter.go(2 hunks)parser/seqql_filter_test.go(2 hunks)parser/token_parser.go(2 hunks)proxy/bulk/ingestor_test.go(6 hunks)proxy/bulk/seqdb_client.go(2 hunks)proxy/search/ingestor.go(3 hunks)proxyapi/grpc_export.go(2 hunks)proxyapi/grpc_export_test.go(2 hunks)proxyapi/grpc_fetch.go(2 hunks)proxyapi/grpc_fetch_test.go(2 hunks)proxyapi/http_bulk_test.go(2 hunks)storeapi/docs_stream.go(2 hunks)storeapi/grpc_search.go(2 hunks)storeapi/grpc_server.go(2 hunks)storeapi/grpc_v1.go(2 hunks)tests/setup/env.go(3 hunks)
💤 Files with no reviewable changes (1)
- logger/logger.go
🧰 Additional context used
🧬 Code Graph Analysis (23)
parser/process_test.go (1)
config/shared.go (1)
CaseSensitive(10-10)
storeapi/docs_stream.go (1)
config/shared.go (1)
MaxFetchSizeBytes(13-13)
proxyapi/http_bulk_test.go (1)
proxyapi/http_bulk.go (1)
NewBulkHandler(65-70)
proxyapi/grpc_export_test.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
frac/sealed_ids.go (1)
consts/consts.go (1)
IDsPerBlock(16-16)
parser/seqql_filter.go (1)
config/shared.go (1)
CaseSensitive(10-10)
parser/token_parser.go (1)
config/shared.go (1)
CaseSensitive(10-10)
proxyapi/grpc_fetch_test.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
proxyapi/grpc_export.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
proxyapi/grpc_fetch.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
parser/seqql_filter_test.go (1)
config/shared.go (1)
CaseSensitive(10-10)
storeapi/grpc_search.go (1)
config/shared.go (1)
UseSeqQLByDefault(17-17)
frac/unpack_cache.go (2)
consts/consts.go (1)
IDsPerBlock(16-16)frac/info.go (1)
BinaryDataVersion(23-23)
cmd/index_analyzer/main.go (1)
fracmanager/cache_maintainer.go (1)
NewCacheMaintainer(122-130)
config/limits.go (1)
config/shared.go (3)
IndexWorkers(6-6)FetchWorkers(7-7)ReaderWorkers(8-8)
proxy/bulk/ingestor_test.go (2)
seq/mapping.go (1)
NewSingleType(218-223)seq/tokenizer.go (2)
TokenizerTypeKeyword(24-24)TokenizerTypeText(25-25)
frac/sealed_index.go (2)
seq/seq.go (1)
LID(19-19)consts/consts.go (1)
IDsPerBlock(16-16)
cmd/stress-search/main.go (1)
pkg/storeapi/store_api_vtproto.pb.go (1)
StoreApiClient(1262-1269)
frac/active_sealer.go (2)
frac/active_token_list.go (1)
TokenList(60-74)consts/consts.go (1)
LIDBlockCap(17-17)
frac/active.go (6)
frac/config.go (1)
Config(3-8)consts/consts.go (2)
DocsFileSuffix(53-53)MetaFileSuffix(51-51)config/shared.go (2)
SkipFsync(11-11)IndexWorkers(6-6)frac/active_token_list.go (2)
TokenList(60-74)NewActiveTokenList(76-93)frac/active_writer.go (1)
NewActiveWriter(17-22)frac/info.go (1)
NewInfo(53-67)
tests/setup/env.go (1)
frac/active_sealer.go (1)
SealParams(28-37)
config/config.go (4)
proxy/search/ingestor.go (1)
Config(27-34)config/shared.go (4)
ReaderWorkers(8-8)SkipFsync(11-11)CaseSensitive(10-10)MaxFetchSizeBytes(13-13)config/limits.go (2)
NumCPU(11-11)TotalMemory(12-12)network/circuitbreaker/circuitbreaker.go (1)
CircuitBreaker(23-25)
cmd/seq-db/flags.go (1)
storeapi/store.go (2)
StoreModeCold(19-19)StoreModeHot(18-18)
🔇 Additional comments (42)
go.mod (1)
8-15: Confirm dependency promotion & rungo mod tidy.
github.com/alecthomas/unitsandgithub.com/kkyr/figwere promoted to direct deps, with new transitives pulled in.
Please double-check:
- The repo actually imports these packages outside of tests; otherwise they should stay
// indirect.- No residual custom byte-size helpers remain – the code still imports
github.com/c2h5oh/datasize; if that package is now obsolete, drop it to avoid bloat.- Run
go mod tidyto ensure the resolved graph is minimal andgo.sumis updated.Also applies to: 44-48
proxyapi/grpc_export_test.go (1)
15-16: LGTM – migration toconfigconstant is consistent.The new reference to
config.MaxRequestedDocumentskeeps the test aligned with runtime limits; no concerns here.Also applies to: 284-285
frac/compress.go (1)
6-6: Missing build-tag guard for optional dependency
github.com/alecthomas/unitsis now a hard dependency of this package. If the rest of the project still builds with thepurego/tinygotags (or in very slim Docker images) you may hit unexpected compilation failures. Consider adding the import only in files that already depend onunits, or guarding with build tags.parser/seqql_filter.go (1)
38-41: Same data-race concern as token_parser
caseSensitive := config.CaseSensitivecan observe a torn value if the global is mutated concurrently (e.g. tests witht.Parallel). Consider the same mitigation as suggested fortoken_parser.go.proxyapi/grpc_fetch.go (1)
14-14: LGTM! Clean package refactoring.The import path and configuration reference have been properly updated from the old
confpackage to the newconfigpackage. The functional logic remains unchanged, maintaining the same validation behavior for document request limits.Also applies to: 49-49
frac/info.go (1)
42-44: LGTM! Type standardization aligns with broader refactoring.The type changes from
uint64tointfor block size constants are consistent with the standardization effort using theunitspackage. These constants represent block sizes that are well withinintrange limits.Note that
inthas platform-dependent size, but this should not be problematic for typical block size values.frac/sealed_index.go (1)
338-338: LGTM! Explicit type casting improves type safety.The explicit cast of
consts.IDsPerBlocktoint64before multiplication ensures type consistency and prevents potential implicit conversion issues. This aligns with the broader effort to standardize constant types while maintaining type safety.frac/unpack_cache.go (1)
50-50: LGTM! Explicit type casting ensures arithmetic safety.The explicit casts to
uint64for both operands ensure consistent unsigned arithmetic and prevent potential issues with mixed signed/unsigned operations. This maintains type safety while adapting to the standardized constant types.Also applies to: 56-56
proxyapi/http_bulk_test.go (1)
14-14: LGTM! Standardized unit constants improve consistency.The migration from custom
consts.KBto the standardunits.KiBwith explicitintcasting maintains the same buffer size (512 KiB) while using more standardized and explicit unit definitions. This aligns with the broader effort to unify size constants across the codebase.Also applies to: 55-55
proxy/bulk/seqdb_client.go (1)
196-200: Hoist the msg-size constant to avoid repeated casting & clarify intentThe
256*int(units.MiB)expression is recalculated on everyBulkcall.
Lifting it to aconst(orvar) at package scope avoids the per-call cast and makes the limit easier to tweak / audit.+const grpcMaxMsgSize = 256 * int(units.MiB) ... - grpc.MaxCallRecvMsgSize(256*int(units.MiB)), - grpc.MaxCallSendMsgSize(256*int(units.MiB)), + grpc.MaxCallRecvMsgSize(grpcMaxMsgSize), + grpc.MaxCallSendMsgSize(grpcMaxMsgSize),Minor, but improves readability and removes two int conversions per replica call.
[ suggest_nitpick ]cmd/index_analyzer/main.go (1)
56-59: Slight behavioural change: 64 MiB vs 64 MB
units.MiB*64equals 64 × 1 048 576 = 67 108 864 bytes, whereas the previousconsts.MB*64was 64 × 1 000 000 = 64 000 000 bytes.
If the cache sizes were tuned tightly, this 4.8 % bump might matter. Confirm that the extra ~3 MiB is acceptable for the analyser’s memory budget.frac/sealed_ids.go (1)
140-141: Division on wide types—return type may not match downstream expectations
getIDBlockIndexByLIDnow returnsint64; callers that index slices/arrays will still need anintcast, risking silent truncation on 32-bit builds. Review all call sites—if they expectint, consider switching the return type back tointand casting here instead to keep the unsafe conversion in a single place.config/shared.go (1)
1-18: LGTM! Clean migration to standardized units and package structure.The changes correctly migrate from custom constants to the standardized
unitspackage and consolidate the package structure. Usingunits.MiBinstead ofconsts.MBis more precise for binary operations.fracmanager/sealer_test.go (2)
13-13: LGTM! Proper import of standardized units package.
64-64: LGTM! Correct migration to standardized units with proper type casting.The changes properly migrate from custom
consts.MBto standardizedunits.MiBwith appropriate type casting (uint64for cache sizes andintfor doc block size).Also applies to: 98-98
config/limits.go (3)
1-1: LGTM! Package rename consistent with refactoring.
16-16: LGTM! Simplified maxprocs configuration.Removing the custom logger function simplifies the code while maintaining functionality.
21-23: LGTM! Worker variables properly initialized.Moving the worker count initialization here from
config/shared.goimproves the architectural organization by consolidating resource limit configuration in one place.cmd/stress-search/main.go (2)
24-24: LGTM! Proper import of standardized units package.
236-236: LGTM! Consistent use of MiB for gRPC message sizes.proxy/bulk/ingestor_test.go (5)
12-12: LGTM! Clean import addition for units package.The import is correctly added to support the standardized byte size constants used throughout the file.
80-80: LGTM! Consistent standardization of byte size constants.The configuration correctly uses
int(units.KiB)instead ofconsts.KB, maintaining the same 1 KiB size while using the standardized units package.Also applies to: 85-85
194-196: LGTM! Consistent test data generation with standardized units.The test correctly uses
int(units.KiB)+1to generate documents that exceed the size limit, maintaining the same test behavior while using the standardized units package.
473-477: LGTM! Consistent benchmark setup with standardized units.The benchmark configuration correctly uses
int(units.KiB)for all mapping type definitions, maintaining consistent 1 KiB size limits while using the standardized units package.
527-527: LGTM! Helper function consistently updated.The helper function correctly uses
int(units.KiB)for the maxSize parameter, ensuring consistency with the standardized units package throughout the test file.proxy/search/ingestor.go (3)
10-10: LGTM! Clean import addition for units package.The import is correctly added to support the standardized byte size constants used in gRPC message size configurations.
224-225: LGTM! Consistent gRPC message size configuration.The gRPC options correctly use
256*int(units.MiB)instead of256*consts.MB, maintaining the same 256 MiB message size limits while using the standardized units package.
606-607: LGTM! Consistent gRPC message size configuration.The gRPC options correctly use
256*int(units.MiB)instead of256*consts.MB, maintaining the same 256 MiB message size limits while using the standardized units package.Makefile (2)
34-34: LGTM! Simplified data directory setup.The fixed directory approach using
/tmp/seq-dbis cleaner and aligns with the configuration file approach where paths are predefined rather than dynamically generated.
37-37: LGTM! Perfect alignment with PR objectives.The change to use
--config=config.example.yamlinstead of multiple flags perfectly aligns with the PR's main objective of replacing ~60 command-line flags with a centralized configuration file approach.frac/active_sealer.go (3)
16-16: LGTM! Clean import addition for units package.The import is correctly added to support the standardized byte size constants used in buffer size calculations.
291-291: LGTM! Consistent buffer size configuration with standardized units.The buffer sizes correctly use
int(units.MiB)instead ofconsts.MB, maintaining the same 4 MiB and 32 MiB sizes while using the standardized units package.Also applies to: 294-294
192-192: LGTM! Explicit type casting for compatibility.The explicit
int()casting ofconsts.LIDBlockCapensures type compatibility with the function parameter expectations, maintaining consistency with the standardized approach.fracmanager/fracmanager.go (4)
19-19: LGTM! Import updated to use new centralized config package.The import correctly changes from the old
confpackage to the newconfigpackage, aligning with the configuration centralization effort.
66-66: LGTM! Parameter renamed to avoid naming conflicts.Renaming the parameter from
configtocfgis a good practice to avoid conflicts with the importedconfigpackage name.
67-67: LGTM! Configuration references consistently updated.All references to the configuration parameter are correctly updated to use
cfginstead ofconfig, maintaining consistency with the parameter rename.Also applies to: 69-69, 90-90, 93-93, 95-95
93-93: LGTM! Usage of centralized worker configuration.The use of
config.ReaderWorkersandconfig.IndexWorkersglobal variables aligns with the centralized configuration approach, providing consistent worker count configuration across the application.consts/consts.go (1)
6-8: Adoption ofunitslooks goodThe switch to
github.com/alecthomas/unitsremoves the previously home-grown KB/MB/GB constants and unifies the byte-size story across the codebase.
Values stay well belowmath.MaxInt32, so theint(...)casts are safe even on 32-bit builds.Also applies to: 14-17
cache/cache_test.go (1)
9-10: Tests updated correctlyAll size literals now use
units.MiB/units.KiB, and the added casts keep the signature expectations intact.No functional or overflow risks spotted.
Also applies to: 15-33
frac/active.go (1)
77-100: OK – avoids name shadowing and uses global config flags coherentlyThe param name change (
cfg) removes the previous shadowing of the importedconfigpackage.
Usingconfig.SkipFsync/config.IndexWorkerskeeps these fast-path toggles global while the instance-specific settings stay incfg.cmd/seq-db/flags.go (1)
18-18: ExistingFile() validation may be too restrictiveUsing
ExistingFile()requires the config file to exist before running the application. This could complicate deployments where the config file is created dynamically or in containerized environments.Consider if this strict validation is necessary, or if a more flexible approach would be better:
- ExistingFile() + String()Then validate file existence with a helpful error message in the Parse function.
cmd/seq-db/seq-db.go (1)
82-82: Ignore undefined config field warningThe
configpackage does exportMaxRequestedDocuments(seeconfig/shared.go), so referencing it incmd/seq-db/seq-db.gois valid. No changes needed.Likely an incorrect or invalid review comment.
|
Note Reviews pausedUse the following commands to manage reviews:
📝 WalkthroughWalkthroughThis change migrates the application's configuration system from command-line flags to a configuration file. Most operational parameters previously set via flags are now specified in a YAML config file, with only a minimal set of flags ( Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
🔭 Outside diff range comments (2)
go.mod (1)
3-3:go 1.24may break builds on current toolchainsThe “go” directive states that the module requires Go 1.24, but Go 1.24 is not released yet. CI environments still pinned to 1.22/1.23 will refuse to build with an unknown version.
-go 1.24 +go 1.22If you really need 1.24-specific features, keep the directive and bump the build images; otherwise revert to the latest released version.
storeapi/docs_stream.go (1)
102-108: Guard against zero-size chunks to prevent infinite loopIf
avgDocSizeexceedsconfig.MaxFetchSizeBytes,newChunkSizebecomes 0.
batchLoaderlater callsmin(len(d.ids), chunkSize); a zero chunk keepsd.idsuntouched and the loop never makes progress, effectively hanging the stream.newChunkSize := int(config.MaxFetchSizeBytes) / avgDocSize +if newChunkSize < 1 { + newChunkSize = 1 // always fetch at least one document +}
♻️ Duplicate comments (7)
parser/token_parser.go (1)
215-217: (duplicate) Access to globalconfig.CaseSensitivestill racesReading a mutable global that tests (and potentially other goroutines) mutate causes data races. Either pass the flag as a parameter or guard it with sync/atomic.
parser/seqql_filter.go (1)
38-41: (duplicate) Mutable globalconfig.CaseSensitiveread without synchronisationSame race-condition concern as previously flagged for
parser/token_parser.go.parser/process_test.go (2)
286-287: (duplicate) Test mutates globalconfig.CaseSensitiveand never restores itLeaking global state can make other tests flaky when run in a different order.
299-300: (duplicate) Parallel sub-tests still share the same global – data raceSetting
config.CaseSensitiveinside a loop that allowst.Parallel()re-introduces the race. Protect the assignment or avoid the global.storeapi/grpc_v1.go (1)
112-112: Hard-codedconfig.FetchWorkerslimits flexibility
docFetcheris initialised with a package-level variable instead of using thecfgparameter like other configuration values. This prevents per-store tuning and deviates from the rest of the constructor.Consider extending
APIConfig.Fetchto carry aWorkersfield and use it here:-docFetcher: fracmanager.NewFetcher(config.FetchWorkers), +docFetcher: fracmanager.NewFetcher(cfg.Fetch.Workers),cmd/seq-db/flags.go (2)
16-18: Default config file name doesn't match the example fileThe default is "config.yaml" but the PR introduces "config.example.yaml". This mismatch could confuse users who expect the example file to work out of the box.
30-34: Inconsistent deprecation comment with default valueThe comment states "We already use SeqQL by default" but the flag default is still "false". This is confusing for users.
🧹 Nitpick comments (3)
go.mod (1)
8-10: Duplicate size-unit libraries increase binary sizeYou added
github.com/alecthomas/units, butgithub.com/c2h5oh/datasizeis still listed (Line 9). Carrying two libraries that solve the same problem is unnecessary and bloats the dependency graph.Remove the unused one and migrate the codebase to a single package.
config.example.yaml (1)
9-13: Optional: clarify size units
64MiB&512MiBare parsed as strings; the reader must know they rely on fig’s units parser.
Consider adding a comment for first-time users:# Supports the same units as github.com/alecthomas/units (KiB, MiB, GiB, …) frac_size: 64MiBcmd/index_analyzer/main.go (1)
56-59: Minor style: cast after multiplication to avoid an intermediateint64.
units.MiBis anint64. Multiplying first (units.MiB * 64) then casting is fine, but casting the constant once and doing arithmetic inuint64avoids an unnecessary signed intermediate:-cm := fracmanager.NewCacheMaintainer(uint64(units.GiB), uint64(units.MiB*64), nil) +cm := fracmanager.NewCacheMaintainer(uint64(units.GiB), 64*uint64(units.MiB), nil)Purely cosmetic – no functional impact.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (39)
Makefile(1 hunks)cache/cache_test.go(3 hunks)cmd/index_analyzer/main.go(2 hunks)cmd/seq-db/flags.go(1 hunks)cmd/seq-db/seq-db.go(5 hunks)cmd/stress-search/main.go(2 hunks)config.example.yaml(1 hunks)config/config.go(1 hunks)config/limits.go(2 hunks)config/shared.go(2 hunks)consts/consts.go(1 hunks)frac/active.go(3 hunks)frac/active_sealer.go(3 hunks)frac/compress.go(2 hunks)frac/info.go(1 hunks)frac/sealed_ids.go(1 hunks)frac/sealed_index.go(1 hunks)frac/unpack_cache.go(1 hunks)fracmanager/fracmanager.go(3 hunks)fracmanager/sealer_test.go(3 hunks)go.mod(2 hunks)logger/logger.go(0 hunks)parser/process_test.go(3 hunks)parser/seqql_filter.go(2 hunks)parser/seqql_filter_test.go(2 hunks)parser/token_parser.go(2 hunks)proxy/bulk/ingestor_test.go(6 hunks)proxy/bulk/seqdb_client.go(2 hunks)proxy/search/ingestor.go(3 hunks)proxyapi/grpc_export.go(2 hunks)proxyapi/grpc_export_test.go(2 hunks)proxyapi/grpc_fetch.go(2 hunks)proxyapi/grpc_fetch_test.go(2 hunks)proxyapi/http_bulk_test.go(2 hunks)storeapi/docs_stream.go(2 hunks)storeapi/grpc_search.go(2 hunks)storeapi/grpc_server.go(2 hunks)storeapi/grpc_v1.go(2 hunks)tests/setup/env.go(3 hunks)
💤 Files with no reviewable changes (1)
- logger/logger.go
🧰 Additional context used
🧬 Code Graph Analysis (22)
storeapi/docs_stream.go (1)
config/shared.go (1)
MaxFetchSizeBytes(13-13)
proxyapi/grpc_export_test.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
parser/process_test.go (1)
config/shared.go (1)
CaseSensitive(10-10)
proxyapi/http_bulk_test.go (1)
proxyapi/http_bulk.go (1)
NewBulkHandler(65-70)
parser/token_parser.go (1)
config/shared.go (1)
CaseSensitive(10-10)
cmd/index_analyzer/main.go (1)
fracmanager/cache_maintainer.go (1)
NewCacheMaintainer(122-130)
frac/sealed_index.go (2)
seq/seq.go (1)
LID(19-19)consts/consts.go (1)
IDsPerBlock(16-16)
frac/sealed_ids.go (1)
consts/consts.go (1)
IDsPerBlock(16-16)
parser/seqql_filter_test.go (1)
config/shared.go (1)
CaseSensitive(10-10)
proxyapi/grpc_export.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
frac/unpack_cache.go (2)
consts/consts.go (1)
IDsPerBlock(16-16)frac/info.go (1)
BinaryDataVersion(23-23)
proxyapi/grpc_fetch.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
proxyapi/grpc_fetch_test.go (1)
config/shared.go (1)
MaxRequestedDocuments(15-15)
config/limits.go (1)
config/shared.go (3)
IndexWorkers(6-6)FetchWorkers(7-7)ReaderWorkers(8-8)
fracmanager/sealer_test.go (1)
fracmanager/cache_maintainer.go (1)
NewCacheMaintainer(122-130)
frac/active.go (6)
frac/config.go (1)
Config(3-8)consts/consts.go (2)
DocsFileSuffix(53-53)MetaFileSuffix(51-51)config/shared.go (2)
SkipFsync(11-11)IndexWorkers(6-6)frac/active_token_list.go (2)
TokenList(60-74)NewActiveTokenList(76-93)frac/active_writer.go (1)
NewActiveWriter(17-22)frac/info.go (1)
NewInfo(53-67)
storeapi/grpc_search.go (1)
config/shared.go (1)
UseSeqQLByDefault(17-17)
tests/setup/env.go (1)
frac/active_sealer.go (1)
SealParams(28-37)
frac/active_sealer.go (1)
consts/consts.go (1)
LIDBlockCap(17-17)
fracmanager/fracmanager.go (3)
fracmanager/config.go (2)
Config(14-29)FillConfigWithDefault(31-77)config/shared.go (2)
ReaderWorkers(8-8)IndexWorkers(6-6)consts/consts.go (1)
FracCacheFileSuffix(64-64)
parser/seqql_filter.go (1)
config/shared.go (1)
CaseSensitive(10-10)
cmd/seq-db/flags.go (1)
storeapi/store.go (2)
StoreModeCold(19-19)StoreModeHot(18-18)
🔇 Additional comments (49)
proxyapi/grpc_export_test.go (1)
284-288: LGTM – test now uses central constantUsing
config.MaxRequestedDocumentskeeps the test aligned with runtime limits. 👍proxyapi/grpc_export.go (1)
31-34: LGTM – config constant migration is correct.The limit check now references
config.MaxRequestedDocuments, keeping behaviour intact after the package rename.proxyapi/http_bulk_test.go (1)
54-56: LGTM – switch tounits.KiBkeeps the 512 KiB limit accurate.The cast is correct (
int(units.KiB)*512 == 524 288).proxy/bulk/seqdb_client.go (1)
196-200: Binary vs decimal MB: check that the 256 MiB gRPC limit is intentional.
units.MiBis 1 048 576 bytes, whereas the previousconsts.MBmay have been 1 000 000 or 1 048 576 depending on the old definition.
The new limit is therefore either identical or ~+5 % larger. Confirm this aligns with server limits; otherwise consider using an explicit value to avoid accidental drift.frac/info.go (1)
42-44: LGTM - Type consistency improvement.The change from
uint64tointfor these constant fields aligns with the project-wide standardization effort moving from custom memory size constants to theunitspackage. This ensures consistent type representation across the codebase.storeapi/grpc_server.go (2)
13-13: LGTM - Consistent migration to standard units.The import change from custom
constspackage togithub.com/alecthomas/unitsaligns with the project-wide standardization effort.
46-47: LGTM - Proper binary unit usage for gRPC message sizes.The change from
consts.MB * 256toint(units.MiB) * 256is correct:
- Uses binary units (MiB) which is appropriate for message sizes
- Maintains the same effective size limit (256 MiB)
- Explicit
intconversion ensures type safetyfrac/sealed_index.go (1)
338-338: LGTM - Type safety improvement in arithmetic operation.The explicit
int64conversion ofconsts.IDsPerBlockensures proper 64-bit arithmetic before casting toseq.LID. This prevents potential type mismatch issues and maintains consistency with similar patterns across the codebase.frac/sealed_ids.go (1)
140-140: LGTM - Type consistency in division operation.The explicit
int64conversion ofconsts.IDsPerBlockensures both operands in the division are the same type, preventing implicit conversion issues and maintaining type safety.cmd/stress-search/main.go (3)
24-24: LGTM - Consistent migration to standard units.The import change from custom
constspackage togithub.com/alecthomas/unitsaligns with the project-wide standardization effort.
236-236: LGTM - Consistent binary unit usage.The change from
consts.MBtoint(units.MiB)ensures consistent binary unit usage for gRPC message sizes with proper type conversion.
243-243: LGTM - Consistent binary unit usage resolved.The change from
consts.MBtoint(units.MiB)ensures both receive and send message sizes use the same binary unit type, resolving the previous inconsistency issue flagged in past reviews.frac/compress.go (1)
6-6: LGTM! Clean migration to standardized units.The replacement of
consts.MBwithunits.MiBproperly standardizes size constants across the codebase. The explicitint()cast maintains type compatibility.Also applies to: 68-68
config/shared.go (1)
1-1: LGTM! Proper refactoring to centralized config package.The package rename from
conftoconfigand import simplification align well with the centralized configuration system. The migration tounits.MiBis consistent with the broader standardization effort.Also applies to: 3-3, 13-13
storeapi/grpc_search.go (1)
18-18: LGTM! Consistent package migration.The import path update from
conftoconfigand the corresponding reference change toconfig.UseSeqQLByDefaultproperly maintain functionality while aligning with the package restructuring.Also applies to: 212-212
Makefile (1)
34-37: Approve Makefile changes – config file verified
config.example.yaml exists and includes the required top-level sections (address, storage, slow_logs, mapping). Ready to merge.config/limits.go (2)
1-1: LGTM! Clean package migration and simplification.The package rename to
configand simplifiedmaxprocs.Set()call align well with the overall refactoring effort.Also applies to: 16-16
21-23: No initialization order issues detectedGo guarantees that a package’s init functions run before any other package that imports it. In the
configpackage, the only init (inconfig/limits.go) setsIndexWorkers,FetchWorkers, andReaderWorkersafterNumCPUis determined, and this runs before any consuming packages (e.g.frac,storeapi,cmd/seq-db) initialize or use those variables. No circular or premature-initialization risks remain.frac/active_sealer.go (3)
16-16: LGTM: Added standardized units import.The import of
github.com/alecthomas/unitsis correctly added to support the unit constant migration.
192-192: LGTM: Explicit type conversion added.The explicit cast of
consts.LIDBlockCaptointimproves type safety without changing functionality.
291-291: Note the semantic change from decimal to binary units.The replacement of
consts.MBwithint(units.MiB)changes from decimal (1,000,000 bytes) to binary (1,048,576 bytes) units. This increases the actual memory allocation by approximately 4.9%.While this appears intentional as part of the standardization effort, please verify this size increase is acceptable for:
- Default block size: 4,194,304 bytes (was 4,000,000)
- Buffer size: 33,554,432 bytes (was 32,000,000)
Also applies to: 294-294
fracmanager/sealer_test.go (3)
13-13: LGTM: Added standardized units import.The import addition correctly supports the unit constant migration.
64-64: LGTM: Correct type conversion for cache sizes.The explicit conversion to
uint64is appropriate for theNewCacheMaintainerfunction signature, and the use ofunits.MiBstandardizes the size constants.
98-98: LGTM: Consistent with other DocBlockSize changes.The conversion to
int(units.MiB) * 4is consistent with similar changes in other files and matches the expected parameter type.proxy/search/ingestor.go (2)
10-10: LGTM: Added standardized units import.The import correctly supports the migration to standardized unit constants.
224-225: LGTM: Consistent gRPC message size configuration.The conversion to
int(units.MiB)maintains the 256 MiB message size limit across bothFetchandSearchoperations. The type conversion tointis appropriate for gRPC options.Note: This increases the actual byte limit from 256,000,000 to 268,435,456 bytes (binary vs decimal units).
Also applies to: 606-607
proxy/bulk/ingestor_test.go (4)
12-12: LGTM: Added standardized units import.The import addition correctly supports the unit constant migration throughout the test file.
80-80: LGTM: Consistent configuration with binary units.The conversion to
int(units.KiB)forMaxTokenSizeandMaxDocumentSizestandardizes the configuration and maintains type compatibility.Also applies to: 85-85
194-196: LGTM: Test cases updated consistently.The test payload generation correctly uses
int(units.KiB)+1to create documents that exceed the size limit, maintaining the test logic while using standardized units.
473-477: LGTM: Benchmark and helper functions updated consistently.All benchmark configurations and helper functions correctly use
int(units.KiB)for size parameters, maintaining consistency across the test suite.Also applies to: 488-488, 527-527
tests/setup/env.go (4)
16-16: LGTM: Added standardized units import.The import addition correctly supports the unit constant migration in the test setup.
84-85: LGTM: Fraction manager configuration updated.The conversion to
uint64(units.MiB)anduint64(units.GiB)is appropriate for the configuration parameters and maintains the intended sizes while using standardized units.
93-93: LGTM: Consistent DocBlockSize configuration.The conversion to
int(units.MiB) * 4aligns with similar changes across the codebase and maintains type compatibility.
310-310: LGTM: Compound unit expression.The expression
int(units.MiB + units.KiB)correctly evaluates to 1,049,600 bytes (1 MiB + 1 KiB), providing a slightly larger document size limit than a round 1 MB. This appears intentional for test configuration.fracmanager/fracmanager.go (3)
19-19: LGTM: Import path updated correctlyThe import path has been correctly updated from the old
confpackage to the new centralizedconfigpackage.
66-67: LGTM: Parameter renamed for clarityThe constructor parameter has been appropriately renamed from
configtocfgto avoid shadowing the imported package name, improving code readability.
93-93: Invalid suggestion:cfgdoes not defineReaderWorkersorIndexWorkersThe
NewFracManager(cfg *Config)constructor takes a*fracmanager.Config, which only includes fracmanager-specific settings (DataDir, FracSize, Fraction, etc.). TheReaderWorkersandIndexWorkerscounts live in the globalconfigpackage’sConfigand are not fields onfracmanager.Config, so replacing-fracProvider: newFractionProvider(&cfg.Fraction, cacheMaintainer, config.ReaderWorkers, config.IndexWorkers), +fracProvider: newFractionProvider(&cfg.Fraction, cacheMaintainer, cfg.ReaderWorkers, cfg.IndexWorkers),would not compile. If you intend to drive those worker counts from the local
cfg, you must first addReaderWorkersandIndexWorkerstofracmanager.Config(infracmanager/config.go). Otherwise, continue using the globalconfig.ReaderWorkersandconfig.IndexWorkersas designed.Likely an incorrect or invalid review comment.
storeapi/grpc_v1.go (2)
13-13: LGTM: Import path updated correctlyThe import path has been correctly updated from the old
confpackage to the new centralizedconfigpackage.
100-102: LGTM: Parameter renamed for clarityThe constructor parameter has been appropriately renamed from
configtocfgto avoid shadowing the imported package name, improving code readability.consts/consts.go (2)
7-7: LGTM: Units package import addedThe addition of the
github.com/alecthomas/unitspackage import supports the standardization of size units across the codebase.
14-17: LGTM: Constants updated to use standardized unitsThe constants have been appropriately updated to use the
unitspackage instead of custom constants. The use ofunits.KiB(binary units) is correct for memory/disk operations, and the explicitint()casts maintain type consistency with the previous implementation.cache/cache_test.go (3)
9-9: LGTM: Units package import addedThe import of
github.com/alecthomas/unitssupports the standardization of size units in the test code.
15-15: LGTM: Test constants updated to use standardized unitsThe test constants have been appropriately updated to use
units.MiBandunits.KiBinstead of custom constants. The use of binary units is correct for memory operations, and explicit type casts maintain compatibility with existing function signatures.Also applies to: 26-31
32-32: LGTM: Type conversions applied consistentlyThe explicit type conversions to
uint64()andint()are correctly applied to maintain compatibility with function signatures while using the standardized units package.Also applies to: 39-46, 50-53, 91-91
frac/active.go (3)
17-17: LGTM: Import path updated correctlyThe import path has been correctly updated from the old
confpackage to the new centralizedconfigpackage.
77-77: LGTM: Parameter renamed for clarityThe constructor parameter has been appropriately renamed from
configtocfgto avoid shadowing the imported package name, and the assignment on line 103 correctly uses the parameter.Also applies to: 103-103
79-80: No change needed:cfglacksSkipFsyncandIndexWorkersfieldsThe
cfgparameter passed into NewActive is thefrac.Configtype (defined in frac/config.go), which only containsSearch,SkipSortDocs, andKeepMetaFile. It does not includeSkipFsyncorIndexWorkers, so using the package-level globals is intentional and valid. Ignore the suggested diff.Likely an incorrect or invalid review comment.
proxyapi/grpc_fetch_test.go (1)
17-17: LGTM! Configuration package migration is correctly applied.The import path and constant reference have been properly updated to use the new centralized
configpackage, maintaining the sameMaxRequestedDocumentsvalue of 100,000.Also applies to: 135-135
cmd/seq-db/flags.go (1)
15-35: Excellent simplification of command-line interface!The reduction from ~60 flags to just 4 essential flags significantly improves usability. The implementation follows good practices with proper validation (ExistingFile) and hint options for the mode flags.
1c4b677 to
49aba14
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (7)
cmd/seq-db/flags.go (2)
16-18: Default config file name doesn't match the example fileThe default is "config.yaml" but the PR introduces "config.example.yaml". This mismatch could confuse users who expect the example file to work out of the box.
30-34: Inconsistent deprecation comment with default valueThe comment states "We already use SeqQL by default" but the flag default is still "false". This is confusing for users.
cmd/seq-db/seq-db.go (3)
79-84: Avoid setting global config variablesSetting global variables from the parsed config defeats the purpose of centralized configuration and makes the code harder to test and maintain. Pass the config struct to functions that need these values instead.
84-84: Remove usage of deprecated flagThe
flagUseSeqQLByDefaultis marked as deprecated but is still being used to set a global config variable. Since SeqQL is already the default, this line should be removed.
150-150: Remove duplicate log messageThis log message is already printed in the main function at line 96.
frac/unpack_cache.go (2)
49-53: Guard against negative block indices and remove if unused.This method has two issues:
- It directly casts
indextouint64without checking for negative values, which could cause wraparound (as previously noted)- Static analysis indicates this method is unused
If this method is intended for future use, apply the previously suggested fix:
func (c *UnpackCache) unpackMIDs(index int64, data []byte) { + if index < 0 { + panic(fmt.Sprintf("unpackMIDs: negative block index=%d", index)) + } c.lastBlock = index c.startLID = uint64(index) * uint64(consts.IDsPerBlock) c.values = unpackRawIDsVarint(data, c.values) }Otherwise, consider removing this unused method.
55-65: Guard against negative block indices and remove if unused.This method has the same issues as
unpackMIDs:
- Negative index casting without validation
- Static analysis indicates it's unused
If this method is intended for future use, add the same guard:
func (c *UnpackCache) unpackRIDs(index int64, data []byte, fracVersion config.BinaryDataVersion) { + if index < 0 { + panic(fmt.Sprintf("unpackRIDs: negative block index=%d", index)) + } c.lastBlock = index c.startLID = uint64(index) * uint64(consts.IDsPerBlock)Otherwise, consider removing this unused method.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (41)
Makefile(1 hunks)cache/cache_test.go(3 hunks)cmd/index_analyzer/main.go(2 hunks)cmd/seq-db/flags.go(1 hunks)cmd/seq-db/seq-db.go(5 hunks)cmd/stress-search/main.go(2 hunks)config.example.yaml(1 hunks)config/config.go(1 hunks)config/frac_version.go(1 hunks)config/limits.go(2 hunks)config/shared.go(2 hunks)consts/consts.go(1 hunks)frac/active.go(3 hunks)frac/active_sealer.go(3 hunks)frac/compress.go(2 hunks)frac/info.go(3 hunks)frac/sealed/seqids/blocks.go(3 hunks)frac/sealed/seqids/loader.go(3 hunks)frac/sealed/seqids/provider.go(2 hunks)frac/unpack_cache.go(1 hunks)fracmanager/fracmanager.go(3 hunks)fracmanager/sealer_test.go(3 hunks)go.mod(2 hunks)logger/logger.go(0 hunks)parser/process_test.go(3 hunks)parser/seqql_filter.go(2 hunks)parser/seqql_filter_test.go(2 hunks)parser/token_parser.go(2 hunks)proxy/bulk/ingestor_test.go(6 hunks)proxy/bulk/seqdb_client.go(2 hunks)proxy/search/ingestor.go(3 hunks)proxyapi/grpc_export.go(2 hunks)proxyapi/grpc_export_test.go(2 hunks)proxyapi/grpc_fetch.go(2 hunks)proxyapi/grpc_fetch_test.go(2 hunks)proxyapi/http_bulk_test.go(2 hunks)storeapi/docs_stream.go(2 hunks)storeapi/grpc_search.go(2 hunks)storeapi/grpc_server.go(2 hunks)storeapi/grpc_v1.go(2 hunks)tests/setup/env.go(3 hunks)
💤 Files with no reviewable changes (1)
- logger/logger.go
✅ Files skipped from review due to trivial changes (19)
- config/frac_version.go
- proxyapi/grpc_fetch.go
- proxyapi/grpc_export_test.go
- storeapi/docs_stream.go
- parser/seqql_filter.go
- parser/seqql_filter_test.go
- proxy/bulk/seqdb_client.go
- proxyapi/http_bulk_test.go
- frac/sealed/seqids/provider.go
- config.example.yaml
- parser/process_test.go
- storeapi/grpc_search.go
- proxyapi/grpc_fetch_test.go
- proxy/bulk/ingestor_test.go
- cmd/index_analyzer/main.go
- frac/sealed/seqids/blocks.go
- frac/sealed/seqids/loader.go
- parser/token_parser.go
- consts/consts.go
🚧 Files skipped from review as they are similar to previous changes (18)
- proxyapi/grpc_export.go
- storeapi/grpc_server.go
- Makefile
- fracmanager/sealer_test.go
- cmd/stress-search/main.go
- frac/compress.go
- config/shared.go
- proxy/search/ingestor.go
- go.mod
- config/limits.go
- storeapi/grpc_v1.go
- frac/active_sealer.go
- fracmanager/fracmanager.go
- frac/active.go
- frac/info.go
- cache/cache_test.go
- tests/setup/env.go
- config/config.go
🧰 Additional context used
🪛 GitHub Check: lint
frac/unpack_cache.go
[failure] 82-82:
func unpackRawIDsNoVarint is unused (unused)
[failure] 67-67:
func unpackRawIDsVarint is unused (unused)
[failure] 55-55:
func (*UnpackCache).unpackRIDs is unused (unused)
[failure] 49-49:
func (*UnpackCache).unpackMIDs is unused (unused)
[failure] 12-12:
field lastBlock is unused (unused)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (3)
frac/unpack_cache.go (3)
1-9: LGTM!Package declaration and imports are appropriate for the unpacking cache functionality.
17-19: LGTM!Pool declaration and constant are properly defined. The pool's
Newfunction is optional and the code correctly handles nil returns.
21-39: LGTM!Constructor and reset methods are well-implemented with proper pool usage and field initialization.
49aba14 to
85a8369
Compare
|
@coderabbitai pause |
✅ Actions performedReviews paused. |
Closes #21
Summary by CodeRabbit
New Features
Refactor
Chores