-
Notifications
You must be signed in to change notification settings - Fork 54
Streaming allDocs/query API #564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@mabels This is the current state of what I have. Any thoughts on the API? See PR description. |
1af1f48 to
d257bd0
Compare
WalkthroughThe changes involve minor formatting updates in build scripts and a new ESLint rule. There are extensive modifications across CRDT-related modules, database and indexing components, and React utilities. Updates include refactoring methods to use asynchronous generators, introducing new utility functions, replacing subscription models with query-based ones, and updating type signatures for enhanced type safety. Test files have been adjusted to reflect these API and behavioral changes. Changes
Sequence Diagram(s)sequenceDiagram
participant CRDTClock as CRDTClockImpl
participant ClockChanges as clockChangesSince
participant AsyncUtil as arrayFromAsyncIterable
CRDTClock->>ClockChanges: Call clockChangesSince(...)
Note right of CRDTClock: Wrap result with throwFalsy
CRDTClock->>AsyncUtil: Pass wrapped result
AsyncUtil-->>CRDTClock: Return changes array
sequenceDiagram
participant Client
participant DB as DatabaseImpl
participant CRDT as CRDTImpl
participant IDX as Index
Client->>DB: Issue query/allDocs request
DB->>CRDT: Delegate query via select/allDocs/changes
CRDT-->>DB: Return document/update stream
DB->>IDX: Process results with mapping and subscription
IDX-->>DB: Return index/query array
DB-->>Client: Respond with QueryResponse (array of rows)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (19)
src/crdt-clock.ts (1)
74-77: Be mindful of potential memory usage.
UsingarrayFromAsyncIterableis convenient, but if the upstream iterator yields a large dataset, the array could become very large. Consider a streaming or incremental approach if memory constraints become a concern.tests/fireproof/fireproof.test.ts (1)
202-206: Remove duplicated assertion.
The call toexpect(rows).toBeTruthy();appears twice, which is redundant and can be removed to streamline the test.src/crdt.ts (1)
181-204: NewallDocsmethod with streaming interface.
Provides snapshot, subscription, array conversion, and live streaming.
Note: TheTODO: Take qryOpts into accountcomment indicates partial handling of query options. Consider implementing filters or key ranges to fully realizeqryOpts.src/types.ts (4)
250-255: Clarify or document the Row interface.These new lines add a generic interface for representing an index row, but there's no documentation describing its usage context. Consider adding JSDoc or inline comments clarifying how this interface should be used, especially for newcomers.
323-334: Nicely structured streaming interface.The
QueryStreamMarkerandInquiryResponseinterface define a clear contract for streaming rows. Consider adding a short documentation comment explaining “preexisting” vs “new” markers for better clarity.
335-346: QueryResponse adds doc access—good approach!Providing a parallel interface that includes the full document helps support both doc-less and doc-including scenarios. Check for duplication between
InquiryResponseandQueryResponse, and consider refactoring common code if it becomes cumbersome.
480-485: all() method overloads.TypeScript overloads can become tricky to maintain. Ensure thorough test coverage on both usage branches (
withDocs: false|true|undefined).tests/fireproof/streaming-api.test.ts (5)
14-17: DocType interface is straightforward.Simple, clear definition for the document structure. Optionally, add doc comments to convey any constraints or usage details.
19-37: Database creation using Date.now().toString().While this approach is typically safe for test isolation, there’s a small chance of collisions in extremely concurrent environments. Consider using a more robust unique suffix if needed.
57-80: testLive handles real-time doc streaming.Good use of
for await(...)with markers. Consider verifying error handling scenarios (e.g., if doc insertion fails or the stream is interrupted), but the logic looks solid for now.
81-127: testSince uses random doc counts.Randomizing the number of new docs may introduce nondeterministic test behavior. Make sure you’re comfortable with that possibility. Otherwise, explicitly define a range to avoid unexpected flickers.
174-204: "allDocs" test suite.Nice variety of sub-tests. Consider adding boundary checks (e.g., no docs at all or large doc sets) if relevant for
.select().src/crdt-helpers.ts (5)
46-46: time() helper for optional logging.This function is commented out, so no noticeable effect. If it’s a placeholder for performance measurement, document how to toggle it on or remove it if no longer needed.
50-50: timeEnd() similarly stands unused.Same reasoning: either document or remove if it’s strictly leftover debugging code.
292-304: clockUpdatesSince generator.This new async generator cleanly streams rows with
clockandvalue. Watch for any unhandled rejections ingetValueFromLinkif data is missing or corrupted. The existing code throws an error, which is good for debugging but ensure it's tested.
330-331: allowedKeys optional parameter.Filtering updates by a set of allowed keys is a neat approach. Double-check you aren’t missing a parallel “excludedKeys” scenario if that’s relevant for the domain.
490-490: Compact step calls clockChangesSince with empty since array.Iterating from the earliest possible state can be resource-intensive for large ledgers. Consider a threshold or checkpoint approach if performance becomes an issue.
src/indexer.ts (1)
307-340: Consider handling async steps in a loop rather thanforEach.
The code calls an async callback within aforEach, which won’t wait for each iteration. This may be acceptable if concurrency is desired, but aforloop orfor...ofmay help if sequential execution is needed.src/database.ts (1)
126-136: Reusingselect()forchanges.
The approach is consistent with the new query mechanism. Note the “FIXME” about needing aclockproperty on rows, but the rest looks good.Would you like help adding the
clockfield to the row structure and ensuring it’s set?
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
build-docs.sh(0 hunks)eslint.config.mjs(1 hunks)src/crdt-clock.ts(2 hunks)src/crdt-helpers.ts(6 hunks)src/crdt.ts(4 hunks)src/database.ts(3 hunks)src/indexer-helpers.ts(3 hunks)src/indexer.ts(11 hunks)src/ledger.ts(3 hunks)src/react/img-file.ts(1 hunks)src/react/useFireproof.ts(2 hunks)src/types.ts(6 hunks)src/utils.ts(1 hunks)tests/fireproof/crdt.test.ts(8 hunks)tests/fireproof/database.test.ts(2 hunks)tests/fireproof/fireproof.test.ts(5 hunks)tests/fireproof/indexer.test.ts(12 hunks)tests/fireproof/streaming-api.test.ts(1 hunks)
💤 Files with no reviewable changes (1)
- build-docs.sh
🧰 Additional context used
🪛 Biome (1.9.4)
src/indexer-helpers.ts
[error] 217-217: Avoid the use of spread (...) syntax on accumulators.
Spread syntax should be avoided on accumulators (like those in .reduce) because it causes a time complexity of O(n^2).
Consider methods such as .splice or .push instead.
(lint/performance/noAccumulatingSpread)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Build Fireproof
- GitHub Check: Build Fireproof
🔇 Additional comments (119)
src/react/img-file.ts (1)
4-4: Great improvement for cross-environment compatibilityReplacing
windowwithglobalThismakes the code more portable and allows it to work in non-browser environments like Node.js, which aligns with the ESLint rule restriction on direct use of globals.src/utils.ts (1)
363-374: Good utility function for async iterationThis polyfill provides a convenient way to collect elements from an AsyncIterable into an array, which will be helpful for working with the new streaming APIs mentioned in the PR objectives.
The implementation is clean and the function signature with the generic type parameter ensures type safety across different use cases. This function will be particularly useful for transitioning code that expects arrays to work with the new streaming/generator-based methods.
eslint.config.mjs (1)
34-38: Improved TypeScript linting configurationAdding this rule for unused variables with exceptions for underscore-prefixed identifiers follows best practices for TypeScript development and helps maintain code cleanliness.
The configuration correctly ignores parameters and destructured array elements that start with underscore, which is the conventional way to mark intentionally unused variables.
tests/fireproof/database.test.ts (2)
319-319: Validate test changes against updated APIThe commented-out clock validation appears related to the API changes mentioned in the PR objectives where
ledger.clockanddatabase.clockhave been updated.Please confirm that this test modification is deliberate and aligns with the new API design where clock validation may be handled differently or the clock property is no longer exposed in the same way.
387-387: Validate test changes against updated APISimilar to the previous comment, this commented-out clock validation should be reviewed to ensure it's consistent with the API changes.
If the clock property is no longer part of the row object or is exposed differently in the new streaming API design, consider adding a comment explaining the change or updating the test to validate the new structure.
src/crdt-clock.ts (2)
16-16: No issues with the new import.
19-19: Import statement looks good.tests/fireproof/fireproof.test.ts (9)
150-155: Verify store references for null or undefined cases.
Ensure thatdb.ledger.crdt.blockstore.loader.attachedStores.local()is always available before accessing.active.car,.meta, and.wal.
212-215: Simple query test logic looks good.
557-557: Refined type signature for mapFn.
Switching toMapFn<Doc, Doc>is consistent with the approach taken elsewhere in the codebase.
583-586: Query result is validated correctly.
594-597: Query expectations appear valid.
599-599: r2 assignment is clearly declared.
The usage ofawait idx2.query().toArray()is consistent and readable.
601-602: Confirmation of result length is appropriate.
615-615: Await usage for query is correct.
618-619: Validating result structure is correct.src/react/useFireproof.ts (2)
34-34: Expanded generic signature is valid.
336-336: Updated type parameters integrate well with existing logic.src/ledger.ts (3)
16-16: Addition of ClockHead import is fine.
105-108: Clock getter introduced in LedgerShell.
This read-only getter cleanly exposes the clock.
211-214: Returning a shallow copy of the head prevents mutation.
This helps maintain immutability for the ledger clock.tests/fireproof/indexer.test.ts (36)
14-15: Imports appear consistent with usage.
These newly introduced imports (IndexRowandarrayFromAsyncIterable) are referenced later in the file and seem properly utilized.
54-54: Switch to.toArray()method looks correct.
The new usage aligns with the updated query interface, ensuring the test checks an actual array of results.
58-61: Consistent test approach with.toArray().
Using.toArray()in both queries effectively verifies that the map function is called only once for the first query.
64-66: Array-based result validation.
Retrieving rows as an array allows for more direct assertions on length and content.
69-69: Query call updated to.toArray().
No issues—this maintains consistency with the new approach.
73-73: Limit query usage remains consistent.
The.toArray()addition for limit queries is correct and straightforward.
77-77: Descending query test updated.
Switching to.toArray()ensures the test result is properly checked in array form.
81-81: Range query returning an array.
Matches the new query interface. Looks correct.
86-90: Repeated range queries use.toArray().
Both calls to.query(...)for verifying repeated queries are updated, which is consistent with the revised API.
94-94: Range query code updated.
No issues identified; the test logic remains clear.
98-98: Single key query now yields an array.
Correct adaptation to the new result format.
102-102: Basic.query().toArray()usage.
This ensures the test covers the default query with no options.
133-133: Prefix-based query updated.
Adopts.toArray()consistently.
162-165: Verifying array results in the index test.
All checks forrowsappear correct, confirming the number of documents indexed.
188-193: Asserting array-based results for map function with a value.
The test confirmsrows.lengthand verifies thatvalueis set correctly asdoc.title.length.
196-196: Query call for doc inclusion test.
No issues—pattern follows the updated query usage.
229-229: Compound key prefix query updated.
The call to.toArray()is consistent with the rest of the file’s usage.
256-258: Checking array result length in new test.
Validates the presence of documents by counting the array length.
261-261: Including docs test with.toArray().
Ensures the result array has expected doc structures.
288-290: Tests correct array length for numeric keys.
Again,.toArray()is consistent, verifying 4 documents.
293-293: Including docs with numeric keys.
All good. The approach is uniform with the rest of the file.
308-308: Typed array forresult.
DeclaringIndexRow<string, TestType, TestType>[]clarifies the structure.
346-346: Assigning query results to shared test variable.
This approach is consistent with the preceding lines.
357-357: Length assertion forresult.
Ensures correctness of the 3 documents returned.
362-363: Retrieving changes via async iterable.
Demonstrates usage ofarrayFromAsyncIterableto gather current CRDT changes.
366-366: Creating a new indexer for verification.
Confirms the new CRDT instance can still create an index with the same definition.
368-368: Calling.toArray()on the new indexer.
Checks that it returns the correct set of documents.
371-371: Ensuring array length correctness.
Remains consistent with earlier tests, verifying the result has 3 items.
377-379: Index creation plus CRDT changes retrieval.
Demonstrates how the second indexer usesarrayFromAsyncIterableto get current changes.
382-383: Retrieving changes from CRDT after an offset.
Continues to validate partial reads based on the clock head.
387-387: Checking second indexer query.
Switch to.toArray()ensures the map function isn’t re-run unnecessarily.
390-390: Confirming array length again.
The test asserts the total remains 3, indicating no new changes yet.
394-395: Retrieving changes after a bulk update.
Verifies one new change is returned.
398-398: Second indexer query post-change.
Checks that new data is properly reflected.
400-400: Ensuring final update count is 4.
Confirms the newly added document is included.
406-410: Handling meta changes in map function.
Demonstrates reversing thedoc.titlein an inline map function, then queries the index to confirm 3 documents.tests/fireproof/crdt.test.ts (12)
1-12: New imports for CRDT tests.
AddingarrayFromAsyncIterable,IndexKeyType, andDocFragmentexpands test capabilities for asynchronous iteration and typed doc fragments.
107-108: Usingchangeswith generics and.reverse().
Correct usage ofarrayFromAsyncIterable(it)clarifies the typed approach for partial docs.
111-111: Verifying doc fields.
Checkingresult[0].doc?.helloensures partial docs are properly included.
164-165: Refined type parameters forchanges.
The test clarifies retrieving typed docs (CRDTTestType) plus their fragments.
168-168: Confirming doc content within changes.
Ensures that thepointsfield is correctly surfaced in the results.
178-180: Additional calls tochangeswith generics.
The chain ofarrayFromAsyncIterablecalls clarifies how new data is aggregated.
182-182: Retrieving CRDT changes from a specific head.
Good usage ofcrdt.changes(firstPut.head)to fetch changes after the initial commit.
186-187: Confirming no changes from the latest head.
This verifies correct behavior in an unchanged scenario.
241-242: Multi-write scenario with typedchanges.
Stores and reverses the array of changes to confirm the insertion order.
245-245: Doc field validation after reversing.
Correctly checksresult[0].doc?.pointsfrom the first inserted doc.
329-330: Read changes in a compaction context.
Ensures thatcrdt.changes()still returns the expected results before compaction.
349-351: Verifying changes persist post-compaction.
Once reversed, the first doc (ace) remains the top entry, ensuring continuity after compaction.src/crdt.ts (10)
3-6: Ignoring missing charwise types.
The@ts-expect-erroris valid given charwise lacks type definitions. Import usage is correct.
16-21: Additional helper imports for doc transformations.
All newly imported functions (docUpdateToDocWithId, etc.) appear to be referenced appropriately below.
41-46: Types for streaming queries.
Bringing inQueryResponse,QueryStreamMarker, and related types ensures strongly typed responses from new charted methods.
51-51: New utility imports.
arrayFromAsyncIterableandensureLoggerare needed for asynchronous iteration and logging, respectively.
206-212: Private#currentDocsmethod.
Simplifies retrieving the current document set, bridgingall()andchanges().
213-231:#snapshotgenerator.
Resolves awaiters (waitFor,ready()) then yields rows, ensuring any queued tasks are complete.
232-246:#subscribemethod for real-time updates.
Pushes newly updated documents to a callback. Unsubscribe logic is cleanly returned.
248-293:#streamfor live or future-only document streaming.
Implements aReadableStreamthat enqueues existing rows (iffutureOnlyis false) and subsequent new rows for continuous consumption.
304-314: Unifiedall()method with doc or row variants.
Provides both doc-inclusive and doc-free iterators based on thewithDocsflag.
316-333: Enhancedchangesmethod supporting typed doc returns.
This refactor addresses both doc presence and partial or full row data, usingclockUpdatesSince(WithDoc).src/types.ts (5)
256-259: DocumentRow extends Row effectively—looks good!This extension cleanly adds a
docproperty to rows. Ensure that the combined usage and testing around partial, deleted, or missing documents remains consistent.
270-270: No concerns.This appears to be a closing brace with no functional change, so there’s nothing more to address here.
471-479: allDocs returning QueryResponse.Overloading
allDocsto return aQueryResponseis logical. Keep in mind that future expansions (e.g., special query constraints or watch modes) might benefit from separate methods if complexity rises.
486-497: Flexible changes() overload.Great job providing strongly typed return values based on the
withDocsoption. Monitor for potential confusion around optional parameters or future expansions. Otherwise, this is a clean approach.
262-262:Details
✅ Verification successful
Confirm consistency with IndexRow usage.
Adding or modifying this
keyproperty for IndexRow makes it align with your Row interfaces. Ensure no conflicting usage ofkeyarises in the codebase.Are all references to
IndexRowupdated accordingly? Consider running a quick repository-wide check:
🏁 Script executed:
#!/bin/bash rg 'IndexRow<' -A 5Length of output: 9112
Action: Confirming Consistency of
IndexRowUsageAfter running a repository-wide search, all instances of
IndexRowconsistently include thereadonly key: IndexKey<K>;property. In every occurrence—insrc/types.ts,src/missingTypes/prolly-trees.d.ts, tests, and usage within related components—thekeyfield is present and aligns with the updated Row interface design. There is no evidence of conflicting usage ofkeyin any part of the codebase.Please proceed knowing that the change is consistent across the repository.
tests/fireproof/streaming-api.test.ts (8)
1-12: Initial imports and setup look fine.Imports cover the needed APIs for testing without obvious redundancy. Good job keeping external references minimal.
39-56: testSnapshot helper usage.This test function is straightforward, verifying snapshot size. No issues spotted. Make sure any edge cases (e.g., zero docs) are tested elsewhere.
129-146: testFuture confirms new doc arrival.This is a clear test for future additions. The checking for docCount meets typical streaming patterns. Implementation looks good.
148-160: testSubscribe ensures subscription callback.Using a promise around the subscribe callback is a practical approach to confirm invocation. Everything looks fine here.
162-168: testToArray verifies array conversion.This final check ensures
.toArray()accumulates documents as expected. Straightforward and robust.
210-243: "query - all" usage.Tests for
db.select("name")appear thorough and consistent with thetestXhelper functions. Nicely organized approach.
244-286: "query - additional items" scenario.Again, a well-structured approach reusing the helper functions for thorough coverage. No immediate issues.
288-313: "query - excludeDocs" usage.These lines confirm the doc is indeed excluded. Good to see that row-level testing is happening here. Looks coherent.
src/crdt-helpers.ts (5)
6-6: New import that includes UnknownLink.This import supports doc retrieval logic with unknown link types. It’s cleanly used in the subsequent code. No concerns.
33-37: Additional imports from types: ClockLink, DocFragment, etc.These entries are used in your new generator functions. The naming aligns well with existing code. No issues here.
306-322: clockUpdatesSinceWithDoc yields doc alongside row.Returning
_deleted: truefor missing docs appears consistent. Just ensure the consumer logic respects that convention properly.
339-348: gatherUpdates function enhancements.Yield-based iteration on CRDT ops is well-structured. The recursion logic and limit check look correct at first glance.
43-44:Details
✅ Verification successful
TypeScript ignore for "charwise" lacking official definitions.
You might want to see if a community-provided type definition or a minimal custom .d.ts stub is available.
Try checking if @types/charwise or something similar exists in npm:
🏁 Script executed:
#!/bin/bash npm info @types/charwiseLength of output: 443
Verified: No Official Types for "charwise" Exist
I confirmed via the npm command that the package
@types/charwiseis not available (E404). Since no community-provided types are found, the current use of// @ts-expect-erroris justified. However, if you plan on using "charwise" extensively, you might consider adding a minimal custom.d.tsstub for improved type safety.src/indexer-helpers.ts (5)
29-39: Imports look good.
Newly introduced imports forClockHead,ChangesOptions,IndexRow,DocumentRow, andDocWithIdappear consistent with the file’s usage.
70-72: Interface updated for flexible types.
Allowing a second generic parameterRforIndexDocexpands type safety and reusability.
80-107: Implementation appears correct.
indexEntriesForRowsusesmapFnappropriately and gracefully handles calls that don't provide a key. Looks good.
109-133: Clean handling of doc updates.
indexEntriesForChangesproperly filters out deletions and undefined values, mapping them correctly. No issues found.
418-441: Index update flow is coherent.
The_updateIndexmethod properly collects and bulk-updates entries while respecting the currentindexHead. Good approach.src/indexer.ts (10)
22-27: New type imports are well-aligned.
No issues with introducingQueryResponse,ChangesOptions,IndexRow,InquiryResponse,DocumentRow,Row.
43-44: Additional helpers import looks good.
No concerns with newly importedIndexDocandindexEntriesForRows.
46-48: Utility imports are fine.
Bringing inarrayFromAsyncIterable,ensureLogger, anddocUpdateToDocWithIdis consistent with usage.
83-83: Optional mapFn property is clear.
Enables flexible usage for the index. Implementation is straightforward.
110-110: Constructor changes.
AcceptingmapFnandmetaparameters here centralizes initialization and enhances clarity.
130-130:applyMapFnlogic is consistent.
The method ensures thatmapFnandmetado not conflict. Behavior is well-defined.
171-172: Falling back to auto-index by field name.
Providing a default map function that returnsdoc[name]is a convenient fallback. No issues.
196-221: In-place query builder rationale seems solid.
Thequerymethod introduces snapshot, subscribe, toArray, live, and future. It’s a well-structured approach that keeps usage moderately simple.
223-266: Refined#querymethod.
Filters results based on different query options (range, key, prefix, full). Implementation is logically coherent and clearly separated.
418-441: Update flow remains consistent.
Merging CRDT changes, building index entries, and bulk-indexing function as expected. No further issues.src/database.ts (7)
26-27: New type aliases are properly used.
QueryResponseandInquiryResponsealign well with the newly introduced query approach.
142-147: Consistent usage ofselect()inallDocs.
This integrates nicely with the new query architecture, though the added comment suggests future improvements foropts.
159-161: Exposing clock as a getter.
Havingthis.ledger.clockaccessible directly is a straightforward design.
163-167: Subscription pattern is in sync with new indexing.
Subscribing to the default index and triggering user-provided listener is intuitive.
171-180:querymethod for manual indexing logic.
Provides direct interaction with custom or field-based indexing. Implementation is consistent.
182-207: Enhancedselect()method.
Overloaded signatures accommodate various usage patterns (maps, string fields, doc exclusion). Implementation is robust.
209-213: Database compaction method remains straightforward.
No concerns; it delegates toledger.crdt.compact().
221d7f4 to
a5662e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/indexer-helpers.ts (1)
214-221: 🛠️ Refactor suggestionOptimize array handling in reduce function
The spread operator in the reducer creates a new array on each iteration, which can cause performance issues with large datasets.
Replace the spread operator with array mutation for better performance:
- return [...acc, row]; + acc.push(row); + return acc;🧰 Tools
🪛 Biome (1.9.4)
[error] 217-217: Avoid the use of spread (
...) syntax on accumulators.Spread syntax should be avoided on accumulators (like those in
.reduce) because it causes a time complexity ofO(n^2).
Consider methods such as .splice or .push instead.(lint/performance/noAccumulatingSpread)
🧹 Nitpick comments (6)
tests/fireproof/streaming-api.test.ts (4)
81-127: Consider making the random document count deterministic.The
testSincefunction uses random numbers to determine how many documents to create, which could make tests non-deterministic and potentially flaky.Consider using fixed document counts for better test stability:
- const amountOfNewDocs = Math.floor(Math.random() * (10 - 1) + 1); + const amountOfNewDocs = 5; // Fixed number for consistency - const amountOfSnapshotDocs = Math.floor(Math.random() * (10 - 4) + 4); + const amountOfSnapshotDocs = 6; // Fixed number for consistency
113-115: Clarify concurrency limitation comment.The comment mentions concurrency limits preventing the use of
Promise.all, but doesn't explain the nature of this limitation.Consider expanding the comment to explain the specific concurrency limitation:
- // NOTE: Concurrency limit disallows for using `Promise.all` with x items + // NOTE: Database operation concurrency limit disallows using `Promise.all` with multiple operations. + // This sequential approach avoids potential race conditions or throttling issues.
148-160: Consider adding subscription cleanup.The
testSubscribefunction doesn't explicitly clean up the subscription after the test, which could potentially lead to memory leaks or unexpected behavior in future tests.Consider storing and cleaning up the subscription:
async function testSubscribe<K extends IndexKeyType, T extends DocTypes, R extends DocFragment = T>( queryResponse: QueryResponse<K, T, R>, ) { + let unsubscribe: (() => void) | undefined; const row = await new Promise((resolve) => { - queryResponse.subscribe(resolve); + unsubscribe = queryResponse.subscribe(resolve); db.put({ _id: `doc-extra`, name: `doc-extra` }); }); + // Clean up subscription + if (unsubscribe) unsubscribe(); expect(row).toBeTruthy(); expect(row).toHaveProperty("id"); expect(row).toHaveProperty("doc"); expect((row as DocumentRow<K, T, R>).doc).toHaveProperty("name"); }This assumes the subscribe method returns an unsubscribe function. If it doesn't, consider adding that feature to improve API ergonomics.
290-300: Consider adding assertions for key and value properties.In the
excludeDocstest, you're checking that the document property doesn't exist, but not explicitly verifying thatkeyandvalueproperties do exist, which would better validate the expected behavior.Add assertions for the expected properties:
expect(doc).toBeTruthy(); expect(doc).not.toHaveProperty("doc"); + expect(doc).toHaveProperty("key"); + expect(doc).toHaveProperty("value"); + expect(doc.key).toBe("doc-0"); // Or whatever the expected key format issrc/indexer.ts (1)
328-330: Type casting workaround could be improvedThe comment about type overloading not working as expected suggests an issue with TypeScript's handling of this pattern.
Consider restructuring the conditional to avoid the need for casting:
if (excludeDocs === true) { const rowCallback = callback as (row: Row<K, R>) => void; rowCallback({ ...indexEntry, id: update.id }); } else { const docRow: DocumentRow<K, T, R> = { ...indexEntry, id: update.id, doc }; (callback as (row: DocumentRow<K, T, R>) => void)(docRow); }src/database.ts (1)
1-213: Add documentation for the streaming APIWhile the implementation of the streaming API is solid, I don't see any JSDoc comments explaining how to use the new methods, particularly
select()with its various overloads.Consider adding comprehensive JSDoc comments for the new methods, especially the
select()method, to help users understand how to use the streaming API effectively. Include examples of each usage pattern mentioned in the PR description.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
build-docs.sh(0 hunks)eslint.config.mjs(1 hunks)src/crdt-clock.ts(2 hunks)src/crdt-helpers.ts(6 hunks)src/crdt.ts(4 hunks)src/database.ts(3 hunks)src/indexer-helpers.ts(3 hunks)src/indexer.ts(11 hunks)src/ledger.ts(3 hunks)src/react/img-file.ts(1 hunks)src/react/useFireproof.ts(2 hunks)src/types.ts(6 hunks)src/utils.ts(1 hunks)tests/fireproof/crdt.test.ts(8 hunks)tests/fireproof/database.test.ts(2 hunks)tests/fireproof/fireproof.test.ts(5 hunks)tests/fireproof/indexer.test.ts(12 hunks)tests/fireproof/streaming-api.test.ts(1 hunks)
💤 Files with no reviewable changes (1)
- build-docs.sh
🚧 Files skipped from review as they are similar to previous changes (7)
- src/react/img-file.ts
- src/utils.ts
- src/crdt-clock.ts
- tests/fireproof/database.test.ts
- tests/fireproof/indexer.test.ts
- tests/fireproof/crdt.test.ts
- src/react/useFireproof.ts
🧰 Additional context used
🪛 Biome (1.9.4)
src/indexer-helpers.ts
[error] 217-217: Avoid the use of spread (...) syntax on accumulators.
Spread syntax should be avoided on accumulators (like those in .reduce) because it causes a time complexity of O(n^2).
Consider methods such as .splice or .push instead.
(lint/performance/noAccumulatingSpread)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build Fireproof
🔇 Additional comments (59)
tests/fireproof/streaming-api.test.ts (11)
1-17: Well-structured imports and interface definition.The imports are organized properly and the DocType interface is clearly defined. The inclusion of all necessary types from "@fireproof/core" demonstrates good planning and understanding of the API requirements.
19-37: Test setup is clean and efficient.The test setup correctly initializes a fresh database for each test and populates it with a configurable number of documents. The cleanup in
afterEachensures proper isolation between tests.
43-48: Good use of TypeScript generics for type safety.The type definitions for Snapshot and Stream provide strong type safety throughout the tests. The generic parameters ensure consistency between the query parameters and result types.
49-56: Straightforward snapshot testing function.The
testSnapshotfunction effectively verifies that the expected number of documents are retrieved from a snapshot.
57-79: Comprehensive live stream testing with state validation.The
testLivefunction properly tests the streaming API by:
- Counting documents as they arrive
- Adding a new document after receiving all pre-existing ones
- Breaking the loop on new document notification
- Verifying the stream is properly closed after iteration
This thoroughly tests the streaming mechanics and event markers.
129-146: Well-implemented future document test.The
testFuturefunction effectively tests the streaming API's ability to capture new documents added after the stream is created. It correctly creates new documents and then verifies that the expected number are captured.
162-168: Simple and effective array conversion test.The
testToArrayfunction provides a clean way to verify the array conversion functionality.
174-204: Comprehensive test coverage for allDocs functionality.This section thoroughly tests all the streaming API operations on the allDocs query type. Each test case focuses on a specific aspect of the API, providing good functional coverage.
245-287: Good test isolation for additional items scenario.The "additional items" tests correctly set up a separate set of documents with a different field to test that querying by that field returns only those documents. This validates the field-specific query behavior.
174-314: Overall excellent test coverage of the new API.The test suite provides comprehensive coverage of the new streaming API functionality. The tests are well-structured with good separation of concerns between helper functions and test cases. The use of TypeScript generics ensures type safety throughout.
A few minor suggestions for improvement:
- Consider making random values deterministic for more predictable tests
- Add explicit cleanup for subscriptions
- Maintain consistency in test patterns across different sections
Overall, this provides a solid foundation for validating the new streaming API.
223-228:Details
✅ Verification successful
Recheck the since parameter test in query section.
In the test case for
snapshotandlivewithsinceparameter, you're querying the "since" field which might not match the intention, as it appears to be testing temporal functionality rather than querying by field name.Verify if querying the "since" field is intentional, or if this should be testing the temporal functionality with:
🏁 Script executed:
#!/bin/bash # Check if 'since' is defined as a field in other tests or just used as a temporal parameter # Look for other uses of 'since' as a field grep -r "since:" --include="*.ts" --include="*.js" tests/ # Look for temporal usages grep -r "\.since" --include="*.ts" --include="*.js" tests/ # Check documentation of since parameter grep -r "since parameter" --include="*.md" .Length of output: 591
Clarification on the 'since' Parameter Test
After verifying the test file, it’s clear that the "since" parameter is intentionally used as a field. Specifically:
- The type signature indicates
(since: ClockHead)is expected.- Insertions using
db.putexplicitly include thesincefield (e.g.,doc-since-${i}).- There’s consistent use of the field in both
snapshotCreatorandstreamCreator.This confirms that the query on the "since" field is deliberate rather than a misinterpretation of temporal functionality.
eslint.config.mjs (1)
34-38: Good addition of TypeScript no-unused vars rule.This rule configuration is a best practice that flags unused variables while allowing intentional unused parameters with leading underscores. This helps catch potential bugs while supporting common TypeScript patterns.
tests/fireproof/fireproof.test.ts (7)
150-155: LGTM: Removal ofawaitfor store access.The removal of
awaitwhen accessing store properties indicates these properties are now accessed directly rather than being Promise-based. This is a valid refactoring following API changes.
202-206: Change in query result handling usingtoArray().The implementation has been updated to use a more modern approach for query results with
toArray()instead of accessing therowsproperty directly.
212-215: Consistent update to usetoArray()method for query results.Good consistency in updating all occurrences to the new query API that uses
toArray().
557-557: Updated type declaration for mapFn.The type has been updated from
MapFn<Doc>toMapFn<Doc, Doc>, which provides more specificity about the input and output types of the mapping function.
582-586: Updated query execution pattern for idx2.This follows the same pattern of using
toArray()consistently throughout the codebase.
593-602: Consistent update for query result handling in reuse test.The query result handling has been updated consistently here as well, maintaining the same pattern throughout the test suite.
616-619: Final consistent update for query result handling.Maintains the same pattern of accessing query results with
toArray().src/ledger.ts (3)
16-16: Update to import the ClockHead type.This import is necessary for the new clock-related functionality being added to the ledger classes.
105-107: Addition of clock getter to LedgerShell class.This new getter provides access to the clock from the underlying reference implementation, which aligns with the PR objectives of exposing clock data.
211-213: Implementation of clock getter in LedgerImpl.The implementation returns a copy of the CRDT clock's head, which is a good practice as it prevents external code from directly modifying the internal clock state.
src/crdt.ts (11)
3-6: New imports and code organization.The imports are well-organized, including the commented-out ts-expect-error for the 'charwise' package that has no types, which is a good practice to document known type issues.
16-22: Updated helper function imports.These additions bring in necessary functions for the new streaming and query capabilities, aligning with the PR objectives.
41-46: Import of new type definitions.The new types support the revamped query API, enabling type-safe interactions with the streaming functionality.
51-51: Import of utility functions.The
arrayFromAsyncIterablefunction is essential for converting async iterables to arrays in the new query API.
181-204: Revamped allDocs method implementation.The implementation now provides a rich query API with multiple interaction options:
snapshotfor point-in-time retrievalsubscribefor real-time updatestoArrayfor easy collection handlingliveandfuturefor different streaming behaviorsThis aligns perfectly with the PR objectives to provide streaming/async iterator capabilities.
206-212: Helper method to determine whether to use changes or all.This private method provides a clean abstraction for deciding which method to use based on whether we're fetching since a specific point or all documents.
213-230: Implementation of snapshot functionality.The
#snapshotmethod creates an async generator that yields document rows, supporting fetching documents since a specific point in time. It correctly awaits waitFor and ready() before yielding results.
232-246: Implementation of subscribe functionality.This method subscribes to clock tick events and converts updates to document rows, passing them to the callback function. The unsubscribe function is properly returned for cleanup.
248-293: Implementation of streaming functionality.The
#streammethod creates a ReadableStream that emits document rows with metadata markers. It correctly handles:
- Preexisting data when
futureOnlyis false- New data through subscription
- Proper cleanup on cancel
This is a modern approach using the Web Streams API.
304-314: Implementation of all method with type overloads.This provides type-safe access to either document rows or simplified rows based on the
withDocsparameter. The implementation uses the appropriate helper functions for each case.
316-333: Implementation of changes method with type overloads.Similar to the
allmethod, this provides type-safe access to changes since a specific point in time, with or without the full document data. The implementation correctly uses eitherclockUpdatesSinceorclockUpdatesSinceWithDocbased on the options.src/types.ts (8)
300-300: API change fromincludeDocstoexcludeDocsmatches discussed requirementsThe change from
includeDocstoexcludeDocsaligns with the discussions in previous review comments. This properly makes document inclusion the default behavior while allowing exclusion as an option.
250-258: Well-designed interfaces for query result handlingThe new
RowandDocumentRowinterfaces provide a clean separation of concerns.Rowhandles the basic row structure whileDocumentRowextends it to include the document data. This design enables the API to support both document-included and document-excluded query patterns efficiently.
325-346: Good separation of response interfaces based on document inclusionThe dual interfaces
InquiryResponseandQueryResponseprovide consistent method signatures with appropriate type variations, supporting the streaming API objectives of the PR. This design allows for consistent method names while maintaining type safety based on whether documents are included.
348-349: Improved type flexibility with generic emit functionsThe updated
EmitFnandMapFntype definitions with generic type parameter R provide better type safety for map/reduce operations, ensuring that emitted values match expected return types.
471-498: Comprehensive async generator methods in CRDT interfaceThe new overloaded methods provide a flexible API for retrieving documents with or without documents attached. The signatures correctly handle the different return types based on the
withDocsparameter, enabling proper type checking in consuming code.
617-632: Well-designed select method with multiple overloadsThe
selectmethod signatures provide comprehensive options for querying data with appropriate return types based on whether documents are included. This enables both type safety and API flexibility.
608-608: Clock getter replaces previous method approachConverting the clock from a method to a getter property provides a cleaner API surface. This is a good modernization of the interface.
680-680: Removing subscription model in favor of streaming APICommenting out the
subscribemethod in the Ledger interface aligns with the PR objective of moving to a streaming-based API rather than a callback-based subscription model.src/crdt-helpers.ts (6)
136-138: Useful utility for document conversionThe
docUpdateToDocWithIdfunction provides a clean way to convert between document representations. This improves code readability by centralizing this common transformation logic.
292-321: Efficient generator functions for streaming document updatesThese new generator functions enable streaming of document updates with and without document content, providing the foundation for the streaming API mentioned in the PR objectives. The implementation efficiently uses
clockChangesSinceto avoid duplicating code.
318-319: Review temporary implementation noteThere's a comment indicating this implementation may not be technically correct. Consider resolving this before final release.
Could you clarify what aspect of the implementation is not correct and if there's a plan to address it before finalizing the API?
324-337: Improved streaming with generator functionsConverting
clockChangesSinceto return an AsyncGenerator instead of a Promise with an array is a good pattern for streaming data and aligns with the PR's objectives.
403-414: Clean implementation of document retrieval with generatorsThe
getAllEntriesWithDocfunction efficiently extendsgetAllEntriesto include document data, providing a consistent pattern for streaming document data.
416-420: Utility function for extracting document valuesThe
docValuesfunction provides a clean way to filter out metadata fields (those starting with_). This simplifies the handling of document data throughout the codebase.src/indexer-helpers.ts (3)
70-73: Enhanced type flexibility with generic value typeThe
IndexDocinterface now accepts a generic type parameterRfor the value property, providing better type safety and flexibility throughout the indexing system.
80-107: Good utility for processing document rowsThe
indexEntriesForRowsfunction centralizes the logic for mapping document rows to index entries, handling edge cases appropriately and ensuring consistent behavior.
197-248: Enhanced query application with streaming supportThe updated
applyQueryfunction now returns an AsyncGenerator, enabling streaming of query results. The function also handles additional query options likesinceandsinceOptionsfor filtering results based on clock changes.🧰 Tools
🪛 Biome (1.9.4)
[error] 217-217: Avoid the use of spread (
...) syntax on accumulators.Spread syntax should be avoided on accumulators (like those in
.reduce) because it causes a time complexity ofO(n^2).
Consider methods such as .splice or .push instead.(lint/performance/noAccumulatingSpread)
src/indexer.ts (4)
44-48: Good utility imports for streaming functionalityThe addition of
indexEntriesForRowsandarrayFromAsyncIterableimports provides the foundation for the new streaming capabilities in the indexer.
196-221: Well-designed query method with appropriate overloadsThe query method now has overloads that return either
InquiryResponseorQueryResponsebased on theexcludeDocsoption, providing a type-safe API for both patterns.
342-398: Comprehensive stream implementationThe
#streammethod provides a solid implementation of the ReadableStream interface, handling both preexisting and new data appropriately. This is key to enabling the streaming API capabilities mentioned in the PR objectives.
418-432: Clean implementation of index updatesThe update index implementation efficiently handles both initial population and incremental updates based on the state of
indexHead. The use of async generators from the CRDT interface provides a consistent pattern throughout the codebase.src/database.ts (5)
26-27: New type imports for streaming API supportThese new types (
QueryResponseandInquiryResponse) are essential for the streaming allDocs/query API implementation, supporting the various overloads of the newselect()method.
159-161: New clock getter exposes ledger clockThis getter provides convenient access to the ledger's clock, which is useful for clients that need to track database state for change detection.
163-167: Simplified subscribe implementation using selectThe
subscribemethod has been refactored to leverage the new streaming capabilities, which is more elegant and consistent with the overall API design.However, note that this implementation changes the behavior - previously it might have passed multiple documents to the listener, but now it passes them one at a time. Make sure this is documented and that consumers are aware of this change.
170-180: Refactored query method using selectThe
querymethod has been simplified by delegating to the newselectmethod, which makes the code more maintainable and consistent.Note that the return type structure has changed - the
rowsproperty now directly contains the array result rather than wrapping it in another object. Ensure this change is documented for API consumers.
182-207: New select method with comprehensive overloadsThe
selectmethod is a well-designed addition that forms the core of the new streaming API. The multiple overloads provide flexibility while maintaining type safety.The implementation elegantly handles the different parameter combinations and delegates to either
ledger.crdt.allDocsoridx.querybased on whether a field is provided. This is a clean approach that centralizes the query logic.One suggestion to improve code clarity:
- const field = typeof a === "string" || typeof a === "function" ? a : undefined; - const opts = b ? b : typeof a === "object" ? a : {}; + // Extract field and options from parameters + const field = (typeof a === "string" || typeof a === "function") ? a : undefined; + const opts = b || (typeof a === "object" ? a : {});
|
|
||
| const qry = this.select<IndexKeyType, T>(); | ||
|
|
||
| // FIXME: row must have `clock` property | ||
| const rows = (await qry.toArray({ ...opts, since })) | ||
| .map((row) => ({ | ||
| key: row.key[1], | ||
| value: row.doc, | ||
| })) | ||
| .reverse(); | ||
|
|
||
| return { rows, clock: this.ledger.clock, name: this.name }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address FIXME comment about clock property
The changes method has been refactored to use the new select method, which is a good architectural improvement. However, there's a FIXME comment that needs to be addressed before this code is production-ready.
The comment on line 128 indicates that rows must have a clock property, but the code doesn't ensure this. Consider adding the clock property in the mapping function:
const rows = (await qry.toArray({ ...opts, since }))
.map((row) => ({
key: row.key[1],
value: row.doc,
+ clock: row.clock || this.ledger.clock
}))
.reverse();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const qry = this.select<IndexKeyType, T>(); | |
| // FIXME: row must have `clock` property | |
| const rows = (await qry.toArray({ ...opts, since })) | |
| .map((row) => ({ | |
| key: row.key[1], | |
| value: row.doc, | |
| })) | |
| .reverse(); | |
| return { rows, clock: this.ledger.clock, name: this.name }; | |
| } | |
| const qry = this.select<IndexKeyType, T>(); | |
| // FIXME: row must have `clock` property | |
| const rows = (await qry.toArray({ ...opts, since })) | |
| .map((row) => ({ | |
| key: row.key[1], | |
| value: row.doc, | |
| clock: row.clock || this.ledger.clock | |
| })) | |
| .reverse(); | |
| return { rows, clock: this.ledger.clock, name: this.name }; | |
| } |
|
|
||
| // FIXME: Passing opts doesn't actually do anything yet | ||
| const qry = this.select<IndexKeyType, T>(opts); | ||
| const rows = (await qry.toArray()).map((row) => ({ key: row.key[1], value: row.doc })); | ||
|
|
||
| return { rows, clock: this.ledger.clock, name: this.name }; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement options handling for allDocs
The allDocs method has been refactored to use the new select method, which is good for consistency. However, there's an unresolved issue with options handling.
The comment on line 142 indicates that passing options doesn't actually do anything yet. This should be implemented before merging:
// FIXME: Passing opts doesn't actually do anything yet
- const qry = this.select<IndexKeyType, T>(opts);
+ // Apply the options to filter/sort the results
+ // For example:
+ const qry = this.select<IndexKeyType, T>(opts);
+ // Implement option handling like limit, skip, etc.Also, the PR objectives mention an open question about the definition of "value" in the context of "allDocs" flow when no field is specified. This needs to be clarified and documented.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // FIXME: Passing opts doesn't actually do anything yet | |
| const qry = this.select<IndexKeyType, T>(opts); | |
| const rows = (await qry.toArray()).map((row) => ({ key: row.key[1], value: row.doc })); | |
| return { rows, clock: this.ledger.clock, name: this.name }; | |
| } | |
| // FIXME: Passing opts doesn't actually do anything yet | |
| // Apply the options to filter/sort the results | |
| // For example: | |
| const qry = this.select<IndexKeyType, T>(opts); | |
| // Implement option handling like limit, skip, etc. | |
| const rows = (await qry.toArray()).map((row) => ({ key: row.key[1], value: row.doc })); | |
| return { rows, clock: this.ledger.clock, name: this.name }; | |
| } |
Adds the ability to stream or async iterate documents.
New API
Old API is still there; In addition to that, this PR adds the new method:
database.select()database.select("field")database.select(queryOpts)database.select("field", queryOpts)All docs behaviour with
.select()Query behaviour with
.select("field")Other changes
ledger.clockanddatabase.clockreturn the currentcrdt.clock.headLedger.changesandLedger.subscribehave been removed.To do
excludeDocsoptionkeyandvaluefrom prolly tree in all results.select()method (which mergesallDocsandquery)allDocsflow in.select()uses query optionsqueryflow in.select()uses query optionsrow.clockproperty?Unclear
It's not clear what
valueshould be in the "allDocs" flow (ie. when not specifying a field and thus listing all documents). I made this a list of all "values" in the document. For example, given the doc{ a: "A", b: "B" }, the value will be[ "A", "B" ]Summary by CodeRabbit
New Features
Refactor
Tests