Skip to content

Conversation

@FUDCo
Copy link
Contributor

@FUDCo FUDCo commented Oct 21, 2024

This is a first cut at a storage wrapper for the kernel. What I have assumes it is passed a dumb string/string key/value store, and builds on top of that, sometimes in ways that are necessarily a bit indirect. Obviously, if we have a more featureful storage substrate we could potentially take advantage of its features to do this stuff more directly (such a thing would also imply refactoring the construction API to push the responsibility for creating whatever object or objects actually accesses the storage to a lower level in the code, rather than simply passing in the kv store at the top).

It's also worth mentioning that at this point there is not yet any notion of transaction boundaries, which will absolutely be needed but which is more wrapped up in the specifics of the underlying storage layer than the stuff here is.

@FUDCo FUDCo self-assigned this Oct 21, 2024
@FUDCo FUDCo force-pushed the chip/kernel-storage branch from 2c3fb23 to f9dd5b4 Compare October 21, 2024 23:31
@FUDCo FUDCo marked this pull request as ready for review October 21, 2024 23:31
@FUDCo FUDCo requested a review from a team as a code owner October 21, 2024 23:32
* @returns The value at that key.
*/
function kvGet(key: string): string {
function kvGet(key: string, required: boolean): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function kvGet(key: string, required: boolean): string {
function kvGet(key: string, required: boolean): string | undefined {

But why don't we want to return undefined? I don't get that return undefined as unknown as string; below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naively, I agree it seems preferable for this function to be honest about occasionally returning undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where TypeScript and I butt heads a lot. Although adding undefined to the possible return types is strictly more correct, "might be undefined" introduces a type contagion that infects distant parts of the call graph, even though in all but a few cases undefined is never an actual possibility and the declaration forces everything that ever touches the return value to implement code to account for the thing that will never happen happening just to make the compiler happy (plus of course all those conditionals cluttering up the code will show up as gaps in code coverage since the thing never happens).

In practice I've found it's sometimes cleaner to pretend to the type system that it's always just a string (or whatever), then just explicitly look for undefined in the few places where that's actually relevant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not really possible to retrieve an undefined key in practice, why don't we just get rid of getRequired and make get throw if the key is undefined? At the moment, should get ever return undefined, we will get a bunch of hard-to-decipher errors, when we could just get an easily comprehensible one instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to retrieve an undefined key in practice, and in fact it's useful for testing if a value has been set in the first place (particularly useful in the case of lazy initialization of things that might never be used), so merely having the result be undefined is not itself an error condition. It's only an error when you need the value to be there. In a nutshell, the problem is that TS wants you make an a priori policy decision as to whether undefined is always ok or never ok, but doesn't deal gracefully with the common case where ok vs. not ok is context sensitive.

@sirtimid
Copy link
Contributor

So if I'm getting this right, based on the description, we can create a couple of separate followup issues

  • Modularize SQL operations into a separate data access layer.
  • Wrap SQLite operations in transactions

What about these:

  • Add cache eviction policy (LRU or lazy expiration, something to avoid memory bloat)
  • Batch SQLite operations to improve performance (I remember you mentioning it some time ago)

@sirtimid sirtimid requested a review from rekmarks October 22, 2024 13:19
@rekmarks rekmarks linked an issue Oct 22, 2024 that may be closed by this pull request
Copy link
Member

@rekmarks rekmarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I have mostly questions and minor suggestions.

Speaking of which, what exactly is a "kernel promise"? Where do they come from? What are they for?

* @returns The value at that key.
*/
function kvGet(key: string): string {
function kvGet(key: string, required: boolean): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naively, I agree it seems preferable for this function to be honest about occasionally returning undefined.

* @param str - The string.
* @returns The same string coerced to type Message
*/
function sm(str: string): Message {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function sm(str: string): Message {
function asMessage(str: string): Message {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I think I'd prefer the creation of a makeMessage factory function so we don't have to lie. Not a blocker because we have not sworn an oath of honesty to TypeScript.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intended as a piece of syntactic sugar for the sole purpose of getting TS to shut up, so I was aiming to make it as close to invisible as possible. I actually wanted to use a one character name but our lint rules wouldn't allow for that. Having a genuine Message factory function might be better once we're actually in a position to have more definite opinions about what a Message really is, but here all that was important was that the thing be stringifiable and pass the type checker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't mind the existence of the function, nor do I mind abbreviations that I can decipher, but I'm a strong proponent of self-documenting names, and I can't make heads or tails of sm.

Copy link
Contributor Author

@FUDCo FUDCo Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be useful to have a conventional terse name (e.g., x) that means "function that's purely instrumental for making TypeScript shut up; just pretend it's not even here". 😃

(Note that the major and probably only use of this would be in constructing tests, which often erect rickety and not quite kosher scaffolding to avoid sucking in more of the world.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I revised it to have better documentation and an arguably better name that goes with the documentation, but still terse. Lemme know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, I yield!

@FUDCo
Copy link
Contributor Author

FUDCo commented Oct 22, 2024

what exactly is a "kernel promise"? Where do they come from? What are they for?

This is the kernel's representation of a promise that has been exported from one vat and imported into one or more others. If the vat(s) into which it has been imported sends a message to it, the message has to wait somewhere for the promise to be resolved, plus when the promise settles the requisite notifications have to be sent to any vats that might be doing a then on it. This is where the bookkeeping for all of that happens. (And writing this reply points out a bit that I missed, which is that the kernel promise needs to keep a subscriber list.)

@FUDCo
Copy link
Contributor Author

FUDCo commented Oct 23, 2024

So if I'm getting this right, based on the description, we can create a couple of separate followup issues

  • Modularize SQL operations into a separate data access layer.
  • Wrap SQLite operations in transactions

What about these:

  • Add cache eviction policy (LRU or lazy expiration, something to avoid memory bloat)
  • Batch SQLite operations to improve performance (I remember you mentioning it some time ago)

If we decide to go the SQL route, for a lot of the data I'd want implement much of the functionality in terms of more direct database operations rather than having to map everything onto a dumb key/value store. So I think a priority is probably making that decision (which may unpack into some additional benchmarking if we don't already know enough to decide (which we might)).

Caching is another interesting question. Right now it caches specific items which we know are going to be used with high frequency, rather that doing some kind of heuristic general cache thing for arbitrary data, but we might well want to add such a thing. One thing I've never been able to get a clear read on is how much, if any, caching SQLite itself already does internally -- I can imagine anything from none at all to speak of to quite a lot.

@sirtimid
Copy link
Contributor

sirtimid commented Oct 24, 2024

The newly introduced command yarn test:ts (which you'll get when you rebase from master) shows the following errors

Screenshot 2024-10-24 at 16 35 03

ps: I am renaming it to yarn lint:ts here, I think it's more appropriate.

@FUDCo
Copy link
Contributor Author

FUDCo commented Oct 24, 2024

The newly introduced command yarn test:ts (which you'll get when you rebase from master) ...

Not directly germane to the issue you're flagging, but: I'm really not a fan of using namespaced labels (e.g., foo:bar) for scripts that one would commonly use from the command line. I would reserve these for script elements used compositionally (which you might invoke from the command line in the course of debugging scripts themselves or when trying to diagnose weird problems, but would not ordinarily use directly).

@rekmarks
Copy link
Member

@FUDCo I agree, and as of #197 lint:ts is a component of lint:

ocap-kernel/package.json

Lines 23 to 29 in 466a4c4

"lint": "yarn constraints && yarn lint:ts && yarn lint:eslint && yarn lint:misc --check && yarn lint:dependencies",
"lint:dependencies": "yarn dedupe --check && yarn depcheck && yarn workspaces foreach --all --parallel --verbose run lint:dependencies",
"lint:dependencies:fix": "yarn dedupe && yarn depcheck && yarn workspaces foreach --all --parallel --verbose run lint:dependencies",
"lint:eslint": "yarn eslint . --cache",
"lint:fix": "yarn constraints --fix && yarn lint:ts && yarn lint:eslint --fix && yarn lint:misc --write && yarn lint:dependencies:fix",
"lint:misc": "prettier --no-error-on-unmatched-pattern '**/*.json' '**/*.md' '**/*.html' '!**/CHANGELOG.old.md' '**/*.yml' '!.yarnrc.yml' '!merged-packages/**' --ignore-path .gitignore",
"lint:ts": "yarn workspaces foreach --all --parallel --verbose run lint:ts",

@FUDCo FUDCo force-pushed the chip/kernel-storage branch from a1992ed to 8e84a92 Compare October 24, 2024 23:37
@FUDCo FUDCo requested a review from rekmarks October 24, 2024 23:38
Copy link
Member

@rekmarks rekmarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@FUDCo FUDCo merged commit 9739db1 into main Oct 25, 2024
@FUDCo FUDCo deleted the chip/kernel-storage branch October 25, 2024 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(kernel): Establish lightweight persistence abstraction

4 participants