-
Notifications
You must be signed in to change notification settings - Fork 6
feat: Integrate kernel with liveslots #317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c024858 to
16e5423
Compare
|
👍 Dependency issues cleared. Learn more about Socket for GitHub ↗︎ This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored. Ignoring: Next stepsTake a deeper look at the dependencyTake a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support [AT] socket [DOT] dev. Remove the packageIf you happen to install a dependency that Socket reports as Known Malware you should immediately remove it and select a different dependency. For other alert types, you may may wish to investigate alternative packages or consider if there are other ways to mitigate the specific risk posed by the dependency. Mark a package as acceptable riskTo ignore an alert, reply with a comment starting with |
3e4a370 to
b723f19
Compare
|
@SocketSecurity ignore npm/@agoric/store@0.9.3-upgrade-18-dev-ef001c0.0 npm/@agoric/internal@0.4.0-u18.0 npm/@agoric/base-zone@0.1.1-upgrade-18-dev-ef001c0.0 npm/@agoric/swingset-liveslots@0.10.3-u18.0 Socket security appears to just be confused about these, per @mhofman |
|
@SocketSecurity ignore npm/negotiator@1.0.0 npm/unique-slug@5.0.0 npm/@npmcli/fs@4.0.0, npm/unique-filename@4.0.0 npm/negotiator@0.6.4 npm/@npmcli/fs@4.0.0 npm/@mapbox/node-pre-gyp@2.0.0-rc.0 Authors changed, yeah. |
|
@SocketSecurity ignore npm/socks-proxy-agent@8.0.5 Mature package that needs net access because it's fundamentally about net access. |
|
@SocketSecurity ignore npm/emoji-regex-xs@1.0.0 Not a typo squat, just an enhancement. |
17ff207 to
3d664e9
Compare
rekmarks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay! I definitely haven't fully digested this work, but here I present some questions and suggestions that will hopefully bring me closer to understanding.
sirtimid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the ./packages/extension/src/ui/App.module.css
and replace
.rightPanel {
position: relative;with
.rightPanel {
position: sticky;
top: var(--spacing-lg);so we don't lose the message panel on scroll?
sirtimid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just thinking that the const { log } = console; destructuring seems to bypass the purpose of having a structured logger utility. The original logger was created to add context (like [ocap kernel]) to log messages
sirtimid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following suggestion perhaps can be done on a follow up PR:
I think we can make the message passing between components (Vat → VatSupervisor → VatHandle → Kernel) simpler and safer by using a MessageBus pattern.
Right now we're doing a lot of type casting (like rawTarget as KRef) and passing messages through multiple components. Instead, we could have components talk through a MessageBus, which would remove the need for type casting - messages would have proper types from the start and make the flow simpler - components would just send messages to the bus instead of needing to know about each other.
Also having all messages in one place would make debugging and testing easier, since, with the MessageBus abstraction, we can move the syscall handling to the Kernel and remove direct kernel dependencies from VatHandle. some example code
// Instead of passing through multiple components:
Vat -> VatSupervisor -> VatHandle -> Kernel.deliver -> Kernel.routeMessage
// We'd just have:
Component -> MessageBus -> Subscribers
type SyscallOperation =
| 'send'
| 'subscribe'
| 'resolve'
| 'exit'
| 'dropImports'
| 'retireImports'
| 'retireExports'
| 'abandonExports';
type MessageType = {
type: 'syscall';
vatId: VatId;
operation: SyscallOperation;
payload: unknown;
} | {
type: 'delivery';
target: VRef;
message: Message;
} | {
type: 'notification';
resolutions: VatOneResolution[];
};
interface MessageBus {
publish(message: MessageType): Promise<void>;
subscribe(handler: (message: MessageType) => Promise<void>): void;
unsubscribe(handler: (message: MessageType) => Promise<void>): void;
}
// Improved Kernel class
export class Kernel {
#messageBus: KernelMessageBus;
async #handleMessage(message: KernelMessage): Promise<void> {
switch (message.type) {
case 'send': {
const { target, message: msg, vatId } = message.payload;
// No type casting needed, flow is more direct
const route = this.#routeMessage({ target, message: msg });
if (route?.vatId) {
const vat = this.#getVat(vatId);
await vat.deliverMessage(
this.#translateRefKtoE(vatId, target, false),
this.#translateMessageKtoE(vatId, msg),
);
}
break;
}
}
}
}
// and we can call it like this:
this.#messageBus.publish({
type: 'syscall',
vatId: this.vatId,
operation: 'send',
payload: { target, message },
});
Bypassing the structured logger was indeed the purpose - getting rid of visual clutter to make debugging easier. |
|
With respect to the message bus idea, I'm think I'm just not getting it. It is definitely true that there's a tedious amount of casting, but I think most of it is due to impedance mismatch between our type definitions and those that come bundled with liveslots, which this proposal won't help at all. (The mismatch can largely be summarized as: they define a bunch of things as strings which we are more specific about. Once we get Agoric's NPM packaging under control I think we can fix the typedefs on their end.) One thing about the current message setup that I do find maddening is the comingling of the vat command channel, in which the kernel sends requests, to which the vat replies, with the syscall channel, in which the vat sends requests, to which the kernel replies. But I don't see how the message bus idea helps with that; if anything it seems to continue it under another guise. I also don't follow the motivation for moving syscall handling into the kernel. We want syscalls to be handled in Basically, I don't see what this buys us. But it's entirely possible I'm just not understanding what is actually being proposed here, so we should talk. |
5e3da2d to
3d40600
Compare
Yeah ok, I think I was wrong about the benefits of a message bus here. I didn't think of the additional lookups. That would actually make it more complex. The current architecture actually makes more sense since VatHandle needs to keep vat-specific state and syscall handling together. Thank you for the thoughtful critique! |
sirtimid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but a second approval would be great for confirmation.
63b5590 to
a98115e
Compare
I'm thinking about what to do about this. @FUDCo your I suppose the structured logger would prepend that call with e.g.
|
those whinges having been introduced by the nodejs package
|
Report too large to display inline |
Yeah, the debug log output was intended to be ephemeral. The More generally, I've found a good log of message traffic (and very little else) is one of the most valuable tools for debugging distributed systems, and have invested a lot of effort over the years in making the output of these log entries legible. This is in direct conflict with most logging utilities which just keep adding stuff to log entries that makes the them less legible. (When I'm debugging some hairy network protocol thing I find I often end up spending ridiculous amounts of time in emacs replacing strings and reformatting things to boil log files down to the just the stuff I want to see). I suspect the long term solution is some kind of structured log thing where the entries are JSON that can be post-processed with |
|
worried my comment may be within a fold somewhere, so: #317 (comment) |
rekmarks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This PR implements the first phase of integration between the
@agoric/swingset-liveslotspackage and our kernel.This enables full vat-to-vat messaging, including serialization and deserialization of all passable value types, intervat object reference mapping, and cross-vat promise resolution. This means we now have a minimum viable multi-vat computational platform!
The word "minimal" there is important: there remain a few rough edges and a number of important things that still need to be done, which should be converted into issues of their own in our project plan.
Notable points:
This PR adds kernel support for the vat syscall and liveslots delivery interfaces, but while the skeleton for these is complete, the kernel does not yet implement the elements of these interfaces that realize distributed reference GC. Without this, the memory footprint of the vats and the kernel will grow over time without bound, which is, of course, bad, but it's entirely adequate for things like demos, proofs of concept, and of course as the jumping off point for actually implementing the DGC stuff. Issue Implement kernel-side support for liveslots distributed garbage collection #329 has been added for this.
The liveslots vat syscall interface is nominally a synchronous API, whereas the kernel<->vat message pathway is unavoidably asynchronous on account of how web APIs work. However, the only portions of this API that truly require synchronous invocation are those that pertain to vat storage access. We deal with this by implementing the vat storage portions of the syscall API directly in the vat process as part of the
VatSupervisorclass. Basically, each vat gets its own SQLite database for its state. One consequence of this is that, for the time being at least, vats run on top of a memory-only version of SQLite, which means the nominally persistent vat state is not actually persistent; our discussions have convinced us that for a lot of applications this ephemerality is actually acceptable. However, some applications will require genuine vat persistence. The way to achieve this is to run them inside web workers instead of iframes, where SQLite can make use of the synchronous part of the OPFS API (as the kernel does). However, the version of the code in this PR does not include support for vats in web workers; that will need to be added. Issue Implement support for running vats in web workers to enable SQLite to have synchronous OPFS access #332 has been added for this.An additional point about wrapping the synchronous syscall interface around the asynchronous vat<->kernel interface: one reason we can get away with this is that, aside from the vat storage methods already discussed, the results returned by syscall methods largely don't matter -- the only thing returned is an error indication in the case of something going wrong, and none of these calls can go wrong as the result of user error. In other words, they should all always be successful except in the case of bugs in our own code. So the
VatSupervisorimplements these by shipping off the asynchronous syscall request to the kernel and then synchronously returning a success indicator to liveslots. This really should be OK, but probably some better logging is called for in order to catch bugs in our stuff in the event we have any (like that could ever happen, hah). Issue Make vats handle syscall failures better (or at all) #333 has been added for this.In the same vein, in the interest of minimizing unanticipated implementation shear due to expectations built into liveslots, this PR implements the very serial one-crank-at-a-time model that SwingSet currently does. However, I don't think this is strictly necessary. In particular, since vats each have their own storage, they can implement internal message queues and then be allowed to run freely, with all of the side effects of a delivery being released to the kernel in a single blob upon crank completion. This is an opportunity for better parallelism and likely better performance, as well as giving us more protection from what happens if a piece of user code in a vat goes into an infinite loop. However, to support all this will require significant additions to both the vat supervisor and the kernel, so I'm kicking that can down the road for now. For the record I've added issue Enable vats to execute more independently of the kernel #334 for this.
Although the current vat store implementation is ephemeral, it should still support the complete virtual object and virtual collection APIs (e.g., exoclasses, baggage, et al) that liveslots makes available to vat user code. All that stuff should Just Work, modulo the fact that the bits will evaporate with their enclosing iframe. However, while I have high confidence that all this will, in fact, Just Work, none of this functionality has actually yet been tested in our environment, and you know what they say about untested code. Issue Test virtual object functionality in our environment #335 has been added for this.
All of the new intervat computation stuff and its supporting machinery implemented here currently lacks unit tests. This is an important gap, but I'm putting this PR up now anyway in the interest of making it all available to the whole team and of avoiding too much drift from the main branch. Issue Add tests for liveslots integration #336 as been added for this.
There were a number of issues with the
@agoricpackages as published not playing well with TypeScript. These issues are on the Agoric side, having to do with how they build and publish this stuff to npm. The Agoric folks are aware and supportive, but we will probably have to be more proactive on our side with respect to getting that stuff fixed. Everybody's expectation is that this is just some tedious build and configuration stuff that needs to be sorted out, but it's possible that we will end up having to be the ones to do it via a PR to Agoric. In the meantime, I've worked around these issues via the expedient of incorporating the necessary type declaration files into our own source tree in thekernelpackage. These were created by extracting the relevant declaration files from the Agoric sources in theagoric-sdkrepo and then having Claude convert them from JSDoc to actual TypeScript (all these files have names beginning withag-if you want to look at them). Once the npm packaging issues are dealt with, these can go away. Having our own type declaration files is bad mainly due to the risk of divergence if there are changes on the Agoric side; I'm not too worried about this as these APIs have been stable for some time, but still... I've added issue Arrange to get Agoric NPM packages fixed and absorb the results #337 for this.Quite a lot of code has been added in a fairly short time, so there are certainly some code hygiene issues that should be addressed. I'm usually pretty obsessive about that kind of stuff, but, as with unit tests, in the interest of moving our process along I'm pushing now rather than taking additional time to make it all perfect and pretty first. In particular, I think there are some methods in
Kernel,VatHandle, andVatSupervisorthat should be private but aren't. Also, some additional love ought to be applied to the layout of the source files in terms of how things are ordered for the benefit of human readers. Also, more comments.While we're on the subject of code hygiene, this exercise has really driven home to me just how clunky, awkward, ugly, and needlessly verbose doing things in the JS class pattern is, in contrast to the objects-as-closures pattern. I think I'd like to raise this as an internal issue in our team at some point.
I'd give even odds that there are one or two other notable points that I should be mentioning here but which I've lost track of in all the excitement. When and if I think of them I'll add additional comments to this PR.
Closes #29