Skip to content

Automatically restart vats on kernel restart #437

@FUDCo

Description

@FUDCo

Although, as of PR #436, we have persistent state in vats, right now it doesn't do a lot of good because when the kernel is restarted it does not reload the vats as active processes, so the state just sits there on disk -- there's nobody home to, for example, receive messages sent to objects inside those vats.

The kernel needs to keep a persistent record of which vats have been started (and not yet terminated), and then use this record on startup to reinitialize those vats' as run-time entities.

Note for the future: In the fullness of time we might want to make this restart lazy and only reload a vat when it's time to deliver a message to that vat, though that's a complication we almost certainly don't need immediately. If we do this we might even want to make the choice of whether to do eager vs. lazy vat reload a per-vat configuration option. We might even want to eventually allow inactive vats to be dropped out of memory entirely and then restarted on demand, but that's definitely long term.

Our current model of vat startup/termination/restart is a little bit muddled, partially due to a little lack of clarity about what we really want but mainly, I think, due to our (up until now) inability to actually reconstitute a previously started vat. What we currently have is something of a compromise, driven mainly by testing needs. Right now:

  • startVat initializes a vat from scratch (with vat ID provided by its caller), launching the vat in a new iframe (with new channel to it), initializes all relevant vat-specific persistent kernel state, and adds the vat to the kernel’s (ephemeral) table of extant vats
  • terminateVat kills the vat’s iframe & the channel to it, and removes it from the kernel’s (ephemeral) table of extant vats. It rejects any promises that the vat is currently the decider for but it does not scrub any vat-specific state from the kernel store
  • terminateAllVats is essentially terminateVat applied to all the vats that the kernel (ephemerally) knows about
  • restartVat is essentially a call to terminateVat followed by a call to startVat, essentially creating a whole new vat with the same vat ID as the vat that was terminated, thus inheriting any vat-specific persistent kernel state and likely resulting in comedy and mayhem if one were to try to do anything serious with the terminated vat’s exported or imported objects
  • launchSubcluster starts a set of vats based on a config descriptor, allocating new vat IDs that it passes to startVat to get the vats themselves going (and optionally sends the bootstrap message to a distinguished member of the set, though I don’t that’s important for the current discussion)
  • reload consists of terminateAllVats followed by a launchSubcluster with the config of the most recently launched subcluster; it is a testing hack that really isn’t otherwise a useful operation
  • reset consists of terminateAllVats followed by restoring the kernel store to its initial state, but doesn’t start any vats running; it is, as best I can tell, also primarily a testing hack

At startup, the kernel executes launchSubcluster with a default cluster configuration that’s hard-wired in for testing purposes; in the fullness of time I expect this to either be removed or to have the default cluster configuration become some baseline set of vats that we want all users to have running by default (not sure what this would be, but plausibly useful and not entirely crazy on its face).

I propose rearranging the above to:

  • launchSubcluster stays mostly unchanged, except that it is no longer responsible for generating vat IDs for the vats it creates
  • startVat continues to initialize a vat from scratch, but is broken into two parts: startVat proper, which generates a new vatID, initializes both kernel and vatstore persistent state for the new vat, and then invokes the second part, runVat. It adds the new vat to the kernel's now persistent vat table.
  • runVat is given the vat ID as a parameter (in the way startVat formerly was); it creates a new iframe and associated channel and loads the vat user code into it, but adopts whatever persistent state is already present (in the case of new vat, this will have been set up by startVat; in the case of a vat being resumed on kernel restart, it will just be present in the database)
  • terminateVat continues to do the things it did before, but in addition also removes any persistent storage pertinent to the vat; after terminateVat is called, all traces of the vat are gone (alternatively, the persistent vat table entry might be retained with an annotation that the vat has been terminated; this would leave a historical trace of the vat's once-upon-a-time existence, which might be useful for debugging and diagnostic purposes, though it would result in a (small) on disk storage leak)
  • terminateAllVats does what it always did
  • reset becomes a genuine "reinitialize everything from scratch" operation, analogous to "reformat my system disk and reinstall the OS"
  • restartVat and reload go away; they are no longer things

At startup, the kernel walks through its persistent vat table and invokes runVat on all the vats that are there. The default subcluster has an associated flag that gets set when the subcluster is started, and kernel startup only launches the subcluster if this flag is clear.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions