Skip to content

Simple module/import mechanism (backend only) #265

@brprice

Description

@brprice

Description

This is a feature request for a simple module/import system.
This only covers core plumbing, leaving exposing this to the frontend as future work.
In particular, we only give a primitive "import this explicit module" function; we don't yet have any way to look modules up by name.

Dependencies

None

Spec

We will make the backend aware of modules.
Currently a Prog is just a bunch of type/term definitions, which are mutable.
We change this so a Prog is a bunch of imported immutable modules and some local mutable modules (along with the other metadata, like the log etc).
A module is a bunch of type/term definitions, and each may or may not be exported.

Identifiers (of types, constructors and global terms) will become hierarchical: i.e. will be identified by a module name along with a normal name/id. The details of what a module name is happens to be fairly irrelevant to this proposal, but for concreteness, it will be a non-empty list of non-empty strings, i.e. basically a hierarchical module name as one would see in Haskell.
We allow different modules to define things with the same name as each other.

We will be able to import modules by copying (i.e. adding the code to the list of imported modules).
There will be a method that, given a module will copy its code into your project as a new imported module.
Assuming that the given module was type-correct in its original program, and that all its dependencies are already imported, this should always just work.
If not (and there is no reasonable way to enforce this), this may fail.
However, when this is exposed to the frontend, the only supported way (i.e. not poking the raw api) will go via importing a particular module from the primer DB, which will thus be type correct (but potentially have holes in).
The issue of dependencies is left for future work (whilst it is easy to enforce "has all its dependencies imported" by name, it is difficult to enforce "the previously-imported deps come from the same version of the program").

The only ways for an import to fail will be

  • not having imported the (transitive) dependencies: this is temporary, and in later work we will track dependencies explicitly
  • version skew -- a dependency has previously been imported, but it was an old, incompatible, version: this is also temporary, in later work we will track versions

(One failure mode we do not have, but will in future work is a race condition. The frontend may correctly say "modules M, M2 ... are available for import", but before the user asks to import M, the publisher changes their mind and deletes/unpublishes it)

Note that we will allow importing modules with holes in (for simplicity), even though it is a bit strange to do so: everything will work, but evaluation will get stuck if it runs into one of these holes, and there is nothing you can do to fill the hole in (imported modules are immutable). (See future work section.)

We will assume for now that all module names are unique (this will be easy to enforce: name the module after a username+project name etc).
This ensures that we will not need to do any name-mangling, so importing is exactly copying code.

We will also assume (falsly!) that there is only ever one version of a module: i.e. we will not worry about version skew.

Implementation details

Fresh id scoping

Some things are referred to by ID, and we have IDs at nodes in our ASTs.
Currently, these are all globally unique.
This means that if we blindly import a module by copying, the globally-defined term with a particular id may not be unique!
I suggest, to avoid having to re-write unique ids on import, we instead make all uniqueness local to a particular module.
Then, any reference we need to make will change from an ID to a module+ID.
(Edit: this was punted to #320)

Multiple local modules

To start with, it may be easier for there to be just one local module, so we don't need to change too much of the action code.
(Edit: indeed, this was punted to #321)

Module dependencies

Should modules list their dependencies explicitly?
Are cyclic module dependencies allowed? (We will allow them in this first pass)
The first is a convenience/optimisation issue: we could instead just walk all the code and look at what external things it refers to.
The second is a free choice.
Many current programming languages do require non-circularity (or at least, need circular deps to be treated specially), this is mostly for separate-compilation reasons, or for module-import/initialisation-has-side-effects reasons.

Is the typechecker aware of modules?

It is not yet clear to me how aware the TC should be of modules.
This will hopefully become clearer when implementing this.
One benefit of having imported modules be immutable is that it means we never have to TC them (other than on import, as a sanity check).

Versioning: as we are not implementing versioning, we can either

  • only allow import of the stdlib, and commit not to changing it (i.e. changes to stdlib need DB migrations), then imports should always be successful
  • allow any import, but accept that imports may fail because of versioning issues.

Not in spec

Parameterised modules

There is no way to say the equivalent of Agda's module Set (A : Type)(<= : A -> A -> Bool) where ...

First-class modules

There is no notion of "module" as a first class thing working inside Primer.
E.g. there is no way to have a function that takes or returns a module.

Flexible exports

It may be interesting to explore more flexible notions of exporting than "yes" and "no" (and "type, but not its constructors").
We will probably want "export, but don't show in evaluation" (although this would also make sense as an import option).
It may also be nice to have something similar to Haskell's hidden modules: one could say that "Foo.Bar.Internal.quux can be called from any Foo.Bar, but not external to that".

Export lists and evaluation

If a term is not exported, one probably shouldn't be able to see it in a middle-of-evaluation step.
This is not much different to having a "step over" rather than "step into" action in our interactive evaluator.

Explicit per-module import lists

If we have multiple local modules, we may want to have separate imports for each of them for organisational reasons.
For this FR, we will say that every imported module is in scope for every local module.
We will also say every local module is in scope for every local module also.
When editing and, say trying to refer to a global term, it would be nice to only see names from relevent modules.
This may be better suited to be stored elsewhere though.

Store references to imported modules

In the stored programs in the DB, it would be good to simply store a reference to the imported modules, rather than copying the code.
This will require some DB work to achieve, and is somewhat tied up with versioning and publishing.

Enforce unique names of modules

It is not particularly clear how this should work, and will have to wait until we have user accounts etc.

Publishing

Since the only exposed method is "import this explicit bunch of type/term defs", there is no way to mark your own code as available for importing.
This will come later, once we have proper accounts etc.
We will want some way to "unpublish" work (either the original publisher changes their mind/realises they published a secret, or an authority (e.g. a teacher) decides it is inappropriate). However, this is a tricky issue: what to do if people have already imported that code? There are also now race conditions where a module can be revoked between asking "what published modules are there" and "please import this module"

Hooking this up the frontend; automatically importing transitive dependencies

This is rather dependent on the DB work to store modules nicely in the DB, and the publishing work so one can see what modules are available to import.
In the future, we need a way for the user to actually interact with this stuff nicely.
They should be able to say "import stdlib.fractions" without needing to worry that this requires "stdlib.integer" and "stdlib.euclidean-algorithm" also.

Versioning

Consider the following timeline, where module B depends on module A, and are both owned by Owen.
Owen has published A & B.
Alice imports A;
Owen edits A and B producing A' and B' in such a way that B' depends on the new additions to A';
Alice tries to import B', but this fails as it needs A' rather than A.
This needs some sort of versioning to solve, and we do not try to tackle this here.
This change will need some database work, and potentially should be bundled with some change to allow us to store references to imported modules (at least in the db, perhaps not in memory) rather than explicit copying.
(When we tackle this, we should consider whether a project should be allowed to transitively depend on two different versions of the same dependency, as long as they do not need to interact.)
(Another thing we need to consider: if everything is explicitly versioned and dependencies are pinned to a particular version we will have a small bugfix deep in a dependency tree will force a ripple effect downstream of releasing a new version simply to bump dependency versions. This is poor UX! However, let's also bear in mind our expected use cases -- in an education setting we probably won't have large dependency trees etc)

Updating

(Since we do not support versioning, we cannot sensibly support updating either.)
We would like to be able to follow upstream updates: if we imported a module which has since seen some development, it would be good to be able to update to the new version.
This may want some automatic PVP-style versioning so we can say "this update will be type-correct (but may break some invariants you rely on...".
Some diff tool would also be good.

Forking

We would like to allow a user to import a module A and edit it, getting A' (perhaps adding some new utility that needs access to the internals).
They should then be able to import B which depends on A, but want it to "really depend on A'".
Some tools to "push changes upstream" and "merge upstream changes" would also be nice here.

For now, we just don't support forking.

Future work

See "Not in spec"

We should revisit the decision to allow imports of modules with holes in. Probably the best way to outlaw that is in the "publish" workflow. Note that we should allow multiple local modules to depend on each other even if they have holes -- i.e. one can develop multiple related modules at the same time.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions