-
Notifications
You must be signed in to change notification settings - Fork 0
Historical Roots for Package Sets #27
Description
Summary
This proposal outlines a feature to introduce named package sets, allowing developers to define a unique identity for a collection of atoms within a repository. This would solve identity conflicts in scenarios like repository forks and improve the flexibility of how atoms are organized and identified.
Problem Statement
Currently, an Ekala root's identity is determined solely by the revision of the oldest parentless commit directly related to an atom. While this is sufficient for most projects, it becomes problematic when a project's identity needs to change without rewriting its Git history.
Use Case: Repository Forks
A canonical example is forking a project. An atom's unambiguous cryptographic ID is derived from its root ID. If a project is forked and its history diverges significantly over time, the atoms within the fork will continue to share the same identity as those from the original repository. This is not ideal, as the fork represents a new, distinct project. Developers should have a straightforward way to establish a new identity for their package set.
Proposed Solution
We can solve this by introducing a name component to the root identity calculation. By combining a user-defined tag with the root commit hash, we can derive a new, unambiguous hash for the entire package set.
This approach offers several advantages:
- Disambiguation: A developer can fork a project, assign a new name to the set, and immediately differentiate all atoms within it from the original project.
- Flexibility: This opens the door to hosting multiple named sets in a single repository, which could be useful for large monorepositories that wish to distinguish different project segments unambiguously. This is out of scope for this proposal but worth mentioning.
- Improved Discoverability: We can introduce an
ekala.tomlfile at the repository root to manage this configuration and index local atoms.
Example Details
1. ekala.toml Configuration
To manage the set name and local packages, we can introduce an ekala.toml file:
[set]
# A unique name for this package set.
tag = "my-projects-common-name"
# A list of local atoms to be indexed for fast resolution.
packages = [
"path/to/atom/a",
"path/to/atom/b"
]This file would provide a static, easily-parseable index of local atoms, avoiding the need for potentially expensive directory traversals. Commands like eka publish --all could use this list to ensure all declared atoms are published.
2. Git Ref Namespacing
To avoid name collisions, we can move the Git ref for atoms into a new namespace based on the set name:
- From:
refs/eka/atoms/... - To:
refs/eka/$SET_NAME/atoms/...
3. Hashing Algorithm
The identity hashing algorithm will need to be updated. Two potential approaches are:
- Derived Set Hash: Use the root commit hash in a key derivation function (e.g., BLAKE3) over the
tagto produce a final hash for the set. To avoid collisions with individual atom IDs, we would use a different hardcoded context string. - Contextual Atom Hashing: Modify the existing atom ID hashing algorithm to accept the set name as part of the context string, which would alter the output for each unique set.
Next Steps
The implementation of this feature is planned for after the initial MVP is complete. This issue is intended to facilitate discussion and track the work for future development.