Skip to content

Implement Content Addressable Storage (CAS) System #4

@Kinflou

Description

@Kinflou

Implement Content Addressable Storage (CAS) System

Priority

P0 (Critical) - Required for versioning and state management

Labels

  • core
  • cas
  • versioning
  • P0

Estimated Effort

3 weeks

Description

Implement the Content Addressable Storage system inspired by Git's object model. This is critical for schema versioning, diffing, and the state.lock file generation.

The CAS system will enable:

  • Immutable schema versioning
  • Efficient storage of schema history
  • Fast diff computation
  • Reproducible builds

Current State

Location:

  • core/src/package/build/cas/
  • core/src/schema/ir/frozen/cas/

Skeleton exists with mostly todo!() placeholders. Over 20 TODO items identified.

TODO References

  • package/build/cas/mod.rs:27
  • package/build/cas/package.rs:56
  • package/build/cas/schema.rs:15
  • schema/ir/frozen/cas/tree.rs:14,19
  • schema/ir/frozen/cas/meta.rs:30

Inspiration

Git Internals - Git Objects

The CAS system should mirror Git's approach:

  • Blobs: Store individual frozen units
  • Trees: Organize schema structure
  • Commits: Track versions with metadata

Acceptance Criteria

  • Blob storage implemented with blake3 hashing
  • Tree structure for organizing schema elements
  • Commit mechanism for versioning
  • state.lock file generation working
  • CAS can store and retrieve FrozenUnits
  • Compression using lz4_flex for storage efficiency
  • Documentation explaining the CAS architecture
  • Performance benchmarks show fast lookups (<10ms)

Tasks

  • Design CAS data structures (blob, tree, commit)
  • Implement blake3-based content hashing
  • Create storage backend (filesystem initially, could be abstracted later)
  • Add FrozenUnit serialization to CAS (MessagePack format)
  • Implement tree building from schema structures
  • Add commit creation with metadata (author, timestamp, message)
  • Create state.lock file writer with version info
  • Add CAS loader for reading frozen schemas
  • Implement garbage collection for unreferenced objects
  • Write comprehensive tests (unit + integration)
  • Document CAS architecture and usage

Dependencies

Blocks

Metadata

Metadata

Assignees

No one assigned

    Labels

    P0Priority 0 - CriticalcasContent Addressable StoragecoreCore system component

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions