Skip to content

refactor: slug-based identity for Convention and Ontology #129

@rorybyrne

Description

@rorybyrne

Background

In #76 we moved Schema from server-generated UUID ids to user-supplied slugs — pdb-structure@1.0.0 on the wire instead of 4d09002b-0572-4fb0-b@1.0.0. That pattern was a clear win for readability and for client-side use (e.g., the Pockets frontend pinning schema on discovery queries).

Two remaining identity asymmetries are worth closing for the same reasons:

1. Convention still uses UUIDs

Today ConventionService.create_convention generates LocalId(str(uuid4())[:20]). The resulting ConventionSRN is opaque — urn:osa:localhost:conv:4d09002b-0572-4fb0-b@1.0.0. Convention is a declarative, user-named resource (it has a title, it's bundled with a specific schema + hooks), so there's no reason its identifier shouldn't be slug-shaped like Schema's.

This also fixes a real diagnostic pain: if two conventions are registered against the same schema (e.g. rcsb-pdb-full and rcsb-pdb-cryoem-only), today you cannot tell them apart by ID — they're two truncated UUIDs. With slugs, they're distinguishable at a glance.

2. Ontology stores srn: OntologySRN internally (inconsistent with Schema)

Schema was refactored in #76 to use id: SchemaId (short (id, version) form) internally, with SchemaSRN reserved for federation edges. Ontology still stores srn: OntologySRN as its identity. Same kind of resource (declarative, user-named, low-cardinality), different internal representation.

No new user-facing change needed here — ontologies already have meaningful slug-shaped IDs (ncbi-taxonomy, uberon). This is purely bringing the code to the same shape.

Out of scope

Record and Deposition should keep UUIDs. They're high-volume instance resources without natural keys at creation time. Domain-level identifiers (PDB ID, accession, DOI) belong in metadata, not in identity — and are already queryable via the discovery DSL. Coupling record identity to a mutable metadata field is the wrong direction.

Proposed work

Convention slug

  • Add ConventionIdentifier type alongside SchemaIdentifier in osa/domain/shared/model/srn.py (same regex: ^[a-z][a-z0-9-]{2,63}$).
  • CreateConvention command + route accepts id: ConventionIdentifier as required.
  • ConventionService.create_convention uses the supplied slug instead of uuid4()[:20].
  • SDK: add __convention_id__ class attribute requirement on convention() declarations, parallel to __schema_id__ on MetadataSchema. Update rcsb-pdb and pockets/backend definitions.
  • Tests: slug validation, duplicate (id, version) raises ConflictError(code=\"convention_already_exists\").

Ontology identity flip

  • Introduce OntologyId = (LocalId, Semver) analogous to SchemaId.
  • Ontology aggregate: replace srn: OntologySRN with id: OntologyId.
  • OntologyRepository port + Postgres impl updated.
  • Ontology services + CLI import flow threaded through.
  • OntologySRN retained for federation-edge boundaries only (import/export).
  • Command/query wire fields updated (short-form id instead of full URN where applicable).

Expected size: ~400–500 lines of code + tests across the two parts. Follows the pattern already established in #76 so the shape is known.

Notes

  • No migration needed in the greenfield sense — new deployments get the right identity from day one.
  • When/if there's data to migrate from existing UUID-based conventions, a separate CLI backfill is the right shape (same reasoning as feat: typed metadata tables + expressive REST query DSL #76's decision on JSONB→typed backfill).

Metadata

Metadata

Assignees

No one assigned

    Labels

    refactorInternal restructuring, no behavior change

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions