Skip to content

Project Structure

John R. D'Orazio edited this page Mar 10, 2026 · 4 revisions

Project Structure

This page describes the organization of the OntoKit API codebase.

Directory Layout

ontokit-api/
├── ontokit/                     # Application source code
│   ├── __init__.py
│   ├── main.py                  # FastAPI application entry point
│   ├── runner.py                # CLI entry point (`ontokit` command)
│   ├── version.py               # Version management (Weblate-style)
│   ├── worker.py                # ARQ background job queue
│   ├── api/                     # API routes
│   │   ├── utils/               # Shared API utilities (Redis pool)
│   │   └── routes/              # REST routers
│   │       ├── analytics.py     # Project activity and contributor stats
│   │       ├── auth.py          # Authentication endpoints
│   │       ├── classes.py       # OWL class operations
│   │       ├── embeddings.py    # Embedding config and generation
│   │       ├── join_requests.py # Project join request workflow
│   │       ├── lint.py          # Ontology linting
│   │       ├── normalization.py # Ontology normalization
│   │       ├── notifications.py # User notifications
│   │       ├── ontologies.py    # Ontology CRUD
│   │       ├── projects.py      # Project management
│   │       ├── properties.py    # OWL property operations
│   │       ├── pull_requests.py # Pull request workflow
│   │       ├── quality.py       # Consistency checks, duplicates, cross-refs
│   │       ├── search.py        # Full-text search and SPARQL
│   │       ├── semantic_search.py # Vector similarity search
│   │       ├── suggestions.py   # Suggestion session workflow
│   │       ├── upstream_sync.py # GitHub upstream synchronization
│   │       └── user_settings.py # User settings and GitHub token management
│   ├── core/                    # Core functionality
│   │   ├── __init__.py
│   │   ├── auth.py              # JWT validation, JWKS caching
│   │   ├── beacon_token.py      # Beacon token minting/verification
│   │   ├── config.py            # Settings (Pydantic)
│   │   ├── database.py          # Database connection
│   │   ├── encryption.py        # Fernet encryption for stored tokens
│   │   ├── exceptions.py        # Custom exception classes
│   │   └── middleware.py        # Request/response middleware
│   ├── models/                  # SQLAlchemy models
│   │   ├── __init__.py          # Model exports
│   │   ├── branch_metadata.py   # Git branch metadata
│   │   ├── change_event.py      # Entity change tracking
│   │   ├── embedding.py         # Embeddings, jobs, config
│   │   ├── join_request.py      # Project join requests
│   │   ├── lint.py              # Lint runs and issues
│   │   ├── normalization.py     # Normalization runs
│   │   ├── notification.py      # User notifications
│   │   ├── project.py           # Project, ProjectMember, GitHubIntegration
│   │   ├── pull_request.py      # PRs, comments, reviews
│   │   ├── suggestion_session.py # Suggestion sessions
│   │   ├── upstream_sync.py     # Sync config and events
│   │   └── user_github_token.py # Encrypted GitHub PATs
│   ├── schemas/                 # Pydantic v2 request/response schemas
│   ├── services/                # Business logic
│   │   ├── change_event_service.py    # Change tracking and analytics
│   │   ├── consistency_service.py     # Ontology consistency checks
│   │   ├── cross_reference_service.py # Cross-reference analysis
│   │   ├── duplicate_detection_service.py # Duplicate entity detection
│   │   ├── embedding_service.py       # Embedding generation and search
│   │   ├── embedding_text_builder.py  # Text extraction for embeddings
│   │   ├── embedding_providers/       # Pluggable embedding backends
│   │   │   ├── base.py               # Provider interface
│   │   │   ├── local_provider.py      # sentence-transformers (local)
│   │   │   ├── openai_provider.py     # OpenAI embeddings API
│   │   │   └── voyage_provider.py     # Voyage AI embeddings API
│   │   ├── github_service.py          # GitHub App integration
│   │   ├── github_sync.py            # GitHub repository sync
│   │   ├── join_request_service.py    # Join request workflow
│   │   ├── linter.py                  # Ontology validation (20+ rules)
│   │   ├── normalization_service.py   # Ontology normalization
│   │   ├── notification_service.py    # Notification management
│   │   ├── ontology.py                # RDF/OWL graph operations
│   │   ├── ontology_extractor.py      # Ontology structure extraction
│   │   ├── project_service.py         # Project CRUD and members
│   │   ├── pull_request_service.py    # PR workflow with semantic diff
│   │   ├── rdf_utils.py               # Shared RDF utilities
│   │   ├── search.py                  # Full-text search service
│   │   ├── sitemap_notifier.py        # Frontend sitemap revalidation
│   │   ├── storage.py                 # MinIO object storage
│   │   ├── suggestion_service.py      # Suggestion session management
│   │   ├── upstream_sync_service.py   # External repo synchronization
│   │   └── user_service.py            # Zitadel user lookups
│   ├── git/                     # Git integration
│   │   └── bare_repository.py   # pygit2 bare repos for concurrent access
│   └── collab/                  # Collaboration (WebSocket)
│       ├── presence.py          # User presence
│       └── protocol.py          # Message protocol
├── alembic/                     # Database migrations
│   ├── env.py                   # Migration environment
│   ├── script.py.mako           # Migration template
│   └── versions/                # Migration files
├── tests/                       # Test suite
│   ├── conftest.py              # Pytest fixtures
│   └── test_*.py                # Test files
├── scripts/                     # Utility scripts
│   └── setup-zitadel.sh         # Zitadel OIDC configuration
├── docs/                        # Documentation
├── .env                         # Environment variables (not in git)
├── .env.example                 # Example environment file
├── .pre-commit-config.yaml      # Pre-commit hooks (ruff, mypy)
├── alembic.ini                  # Alembic configuration
├── compose.yaml                 # Docker Compose (full stack)
├── compose.prod.yaml            # Docker Compose (infrastructure only)
├── Dockerfile                   # Container build instructions
├── Makefile                     # Dev tasks: setup, lint, format, typecheck, test
├── pyproject.toml               # Project configuration (uv/PEP 735)
└── README.md

Key Components

ontokit/main.py

The FastAPI application entry point:

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI(
    title="OntoKit API",
    description="Collaborative OWL Ontology Curation Platform",
)

# CORS middleware
app.add_middleware(CORSMiddleware, ...)

# Include API routers (URL prefix is /api/v1/, not a directory concern)
app.include_router(api_router, prefix="/api/v1")

ontokit/core/config.py

Configuration management using Pydantic Settings:

from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    app_env: str = "development"
    database_url: PostgresDsn
    # ... other settings

    model_config = SettingsConfigDict(env_file=".env")

settings = get_settings()  # Cached singleton

ontokit/core/database.py

Async SQLAlchemy database setup:

from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession

class Base(DeclarativeBase):
    pass

engine = create_async_engine(str(settings.database_url))
async_session_maker = async_sessionmaker(engine)

async def get_db() -> AsyncGenerator[AsyncSession, None]:
    async with async_session_maker() as session:
        yield session

ontokit/core/auth.py

JWT token validation and user extraction:

class CurrentUser(BaseModel):
    id: str
    email: str | None
    name: str | None
    roles: list[str] = []

# Dependency injection types
RequiredUser = Annotated[CurrentUser, Depends(get_current_user)]
OptionalUser = Annotated[CurrentUser | None, Depends(get_current_user_optional)]

ontokit/api/routes/

API routers are registered in main.py with the /api/v1/ URL prefix:

# In main.py
app.include_router(projects.router, prefix="/api/v1/projects", tags=["Projects"])
app.include_router(ontologies.router, prefix="/api/v1/ontologies", tags=["Ontologies"])
# ... more routers

Architectural Patterns

Layered Architecture

┌──────────────────────────────────────┐
│         API Layer (Routers)          │  ← HTTP handling, validation
├──────────────────────────────────────┤
│         Service Layer                │  ← Business logic
├──────────────────────────────────────┤
│         Data Layer (Models)          │  ← Database operations
└──────────────────────────────────────┘

Request Flow

  1. Router receives HTTP request
  2. Pydantic schema validates request body
  3. Dependencies inject services, user context
  4. Service executes business logic
  5. Model interacts with database
  6. Response schema formats the response

Example Flow

# Router (ontokit/api/routes/projects.py)
@router.post("", response_model=ProjectResponse)
async def create_project(
    project: ProjectCreate,              # Validated by Pydantic
    service: ProjectService = Depends(), # Injected service
    user: RequiredUser,                  # Authenticated user
) -> ProjectResponse:
    return await service.create(project, user)

# Service (ontokit/services/project_service.py)
async def create(self, project: ProjectCreate, owner: CurrentUser):
    db_project = Project(
        name=project.name,
        owner_id=owner.id,
        ...
    )
    self.db.add(db_project)
    await self.db.commit()
    return self._to_response(db_project, owner)

Dependency Injection

FastAPI's Depends() is used throughout:

# Database session
def get_db():
    return async_session_maker()

# Service with database
def get_project_service(db: AsyncSession = Depends(get_db)):
    return ProjectService(db)

# In router
@router.get("/")
async def list(service: ProjectService = Depends(get_project_service)):
    ...

Naming Conventions

Type Convention Example
Files snake_case.py project_service.py
Classes PascalCase ProjectService
Functions snake_case create_project
Variables snake_case user_id
Constants UPPER_SNAKE_CASE MAX_LIMIT
API paths kebab-case (usually) or snake_case /api/v1/projects

Schema Conventions

For each entity, we typically have:

Schema Purpose Example
{Entity}Base Shared fields ProjectBase
{Entity}Create Input for creation ProjectCreate
{Entity}Update Input for updates (all optional) ProjectUpdate
{Entity}Response API response ProjectResponse
{Entity}ListResponse Paginated list ProjectListResponse

Adding New Features

1. Add a Model

# ontokit/models/comment.py
class Comment(Base):
    __tablename__ = "comments"
    ...

# ontokit/models/__init__.py
from ontokit.models.comment import Comment

2. Create Migration

alembic revision --autogenerate -m "Add comments"
alembic upgrade head

3. Add Schemas

# ontokit/schemas/comment.py
class CommentCreate(BaseModel): ...
class CommentResponse(BaseModel): ...

4. Add Service

# ontokit/services/comment_service.py
class CommentService:
    def __init__(self, db: AsyncSession):
        self.db = db

    async def create(self, ...): ...

5. Add Router

# ontokit/api/routes/comments.py
router = APIRouter()

@router.post("", response_model=CommentResponse)
async def create_comment(...): ...

6. Register Router

# ontokit/main.py
app.include_router(comments.router, prefix="/api/v1/comments", tags=["Comments"])

Next Steps

Clone this wiki locally