Skip to content

Ingest GitHub Issues into Backend Database #74

@joelamouche

Description

@joelamouche

NOTE: THIS COULD BE DONE IN TYPESCRIPT OR RUST
BECAUSE THIS SHOULD BE IMPLEMENTED AS A SEPARATE MICROSERVICE

Ingest GitHub Issues into Backend Database

Reference: issue #74

  • Scope

    • Ingest issues from specified GitHub repositories via REST API.
    • Persist issues with metadata, assignees, labels/tags, points, reward-processing flags.
    • Idempotent upserts based on a composite key to avoid duplicates across repos.
    • Exclude pagination, retry/backoff, and endpoint auth from this ticket.
  • Data Model

    • github_issues
      • Primary Key: (repo_id BIGINT, github_issue_id BIGINT) — composite PK to avoid duplicates across repos.
      • repo TEXT (org/repo)
      • repo_id BIGINT (GitHub repository ID)
      • github_issue_id BIGINT (GitHub global issue ID)
      • number INT (repo-scoped issue number)
      • title TEXT
      • state TEXT ENUM-like: open | closed
      • labels JSONB (or normalized later)
      • points INT (derived; default 0)
      • assignee_logins JSONB (or normalized later)
      • html_url TEXT
      • created_at TIMESTAMPTZ
      • closed_at TIMESTAMPTZ NULL
      • rewarded BOOL DEFAULT false
      • distribution_id TEXT NULL
      • updated_at TIMESTAMPTZ
    • (Optional normalization, out of scope now) github_issue_assignees(issue_repo_id BIGINT, issue_github_issue_id BIGINT, assignee_login TEXT)
  • Endpoint (Auth deferred)

    • POST /admin/github/sync
      • Body: { repos: string[] } (each org/repo), optional { since: string ISO }
      • Performs ingestion for provided repos (or configured defaults if not provided).
      • Idempotent upsert keyed by (repo_id, github_issue_id).
  • Logic Notes

    • Ignore PRs (or store with is_pull_request=true but do not compute points) — choose ignore for simplicity.
    • Derive points from label patterns like points:3; default 0 if none.
    • Persist assignee GitHub logins.
    • Closed issues are candidates for rewards; do not set rewarded in this sync.
    • Minimal happy-path sync; pagination and rate limit handling are out of scope for this ticket.
  • Validation

    • Ensure (repo_id, github_issue_id) drives idempotency (upsert).
    • Normalize label names to lower-case.
  • Acceptance Criteria

    • DB schema created with composite primary key (repo_id, github_issue_id).
    • POST /admin/github/sync ingests issues for the specified repos and upserts idempotently.
    • Points derived from labels; assignees and states persisted.
    • Running sync twice yields no duplicates.
    • Tests: transformation (labels → points), idempotent upsert, closed issue reflected.

Follow-up Task: Secure the Sync Endpoint

Reference: issue #74

  • Goal

    • Add authentication/authorization to POST /admin/github/sync.
  • Scope

    • Require admin-only access (reuse existing auth middleware/policy).
    • Accept an API key or JWT consistent with current backend auth.
    • Return 401/403 on unauthenticated/unauthorized.
  • Acceptance Criteria

    • Unauthorized requests to /admin/github/sync are rejected.
    • Authorized admin can trigger sync successfully.
    • Tests for auth success/failure paths.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions