Skip to content

Migrate Push data from Mercurial to Git#9372

Draft
camd wants to merge 3 commits intomasterfrom
camd/ingest-git-before-mg
Draft

Migrate Push data from Mercurial to Git#9372
camd wants to merge 3 commits intomasterfrom
camd/ingest-git-before-mg

Conversation

@camd
Copy link
Copy Markdown
Collaborator

@camd camd commented Apr 5, 2026

Add git-first push ingestion with Mercurial fallback for repositories
transitioning from hg to git. This enables a gradual migration where
repos can be configured with a git_url to try GitHub first.

Model changes:

  • Add git_url and git_branch fields to Repository model
  • Add RevisionMapping model for hg<->git SHA mapping

New modules:

  • git_pushlog.py: GitPushlogProcess and fetch_git_push() for fetching
    push data from GitHub API
  • revision_mapper.py: RevisionMapper class for mapping between hg and
    git revision SHAs via DB cache, GitHub search, cinnabar mapfiles,
    or local git log parsing

Modified ingestion paths:

  • PushLoader._try_git_first(): intercepts hg pushes and tries Git
    when git_url is configured on the repo
  • pushlog_tasks: fetch_push_logs dispatches to git or hg polling;
    new fetch_git_push_log task with hg fallback
  • ingest command: ingest_push() tries git-first for transitioning repos

Backfill command:

  • backfill_git_pushes management command re-maps Push/Commit revisions
    from hg SHAs to git SHAs, with dry-run, chunking, rate limiting,
    resume support, and --flip-dvcs-type to complete the migration

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

camd and others added 3 commits April 4, 2026 09:11
Add git-first push ingestion with Mercurial fallback for repositories
transitioning from hg to git. This enables a gradual migration where
repos can be configured with a git_url to try GitHub first.

Model changes:
- Add git_url and git_branch fields to Repository model
- Add RevisionMapping model for hg<->git SHA mapping

New modules:
- git_pushlog.py: GitPushlogProcess and fetch_git_push() for fetching
  push data from GitHub API
- revision_mapper.py: RevisionMapper class for mapping between hg and
  git revision SHAs via DB cache, GitHub search, cinnabar mapfiles,
  or local git log parsing

Modified ingestion paths:
- PushLoader._try_git_first(): intercepts hg pushes and tries Git
  when git_url is configured on the repo
- pushlog_tasks: fetch_push_logs dispatches to git or hg polling;
  new fetch_git_push_log task with hg fallback
- ingest command: ingest_push() tries git-first for transitioning repos

Backfill command:
- backfill_git_pushes management command re-maps Push/Commit revisions
  from hg SHAs to git SHAs, with dry-run, chunking, rate limiting,
  resume support, and --flip-dvcs-type to complete the migration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend:
- _ingest_git_first() now accepts both git SHAs and hg revisions,
  checking if the revision exists in git before falling back to
  hg-to-git mapping
- validate_revision() in job_loader checks RevisionMapping when a
  push isn't found, rewriting the origin revision to the git SHA
  so tasks referencing hg revisions link to git pushes
- PushSerializer exposes is_git_revision (computed from
  RevisionMapping join, no new migration needed)

Frontend:
- RepositoryModel stores gitRevisionHrefPrefix/gitPushLogUrl when
  git_url is set, alongside the existing hg defaults
- getRevisionHref(), getPushLogHref(), getRevisionBaseUrl() accept
  isGitRevision param to select the correct URL per push
- Push, RevisionList, Revision, RevisionLinkify, CommitHistory all
  thread is_git_revision from the push API response through to links

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant