GitHub deletes your traffic data every 14 days. This doesn't.
A self-hosted multi-source analytics tracker that preserves GitHub traffic beyond the 14-day retention limit and aggregates download stats from npm, PyPI, and Docker Hub into a single dashboard.
GitHub's repo Insights → Traffic page is great — for two weeks. After that, the data is gone. If you don't check daily, you never see it again. There's no historical chart, no alert, no export.
This tracker runs once a day (via GitHub Actions), pulls the API, stores everything in a local SQLite database, and gives you a Next.js dashboard with the historical chart you wish GitHub provided. It also pulls download stats from npm, PyPI, and Docker Hub so you have one place to see ecosystem adoption across all your release channels.
| Source | What | Retention |
|---|---|---|
| GitHub traffic | Daily views + clones, with deduplicated 14-day uniques | Unlimited (this tracker) |
| GitHub referrers | Top traffic sources, snapshot per day | 90+ days (configurable) |
| GitHub popular paths | Most-visited files, snapshot per day | 90+ days (configurable) |
| GitHub stars + forks | Daily totals across all tracked repos | Unlimited |
| GitHub contributors | Top contributors per repo, snapshot per day | Unlimited |
| GitHub release downloads | Per-tag download counts | Unlimited |
| npm | Daily + last-30-day downloads per package | Unlimited |
| npm by version | Per-version download splits | Unlimited |
| PyPI | Daily downloads, by Python version, by OS | Unlimited |
| PyPI by country | Country-level downloads via BigQuery (optional) | Unlimited |
| Docker Hub | Pull counts + tag history per image | Unlimited |
GitHub API ─┐
npm API ────┤
PyPI API ───┼──► collect-*.js ──► SQLite (data/analytics.db) ──► Next.js dashboard
Docker Hub ─┤ ╲
BigQuery ───┘ ╲──► static JSON (data/*.json)
consumable by external sites
via raw.githubusercontent.com
A daily GitHub Actions cron (6:00 AM UTC) runs the collectors, regenerates the SQLite DB, and commits the data plus per-source static JSON artifacts back to the repo. Your dashboard reads from the DB; external sites can read the JSON directly without spinning up the dashboard.
git clone https://github.com/opena2a-org/github-analytics-tracker.git
cd github-analytics-tracker
npm install
cp .env.example .env # add your GITHUB_TOKEN
npm run setup-db
npm run collect # fetch from GitHub
npm run collect-npm # fetch npm download stats
npm run dev # dashboard at http://localhost:3000That's it. The dashboard renders whatever's been collected so far. After running for a few days, the historical charts start filling in.
In .env:
GITHUB_TOKEN=ghp_... # required for GitHub
GITHUB_ORG=opena2a-org,ecolibria # auto-discovers all public repos
REPOS_TO_TRACK=owner/repo,owner/repo # optional extra repos
NPM_AUTHOR=ecolibria # auto-discovers all packages by this user
NPM_PACKAGES=hackmyagent,opena2a-cli # optional extra packages
PYPI_PACKAGES=cryptoserve,aim-sdk # comma-separated
DOCKER_IMAGES=opena2a/aim-server,opena2a/dvaa # comma-separated
GOOGLE_APPLICATION_CREDENTIALS=/path/to/gcp-key.json # optional for BigQuery country statsThe included workflow (.github/workflows/collect-stats.yml) runs daily at 6:00 AM UTC. To enable it:
- Settings → Secrets and variables → Actions → New secret.
- Add
GH_STATS_TOKEN(a Personal Access Token withrepoorpublic_reposcope). - Optionally set
GOOGLE_APPLICATION_CREDENTIALS_JSONfor BigQuery country stats.
The workflow auto-discovers public repos in the orgs listed in GITHUB_ORG. Add a new repo to the org, the next run picks it up. No manual list maintenance.
- DB is SQLite. No external database. The full DB ships with the repo as
data/analytics.db(~7 MB for 30+ repos at one year of history). - All API endpoints are unauthenticated because the data is already public. If you self-host, that's by design.
- No tracking, no telemetry, no third-party scripts in the dashboard.
- No PII. GitHub's referrer + popular-paths APIs return only aggregate counts — no IPs, no user agents, no session data.
- Static JSON is the canonical export.
data/summary.jsoncarries cross-source totals;data/*-stats-*.jsoncarry per-source per-package details. Consume directly via raw.githubusercontent.com if you don't want to run the dashboard.
GitHub's traffic API has subtleties worth knowing:
- Daily uniques cannot be summed. A visitor on three different days appears as
uniques=1on each of those days; summing gives 3, not 1. We store daily counts AND the 14-day rolling summary (which GitHub deduplicates correctly) so consumers can pick the right one. - Today's data is partial. The current day is still being written. We skip it on collection and only persist completed days.
- Referrers and popular paths are 14-day rolling snapshots, not daily breakdowns. We store one snapshot per day; on re-runs for the same day, we replace.
- All-time uniques are unreported because they would be wrong for the reason above. We surface the 14-day API figure instead.
| Table | Purpose |
|---|---|
repositories |
Tracked repos |
traffic_views, traffic_clones |
Daily counts (completed days) |
traffic_summary |
14-day deduplicated uniques |
referrers, popular_paths |
Daily snapshots of 14-day rolling data |
stargazers, forks |
Daily totals |
github_contributors, github_releases |
Per-repo extras |
npm_packages, npm_downloads, npm_version_downloads |
npm |
pypi_packages, pypi_downloads, pypi_python_versions, pypi_system_stats, pypi_country_downloads |
PyPI |
docker_images, docker_pulls, docker_tags |
Docker Hub |
Run sqlite3 data/analytics.db .schema for the full DDL.
All read-only, no auth. JSON responses.
GET /api/repos # list of tracked repos
GET /api/stats?repo_id=1&days=30 # per-repo stats (days: 7|14|30|90|365|all)
GET /api/overview # cross-source totals
GET /api/trends?repo_id=1 # daily trend data for charts
GET /api/npm-stats # npm package stats
GET /api/pypi-stats # PyPI package stats
GET /api/docker-stats # Docker image stats
| This | ungh.cc | Manual | |
|---|---|---|---|
| GitHub history beyond 14 days | ✓ | ✗ | ✗ |
| npm + PyPI + Docker | ✓ | ✗ | partial |
| Self-hosted (no third party sees your token) | ✓ | ✗ | ✓ |
| Dashboard included | ✓ | ✗ | ✗ |
| BigQuery country stats | ✓ optional | ✗ | ✗ |
How far back can I see data? As far back as when you started collecting. The first run captures the available 14 days; subsequent runs append.
What if I miss a day? GitHub keeps 14 days, so you have a 2-week buffer. Run the collector again to backfill.
How much storage? ~10-20 MB per year per 20 repos.
Can I track private repos?
Yes, if your GITHUB_TOKEN has access. Auto-discovery via GITHUB_ORG only picks up public repos; add private ones to REPOS_TO_TRACK explicitly.
Why not all-time unique visitors? Daily uniques can't be summed (a visitor on 5 days = 5 in the sum, not 1). The 14-day API figure is the most accurate unique count GitHub will give you.
See CONTRIBUTING.md. For security issues, see SECURITY.md.