Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/downloads/)

**OpenDNA** parses a 23andMe / AncestryDNA / MyHeritage raw DNA file against 8 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, cognition, stimulant sensitivity), joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. It now also scores match confidence, shows panel coverage / blind spots, and rolls multi-marker genes like APOE, HFE, MTHFR, CYP2C19, and warfarin PGx into composite calls. Optional BYOK LLM synthesis (Anthropic Claude or OpenAI GPT) adds a prose interpretation layer.
**OpenDNA** parses a 23andMe / AncestryDNA / MyHeritage raw DNA file against 12 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, histamine, cognition, stimulant sensitivity, eye health, vitamin D, nicotine), joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. It now also scores match confidence, shows panel coverage / blind spots, and rolls multi-marker patterns like APOE, HFE, MTHFR, histamine DAO/HNMT, CYP2C19, warfarin PGx, statin PGx, DPYD, AMD susceptibility, vitamin D tendency, and nicotine dependence into composite calls. Optional BYOK LLM synthesis (Anthropic Claude or OpenAI GPT) adds a prose interpretation layer.

**What leaves your machine:** nothing, by default. No account, no upload, no telemetry. The raw DNA file is parsed, analyzed, and rendered entirely offline. If — and only if — you paste an API key and opt in to AI synthesis, OpenDNA sends the filtered *findings* (rsid + gene + genotype + tier + short note; roughly 40–50 lines of structured text) to the provider you chose. Your raw DNA file is never transmitted. See the [Privacy](#privacy--what-leaves-your-machine) section for the full rules.
**What leaves your machine:** nothing, by default. No account, no upload, no telemetry. The raw DNA file is parsed, analyzed, and rendered entirely offline. If — and only if — you paste an API key and opt in to AI synthesis, OpenDNA sends the filtered *findings* (rsid + gene + genotype + tier + short note; roughly a few dozen lines of structured text) to the provider you chose. Your raw DNA file is never transmitted. See the [Privacy](#privacy--what-leaves-your-machine) section for the full rules.

## Sample report

Expand All @@ -17,7 +17,7 @@
**[→ View the live sample report](https://raw.githack.com/corbett3000/OpenDNA/main/examples/sample-report.html)** (renders the actual HTML in your browser)
· [view source](examples/sample-report.html)

The sample was generated from a real 23andMe-style raw DNA file run through the full OpenDNA pipeline (parse → 8 panels → ClinVar + PharmGKB annotation → rule-based render). No personal raw data is embedded — only the interpreted findings.
The sample was generated from a real 23andMe-style raw DNA file run through the full OpenDNA pipeline (parse → shipped panels → ClinVar + PharmGKB annotation → rule-based render). No personal raw data is embedded — only the interpreted findings.

---

Expand All @@ -32,7 +32,7 @@ OpenDNA is an educational and exploratory tool. **It is not a clinical diagnosti
This is the whole trust claim; read it carefully:

1. **Your raw DNA file never leaves your machine.** Parsing, panel matching, ClinVar + PharmGKB annotation, and the rule-based report all happen on `localhost`. No step in that pipeline touches the network.
2. **LLM synthesis is opt-in and sends a minimal payload.** If and only if you paste an API key and click Generate, OpenDNA sends the filtered findings (roughly 40–50 lines of structured text — rsid, gene, genotype, tier, note) to the provider you chose. Your raw DNA file is **never** sent.
2. **LLM synthesis is opt-in and sends a minimal payload.** If and only if you paste an API key and click Generate, OpenDNA sends the filtered findings (roughly a few dozen lines of structured text — rsid, gene, genotype, tier, note) to the provider you chose. Your raw DNA file is **never** sent.
3. **No telemetry, no accounts, no server-side persistence.** API keys are read from the request, held in memory for one call, then discarded. There is no database.
4. **The server binds to `127.0.0.1` by default.** Only the machine running it can reach it. `--host 0.0.0.0` requires an explicit flag and prints a warning.
5. **Browser-local convenience storage is opt-in and bounded.** The web UI can remember your DNA file path, LLM provider, model, and API key in the browser's `localStorage` (scoped to `http://localhost:8787` on that one browser profile). Nothing is written to disk by the server, committed to git, or transmitted anywhere. The **Remember** checkbox controls it; the **Clear saved values** button wipes it instantly.
Expand Down Expand Up @@ -86,7 +86,7 @@ opendna --version
pytest -v # (only if you installed [dev])
```

Current contributor baseline: `51` tests passing.
Current contributor baseline: `53` tests passing.

---

Expand All @@ -98,7 +98,7 @@ opendna serve
```

1. **Paste the absolute path** to your raw DNA file (e.g. `/Users/you/Downloads/genome_yourname.txt`).
2. **Pick panels** — all 8 are selected by default; uncheck to narrow the scope.
2. **Pick panels** — all 12 are selected by default; uncheck to narrow the scope.
3. **(Optional) AI synthesis** — pick Anthropic or OpenAI, pick a model, paste your API key. Leave the provider as *None* for a rule-based report only.
4. **Click *Generate report*.** Watch the progress bar — you'll see the pipeline advance through validate → parse → analyze → annotate → LLM → render.
5. **Download** the self-contained HTML or the structured JSON.
Expand Down Expand Up @@ -143,10 +143,14 @@ Every finding is tier-scored (`risk` / `warning` / `normal` / `unknown`), annota
|---|---|---|
| **Cardiovascular & Longevity** | CAD risk, thrombosis, lifespan | APOE, 9p21, FOXO3, LPA, F5, F2 |
| **Methylation & Detox** | Folate/B12 cycle | MTHFR, COMT, MTR/MTRR, FUT2, CBS, VDR |
| **Pharmacogenomics (PGx)** | Drug metabolism | CYP2C19, CYP2C9, VKORC1, CYP4F2, SLCO1B1, TPMT, CYP3A5 |
| **Pharmacogenomics (PGx)** | Drug metabolism | CYP2C19, CYP2C9, VKORC1, CYP4F2, SLCO1B1, ABCG2, DPYD, TPMT, CYP3A5 |
| **Athletic Performance & Recovery** | Fiber type, recovery | ACTN3, PPARA, PPARGC1A, COL5A1, IL6 |
| **Dietary Sensitivity** | Lactose, caffeine, alcohol, omega-3 | LCT, CYP1A2, ALDH2, FADS1, TAS2R38 |
| **Eye Health & Macular Degeneration** | Common AMD predisposition | CFH, ARMS2, C3 |
| **Iron Metabolism (HFE)** | Hemochromatosis | C282Y, H63D, S65C |
| **Histamine Handling** | Exploratory histamine breakdown | AOC1 (DAO), HNMT |
| **Nicotine Dependence & Smoking Response** | Smoking heaviness, nicotine reinforcement | CHRNA5, CHRNA3 |
| **Vitamin D & Bone** | Low-25(OH)D tendency, signaling | DHCR7, CYP2R1, VDR |
| **Cognition & Mood** | Plasticity, dopamine | BDNF, DRD2, OXTR |
| **Stimulant Sensitivity** | Caffeine/adenosine | ADORA2A, CYP1A2, COMT |

Expand All @@ -157,6 +161,8 @@ The analyzer automatically handles both **allele-order variations** (`AG` == `GA
- `Not on chip` means the marker was not present in the source file. It does **not** mean the user has a reassuring genotype there.
- `No-call` means the vendor included the marker but did not make a confident genotype call.
- Composite calls are only made where the source file type can support them. OpenDNA still does **not** infer rare variants, structural variants, HLA types, CYP2D6 star alleles, or methylation from array data.
- The histamine panel is exploratory. It can flag lower-clearance DAO/HNMT patterns, but it does **not** diagnose histamine intolerance, food reactions, or mast-cell disease on its own.
- The AMD, vitamin D, and nicotine panels are predisposition panels. They do **not** diagnose current eye disease, vitamin D deficiency, or addiction by themselves.

---

Expand Down
12 changes: 5 additions & 7 deletions docs/superpowers/handover-2026-04-17.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

## What this is

Local-first web utility that parses a raw 23andMe-style DNA file, matches against 8 curated SNP panels, joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. Optional BYOK LLM synthesis (Anthropic default with prompt caching, OpenAI secondary) adds prose interpretation.
Local-first web utility that parses a raw 23andMe-style DNA file, matches against 12 curated SNP panels, joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. Optional BYOK LLM synthesis (Anthropic default with prompt caching, OpenAI secondary) adds prose interpretation.

- **Repo:** https://github.com/corbett3000/OpenDNA — **public**, Apache 2.0
- **Runs at:** http://localhost:8787 (default)
Expand All @@ -16,16 +16,16 @@ Local-first web utility that parses a raw 23andMe-style DNA file, matches agains

### Shipped features

- 8 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, cognition, stimulant sensitivity) — now expanded with F5 / F2 thrombophilia markers and CYP4F2 warfarin PGx support.
- 12 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, histamine, cognition, stimulant sensitivity, eye health, vitamin D, nicotine) — now expanded with F5 / F2 thrombophilia markers, CYP4F2 / ABCG2 / DPYD PGx support, exploratory DAO / HNMT histamine handling coverage, common AMD susceptibility markers, vitamin D tendency markers, and nicotine-dependence markers.
- ClinVar (16 rsids) + PharmGKB (8 rsids, CPIC drug recs for TPMT / CYP2C19 / CYP2C9 / VKORC1 / CYP4F2 / SLCO1B1) shipped subsets.
- FastAPI server with five endpoints: `/`, `/api/panels`, `/api/analyze`, `/api/analyze-stream` (SSE for live progress), `/api/update-db`.
- Vanilla-JS SPA: DOM APIs only (no `innerHTML` on dynamic data), report rendered into sandboxed iframe. Live progress bar with per-stage messages. Optional browser `localStorage` persistence of form values behind a *Remember* checkbox + *Clear saved values* button.
- CLI: `opendna serve`, `opendna scan <file>`, `opendna update-db`.
- Jinja2 HTML report (autoescape on) styled after the original sandbox synthesis — stat tiles, color-coded card grid per panel, critical-findings alert box for risk tier, AI synthesis block when LLM is configured.
- Report now includes source-file QC, vendor/build heuristics, panel completeness, confidence labels, gene caveats, and composite summaries for APOE, HFE, MTHFR, CYP2C19, and warfarin PGx.
- Report now includes source-file QC, vendor/build heuristics, panel completeness, confidence labels, gene caveats, and composite summaries for APOE, HFE, MTHFR, histamine DAO/HNMT, CYP2C19, warfarin PGx, statin PGx, DPYD, AMD susceptibility, vitamin D tendency, and nicotine dependence.
- LLM provider layer: pluggable ABC, Anthropic default (`claude-sonnet-4-6` + prompt caching), OpenAI (`gpt-4o` using `max_completion_tokens`). API keys held per-request in memory only.
- Analyzer normalizes allele order (`AG == GA`) AND reverse complement (panel `CC` matches file `GG` for non-palindromic C/T SNPs).
- 51 tests passing (parser, panels, analyzer, summaries, annotations, report, LLM providers, server, CLI, SSE stream, e2e). `ruff check` clean. CI runs on Python 3.11 / 3.12 / 3.13.
- 53 tests passing (parser, panels, analyzer, summaries, annotations, report, LLM providers, server, CLI, SSE stream, e2e). `ruff check` clean. CI runs on Python 3.11 / 3.12 / 3.13.
- Apache 2.0 license.
- README with full setup guide, per-provider raw-DNA download instructions, sample report + screenshot, troubleshooting, author links (`@corbett3000`, `stillrush.co/bio`).
- Sample report: `examples/sample-report.html` (served live via raw.githack; preview image at `docs/assets/sample-report.png`).
Expand Down Expand Up @@ -55,12 +55,10 @@ If Claude Code opens this repo, `CLAUDE.md` at the root will auto-load with a co
```bash
cd ~/Coding/OpenDNA
source .venv/bin/activate # or create one: python3 -m venv .venv && pip install -e ".[dev,all]"
pytest -v # 44 tests should pass
pytest -v # 53 tests should pass
opendna serve # localhost:8787
```

The current baseline is `51` passing tests.

## Key design decisions (for context)

| Decision | Choice | Why |
Expand Down
40 changes: 40 additions & 0 deletions src/opendna/panels/eye_health.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"id": "eye_health",
"name": "Eye Health & Macular Degeneration",
"description": "Common age-related macular degeneration (AMD) susceptibility markers. Best read as predisposition, not diagnosis.",
"snps": [
{
"rsid": "rs1061170",
"gene": "CFH",
"variant_name": "Y402H",
"description": "Complement-factor H variant strongly associated with common AMD risk in European-ancestry studies.",
"interpretations": {
"TT": {"tier": "normal", "note": "No common CFH Y402H risk allele detected at this site."},
"CT": {"tier": "warning", "note": "One common AMD risk allele is present at CFH. Lifestyle and retinal screening matter more as age advances."},
"CC": {"tier": "risk", "note": "Two common CFH risk alleles are present. This is a stronger inherited AMD predisposition signal."}
}
},
{
"rsid": "rs10490924",
"gene": "ARMS2",
"variant_name": "A69S",
"description": "One of the best-replicated common AMD variants; smoking amplifies the risk signal.",
"interpretations": {
"GG": {"tier": "normal", "note": "No common ARMS2 risk allele detected at this site."},
"GT": {"tier": "warning", "note": "One common AMD risk allele is present at ARMS2."},
"TT": {"tier": "risk", "note": "Two ARMS2 risk alleles are present. This is a strong common susceptibility signal for AMD."}
}
},
{
"rsid": "rs2230199",
"gene": "C3",
"variant_name": "R102G",
"description": "Complement C3 variant associated with advanced AMD risk in multiple cohorts.",
"interpretations": {
"CC": {"tier": "normal", "note": "No common C3 AMD risk allele detected at this site."},
"CG": {"tier": "warning", "note": "One C3 risk allele is present. This modestly raises common AMD susceptibility."},
"GG": {"tier": "risk", "note": "Two C3 risk alleles are present. This is a stronger complement-pathway AMD signal."}
}
}
]
}
62 changes: 62 additions & 0 deletions src/opendna/panels/histamine.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{
"id": "histamine",
"name": "Histamine Handling",
"description": "Exploratory panel for histamine breakdown via DAO (AOC1) and HNMT. Useful for pattern-spotting, not for diagnosing histamine intolerance or mast-cell disease.",
"snps": [
{
"rsid": "rs10156191",
"gene": "AOC1",
"variant_name": "DAO Thr16Met",
"description": "Coding variant in the DAO enzyme. The T allele has been associated with reduced DAO activity in human association studies.",
"interpretations": {
"CC": {"tier": "normal", "note": "No reduced-DAO allele at this site."},
"CT": {"tier": "warning", "note": "One reduced-DAO allele. Histamine clearance may be a bit less resilient, especially with high histamine load or gut inflammation."},
"TT": {"tier": "risk", "note": "Two reduced-DAO alleles at this site. This is a stronger genetic signal for lower DAO-mediated histamine breakdown."}
}
},
{
"rsid": "rs1049742",
"gene": "AOC1",
"variant_name": "DAO Ser332Phe",
"description": "DAO coding variant associated with lower enzyme activity in AOC1 deficiency studies.",
"interpretations": {
"CC": {"tier": "normal", "note": "No reduced-DAO allele at this site."},
"CT": {"tier": "warning", "note": "One reduced-DAO allele. On its own this is usually a mild signal, but it matters more when other DAO variants stack up."},
"TT": {"tier": "risk", "note": "Two reduced-DAO alleles at this site. Stronger genetic signal for lower DAO function."}
}
},
{
"rsid": "rs1049793",
"gene": "AOC1",
"variant_name": "DAO His664Asp",
"description": "DAO coding variant associated with reduced histamine-degrading capacity in AOC1 studies.",
"interpretations": {
"CC": {"tier": "normal", "note": "No reduced-DAO allele at this site."},
"CG": {"tier": "warning", "note": "One reduced-DAO allele. This can contribute to lower histamine tolerance when paired with other DAO variants."},
"GG": {"tier": "risk", "note": "Two reduced-DAO alleles at this site. Stronger signal for lower DAO-mediated histamine breakdown."}
}
},
{
"rsid": "rs2052129",
"gene": "AOC1",
"variant_name": "DAO promoter c.-691G>T",
"description": "Promoter-region variant associated with lower AOC1 transcription and lower DAO activity.",
"interpretations": {
"GG": {"tier": "normal", "note": "No lower-expression DAO allele at this site."},
"GT": {"tier": "warning", "note": "One lower-expression allele. DAO reserve may be somewhat lower under stressors like alcohol, infection, or gut irritation."},
"TT": {"tier": "risk", "note": "Two lower-expression alleles at this site. Stronger signal for reduced DAO expression."}
}
},
{
"rsid": "rs11558538",
"gene": "HNMT",
"variant_name": "HNMT Thr105Ile",
"description": "Functional HNMT variant. The T allele lowers histamine N-methyltransferase activity and can reduce intracellular histamine clearance.",
"interpretations": {
"CC": {"tier": "normal", "note": "Standard HNMT activity at this site."},
"CT": {"tier": "warning", "note": "One lower-activity HNMT allele. Histamine breakdown outside the gut may be somewhat slower."},
"TT": {"tier": "risk", "note": "Two lower-activity HNMT alleles. Stronger signal for reduced HNMT-mediated histamine breakdown."}
}
}
]
}
40 changes: 40 additions & 0 deletions src/opendna/panels/nicotine.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"id": "nicotine",
"name": "Nicotine Dependence & Smoking Response",
"description": "Common smoking-heaviness and nicotine-reinforcement markers. Behavioral predisposition only; not deterministic.",
"snps": [
{
"rsid": "rs16969968",
"gene": "CHRNA5",
"variant_name": "D398N",
"description": "The strongest common CHRNA5 smoking-dependence marker in many GWAS; associated with heavier smoking among smokers.",
"interpretations": {
"GG": {"tier": "normal", "note": "No common CHRNA5 nicotine-risk allele detected at this site."},
"AG": {"tier": "warning", "note": "One CHRNA5 risk allele is present. If nicotine use starts, reinforcement can run stronger than baseline."},
"AA": {"tier": "risk", "note": "Two CHRNA5 risk alleles are present. This is the strongest common dependence-prone pattern at this site."}
}
},
{
"rsid": "rs1051730",
"gene": "CHRNA3",
"variant_name": "Smoking quantity marker",
"description": "Closely related to smoking heaviness and toxin exposure among smokers.",
"interpretations": {
"GG": {"tier": "normal", "note": "No common CHRNA3 smoking-quantity risk allele detected at this site."},
"AG": {"tier": "warning", "note": "One CHRNA3 risk allele is present. This can track with heavier smoking once nicotine use begins."},
"AA": {"tier": "risk", "note": "Two CHRNA3 risk alleles are present. This is a stronger common signal for heavier smoking exposure."}
}
},
{
"rsid": "rs588765",
"gene": "CHRNA5",
"variant_name": "Regulatory expression marker",
"description": "Expression-linked CHRNA5 marker that refines nicotine-dependence tendency when paired with rs16969968.",
"interpretations": {
"TT": {"tier": "normal", "note": "No common rs588765 risk allele detected at this site."},
"CT": {"tier": "warning", "note": "One regulatory risk allele is present. This is a smaller signal than rs16969968 but can reinforce it."},
"CC": {"tier": "warning", "note": "Two regulatory risk alleles are present. This is a moderate added nicotine-reinforcement signal."}
}
}
]
}
Loading
Loading