diff --git a/README.md b/README.md index fadeea9..69ab0f3 100644 --- a/README.md +++ b/README.md @@ -6,9 +6,9 @@ [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/downloads/) -**OpenDNA** parses a 23andMe / AncestryDNA / MyHeritage raw DNA file against 8 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, cognition, stimulant sensitivity), joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. It now also scores match confidence, shows panel coverage / blind spots, and rolls multi-marker genes like APOE, HFE, MTHFR, CYP2C19, and warfarin PGx into composite calls. Optional BYOK LLM synthesis (Anthropic Claude or OpenAI GPT) adds a prose interpretation layer. +**OpenDNA** parses a 23andMe / AncestryDNA / MyHeritage raw DNA file against 12 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, histamine, cognition, stimulant sensitivity, eye health, vitamin D, nicotine), joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. It now also scores match confidence, shows panel coverage / blind spots, and rolls multi-marker patterns like APOE, HFE, MTHFR, histamine DAO/HNMT, CYP2C19, warfarin PGx, statin PGx, DPYD, AMD susceptibility, vitamin D tendency, and nicotine dependence into composite calls. Optional BYOK LLM synthesis (Anthropic Claude or OpenAI GPT) adds a prose interpretation layer. -**What leaves your machine:** nothing, by default. No account, no upload, no telemetry. The raw DNA file is parsed, analyzed, and rendered entirely offline. If — and only if — you paste an API key and opt in to AI synthesis, OpenDNA sends the filtered *findings* (rsid + gene + genotype + tier + short note; roughly 40–50 lines of structured text) to the provider you chose. Your raw DNA file is never transmitted. See the [Privacy](#privacy--what-leaves-your-machine) section for the full rules. +**What leaves your machine:** nothing, by default. No account, no upload, no telemetry. The raw DNA file is parsed, analyzed, and rendered entirely offline. If — and only if — you paste an API key and opt in to AI synthesis, OpenDNA sends the filtered *findings* (rsid + gene + genotype + tier + short note; roughly a few dozen lines of structured text) to the provider you chose. Your raw DNA file is never transmitted. See the [Privacy](#privacy--what-leaves-your-machine) section for the full rules. ## Sample report @@ -17,7 +17,7 @@ **[→ View the live sample report](https://raw.githack.com/corbett3000/OpenDNA/main/examples/sample-report.html)** (renders the actual HTML in your browser) · [view source](examples/sample-report.html) -The sample was generated from a real 23andMe-style raw DNA file run through the full OpenDNA pipeline (parse → 8 panels → ClinVar + PharmGKB annotation → rule-based render). No personal raw data is embedded — only the interpreted findings. +The sample was generated from a real 23andMe-style raw DNA file run through the full OpenDNA pipeline (parse → shipped panels → ClinVar + PharmGKB annotation → rule-based render). No personal raw data is embedded — only the interpreted findings. --- @@ -32,7 +32,7 @@ OpenDNA is an educational and exploratory tool. **It is not a clinical diagnosti This is the whole trust claim; read it carefully: 1. **Your raw DNA file never leaves your machine.** Parsing, panel matching, ClinVar + PharmGKB annotation, and the rule-based report all happen on `localhost`. No step in that pipeline touches the network. -2. **LLM synthesis is opt-in and sends a minimal payload.** If and only if you paste an API key and click Generate, OpenDNA sends the filtered findings (roughly 40–50 lines of structured text — rsid, gene, genotype, tier, note) to the provider you chose. Your raw DNA file is **never** sent. +2. **LLM synthesis is opt-in and sends a minimal payload.** If and only if you paste an API key and click Generate, OpenDNA sends the filtered findings (roughly a few dozen lines of structured text — rsid, gene, genotype, tier, note) to the provider you chose. Your raw DNA file is **never** sent. 3. **No telemetry, no accounts, no server-side persistence.** API keys are read from the request, held in memory for one call, then discarded. There is no database. 4. **The server binds to `127.0.0.1` by default.** Only the machine running it can reach it. `--host 0.0.0.0` requires an explicit flag and prints a warning. 5. **Browser-local convenience storage is opt-in and bounded.** The web UI can remember your DNA file path, LLM provider, model, and API key in the browser's `localStorage` (scoped to `http://localhost:8787` on that one browser profile). Nothing is written to disk by the server, committed to git, or transmitted anywhere. The **Remember** checkbox controls it; the **Clear saved values** button wipes it instantly. @@ -86,7 +86,7 @@ opendna --version pytest -v # (only if you installed [dev]) ``` -Current contributor baseline: `51` tests passing. +Current contributor baseline: `53` tests passing. --- @@ -98,7 +98,7 @@ opendna serve ``` 1. **Paste the absolute path** to your raw DNA file (e.g. `/Users/you/Downloads/genome_yourname.txt`). -2. **Pick panels** — all 8 are selected by default; uncheck to narrow the scope. +2. **Pick panels** — all 12 are selected by default; uncheck to narrow the scope. 3. **(Optional) AI synthesis** — pick Anthropic or OpenAI, pick a model, paste your API key. Leave the provider as *None* for a rule-based report only. 4. **Click *Generate report*.** Watch the progress bar — you'll see the pipeline advance through validate → parse → analyze → annotate → LLM → render. 5. **Download** the self-contained HTML or the structured JSON. @@ -143,10 +143,14 @@ Every finding is tier-scored (`risk` / `warning` / `normal` / `unknown`), annota |---|---|---| | **Cardiovascular & Longevity** | CAD risk, thrombosis, lifespan | APOE, 9p21, FOXO3, LPA, F5, F2 | | **Methylation & Detox** | Folate/B12 cycle | MTHFR, COMT, MTR/MTRR, FUT2, CBS, VDR | -| **Pharmacogenomics (PGx)** | Drug metabolism | CYP2C19, CYP2C9, VKORC1, CYP4F2, SLCO1B1, TPMT, CYP3A5 | +| **Pharmacogenomics (PGx)** | Drug metabolism | CYP2C19, CYP2C9, VKORC1, CYP4F2, SLCO1B1, ABCG2, DPYD, TPMT, CYP3A5 | | **Athletic Performance & Recovery** | Fiber type, recovery | ACTN3, PPARA, PPARGC1A, COL5A1, IL6 | | **Dietary Sensitivity** | Lactose, caffeine, alcohol, omega-3 | LCT, CYP1A2, ALDH2, FADS1, TAS2R38 | +| **Eye Health & Macular Degeneration** | Common AMD predisposition | CFH, ARMS2, C3 | | **Iron Metabolism (HFE)** | Hemochromatosis | C282Y, H63D, S65C | +| **Histamine Handling** | Exploratory histamine breakdown | AOC1 (DAO), HNMT | +| **Nicotine Dependence & Smoking Response** | Smoking heaviness, nicotine reinforcement | CHRNA5, CHRNA3 | +| **Vitamin D & Bone** | Low-25(OH)D tendency, signaling | DHCR7, CYP2R1, VDR | | **Cognition & Mood** | Plasticity, dopamine | BDNF, DRD2, OXTR | | **Stimulant Sensitivity** | Caffeine/adenosine | ADORA2A, CYP1A2, COMT | @@ -157,6 +161,8 @@ The analyzer automatically handles both **allele-order variations** (`AG` == `GA - `Not on chip` means the marker was not present in the source file. It does **not** mean the user has a reassuring genotype there. - `No-call` means the vendor included the marker but did not make a confident genotype call. - Composite calls are only made where the source file type can support them. OpenDNA still does **not** infer rare variants, structural variants, HLA types, CYP2D6 star alleles, or methylation from array data. +- The histamine panel is exploratory. It can flag lower-clearance DAO/HNMT patterns, but it does **not** diagnose histamine intolerance, food reactions, or mast-cell disease on its own. +- The AMD, vitamin D, and nicotine panels are predisposition panels. They do **not** diagnose current eye disease, vitamin D deficiency, or addiction by themselves. --- diff --git a/docs/superpowers/handover-2026-04-17.md b/docs/superpowers/handover-2026-04-17.md index 3598ecc..ac422c6 100644 --- a/docs/superpowers/handover-2026-04-17.md +++ b/docs/superpowers/handover-2026-04-17.md @@ -4,7 +4,7 @@ ## What this is -Local-first web utility that parses a raw 23andMe-style DNA file, matches against 8 curated SNP panels, joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. Optional BYOK LLM synthesis (Anthropic default with prompt caching, OpenAI secondary) adds prose interpretation. +Local-first web utility that parses a raw 23andMe-style DNA file, matches against 12 curated SNP panels, joins findings with ClinVar + PharmGKB/CPIC annotations, and renders a self-contained HTML report. Optional BYOK LLM synthesis (Anthropic default with prompt caching, OpenAI secondary) adds prose interpretation. - **Repo:** https://github.com/corbett3000/OpenDNA — **public**, Apache 2.0 - **Runs at:** http://localhost:8787 (default) @@ -16,16 +16,16 @@ Local-first web utility that parses a raw 23andMe-style DNA file, matches agains ### Shipped features -- 8 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, cognition, stimulant sensitivity) — now expanded with F5 / F2 thrombophilia markers and CYP4F2 warfarin PGx support. +- 12 curated SNP panels (cardiovascular, methylation, pharmacogenomics, athletic, dietary, HFE, histamine, cognition, stimulant sensitivity, eye health, vitamin D, nicotine) — now expanded with F5 / F2 thrombophilia markers, CYP4F2 / ABCG2 / DPYD PGx support, exploratory DAO / HNMT histamine handling coverage, common AMD susceptibility markers, vitamin D tendency markers, and nicotine-dependence markers. - ClinVar (16 rsids) + PharmGKB (8 rsids, CPIC drug recs for TPMT / CYP2C19 / CYP2C9 / VKORC1 / CYP4F2 / SLCO1B1) shipped subsets. - FastAPI server with five endpoints: `/`, `/api/panels`, `/api/analyze`, `/api/analyze-stream` (SSE for live progress), `/api/update-db`. - Vanilla-JS SPA: DOM APIs only (no `innerHTML` on dynamic data), report rendered into sandboxed iframe. Live progress bar with per-stage messages. Optional browser `localStorage` persistence of form values behind a *Remember* checkbox + *Clear saved values* button. - CLI: `opendna serve`, `opendna scan `, `opendna update-db`. - Jinja2 HTML report (autoescape on) styled after the original sandbox synthesis — stat tiles, color-coded card grid per panel, critical-findings alert box for risk tier, AI synthesis block when LLM is configured. -- Report now includes source-file QC, vendor/build heuristics, panel completeness, confidence labels, gene caveats, and composite summaries for APOE, HFE, MTHFR, CYP2C19, and warfarin PGx. +- Report now includes source-file QC, vendor/build heuristics, panel completeness, confidence labels, gene caveats, and composite summaries for APOE, HFE, MTHFR, histamine DAO/HNMT, CYP2C19, warfarin PGx, statin PGx, DPYD, AMD susceptibility, vitamin D tendency, and nicotine dependence. - LLM provider layer: pluggable ABC, Anthropic default (`claude-sonnet-4-6` + prompt caching), OpenAI (`gpt-4o` using `max_completion_tokens`). API keys held per-request in memory only. - Analyzer normalizes allele order (`AG == GA`) AND reverse complement (panel `CC` matches file `GG` for non-palindromic C/T SNPs). -- 51 tests passing (parser, panels, analyzer, summaries, annotations, report, LLM providers, server, CLI, SSE stream, e2e). `ruff check` clean. CI runs on Python 3.11 / 3.12 / 3.13. +- 53 tests passing (parser, panels, analyzer, summaries, annotations, report, LLM providers, server, CLI, SSE stream, e2e). `ruff check` clean. CI runs on Python 3.11 / 3.12 / 3.13. - Apache 2.0 license. - README with full setup guide, per-provider raw-DNA download instructions, sample report + screenshot, troubleshooting, author links (`@corbett3000`, `stillrush.co/bio`). - Sample report: `examples/sample-report.html` (served live via raw.githack; preview image at `docs/assets/sample-report.png`). @@ -55,12 +55,10 @@ If Claude Code opens this repo, `CLAUDE.md` at the root will auto-load with a co ```bash cd ~/Coding/OpenDNA source .venv/bin/activate # or create one: python3 -m venv .venv && pip install -e ".[dev,all]" -pytest -v # 44 tests should pass +pytest -v # 53 tests should pass opendna serve # localhost:8787 ``` -The current baseline is `51` passing tests. - ## Key design decisions (for context) | Decision | Choice | Why | diff --git a/src/opendna/panels/eye_health.json b/src/opendna/panels/eye_health.json new file mode 100644 index 0000000..d0e0bee --- /dev/null +++ b/src/opendna/panels/eye_health.json @@ -0,0 +1,40 @@ +{ + "id": "eye_health", + "name": "Eye Health & Macular Degeneration", + "description": "Common age-related macular degeneration (AMD) susceptibility markers. Best read as predisposition, not diagnosis.", + "snps": [ + { + "rsid": "rs1061170", + "gene": "CFH", + "variant_name": "Y402H", + "description": "Complement-factor H variant strongly associated with common AMD risk in European-ancestry studies.", + "interpretations": { + "TT": {"tier": "normal", "note": "No common CFH Y402H risk allele detected at this site."}, + "CT": {"tier": "warning", "note": "One common AMD risk allele is present at CFH. Lifestyle and retinal screening matter more as age advances."}, + "CC": {"tier": "risk", "note": "Two common CFH risk alleles are present. This is a stronger inherited AMD predisposition signal."} + } + }, + { + "rsid": "rs10490924", + "gene": "ARMS2", + "variant_name": "A69S", + "description": "One of the best-replicated common AMD variants; smoking amplifies the risk signal.", + "interpretations": { + "GG": {"tier": "normal", "note": "No common ARMS2 risk allele detected at this site."}, + "GT": {"tier": "warning", "note": "One common AMD risk allele is present at ARMS2."}, + "TT": {"tier": "risk", "note": "Two ARMS2 risk alleles are present. This is a strong common susceptibility signal for AMD."} + } + }, + { + "rsid": "rs2230199", + "gene": "C3", + "variant_name": "R102G", + "description": "Complement C3 variant associated with advanced AMD risk in multiple cohorts.", + "interpretations": { + "CC": {"tier": "normal", "note": "No common C3 AMD risk allele detected at this site."}, + "CG": {"tier": "warning", "note": "One C3 risk allele is present. This modestly raises common AMD susceptibility."}, + "GG": {"tier": "risk", "note": "Two C3 risk alleles are present. This is a stronger complement-pathway AMD signal."} + } + } + ] +} diff --git a/src/opendna/panels/histamine.json b/src/opendna/panels/histamine.json new file mode 100644 index 0000000..a42f231 --- /dev/null +++ b/src/opendna/panels/histamine.json @@ -0,0 +1,62 @@ +{ + "id": "histamine", + "name": "Histamine Handling", + "description": "Exploratory panel for histamine breakdown via DAO (AOC1) and HNMT. Useful for pattern-spotting, not for diagnosing histamine intolerance or mast-cell disease.", + "snps": [ + { + "rsid": "rs10156191", + "gene": "AOC1", + "variant_name": "DAO Thr16Met", + "description": "Coding variant in the DAO enzyme. The T allele has been associated with reduced DAO activity in human association studies.", + "interpretations": { + "CC": {"tier": "normal", "note": "No reduced-DAO allele at this site."}, + "CT": {"tier": "warning", "note": "One reduced-DAO allele. Histamine clearance may be a bit less resilient, especially with high histamine load or gut inflammation."}, + "TT": {"tier": "risk", "note": "Two reduced-DAO alleles at this site. This is a stronger genetic signal for lower DAO-mediated histamine breakdown."} + } + }, + { + "rsid": "rs1049742", + "gene": "AOC1", + "variant_name": "DAO Ser332Phe", + "description": "DAO coding variant associated with lower enzyme activity in AOC1 deficiency studies.", + "interpretations": { + "CC": {"tier": "normal", "note": "No reduced-DAO allele at this site."}, + "CT": {"tier": "warning", "note": "One reduced-DAO allele. On its own this is usually a mild signal, but it matters more when other DAO variants stack up."}, + "TT": {"tier": "risk", "note": "Two reduced-DAO alleles at this site. Stronger genetic signal for lower DAO function."} + } + }, + { + "rsid": "rs1049793", + "gene": "AOC1", + "variant_name": "DAO His664Asp", + "description": "DAO coding variant associated with reduced histamine-degrading capacity in AOC1 studies.", + "interpretations": { + "CC": {"tier": "normal", "note": "No reduced-DAO allele at this site."}, + "CG": {"tier": "warning", "note": "One reduced-DAO allele. This can contribute to lower histamine tolerance when paired with other DAO variants."}, + "GG": {"tier": "risk", "note": "Two reduced-DAO alleles at this site. Stronger signal for lower DAO-mediated histamine breakdown."} + } + }, + { + "rsid": "rs2052129", + "gene": "AOC1", + "variant_name": "DAO promoter c.-691G>T", + "description": "Promoter-region variant associated with lower AOC1 transcription and lower DAO activity.", + "interpretations": { + "GG": {"tier": "normal", "note": "No lower-expression DAO allele at this site."}, + "GT": {"tier": "warning", "note": "One lower-expression allele. DAO reserve may be somewhat lower under stressors like alcohol, infection, or gut irritation."}, + "TT": {"tier": "risk", "note": "Two lower-expression alleles at this site. Stronger signal for reduced DAO expression."} + } + }, + { + "rsid": "rs11558538", + "gene": "HNMT", + "variant_name": "HNMT Thr105Ile", + "description": "Functional HNMT variant. The T allele lowers histamine N-methyltransferase activity and can reduce intracellular histamine clearance.", + "interpretations": { + "CC": {"tier": "normal", "note": "Standard HNMT activity at this site."}, + "CT": {"tier": "warning", "note": "One lower-activity HNMT allele. Histamine breakdown outside the gut may be somewhat slower."}, + "TT": {"tier": "risk", "note": "Two lower-activity HNMT alleles. Stronger signal for reduced HNMT-mediated histamine breakdown."} + } + } + ] +} diff --git a/src/opendna/panels/nicotine.json b/src/opendna/panels/nicotine.json new file mode 100644 index 0000000..195fa82 --- /dev/null +++ b/src/opendna/panels/nicotine.json @@ -0,0 +1,40 @@ +{ + "id": "nicotine", + "name": "Nicotine Dependence & Smoking Response", + "description": "Common smoking-heaviness and nicotine-reinforcement markers. Behavioral predisposition only; not deterministic.", + "snps": [ + { + "rsid": "rs16969968", + "gene": "CHRNA5", + "variant_name": "D398N", + "description": "The strongest common CHRNA5 smoking-dependence marker in many GWAS; associated with heavier smoking among smokers.", + "interpretations": { + "GG": {"tier": "normal", "note": "No common CHRNA5 nicotine-risk allele detected at this site."}, + "AG": {"tier": "warning", "note": "One CHRNA5 risk allele is present. If nicotine use starts, reinforcement can run stronger than baseline."}, + "AA": {"tier": "risk", "note": "Two CHRNA5 risk alleles are present. This is the strongest common dependence-prone pattern at this site."} + } + }, + { + "rsid": "rs1051730", + "gene": "CHRNA3", + "variant_name": "Smoking quantity marker", + "description": "Closely related to smoking heaviness and toxin exposure among smokers.", + "interpretations": { + "GG": {"tier": "normal", "note": "No common CHRNA3 smoking-quantity risk allele detected at this site."}, + "AG": {"tier": "warning", "note": "One CHRNA3 risk allele is present. This can track with heavier smoking once nicotine use begins."}, + "AA": {"tier": "risk", "note": "Two CHRNA3 risk alleles are present. This is a stronger common signal for heavier smoking exposure."} + } + }, + { + "rsid": "rs588765", + "gene": "CHRNA5", + "variant_name": "Regulatory expression marker", + "description": "Expression-linked CHRNA5 marker that refines nicotine-dependence tendency when paired with rs16969968.", + "interpretations": { + "TT": {"tier": "normal", "note": "No common rs588765 risk allele detected at this site."}, + "CT": {"tier": "warning", "note": "One regulatory risk allele is present. This is a smaller signal than rs16969968 but can reinforce it."}, + "CC": {"tier": "warning", "note": "Two regulatory risk alleles are present. This is a moderate added nicotine-reinforcement signal."} + } + } + ] +} diff --git a/src/opendna/panels/pharmacogenomics.json b/src/opendna/panels/pharmacogenomics.json index 99028b0..157e574 100644 --- a/src/opendna/panels/pharmacogenomics.json +++ b/src/opendna/panels/pharmacogenomics.json @@ -80,6 +80,61 @@ "CC": {"tier": "risk", "note": "Highest myopathy risk — avoid simvastatin 80 mg."} } }, + { + "rsid": "rs2231142", + "gene": "ABCG2", + "variant_name": "c.421C>A (Q141K)", + "description": "Common BCRP transporter variant. Most relevant for rosuvastatin exposure and complements SLCO1B1 for statin interpretation.", + "interpretations": { + "CC": {"tier": "normal", "note": "Baseline ABCG2 transporter activity at this site."}, + "AC": {"tier": "warning", "note": "Decreased transporter function — rosuvastatin exposure can run higher than expected."}, + "AA": {"tier": "risk", "note": "Poorer ABCG2 transporter function — strongest common signal here for elevated rosuvastatin exposure."} + } + }, + { + "rsid": "rs3918290", + "gene": "DPYD", + "variant_name": "*2A splice variant", + "description": "Rare but highly actionable fluoropyrimidine toxicity variant for 5-FU and capecitabine.", + "interpretations": { + "GG": {"tier": "normal", "note": "No DPYD *2A variant detected at this site."}, + "AG": {"tier": "risk", "note": "Actionable DPYD toxicity variant present — fluoropyrimidines can cause severe toxicity without dose reduction."}, + "AA": {"tier": "risk", "note": "Two DPYD *2A risk alleles present — extremely high fluoropyrimidine toxicity risk."} + } + }, + { + "rsid": "rs55886062", + "gene": "DPYD", + "variant_name": "*13 (c.1679T>G)", + "description": "Rare DPYD loss-of-function variant with strong fluoropyrimidine toxicity relevance.", + "interpretations": { + "TT": {"tier": "normal", "note": "No DPYD *13 variant detected at this site."}, + "TG": {"tier": "risk", "note": "Actionable DPYD loss-of-function variant present — major fluoropyrimidine dose reduction is typically required."}, + "GG": {"tier": "risk", "note": "Two DPYD *13 risk alleles present — strongest toxicity signal in this marker set."} + } + }, + { + "rsid": "rs67376798", + "gene": "DPYD", + "variant_name": "c.2846A>T", + "description": "Clinically actionable DPYD decreased-function variant associated with fluoropyrimidine toxicity.", + "interpretations": { + "AA": {"tier": "normal", "note": "No DPYD c.2846A>T variant detected at this site."}, + "AT": {"tier": "risk", "note": "Actionable DPYD toxicity variant present — standard 5-FU or capecitabine doses can be unsafe."}, + "TT": {"tier": "risk", "note": "Two DPYD c.2846A>T risk alleles present — strongest common toxicity signal at this site."} + } + }, + { + "rsid": "rs56038477", + "gene": "DPYD", + "variant_name": "HapB3 tag (c.1236G>A)", + "description": "Tag SNP used on some arrays for the DPYD HapB3 decreased-function haplotype. Useful, but not as definitive as directly assaying the causal intronic SNP.", + "interpretations": { + "GG": {"tier": "normal", "note": "No common HapB3 tag signal detected at this site."}, + "AG": {"tier": "warning", "note": "HapB3 tag present — possible DPYD decreased function. Confirm with a clinical PGx assay before treatment decisions."}, + "AA": {"tier": "risk", "note": "Two HapB3 tag alleles are present. This is a stronger common signal for lower DPD activity, but still benefits from confirmatory testing."} + } + }, { "rsid": "rs1142345", "gene": "TPMT", diff --git a/src/opendna/panels/vitamin_d.json b/src/opendna/panels/vitamin_d.json new file mode 100644 index 0000000..a9b9504 --- /dev/null +++ b/src/opendna/panels/vitamin_d.json @@ -0,0 +1,40 @@ +{ + "id": "vitamin_d", + "name": "Vitamin D & Bone", + "description": "Common vitamin D synthesis, activation, and receptor markers. Useful for predisposition only; labs still matter most.", + "snps": [ + { + "rsid": "rs12785878", + "gene": "DHCR7", + "variant_name": "Skin synthesis marker", + "description": "GWAS marker near DHCR7/NADSYN1 associated with lower circulating 25-hydroxyvitamin D in some populations.", + "interpretations": { + "TT": {"tier": "normal", "note": "No common lower-vitamin-D risk allele detected at this site."}, + "GT": {"tier": "warning", "note": "One lower-vitamin-D risk allele is present. Sun exposure and season can dominate the actual lab result."}, + "GG": {"tier": "risk", "note": "Two lower-vitamin-D risk alleles are present. This is a stronger inherited tendency toward lower 25(OH)D."} + } + }, + { + "rsid": "rs10741657", + "gene": "CYP2R1", + "variant_name": "25-hydroxylase marker", + "description": "CYP2R1 variant associated with baseline vitamin D level and supplementation response.", + "interpretations": { + "AA": {"tier": "normal", "note": "No common lower-activity CYP2R1 signal detected at this site."}, + "AG": {"tier": "warning", "note": "One common CYP2R1 risk allele is present. Some people with this pattern run lower 25(OH)D at baseline."}, + "GG": {"tier": "risk", "note": "Two common CYP2R1 risk alleles are present. This is a stronger genetic tendency toward lower 25(OH)D."} + } + }, + { + "rsid": "rs2228570", + "gene": "VDR", + "variant_name": "FokI / receptor sensitivity", + "description": "Common vitamin D receptor variant already used in the methylation panel; repeated here because it fits vitamin D signaling directly.", + "interpretations": { + "GG": {"tier": "normal", "note": "Common receptor pattern."}, + "AG": {"tier": "normal", "note": "Intermediate receptor pattern."}, + "AA": {"tier": "warning", "note": "Lower receptor efficiency — vitamin D signaling may be a bit less resilient."} + } + } + ] +} diff --git a/src/opendna/summaries.py b/src/opendna/summaries.py index 58eabe7..2bb82a1 100644 --- a/src/opendna/summaries.py +++ b/src/opendna/summaries.py @@ -15,10 +15,38 @@ _TIER_RANK = {"risk": 3, "warning": 2, "normal": 1, "unknown": 0} GENE_CAVEATS = { + "AOC1": ( + "DAO genetics are only one piece of histamine tolerance. Food load, alcohol, " + "medications, gut inflammation, and mast-cell biology can dominate symptoms." + ), + "ABCG2": ( + "ABCG2 rs2231142 is most useful for rosuvastatin exposure, not the full " + "statin-response landscape." + ), "APOE": ( "APOE status here comes from rs429358 + rs7412 only; it does not capture " "non-APOE causes of dementia or lipid risk." ), + "ARMS2": ( + "This reflects common AMD susceptibility only. Smoking, age, and retinal " + "findings still matter far more than genetics alone." + ), + "CFH": ( + "This common complement marker can raise AMD predisposition, but it does " + "not diagnose existing retinal disease." + ), + "CHRNA3": ( + "Nicotine-risk markers reflect reinforcement tendency if smoking begins; " + "they are not destiny and they are not a reason to smoke." + ), + "CHRNA5": ( + "These are behavior-risk markers for nicotine reinforcement, not a " + "diagnosis of addiction and not a reason to start smoking." + ), + "C3": ( + "This common complement-pathway marker contributes to AMD predisposition " + "only; it does not measure current eye health." + ), "CYP2C19": ( "This captures the common *2 and *17 markers only; other loss-of-function " "alleles are not assessed in most consumer array files." @@ -35,6 +63,18 @@ "CYP4F2 contributes modestly to warfarin dose and should be read " "alongside CYP2C9 and VKORC1." ), + "CYP2R1": ( + "CYP2R1 variants influence vitamin D tendency, but sun exposure, season, " + "body size, diet, and supplements usually dominate actual lab values." + ), + "DHCR7": ( + "DHCR7 is a common vitamin D GWAS marker, not a diagnosis of vitamin D " + "deficiency. Serum 25(OH)D remains the real measurement." + ), + "DPYD": ( + "Consumer arrays rarely cover the full DPYD allele set. A normal result " + "here does not rule out fluoropyrimidine toxicity risk." + ), "F2": ( "This common thrombophilia marker raises baseline clot risk but does not " "diagnose an active clotting disorder." @@ -47,6 +87,10 @@ "These common HFE variants explain many classic hemochromatosis patterns, " "but not every cause of iron overload or anemia." ), + "HNMT": ( + "HNMT affects intracellular histamine breakdown, not the full picture of " + "dietary histamine handling or mast-cell activation." + ), "MTHFR": ( "Common MTHFR SNPs do not directly diagnose folate deficiency; " "homocysteine, B12, folate, and diet still matter." @@ -59,6 +103,10 @@ "This panel captures TPMT *3C only; other TPMT and NUDT15 variants can " "still be clinically important." ), + "VDR": ( + "VDR changes signaling sensitivity, not measured serum vitamin D. Labs and " + "lifestyle still determine whether someone is actually deficient." + ), "VKORC1": ( "Warfarin response still depends on age, diet, liver function, " "interacting drugs, and other genes." @@ -360,6 +408,94 @@ def _warfarin_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: ) +def _statin_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: + slco1b1 = rsid_map.get("rs4149056") + abcg2 = rsid_map.get("rs2231142") + slco1b1_count = _variant_count(slco1b1, "C") + abcg2_count = _variant_count(abcg2, "A") + if slco1b1_count is None and abcg2_count is None: + return None + + slco1b1_count = slco1b1_count or 0 + abcg2_count = abcg2_count or 0 + if slco1b1_count == 0 and abcg2_count == 0: + tier = "normal" + detail = ( + "Statin composite: no common SLCO1B1 or ABCG2 risk pattern was detected " + "in the markers this file covered." + ) + elif slco1b1_count == 2 or (slco1b1_count >= 1 and abcg2_count >= 1): + tier = "risk" + detail = ( + "Statin composite: combined transporter risk pattern. Simvastatin " + "myopathy risk and rosuvastatin exposure are both more likely to run high." + ) + else: + tier = "warning" + detail = ( + "Statin composite: a common transporter variant is present. The clearest " + "signal is lower tolerance for simvastatin or higher rosuvastatin exposure." + ) + + return _make_insight( + "statin", + "Statin Composite", + detail, + tier, + _present_findings(slco1b1, abcg2), + ) + + +def _dpyd_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: + star2a = rsid_map.get("rs3918290") + star13 = rsid_map.get("rs55886062") + c2846 = rsid_map.get("rs67376798") + hapb3 = rsid_map.get("rs56038477") + + star2a_count = _variant_count(star2a, "A") + star13_count = _variant_count(star13, "G") + c2846_count = _variant_count(c2846, "T") + hapb3_count = _variant_count(hapb3, "A") + counts = [ + count + for count in (star2a_count, star13_count, c2846_count, hapb3_count) + if count is not None + ] + if not counts: + return None + + strong_hits = sum(count or 0 for count in (star2a_count, star13_count, c2846_count)) + tag_hits = hapb3_count or 0 + if strong_hits >= 1: + tier = "risk" + detail = ( + "DPYD composite: clinically actionable fluoropyrimidine-toxicity variant " + "detected. Standard 5-FU or capecitabine dosing can be unsafe without " + "formal PGx-guided adjustment." + ) + elif tag_hits >= 1: + tier = "warning" + detail = ( + "DPYD composite: HapB3 tag signal present. This can indicate lower DPD " + "activity, but clinical confirmation is still warranted because the tag " + "SNP is not perfect." + ) + else: + tier = "normal" + detail = ( + "DPYD composite: no common fluoropyrimidine-toxicity signal was detected " + "in the DPYD markers this file covered." + ) + + return _make_insight( + "dpyd", + "DPYD Composite", + detail, + tier, + _present_findings(star2a, star13, c2846, hapb3), + ) + + def _mthfr_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: c677t = rsid_map.get("rs1801133") a1298c = rsid_map.get("rs1801131") @@ -399,6 +535,199 @@ def _mthfr_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: ) +def _histamine_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: + dao_16 = rsid_map.get("rs10156191") + dao_332 = rsid_map.get("rs1049742") + dao_664 = rsid_map.get("rs1049793") + dao_promoter = rsid_map.get("rs2052129") + hnmt = rsid_map.get("rs11558538") + + dao_16_count = _variant_count(dao_16, "T") + dao_332_count = _variant_count(dao_332, "T") + dao_664_count = _variant_count(dao_664, "G") + dao_promoter_count = _variant_count(dao_promoter, "T") + hnmt_count = _variant_count(hnmt, "T") + + counts = [ + count + for count in ( + dao_16_count, + dao_332_count, + dao_664_count, + dao_promoter_count, + hnmt_count, + ) + if count is not None + ] + if not counts: + return None + + dao_total = sum( + count or 0 + for count in (dao_16_count, dao_332_count, dao_664_count, dao_promoter_count) + ) + hnmt_total = hnmt_count or 0 + + if dao_total == 0 and hnmt_total == 0: + tier = "normal" + detail = ( + "Histamine composite: no reduced-clearance alleles were detected in the " + "DAO and HNMT markers this file covered." + ) + elif dao_total >= 4 or (dao_total >= 2 and hnmt_total >= 1) or hnmt_total == 2: + tier = "risk" + detail = ( + "Histamine composite: multi-hit reduced-clearance pattern across DAO/HNMT " + "markers. This is a stronger genetic signal for lower histamine " + "breakdown capacity." + ) + else: + tier = "warning" + detail = ( + "Histamine composite: at least one reduced-clearance DAO/HNMT marker is " + "present. This can matter more when symptoms are triggered by alcohol, " + "high-histamine foods, gut irritation, or certain medications." + ) + + detail += ( + " This is exploratory genetics, not a diagnosis of histamine intolerance " + "or mast-cell disease." + ) + return _make_insight( + "histamine", + "Histamine Composite", + detail, + tier, + _present_findings(dao_16, dao_332, dao_664, dao_promoter, hnmt), + ) + + +def _amd_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: + cfh = rsid_map.get("rs1061170") + arms2 = rsid_map.get("rs10490924") + c3 = rsid_map.get("rs2230199") + cfh_count = _variant_count(cfh, "C") + arms2_count = _variant_count(arms2, "T") + c3_count = _variant_count(c3, "G") + counts = [count for count in (cfh_count, arms2_count, c3_count) if count is not None] + if not counts: + return None + + total = sum(count or 0 for count in (cfh_count, arms2_count, c3_count)) + if total == 0: + tier = "normal" + detail = ( + "AMD composite: no common high-risk pattern was detected in the AMD " + "markers this file covered." + ) + elif (arms2_count or 0) == 2 or total >= 4: + tier = "risk" + detail = ( + "AMD composite: stronger common susceptibility pattern across CFH, ARMS2, " + "and C3. Smoking avoidance and routine eye care become more important." + ) + else: + tier = "warning" + detail = ( + "AMD composite: at least one common susceptibility marker is present. " + "This is a predisposition signal, not a diagnosis of retinal disease." + ) + + return _make_insight( + "amd", + "AMD Composite", + detail, + tier, + _present_findings(cfh, arms2, c3), + ) + + +def _vitamin_d_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: + dhcr7 = rsid_map.get("rs12785878") + cyp2r1 = rsid_map.get("rs10741657") + vdr = rsid_map.get("rs2228570") + dhcr7_count = _variant_count(dhcr7, "G") + cyp2r1_count = _variant_count(cyp2r1, "G") + vdr_count = _variant_count(vdr, "A") + counts = [count for count in (dhcr7_count, cyp2r1_count, vdr_count) if count is not None] + if not counts: + return None + + total = sum(count or 0 for count in (dhcr7_count, cyp2r1_count, vdr_count)) + if total == 0: + tier = "normal" + detail = ( + "Vitamin D composite: no common low-vitamin-D tendency was detected in " + "the markers this file covered." + ) + elif total >= 3: + tier = "risk" + detail = ( + "Vitamin D composite: multi-hit tendency toward lower 25(OH)D or less " + "efficient signaling. Lab monitoring matters more than guessing." + ) + else: + tier = "warning" + detail = ( + "Vitamin D composite: at least one common lower-vitamin-D marker is " + "present. This can matter more during winter, with low sun exposure, or " + "if intake is poor." + ) + + return _make_insight( + "vitamin_d", + "Vitamin D Composite", + detail, + tier, + _present_findings(dhcr7, cyp2r1, vdr), + ) + + +def _nicotine_insight(rsid_map: dict[str, Finding]) -> DerivedInsight | None: + chrna5 = rsid_map.get("rs16969968") + chrna3 = rsid_map.get("rs1051730") + regulatory = rsid_map.get("rs588765") + chrna5_count = _variant_count(chrna5, "A") + chrna3_count = _variant_count(chrna3, "A") + regulatory_count = _variant_count(regulatory, "C") + counts = [ + count + for count in (chrna5_count, chrna3_count, regulatory_count) + if count is not None + ] + if not counts: + return None + + primary_total = (chrna5_count or 0) + (chrna3_count or 0) + regulatory_total = regulatory_count or 0 + if primary_total == 0 and regulatory_total == 0: + tier = "normal" + detail = ( + "Nicotine composite: no common high-risk nicotine-reinforcement pattern " + "was detected in the markers this file covered." + ) + elif primary_total >= 3 or (primary_total >= 2 and regulatory_total >= 1): + tier = "risk" + detail = ( + "Nicotine composite: stronger common dependence-prone pattern. If " + "nicotine use begins, reinforcement and smoking heaviness may run higher." + ) + else: + tier = "warning" + detail = ( + "Nicotine composite: at least one common dependence-related marker is " + "present. This is a behavioral predisposition, not destiny." + ) + + return _make_insight( + "nicotine", + "Nicotine Composite", + detail, + tier, + _present_findings(chrna5, chrna3, regulatory), + ) + + def _build_derived_insights(findings: list[Finding]) -> list[DerivedInsight]: rsid_map = { finding.rsid: finding @@ -410,7 +739,13 @@ def _build_derived_insights(findings: list[Finding]) -> list[DerivedInsight]: _hfe_insight, _cyp2c19_insight, _warfarin_insight, + _statin_insight, + _dpyd_insight, _mthfr_insight, + _histamine_insight, + _amd_insight, + _vitamin_d_insight, + _nicotine_insight, ) insights = [insight for builder in builders if (insight := builder(rsid_map)) is not None] insights.sort(key=lambda item: (-_TIER_RANK[item.tier], item.title)) diff --git a/tests/test_panels.py b/tests/test_panels.py index 4b3ca17..0c78ba4 100644 --- a/tests/test_panels.py +++ b/tests/test_panels.py @@ -6,13 +6,17 @@ "pharmacogenomics", "athletic", "dietary", + "eye_health", "hfe", + "histamine", "cognition", + "nicotine", "sensitivity", + "vitamin_d", } -def test_all_eight_panels_load() -> None: +def test_all_twelve_panels_load() -> None: panels = load_panels() assert {p.id for p in panels} == EXPECTED_PANELS @@ -40,9 +44,21 @@ def test_expanded_panels_include_new_high_signal_markers() -> None: panels = {p.id: p for p in load_panels()} cardio_rsids = {s.rsid for s in panels["cardiovascular"].snps} pgx_rsids = {s.rsid for s in panels["pharmacogenomics"].snps} + eye_rsids = {s.rsid for s in panels["eye_health"].snps} + histamine_rsids = {s.rsid for s in panels["histamine"].snps} + nicotine_rsids = {s.rsid for s in panels["nicotine"].snps} + vitamin_d_rsids = {s.rsid for s in panels["vitamin_d"].snps} assert "rs6025" in cardio_rsids # F5 Factor V Leiden assert "rs1799963" in cardio_rsids # F2 prothrombin G20210A assert "rs2108622" in pgx_rsids # CYP4F2 warfarin modifier + assert "rs2231142" in pgx_rsids # ABCG2 statin transporter + assert "rs3918290" in pgx_rsids # DPYD *2A + assert "rs1061170" in eye_rsids # CFH Y402H + assert "rs10490924" in eye_rsids # ARMS2 A69S + assert "rs10156191" in histamine_rsids # AOC1 / DAO + assert "rs11558538" in histamine_rsids # HNMT + assert "rs16969968" in nicotine_rsids # CHRNA5 + assert "rs12785878" in vitamin_d_rsids # DHCR7 def test_snp_interpretations_are_indexed_by_genotype() -> None: diff --git a/tests/test_summaries.py b/tests/test_summaries.py index f9b868b..657a002 100644 --- a/tests/test_summaries.py +++ b/tests/test_summaries.py @@ -26,6 +26,32 @@ def test_build_analysis_summary_from_fixture(fixtures_dir: Path) -> None: assert "Methylation & Detox" in comt.panels +def test_build_analysis_summary_infers_histamine_composite() -> None: + panels = load_panels() + findings = annotate( + analyze( + { + "rs10156191": "TT", + "rs1049742": "CT", + "rs1049793": "CG", + "rs2052129": "GT", + "rs11558538": "CT", + }, + panels, + selected_panel_ids={"histamine"}, + ), + load_clinvar(), + load_pharmgkb(), + ) + summary = build_analysis_summary(findings, panels) + histamine = next(item for item in summary.derived_insights if item.id == "histamine") + aoc1 = next(item for item in summary.gene_summaries if item.gene == "AOC1") + assert histamine.tier == "risk" + assert "histamine composite" in histamine.summary.lower() + assert "exploratory genetics" in histamine.summary.lower() + assert aoc1.caveat is not None + + def test_build_analysis_summary_infers_cyp2c19_and_warfarin_composites() -> None: panels = load_panels() findings = annotate( @@ -37,6 +63,10 @@ def test_build_analysis_summary_infers_cyp2c19_and_warfarin_composites() -> None "rs1057910": "AA", "rs9923231": "TT", "rs2108622": "CT", + "rs4149056": "CT", + "rs2231142": "AC", + "rs3918290": "AG", + "rs56038477": "AG", }, panels, selected_panel_ids={"pharmacogenomics"}, @@ -46,8 +76,52 @@ def test_build_analysis_summary_infers_cyp2c19_and_warfarin_composites() -> None ) summary = build_analysis_summary(findings, panels) cyp2c19 = next(item for item in summary.derived_insights if item.id == "cyp2c19") + dpyd = next(item for item in summary.derived_insights if item.id == "dpyd") + statin = next(item for item in summary.derived_insights if item.id == "statin") warfarin = next(item for item in summary.derived_insights if item.id == "warfarin") assert cyp2c19.tier == "warning" assert "intermediate" in cyp2c19.summary + assert dpyd.tier == "risk" + assert "fluoropyrimidine" in dpyd.summary.lower() + assert statin.tier == "risk" + assert "statin composite" in statin.title.lower() assert warfarin.tier == "risk" assert "warfarin" in warfarin.summary.lower() + + +def test_build_analysis_summary_infers_amd_vitamin_d_and_nicotine_composites() -> None: + panels = load_panels() + findings = annotate( + analyze( + { + "rs1061170": "CC", + "rs10490924": "GT", + "rs2230199": "CG", + "rs12785878": "GG", + "rs10741657": "AG", + "rs2228570": "AA", + "rs16969968": "AG", + "rs1051730": "AG", + "rs588765": "CT", + }, + panels, + selected_panel_ids={"eye_health", "vitamin_d", "nicotine"}, + ), + load_clinvar(), + load_pharmgkb(), + ) + summary = build_analysis_summary(findings, panels) + amd = next(item for item in summary.derived_insights if item.id == "amd") + vitamin_d = next(item for item in summary.derived_insights if item.id == "vitamin_d") + nicotine = next(item for item in summary.derived_insights if item.id == "nicotine") + chrna5 = next(item for item in summary.gene_summaries if item.gene == "CHRNA5") + assert amd.tier == "risk" + assert "predisposition" in amd.summary.lower() or "susceptibility" in amd.summary.lower() + assert vitamin_d.tier == "risk" + assert "lab" in vitamin_d.summary.lower() + assert nicotine.tier == "risk" + assert ( + "behavioral predisposition" in nicotine.summary.lower() + or "dependence-prone" in nicotine.summary.lower() + ) + assert chrna5.caveat is not None