Skip to content

jayashan10/project_CAMK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Muscular Dystrophy Clinical Decision Support System

Overview

A knowledge graph-driven RAG system for clinical decision support in rare muscular dystrophies, focusing on scenario-based differential diagnosis and management recommendations.

Architecture

Core Components

  1. Clinical Scenario Processor

    • Accepts comprehensive patient presentations
    • Extracts phenotypes and clinical features
    • Maps to HPO (Human Phenotype Ontology) terms
  2. Knowledge Graph Engine

    • Neo4j-based graph database
    • Monarch Initiative knowledge graph (1.3M+ nodes, 14.7M+ relationships)
    • Stores gene-disease-phenotype relationships using Biolink Model schema
    • Enables complex clinical reasoning queries
    • Falls back to in-memory store if Neo4j unavailable
  3. RAG System

    • Processes clinical guidelines and literature
    • Provides evidence-based recommendations
    • Maintains citation traceability
  4. Differential Diagnosis Module

    • Phenotype-based disease ranking
    • Contextual variant interpretation
    • Probability scoring with explanations

Supported Scenarios

Primary Use Cases

  1. New Patient Presentation

    • Input: Clinical features, lab values, family history
    • Output: Differential diagnosis, recommended tests, initial management
  2. Variant Interpretation

    • Input: Genetic variant + clinical context
    • Output: Pathogenicity assessment, phenotype prediction, treatment eligibility
  3. Management Planning

    • Input: Confirmed diagnosis + patient status
    • Output: Age-appropriate surveillance, treatment options, prognostic counseling

Disease Coverage (Phase 1)

  • Duchenne Muscular Dystrophy (DMD)
  • Becker Muscular Dystrophy (BMD)
  • Limb-Girdle MD Type R1 (LGMDR1/LGMD2A)
  • LAMA2-Related Congenital MD (MDC1A)

Technical Stack

  • Backend: FastAPI (Python)
  • Graph DB: Neo4j
  • Vector DB: ChromaDB/Pinecone
  • LLM: OpenAI GPT-4 / Claude
  • RAG: LangChain
  • Frontend: React + Next.js

Project Structure

├── backend/
│   ├── api/               # FastAPI endpoints
│   ├── core/              # Core business logic
│   │   ├── scenario_processor.py
│   │   ├── differential_diagnosis.py
│   │   └── variant_interpreter.py
│   ├── knowledge_graph/   # Neo4j integration
│   ├── rag/              # RAG pipeline
│   └── data/             # Data ingestion scripts
│
├── frontend/
│   ├── components/       # React components
│   ├── pages/           # Next.js pages
│   └── utils/           # Helper functions
│
├── data/
│   ├── guidelines/      # Clinical guidelines
│   ├── gene_data/       # Genetic databases
│   └── scenarios/       # Test scenarios
│
├── notebooks/
│   └── prototype.ipynb  # Development notebook
│
├── pyproject.toml        # Project dependencies (uv/pip)
├── uv.lock               # Locked dependencies (uv)
└── .env                  # Neo4j configuration (not in git)

Monarch Integration

  • backend/knowledge_graph/monarch_service.py – queries Monarch using Biolink schema
  • backend/knowledge_graph/monarch_mapper.py – maps Biolink labels/relationships to project schema
  • test_monarch_integration.py – smoke test script to verify Monarch connectivity (uv run python test_monarch_integration.py)
  • backend/knowledge_graph/seed_data.pyfallback only when Neo4j/Monarch is unavailable

Getting Started

Prerequisites

  • Python 3.9+
  • Node.js 16+ (for frontend, when implemented)
  • Neo4j 4.4+ (Neo4j Desktop recommended)
  • uv (recommended Python package manager)

Installation

1. Python Environment Setup

Using uv (Recommended):

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate  # On macOS/Linux
# .venv\Scripts\activate  # On Windows

# Install project dependencies (from pyproject.toml)
uv sync

# Or install dependencies directly
uv pip install neo4j pydantic python-dotenv

# Install with dev dependencies (for notebooks)
uv sync --extra dev

Alternative: Using pip:

python -m venv venv
source venv/bin/activate
pip install neo4j pydantic python-dotenv

2. Neo4j Setup

The project uses Neo4j with the Monarch Initiative knowledge graph database.

Option A: Neo4j Desktop (Recommended)

  1. Download and install Neo4j Desktop
  2. Create a new database instance (or use existing)
  3. Import the Monarch Initiative dump into a database named monarch
  4. Start the database

Option B: Docker

docker run -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your-password \
  neo4j:latest

3. Environment Configuration

Create a .env file in the project root:

NEO4J_URI=neo4j://127.0.0.1:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
NEO4J_DATABASE=monarch

Important:

  • Add .env to .gitignore (already included) to avoid committing credentials
  • The uv.lock file (generated by uv sync) should be committed to ensure reproducible builds

4. Test Neo4j Connection

# Activate virtual environment
source .venv/bin/activate

# Test connection to Monarch database
python test_monarch_database.py

# Or test general connection
python test_neo4j_connection.py

# Optional: run Monarch integration smoke tests
python test_monarch_integration.py

Note: The system automatically falls back to an in-memory knowledge store if Neo4j is not available or not configured.

Example Clinical Scenario

{
  "patient": {
    "age": "7 years",
    "sex": "male"
  },
  "symptoms": [
    "Progressive proximal muscle weakness",
    "Gowers sign positive",
    "Calf pseudohypertrophy"
  ],
  "labs": {
    "CK": "15000 U/L"
  },
  "question": "What is the diagnosis and management?"
}

Development Status

  • Project planning and architecture
  • Knowledge graph schema design
  • Neo4j integration with Monarch Initiative database
  • Clinical scenario processor
  • In-memory knowledge store (fallback)
  • Adapter layer for Monarch schema integration
  • RAG pipeline implementation
  • API development
  • Frontend interface
  • Testing with real scenarios

References

  • Birnkrant DJ, et al. Diagnosis and management of Duchenne muscular dystrophy. Lancet Neurol. 2018
  • TREAT-NMD Standards of Care Guidelines
  • ACMG/AMP Variant Interpretation Guidelines

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •