Digital twin modelling for Project Data Sphere study 20020408 (panitumumab + best supportive care vs best supportive care).
This package ingests ADaM tables, engineers rich baseline and longitudinal features, trains predictive models (OS, response, TTR, biomarker trajectories) and runs validation and simulation workflows.
- Python ≥ 3.9
uv(preferred) orpipfor dependency management- ADaM tables from PDS_DSA_20020408 (SAS7BDAT format)
- Optional:
openpyxlif you want to inspect the design workbookDDT_408_v2.xlsx
AllProvidedFiles_310/
├── DDT_408_v2.xlsx # data dictionary
└── PDS_DSA_20020408/
├── adsl_pds2019.sas7bdat # subject-level (required)
├── adlb_pds2019.sas7bdat # lab longitudinal (required)
├── adae_pds2019.sas7bdat # adverse events (optional but recommended)
├── adls_pds2019.sas7bdat # lesion measurements
├── adpm_pds2019.sas7bdat # physical measurements
├── adrsp_pds2019.sas7bdat # response assessments
└── biomark_pds2019.sas7bdat # molecular assays
Point pds310/config.yaml to the directory above. All artefacts will land in outputs/pds310/.
uv sync # or: pip install -r requirements.txt
uv run python -m pip install openpyxl pyreadstat # if not already present| Stage | Module | Description |
|---|---|---|
| Data ingestion | pds310.io |
Loads ADaM tables, normalises patient IDs (SUBJID) and synthesises STUDYID if missing. |
| Feature engineering | pds310.digital_profile, pds310.features_* |
Builds 71-feature digital profiles combining demographics, baseline labs, early-treatment labs, tumour burden, molecular markers, physical measurements, derived risk scores, and outcomes. Labs are normalised using canonical mappings from pds310/labs.py; longitudinal aggregates now keep only the 1–42 day early window (lab_<code>_early_last and _slope) to avoid Week 8 leakage. |
| Survival modelling | pds310.cli os, pds310.train_models |
Fits Cox PH (and optional Weibull AFT) for overall survival, produces risk scores, os_model.joblib, and KM overlays. |
| AE/EOT modelling (optional) | pds310.train_ae_models |
Trains cause-specific Cox for treatment-related adverse events and Weibull/LogNormal AFT for end-of-treatment timing (optional analysis, not required for virtual trials). |
| Response/TTR models | pds310.train_models |
Pipeline-based Random Forest (default) for best-response classification; Random Forest regression for time-to-response; both include proper preprocessing and holdout validation. |
| Biomarker trajectories | pds310.train_models → pds310.model_biomarkers |
Predicts LDH and HGB trajectories at weeks 8 and 16; leverages canonical lab IDs to reconstruct longitudinal targets. |
| Validation & calibration | pds310.run_validation, pds310.calibration |
Retrains on train split, evaluates on holdout (accuracy, ROC-AUC, MAE/R²), plots calibration curves, bootstraps uncertainty. |
| Digital twins & simulation | pds310.virtual_trial, pds310.simulate |
Generates synthetic cohorts, compares twins to real patients, and runs virtual trial design/statistics. |
The ADaM data dictionary (DDT_408_v2.xlsx) is a handy reference when exploring column definitions.
uv run python pds310/build_profiles.pyThis integrates all ADaM tables and creates outputs/pds310/patient_profiles.csv (370 patients × 71 features).
Option A: Comprehensive pipeline (Recommended)
uv run python pds310/train_models.py --seed 42Trains all models: response, TTR, OS, biomarkers → outputs/pds310/models/
Option B: OS model only via CLI
uv run -m pds310.cli os --config pds310/config.yamlFaster for quick OS modeling, produces same output as train_models.py for OS.
All modelling entry points automatically drop downstream outcomes and derived risk-score columns (lab_risk_score, performance_risk, etc.) from the feature matrix so that response/TTR/OS models cannot leak label information.
Holdout validation:
uv run python pds310/run_validation.py \
--profiles outputs/pds310/patient_profiles.csv \
--output_dir outputs/pds310/validation \
--model_type rf --test_size 0.2 --seed 42Virtual trial simulation:
uv run python pds310/run_virtual_trial.py --config pds310/config.yaml --effect_source learnedThe --effect_source flag controls treatment effects:
learned(default): Uses trained models for counterfactual outcomes + KM overlays with observed dataassumed: Uses protocol hazard ratios without counterfactual sampling
uv run python pds310/train_ae_models.pyTrains adverse event and treatment discontinuation models (optional - virtual trials work without this).
- CLI commands require profiles: Run
build_profiles.pyfirst - OS overlay moved: Use
run_virtual_trial.py --effect_source learnedinstead ofcli os-overlay - Advanced OS models deprecated: RSF/GB underperformed (C-index ~0.33 vs 0.666 for Cox)
- AE/EOT CLI commands deprecated: Use
train_ae_models.pystandalone script instead - All outputs go to
outputs/pds310/(models inmodels/subdirectory)
| Artefact | Location | Description |
|---|---|---|
| Patient profiles | outputs/pds310/patient_profiles.csv |
One row per patient with all engineered features. Summary in profile_database_summary.{json,txt} plus distribution plots (profile_distributions.png, profile_correlations.png, ras_vs_risk_score.png, treatment_vs_survival.png). |
| OS modelling | cox.joblib, metrics_os.json, plot_os_overlay_PDS310.png |
Cox model bundle, evaluation metrics (KFold C-index ≈ 0.666) and observed vs simulated survival overlay. |
| Learned OS model | outputs/pds310/models/os_model.joblib |
Cox PH model trained on digital profiles (with treatment indicator) used for counterfactual survival simulation. |
| AE/EOT simulation (optional) | ae_cox.joblib, eot_model.joblib, sim_ae.csv, report_ae.csv, plot_ae_incidence_*.png, plot_eot_distributions.png |
Optional adverse event and end-of-treatment models (generated by train_ae_models.py). |
| Response/TTR models | outputs/pds310/models/response_model.joblib, ttr_model.joblib + plots |
Pipeline-based models with feature importances (response_feature_importance.png, ttr_feature_importance.png) and diagnostics (response_confusion_matrix.png, ttr_predictions.png). |
| Biomarker trajectories | outputs/pds310/models/biomarker_models.joblib + LDH/HGB prediction plots |
Week 8/16 predictions with scatter/residual plots (e.g. Lactate Dehydrogenase_day56_predictions.png). |
| Validation bundle | outputs/pds310/validation/ |
Train/test splits, JSON metrics (response_validation.json, ttr_validation.json), calibration plots, performance comparison tables, bootstrap uncertainty results. |
| Digital twin audits | outputs/pds310/twin_* |
Diversity metrics, twin vs real comparisons (n=100 and n=1000). |
Refer to pds310/REPORT.md for detailed interpretation of these artefacts.
| Task | Dataset / Metric | Result |
|---|---|---|
| Overall Survival | KFold C-index (Cox PH) | 0.666 (single study – grouped CV not available) |
| AE / EOT Simulation | Arm medians from report_ae.csv |
BSC observed EOT median 44.5 days (sim 32.2 days); Panitumumab arm observed 83.5 days (sim 81.6 days) |
| AE Incidence (BSC vs Panitu) | Observed 90/180/365 vs Simulated | BSC: 0.00 / 0.00 / 0.00 vs 0.37 / 0.37 / 0.37 Panitu: 0.91 / 0.91 / 0.91 vs 0.39 / 0.60 / 0.61 |
| Response Classification | Holdout (60 pts) | Accuracy 0.77, ROC-AUC 0.87, macro-F1 0.43 (PR under-represented) |
| Time-to-Response | Holdout (5 responders) | R² −1.35, MAE 12.3 days (small n – treat cautiously) |
| Biomarker Trajectories | LDH day56 CV R² 0.91, day112 CV R² −56 (sparse late data) HGB day56 CV R² 0.98, day112 CV R² 0.56 |
Early timepoints robust; late LDH needs more data or smoothing. |
Interpretation of these metrics, plots, and recommended follow-up actions are documented in pds310/REPORT.md.
- Patient IDs look numeric? Make sure you rebuild profiles using the shipped scripts. The loader now forces zero-padded strings (e.g.
000123) so ADaM joins succeed. - AE incidence mismatch: verify
ADAEcontains on-study events (columnsAESTDY,AESTDYI). You can adjust horizons inpds310/reporting.py. - TTR performance: only 25 responders exist (5 in the validation split). Expect high variance; consider framing time-to-response as a survival problem with censoring.
- Biomarker coverage:
pds310/model_biomarkers.pyprints counts per timepoint—use this to adjust target windows or add additional labs. - Virtual trial: start with the provided design in
pds310/trial_design.py, then customise sample size, arm allocation, or endpoint definitions.
- Project Data Sphere Study 20020408 documentation (
DDT_408_v2.xlsx) – sheetADLB_PDS2019maps each lab test to canonical names used here. - CAMP methodology background: Chen et al., Clinically Aligned Multi-Task Learning for Digital Twins (2022).
- Lifelines documentation for Cox/AFT models if you wish to customise penalisation or tie handling.
Questions or ideas? See the detailed analysis in pds310/REPORT.md for context, open issues, and next-step recommendations.