diff --git a/README.md b/README.md index 41e6b49a..d3aeb6da 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,41 @@ [![Docker Build](https://github.com/RealPaiLab/netDx/actions/workflows/push-docker.yml/badge.svg)](https://github.com/RealPaiLab/netDx/actions/workflows/push-docker.yml) [![R CMD check bioc](https://github.com/RealPaiLab/netDx/actions/workflows/check-bioc.yml/badge.svg)](https://github.com/RealPaiLab/netDx/actions/workflows/check-bioc.yml) +## Project Statement + +netDx is for biomedical researchers who want to integrate multi-modal patient data to predict outcome or patient subtype. netDx builds interpretable machine-learning patient classifiers. Unlike standard machine-learning tools, netDx allows modeling of user-defined biological groups as input features; examples include pathways and co-regulated elements. In addition to patient classification, top-scoring features provide mechanistic insight, helping drive hypothesis generation for downstream experiments. netDx currently provides native support for pathway-level features but can be generalized to any user-defined data type and grouping. + +## Install + +These steps were tested on a 2020 Mac Book Pro with 32 Gb RAM. + +1) Install Docker, and under Resources for Docker Desktop's settings set Memory to 5 GB. +2) In terminal, pull the netDx image from Docker: + `docker pull realpailab/netdx` +3) To run the image as a container, save the following code as a bash script named `startDocker.sh`: +``` +#!/bin/bash + +docker run -d --rm \ + -p 8787:8787 \ + -e PASSWORD=netdx \ + -v /path/to/software/dir:/software \ + --name [containerName] \ + realpailab/netdx +``` +4) Modify the bash script so that it's executable and run it: +``` +chmod u+x startDocker.sh +./startDocker.sh +``` +5) Access RStudio by going to your web browser and typing in `localhost:8787`, and sign in with the username `rstudio` and password `netdx`. +6) You can then run the vignettes in R: +``` +setwd("/home/rstudio/vignettes") +rmarkdown::render("ThreeWayClassifier.Rmd") +``` + +--- ### Main repo for netDx dev work as of Sep 2021. @@ -12,6 +47,7 @@ Visit http://bioconductor.org/packages/release/bioc/html/netDx.html to install t Contact Shraddha Pai at shraddha.pai@utoronto.ca in case of questions. + References: 1. Pai S, Hui S, Isserlin R, Shah MA, Kaka H and GD Bader (2019). netDx: Interpretable patient classification using patient similarity networks. *Mol Sys Biol*. 15: e8497. [Read the paper here](https://www.embopress.org/doi/full/10.15252/msb.20188497). diff --git a/vignettes/ThreeWayClassifier.Rmd b/vignettes/ThreeWayClassifier.Rmd index e817e7b8..bce52012 100755 --- a/vignettes/ThreeWayClassifier.Rmd +++ b/vignettes/ThreeWayClassifier.Rmd @@ -64,7 +64,8 @@ pID <- colData(brca)$patientID colData(brca)$ID <- pID ``` -ADD HOLDOUT TEXT HERE +The function `subSampleValidationData` partitions the TCGA data into two smaller datasets: a training and holdout validation dataset. This is to facilitate model validation after an initial model is built by the netDx algorithm - netDx will train a model on the training dataset through the `buildPredictor()` call, and this model can validate the held-out validation dataset through the `predict()` function call. + ```{r} set.seed(123) dsets <- subsampleValidationData(brca,pctValidation=0.1) @@ -105,9 +106,15 @@ groupList[[names(brca)[3]]] <- tmp ## Define patient similarity for each network -ADD SOME INTRO TEXT HERE +`sims` is a list that specifies the choice of similarity metric to use for each grouping we're passing to the netDx algorithm. You can choose between several built-in similarity functions provided in the `netDx` package: + +* `normDiff` (normalized difference) +* `avgNormDiff` (average normalized difference) +* `sim.pearscale` (Pearson correlation followed by exponential scaling) +* `sim.eucscale` (Euclidean distance followed by exponential scaling) or +* `pearsonCorr` (Pearson correlation) -`normDiff` is a function provided in the `netDx` package, but the user may define custom similarity functions in this block of code and pass those to `makePSN_NamedMatrix()`, using the `customFunc` parameter. +You may also define custom similarity functions in this block of code and pass those to `makePSN_NamedMatrix()`, using the `customFunc` parameter. ```{r,eval=TRUE} sims <- list(a="pearsonCorr",b="normDiff",c="pearsonCorr",d="pearsonCorr")