From 8a059b29207226d40b58ab1e333841984da74acf Mon Sep 17 00:00:00 2001 From: Indy Ng Date: Wed, 6 Oct 2021 12:17:59 -0400 Subject: [PATCH 1/3] Update README.md Added Project Statement + Docker Install instructions --- README.md | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/README.md b/README.md index 41e6b49a..ee56dfa3 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,40 @@ [![Docker Build](https://github.com/RealPaiLab/netDx/actions/workflows/push-docker.yml/badge.svg)](https://github.com/RealPaiLab/netDx/actions/workflows/push-docker.yml) [![R CMD check bioc](https://github.com/RealPaiLab/netDx/actions/workflows/check-bioc.yml/badge.svg)](https://github.com/RealPaiLab/netDx/actions/workflows/check-bioc.yml) +## Project Statement + +netDx is for biomedical researchers who want to integrate multi-modal patient data to predict outcome or patient subtype. netDx builds interpretable machine-learning patient classifiers. Unlike standard machine-learning tools, netDx allows modeling of user-defined biological groups as input features; examples include pathways and co-regulated elements. In addition to patient classification, top-scoring features provide mechanistic insight, helping drive hypothesis generation for downstream experiments. netDx currently provides native support for pathway-level features but can be generalized to any user-defined data type and grouping. + +## Install + +These steps were tested on a 2020 Mac Book Pro with 32 Gb RAM. + +1) Install Docker, and under Resources for Docker Desktop's settings set Memory to 5 GB. +2) In terminal, pull the netDx image from Docker: + `docker pull realpailab/netdx` +3) To run the image as a container, save the following code as a bash script named `startDocker.sh`: +``` +#!/bin/bash + +docker run -d --rm \ + -p 8787:8787 \ + -e PASSWORD=netdx \ + -v /path/to/software/dir:/software \ + --name [containerName] \ + realpailab/netdx +``` +4) Modify the bash script so that it's executable and run it: +``` +chmod u+x startDocker.sh +./startDocker.sh +``` +5) Access RStudio by going to your web browser and typing in `localhost:8787`, and sign in with the username `rstudio` and password `netdx`. +6) You can then run the vignettes in R: +``` +setwd("/home/rstudio/vignettes") +rmarkdown::render("ThreeWayClassifier.Rmd") +``` + ### Main repo for netDx dev work as of Sep 2021. @@ -12,6 +46,7 @@ Visit http://bioconductor.org/packages/release/bioc/html/netDx.html to install t Contact Shraddha Pai at shraddha.pai@utoronto.ca in case of questions. + References: 1. Pai S, Hui S, Isserlin R, Shah MA, Kaka H and GD Bader (2019). netDx: Interpretable patient classification using patient similarity networks. *Mol Sys Biol*. 15: e8497. [Read the paper here](https://www.embopress.org/doi/full/10.15252/msb.20188497). From bc10fb5f607d6353a2f456ce9f125c8d02e62d1a Mon Sep 17 00:00:00 2001 From: Indy Ng Date: Wed, 6 Oct 2021 12:23:23 -0400 Subject: [PATCH 2/3] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ee56dfa3..d3aeb6da 100644 --- a/README.md +++ b/README.md @@ -36,6 +36,7 @@ setwd("/home/rstudio/vignettes") rmarkdown::render("ThreeWayClassifier.Rmd") ``` +--- ### Main repo for netDx dev work as of Sep 2021. From 5e1b3f86500827f0b387b8ed85a2c97c43cd5fd7 Mon Sep 17 00:00:00 2001 From: Indy Date: Wed, 6 Oct 2021 16:51:38 -0400 Subject: [PATCH 3/3] Update ThreeWayClassifier.Rmd to add explanatory text about data partitioning step and use of built-in similarity functions --- vignettes/ThreeWayClassifier.Rmd | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/vignettes/ThreeWayClassifier.Rmd b/vignettes/ThreeWayClassifier.Rmd index e817e7b8..bce52012 100755 --- a/vignettes/ThreeWayClassifier.Rmd +++ b/vignettes/ThreeWayClassifier.Rmd @@ -64,7 +64,8 @@ pID <- colData(brca)$patientID colData(brca)$ID <- pID ``` -ADD HOLDOUT TEXT HERE +The function `subSampleValidationData` partitions the TCGA data into two smaller datasets: a training and holdout validation dataset. This is to facilitate model validation after an initial model is built by the netDx algorithm - netDx will train a model on the training dataset through the `buildPredictor()` call, and this model can validate the held-out validation dataset through the `predict()` function call. + ```{r} set.seed(123) dsets <- subsampleValidationData(brca,pctValidation=0.1) @@ -105,9 +106,15 @@ groupList[[names(brca)[3]]] <- tmp ## Define patient similarity for each network -ADD SOME INTRO TEXT HERE +`sims` is a list that specifies the choice of similarity metric to use for each grouping we're passing to the netDx algorithm. You can choose between several built-in similarity functions provided in the `netDx` package: + +* `normDiff` (normalized difference) +* `avgNormDiff` (average normalized difference) +* `sim.pearscale` (Pearson correlation followed by exponential scaling) +* `sim.eucscale` (Euclidean distance followed by exponential scaling) or +* `pearsonCorr` (Pearson correlation) -`normDiff` is a function provided in the `netDx` package, but the user may define custom similarity functions in this block of code and pass those to `makePSN_NamedMatrix()`, using the `customFunc` parameter. +You may also define custom similarity functions in this block of code and pass those to `makePSN_NamedMatrix()`, using the `customFunc` parameter. ```{r,eval=TRUE} sims <- list(a="pearsonCorr",b="normDiff",c="pearsonCorr",d="pearsonCorr")