From 982797623608cb4948578c74ea8b7732c5f87eec Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Mon, 4 Oct 2021 15:10:56 -0400
Subject: [PATCH 01/10] add CITATION.cff draft

---
 CITATION.cff | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 CITATION.cff

diff --git a/CITATION.cff b/CITATION.cff
new file mode 100644
index 00000000..6267891f
--- /dev/null
+++ b/CITATION.cff
@@ -0,0 +1,40 @@
+# YAML 1.2
+---
+authors: 
+  -
+    family-names: Pai
+    given-names: Shraddha
+  -
+    family-names: Weber
+    given-names: Philipp
+  -
+    family-names: Shah
+    given-names: Ahmad
+  -
+    family-names: Giudice
+    given-names: Luca
+  -
+    family-names: Hui
+    given-names: Shirley
+  -
+    family-names: "Nøhr"
+    given-names: Anne
+  -
+    family-names: Ng
+    given-names: Indy
+  -
+    family-names: Isserlin
+    given-names: Ruth
+  -
+    family-names: Kaka
+    given-names: Hussam
+  -
+    family-names: Bader
+    given-names: Gary
+cff-version: "1.1.0"
+license: MIT
+message: "If you use this software, please cite it using these metadata."
+repository-code: "https://github.com/RealPaiLab/netDx"
+title: "netDx: Network-based patient classifier"
+version: "1.5.5"
+...
\ No newline at end of file

From 60083253491ed09ba2ae4034c646befdf63da61d Mon Sep 17 00:00:00 2001
From: Indy Ng <writingindy@gmail.com>
Date: Mon, 4 Oct 2021 15:21:29 -0400
Subject: [PATCH 02/10] Update README.md

Added status badges for Docker Build build/pass/fail status and R CMD check build/pass/fail status.
---
 README.md | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/README.md b/README.md
index 586b32ca..41e6b49a 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,8 @@
+# netDx: Network-based patient classifier
+[![Docker Build](https://github.com/RealPaiLab/netDx/actions/workflows/push-docker.yml/badge.svg)](https://github.com/RealPaiLab/netDx/actions/workflows/push-docker.yml)
+[![R CMD check bioc](https://github.com/RealPaiLab/netDx/actions/workflows/check-bioc.yml/badge.svg)](https://github.com/RealPaiLab/netDx/actions/workflows/check-bioc.yml)
+
+
 ### Main repo for netDx dev work as of Sep 2021.
 
 netDx is a general-purpose algorithm for building patient classifiers by using patient similarity networks as features. It excels at interpretability and handling missing data. It also allows custom grouping rules for features, notably grouping genes into pathways. It integrates with RCy3 for network visualization of predictive pathways.

From 06c6ba39c34cb7685dc93c4f28455680348ab078 Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Mon, 4 Oct 2021 17:09:41 -0400
Subject: [PATCH 03/10] Draft update on RawDataConversion.Rmd

---
 vignettes/RawDataConversion.Rmd | 48 ++++++++++++++++++++++-----------
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/vignettes/RawDataConversion.Rmd b/vignettes/RawDataConversion.Rmd
index 1a51cb31..522e3977 100644
--- a/vignettes/RawDataConversion.Rmd
+++ b/vignettes/RawDataConversion.Rmd
@@ -1,5 +1,5 @@
 ---
-title: "Converting raw assay data/tables into format compatible with netDx algorithm"
+title: "Building a binary classifier from assay data using pathway level features"
 author: "Shraddha Pai & Indy Ng"
 package: netDx
 date: "`r Sys.Date()`"
@@ -7,29 +7,31 @@ output:
   BiocStyle::html_document:
     toc_float: true
 vignette: >
-    %\VignetteIndexEntry{02. Running netDx with data in table format}.
+    %\VignetteIndexEntry{01. Build binary predictor and view performance, top features and integrated Patient Similarity Network}.
     %\VignetteEngine{knitr::rmarkdown}
     %\VignetteEncoding{UTF-8}
 ---
 
 # Introduction
 
-In this example we will use the convertToMAE() wrapper function to transform raw experimental assay data/tables into the MultiAssayExperiment class, to illustrate how a typical netDx workflow might be initiated. 
+In this example we will build a predictor to classify breast tumours as being either of Luminal A subtype or otherwise. The process is identical for classifying three or more labels, and the example uses minimal data for quick runtime.
 
-For this we will use data from The Cancer Genome Atlas, converting it from the MultiAssayExperiment class into a list so that it is compatible with the convertToMAE() wrapper function.
+The netDx algorithm requires assay data to be provided in the form of a `MultiAssayExperiment` object, so we will also use the `convertToMAE()` wrapper function to transform raw experimental assay data/tables into a `MultiAssayExperiment` object, to illustrate how a typical netDx workflow might be initiated. 
+
+We will use data from The Cancer Genome Atlas to build the predictor, converting it from a `MultiAssayExperiment` object into a list to illustrate how to utilize the `convertToMAE()` wrapper function.
 
 We will integrate two types of -omic data:
 
 * gene expression from Agilent mRNA microarrays and
 * miRNA sequencing 
 
-
 ```{r, include = FALSE}
 knitr::opts_chunk$set(crop=NULL)
 ```
 
 # Setup
-Load the `netDx` package.
+
+First, we load the `netDx` package.
 
 ```{r,eval=TRUE}
 suppressWarnings(suppressMessages(require(netDx)))
@@ -37,12 +39,13 @@ suppressWarnings(suppressMessages(require(netDx)))
 
 # Data 
 
-For this example we pull data from the The Cancer Genome Atlas through the BioConductor `curatedTCGAData` package. The fetch command automatically brings in a `MultiAssayExperiment` object. 
+For this example we pull data from the The Cancer Genome Atlas through the BioConductor `curatedTCGAData` package.
+
 ```{r,eval=TRUE}
 suppressMessages(library(curatedTCGAData))
 ```
 
-We fetch the data two layers of data that we need:
+We fetch the two layers of data that we need:
 
 ```{r, eval=TRUE}
 brca <- suppressMessages(curatedTCGAData("BRCA",
@@ -51,6 +54,12 @@ brca <- suppressMessages(curatedTCGAData("BRCA",
                                          dry.run=FALSE, version="1.1.38"))
 ```
 
+The fetch command automatically brings in a `MultiAssayExperiment` object. 
+
+```{r, eval = TRUE}
+summary(brca)
+```
+
 This next code block prepares the TCGA data. In practice you would do this once, and save the data before running netDx, but we run it here to see an end-to-end example. 
 
 ```{r, eval=TRUE}
@@ -65,9 +74,18 @@ pID <- colData(brca)$patientID
 colData(brca)$ID <- pID
 ```
 
-## Create feature design rules
+# Create feature design rules
+
+To build the predictor using the netDx algorithm, we call the `buildPredictor()` function which takes patient data and variable groupings 
+
+## dataList object
+
+## groupList object
+
+Our plan is to group gene expression data by pathways
+
+The 'groupList' object tells the predictor how to group units when constructing a network. For examples, genes may be grouped into a network representing a pathway. This object is a list; the names match those of `dataList` while each value is itself a list and reflects a potential network.
 
-We follow the workflow in the vignette "01: Build binary predictor and view performance, top features and integrated Patient Similarity Network" to generate the groupList parameter.
 
 ```{r, eval=TRUE}
 expr <- assays(brca)
@@ -111,7 +129,7 @@ brca <- convertToMAE(brcaList)
 
 The rest of the workflow follows the vignette "01: Build binary predictor and view performance, top features and integrated Patient Similarity Network".
 
-### Define patient similarity for each network
+## Define patient similarity for each network
 The `makeNets` function tells the predictor how to create networks from provided input data.
 
 This function requires `dataList`,`groupList`, and `netDir` as input variables. The residual `...` parameter is to pass additional variables to `makePSN_NamedMatrix()`, notably `numCores` (number of parallel jobs).
@@ -147,7 +165,7 @@ makeNets <- function(dataList, groupList, netDir,...) {
 }
 ```
 
-## Build predictor
+# Build predictor
 
 Finally we call the function that runs the netDx predictor. We provide:
 
@@ -168,7 +186,7 @@ Within each split, it:
 
 In practice a good starting point is `featScoreMax=10`, `featSelCutoff=9` and `numSplits=10L`, but these parameters depend on the sample sizes in the dataset and heterogeneity of the samples. 
 
-This step can take a few hours based on the current parameters, so we're commenting this out for the tutorial and will simply load the results.
+This step can take a few hours based on the current parameters, so we comment this out for the tutorial and will simply load the results.
  
 ```{r lab1-buildpredictor ,eval=TRUE}
 nco <- round(parallel::detectCores()*0.75) # use 75% available cores
@@ -198,7 +216,7 @@ t1 <- Sys.time()
 print(t1-t0)
 ```
 
-## Examine results
+# Examine results
 
 Now we get model output, including performance for various train/test splits and consistently high-scoring features. 
 
@@ -253,7 +271,7 @@ tsne <- tSNEPlotter(
 	)
 ```
 
-## sessionInfo
+# sessionInfo
 ```{r}
 sessionInfo()
 ```
\ No newline at end of file

From 75414ce304853356a59022db5dc14c00ff1ff16a Mon Sep 17 00:00:00 2001
From: Indy Ng <writingindy@gmail.com>
Date: Mon, 4 Oct 2021 17:30:17 -0400
Subject: [PATCH 04/10] Update CITATION.cff with DOI

---
 CITATION.cff | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/CITATION.cff b/CITATION.cff
index 6267891f..cdfe39c7 100644
--- a/CITATION.cff
+++ b/CITATION.cff
@@ -4,24 +4,12 @@ authors:
   -
     family-names: Pai
     given-names: Shraddha
-  -
-    family-names: Weber
-    given-names: Philipp
   -
     family-names: Shah
     given-names: Ahmad
-  -
-    family-names: Giudice
-    given-names: Luca
   -
     family-names: Hui
     given-names: Shirley
-  -
-    family-names: "Nøhr"
-    given-names: Anne
-  -
-    family-names: Ng
-    given-names: Indy
   -
     family-names: Isserlin
     given-names: Ruth
@@ -32,6 +20,8 @@ authors:
     family-names: Bader
     given-names: Gary
 cff-version: "1.1.0"
+date-released: 2019
+doi: "10.15252/msb.20188497"
 license: MIT
 message: "If you use this software, please cite it using these metadata."
 repository-code: "https://github.com/RealPaiLab/netDx"

From 4ad48a32a455c8b01d5b3caff57e829cd8d28ca8 Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Tue, 5 Oct 2021 10:25:00 -0400
Subject: [PATCH 05/10] update RawDataConversion.Rmd

---
 vignettes/RawDataConversion.Rmd | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/vignettes/RawDataConversion.Rmd b/vignettes/RawDataConversion.Rmd
index 522e3977..b3500222 100644
--- a/vignettes/RawDataConversion.Rmd
+++ b/vignettes/RawDataConversion.Rmd
@@ -76,9 +76,7 @@ colData(brca)$ID <- pID
 
 # Create feature design rules
 
-To build the predictor using the netDx algorithm, we call the `buildPredictor()` function which takes patient data and variable groupings 
-
-## dataList object
+To build the predictor using the netDx algorithm, we call the `buildPredictor()` function which takes patient data and variable groupings, and returns a set of patient similarity networks (PSN) as an output. The user can customize what datatypes are used, how they are grouped, and what defines patient similarity for a given datatype.
 
 ## groupList object
 

From 928c73e7761a38a110d73c2986ca73a7ed448d7f Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Tue, 5 Oct 2021 12:38:46 -0400
Subject: [PATCH 06/10] Fixed fetchPathwayDefinitions.Rd

---
 R/fileCache.R                  | 4 ++--
 man/fetchPathwayDefinitions.Rd | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/R/fileCache.R b/R/fileCache.R
index abe5af19..3b266835 100644
--- a/R/fileCache.R
+++ b/R/fileCache.R
@@ -72,7 +72,7 @@ fetchPathwayDefinitions <- function(month=NULL,year=NULL,day=1,verbose=FALSE){
 		month <- month.name[month]
 	}
 		pdate <- sprintf("%s_%02d_%i",month,day,year)
-    	pathwayURL <- paste("https://downloads.res.oicr.on.ca/pailab/EM_Genesets/", 
+    	pathwayURL <- paste("https://downloads.res.oicr.on.ca/pailab/public/EM_Genesets/", 
 		sprintf("%s/Human/symbol/",pdate),
         sprintf("Human_AllPathways_%s_symbol.gmt",pdate),
 		 sep = "")
@@ -83,7 +83,7 @@ fetchPathwayDefinitions <- function(month=NULL,year=NULL,day=1,verbose=FALSE){
 	if (chk$status_code==404) {
 		stop(paste(sprintf("The pathway file for %02d %s %i doesn't exist.",day,month,year),
 				"Select a different date. ",
-				"See https://downloads.res.oicr.on.ca/pailab/EM_Genesets/Human/symbol for options.",
+				"See https://downloads.res.oicr.on.ca/pailab/public/EM_Genesets/Human/symbol for options.",
 				sep=" "))
 	}
     bfcrpath(bfc, pathwayURL)
diff --git a/man/fetchPathwayDefinitions.Rd b/man/fetchPathwayDefinitions.Rd
index a79a0693..8d4b4d11 100644
--- a/man/fetchPathwayDefinitions.Rd
+++ b/man/fetchPathwayDefinitions.Rd
@@ -11,7 +11,7 @@ fetchPathwayDefinitions(month = NULL, year = NULL, day = 1, verbose = FALSE)
 numeric or text (e.g. "January","April"). If NULL, fails.}
 
 \item{year}{(numeric) year of pathway definition file. Must be in
-yyyy format (e.g. 2018). If NULL, fails.}
+yyyy format (e.g. 2020). If NULL, fails.}
 
 \item{day}{(integer)}
 

From 9c86c392cd7ec80fb5caad977bfcc4c1230d1ff9 Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Tue, 5 Oct 2021 16:38:10 -0400
Subject: [PATCH 07/10] Update RawDataConversion vignette tobe concise and
 clear

---
 vignettes/RawDataConversion.Rmd | 110 +++++++++++++-------------------
 1 file changed, 44 insertions(+), 66 deletions(-)

diff --git a/vignettes/RawDataConversion.Rmd b/vignettes/RawDataConversion.Rmd
index b3500222..25fb31cb 100644
--- a/vignettes/RawDataConversion.Rmd
+++ b/vignettes/RawDataConversion.Rmd
@@ -16,9 +16,7 @@ vignette: >
 
 In this example we will build a predictor to classify breast tumours as being either of Luminal A subtype or otherwise. The process is identical for classifying three or more labels, and the example uses minimal data for quick runtime.
 
-The netDx algorithm requires assay data to be provided in the form of a `MultiAssayExperiment` object, so we will also use the `convertToMAE()` wrapper function to transform raw experimental assay data/tables into a `MultiAssayExperiment` object, to illustrate how a typical netDx workflow might be initiated. 
-
-We will use data from The Cancer Genome Atlas to build the predictor, converting it from a `MultiAssayExperiment` object into a list to illustrate how to utilize the `convertToMAE()` wrapper function.
+Although the netDx algorithm requires assay data to be provided in the form of a `MultiAssayExperiment` object, the package comes equipped with the `convertToMAE()` wrapper function to transform raw experimental assay data/tables into a `MultiAssayExperiment` object. We will use data from The Cancer Genome Atlas to build the predictor, converting it from a `MultiAssayExperiment` object into a list to illustrate how to utilize the `convertToMAE()` wrapper function.
 
 We will integrate two types of -omic data:
 
@@ -54,7 +52,7 @@ brca <- suppressMessages(curatedTCGAData("BRCA",
                                          dry.run=FALSE, version="1.1.38"))
 ```
 
-The fetch command automatically brings in a `MultiAssayExperiment` object. 
+The fetch command automatically brings in a `MultiAssayExperiment` object.
 
 ```{r, eval = TRUE}
 summary(brca)
@@ -74,15 +72,15 @@ pID <- colData(brca)$patientID
 colData(brca)$ID <- pID
 ```
 
-# Create feature design rules
+# Create feature design rules (patient similarity networks)
 
 To build the predictor using the netDx algorithm, we call the `buildPredictor()` function which takes patient data and variable groupings, and returns a set of patient similarity networks (PSN) as an output. The user can customize what datatypes are used, how they are grouped, and what defines patient similarity for a given datatype.
 
 ## groupList object
 
-Our plan is to group gene expression data by pathways
+The `groupList` object tells the predictor how to group units when constructing a network. For examples, genes may be grouped into a network representing a pathway. This object is a list; the names match those of `dataList` while each value is itself a list and reflects a potential network.
 
-The 'groupList' object tells the predictor how to group units when constructing a network. For examples, genes may be grouped into a network representing a pathway. This object is a list; the names match those of `dataList` while each value is itself a list and reflects a potential network.
+In this simple example we just create a single PSN for each datatype (mRNA gene expression, and miRNA expression data), containing all measures from that datatype, where measures can be individual genes, proteins, CpG bases (in DNA methylation data), clinical variables, etc., 
 
 
 ```{r, eval=TRUE}
@@ -99,19 +97,36 @@ for (k in 1:length(expr)) {	# loop over all layers
 }
 ```
 
+## Define patient similarity for each network
+
+`sims` is a list that specifies the choice of similarity metric to use for each grouping we're passing to the netDx algorithm. You can choose between several built-in similarity functions provided in the `netDx` package:
+
+* `normDiff` (normalized difference)
+* `avgNormDiff` (average normalized difference)
+* `sim.pearscale` (Pearson correlation followed by exponential scaling)
+* `sim.eucscale` (Euclidean distance followed by exponential scaling) or
+* `pearsonCorr` (Pearson correlation)
+
+You may also define custom similarity functions in this block of code and pass those to `makePSN_NamedMatrix()`, using the `customFunc` parameter.
+
+```{r,eval=TRUE}
+sims <- list(a="pearsonCorr", b="pearsonCorr")
+names(sims) <- names(groupList)
+```
+
 # Conversion of raw assay data into MultiAssayExperiment format
 
-Data pulled from The Cancer Genome Atlas through the BioConductor `curatedTCGAData` package automatically fetches data in the form of a `MultiAssayExperiment` object. However, most workflows that might utilize the netDx algorithm will have experimental assay data and patient metadata in the form of data frames/matrices/tables. 
+Data pulled from The Cancer Genome Atlas through the BioConductor `curatedTCGAData` package automatically fetches data in the form of a `MultiAssayExperiment` object. However, most workflows that might utilize the netDx algorithm will have experimental assay data and patient metadata in the form of data frames/matrices/tables.
 
-To facilitate ease-of-use, the netDx package has a built-in wrapper function convertToMAE() that takes in an input list of key-value pairs of experimental assay data and patient metadata, converting it into a MultiAssayExperiment object compatible with further analysis using the netDx algorithm. 
+To facilitate ease-of-use, the netDx package has a built-in wrapper function `convertToMAE()` that takes in an input list of key-value pairs of experimental assay data and patient metadata, converting it into a `MultiAssayExperiment` object compatible with further analysis using the netDx algorithm. However, all relevant data engineering/preparation should be done before using the `convertToMAE()` wrapper function.
 
-This next code block converts the TCGA data into a list format.
+This next code block converts the TCGA data into a list format to illustrate how one might use the `convertToMAE()` wrapper function.
 
 ```{r, eval=TRUE}
 brcaData <- dataList2List(brca, groupList)
 ```
 
-The keys of the input list of key-value pairs should be labelled according to the type of data corresponding to the value pairs (methylation, mRNA, proteomic, etc) and there must be a key-value pair that corresponds to patient IDs/metadata labelled pheno.
+The keys of the input list of key-value pairs should be labelled according to the type of data corresponding to the value pairs (methylation, mRNA, proteomic, etc) and there must be a key-value pair that corresponds to patient IDs/metadata labelled `pheno`.
 
 ```{r, eval=TRUE}
 brcaList <- brcaData$assays
@@ -119,49 +134,13 @@ brcaList <- c(brcaList, list(brcaData$pheno))
 names(brcaList)[3] <- "pheno"
 ```
 
-We can now call the convertToMAE() wrapper function to convert the list containing experimental assay data and patient metadata into a MAE object.
+We can now call the `convertToMAE()` wrapper function to convert the list containing experimental assay data and patient metadata into a `MultiAssayExperiment` object.
 
 ```{r, eval=TRUE}
 brca <- convertToMAE(brcaList)
 ```
 
-The rest of the workflow follows the vignette "01: Build binary predictor and view performance, top features and integrated Patient Similarity Network".
-
-## Define patient similarity for each network
-The `makeNets` function tells the predictor how to create networks from provided input data.
-
-This function requires `dataList`,`groupList`, and `netDir` as input variables. The residual `...` parameter is to pass additional variables to `makePSN_NamedMatrix()`, notably `numCores` (number of parallel jobs).
-
-netDx requires that this function have:
-
-* `dataList`,`groupList`, and `netDir` as input variables. The residual `...` parameter is to pass additional variables to `makePSN_NamedMatrix()`, notably number of cores for parallel processing (`numCores`). 
-
-
-```{r,  eval=TRUE}
-makeNets <- function(dataList, groupList, netDir,...) {
-	netList <- c() # initialize before is.null() check
-	
-	layerNames <- c("BRCA_miRNASeqGene-20160128",
-		"BRCA_mRNAArray-20160128")
-	
-	for (nm in layerNames){  			## for each layer
-		if (!is.null(groupList[[nm]])){ ## must check for null for each layer
-			netList_cur <- makePSN_NamedMatrix(
-				dataList[[nm]],
-				rownames(dataList[[nm]]),	## names of measures (e.g. genes, CpGs)
-				groupList[[nm]],			## how to group measures in that layer
-				netDir,						## leave this as-is, netDx will figure out where this is.
-				verbose=FALSE, 			
-				writeProfiles=TRUE,   		## use Pearson correlation-based similarity
-				...
-				)
-
-			netList <- c(netList,netList_cur)	## just leave this in
-		}
-	}
-	return(unlist(netList))	## just leave this in 
-}
-```
+We can then proceed with the rest of the netDx workflow.
 
 # Build predictor
 
@@ -169,7 +148,7 @@ Finally we call the function that runs the netDx predictor. We provide:
 
 * patient data  (`dataList`)
 * grouping rules (`groupList`)
-* function to create PSN from data, includes choice of similarity metric (`makeNetFunc`)
+* list specifying choice of similarity metric to use for each grouping (`sims`)
 * number of train/test splits over which to collect feature scores and average performance (`numSplits`), 
 * maximum score for features in one round of feature selection  (`featScoreMax`, set to 10)
 * threshold to call feature-selected networks for each train/test split (`featSelCutoff`); only features scoring this value or higher will be used to classify test patients,
@@ -189,26 +168,26 @@ This step can take a few hours based on the current parameters, so we comment th
 ```{r lab1-buildpredictor ,eval=TRUE}
 nco <- round(parallel::detectCores()*0.75) # use 75% available cores
 message(sprintf("Using %i of %i cores", nco, parallel::detectCores()))
+
 t0 <- Sys.time()
 set.seed(42) # make results reproducible
 outDir <- paste(tempdir(),randAlphanumString(),
 	"pred_output",sep=getFileSep())
 if (file.exists(outDir)) unlink(outDir,recursive=TRUE)
-model <- suppressMessages(buildPredictor(
-	dataList=brca,			## your data
-	groupList=groupList,	## grouping strategy
-	makeNetFunc=makeNets,	## function to build PSNs
-	outDir=outDir, 			## output directory
-	trainProp=0.8,			## pct of samples to use to train model in
-							## each split
-	numSplits=2L,			## number of train/test splits
- featSelCutoff=1L,		## threshold for calling something
-							## feature-selected
-	featScoreMax=2L,		## max score for feature selection
- numCores=nco,			## set higher for parallelizing
- debugMode=FALSE,
- keepAllData=FALSE,	## set to TRUE for debugging or low-level files used by the dictor
- logging="none"
+model <- suppressMessages(
+	buildPredictor(
+		dataList=brca,			## your data
+		groupList=groupList,	## grouping strategy
+		sims = sims,
+		outDir=outDir, 			## output directory
+		trainProp=0.8,			## pct of samples to use to train model in each split
+		numSplits=2L,			## number of train/test splits
+		featSelCutoff=1L,		## threshold for calling something feature-selected
+		featScoreMax=2L,		## max score for feature selection
+		numCores=nco,			## set higher for parallelizing
+		debugMode=FALSE,
+		keepAllData=FALSE,	## set to TRUE for debugging or low-level files used by the dictor
+		logging="none"
   ))
 t1 <- Sys.time()
 print(t1-t0)
@@ -218,7 +197,6 @@ print(t1-t0)
 
 Now we get model output, including performance for various train/test splits and consistently high-scoring features. 
 
-
 In the function below, we define top-scoring features as those which score two out of two in at least half of the train/test splits:
 
 ```{r lab1-getresults,eval=TRUE}

From 6a168bc92c47850196df1703a688c689dee802b7 Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Tue, 5 Oct 2021 17:02:04 -0400
Subject: [PATCH 08/10] Update geneMANIA links in fileCache.R to point to
 pailab servers

---
 R/fileCache.R | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/R/fileCache.R b/R/fileCache.R
index 3b266835..21eaa6a1 100644
--- a/R/fileCache.R
+++ b/R/fileCache.R
@@ -24,11 +24,11 @@ getGMjar_path <- function(verbose = FALSE) {
 	)
 	if (any(grep(" 11",java_ver)) || any(grep(" 12",java_ver)) || any(grep(" 13",java_ver)) || any(grep(" 14",java_ver)) || any(grep(" 16",java_ver))) {
 		if (verbose) message("Java 11+ detected")
-    	fileURL <- paste("https://download.baderlab.org/netDx/java11/", 
+    	fileURL <- paste("https://downloads.res.oicr.on.ca/pailab/netDx/java11/", 
 			"genemania-netdx.jar",sep="")
 	} else {
 		if (verbose) message("Java 8 detected")
-    	fileURL <- paste("https://download.baderlab.org/netDx/java8/", 
+    	fileURL <- paste("https://downloads.res.oicr.on.ca/pailab/netDx/java8/", 
 			"genemania-netdx.jar",sep="")
 	}
 	

From 869464549ae044d10cbb8be6d9a0131e2ce25cf5 Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Tue, 5 Oct 2021 17:04:47 -0400
Subject: [PATCH 09/10] update RawDataConversion vignette

---
 vignettes/RawDataConversion.Rmd | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/vignettes/RawDataConversion.Rmd b/vignettes/RawDataConversion.Rmd
index 25fb31cb..b36f47b7 100644
--- a/vignettes/RawDataConversion.Rmd
+++ b/vignettes/RawDataConversion.Rmd
@@ -234,19 +234,6 @@ And here are selected features, which are those scoring 2 out of 2 in at least h
 results$selectedFeatures
 ```
 
-We finally get the integrated PSN and visualize it using a tSNE plot:
-
-```{r, fig.width=8,fig.height=8, eval=TRUE}
-## this call doesn't work in Rstudio; for now we've commented this out and saved the PSN file. 
-psn <- getPSN(brca,groupList,makeNetFunc=makeNets,selectedFeatures=results$selectedFeatures)
-
-require(Rtsne)
-tsne <- tSNEPlotter(
-	psn$patientSimNetwork_unpruned, 
-	colData(brca)
-	)
-```
-
 # sessionInfo
 ```{r}
 sessionInfo()

From d220795b6cb4ac999f95fe24370a1321d404511e Mon Sep 17 00:00:00 2001
From: Indy <writingindy@gmail.com>
Date: Tue, 5 Oct 2021 17:49:40 -0400
Subject: [PATCH 10/10] Fixed getPSN call in RawDataConversion vignette

---
 vignettes/RawDataConversion.Rmd | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/vignettes/RawDataConversion.Rmd b/vignettes/RawDataConversion.Rmd
index b36f47b7..0fd67d6e 100644
--- a/vignettes/RawDataConversion.Rmd
+++ b/vignettes/RawDataConversion.Rmd
@@ -233,6 +233,17 @@ And here are selected features, which are those scoring 2 out of 2 in at least h
 ```{r, eval=TRUE}
 results$selectedFeatures
 ```
+We finally get the integrated PSN and visualize it using a tSNE plot:
+
+```{r, fig.width=8,fig.height=8, eval=TRUE}
+## this call doesn't work in Rstudio; for now we've commented this out and saved the PSN file. 
+psn <- getPSN(brca,groupList,sims = sims,selectedFeatures=results$selectedFeatures)
+require(Rtsne)
+tsne <- tSNEPlotter(
+	psn$patientSimNetwork_unpruned, 
+	colData(brca)
+	)
+```
 
 # sessionInfo
 ```{r}