Skip to content

Tutorial Preface

Peter Ronning edited this page Aug 31, 2018 · 9 revisions

Chapter 4.1 - Tutorial Preface

Chapter 4 of this Wiki is a tutorial for using DeepSVR. Its purpose is to detail the necessary commands for building a model with a large data set, using the model to classify a set of input variants, assessing the accuracy of the model, and retraining the model. This tutorial is designed to be interactively followed. The chapter is structured such that each subchapter explains a critical command for building and using DeepSVR. Within each subchapter is background on a command and its function, as well as detailed explanations of the input/output files, command line syntax, and command flag options.

The original 41,000 variants used to train the model in this tutorial are referred to as the Original Data, while the variants to be classified from Chr22 of tst1 are referred to as the Inference Data.

We recommend creating and training the classifier with the Original data, . The Original data set is an excellent starting point for training a model, con given its sample size, tissue diversity, diversity, and We strongly recommend creating the classifier with the Original 41,000 somatic variants because it is a large and diverse data set that equips the model with capabilities to handle many cancer types across solid and liquid tumors. We also recommend adding 5% (or >250) of your own manually reviewed variants to the training data in order to help mitigate batch effects. Upon reviewing the accuracy of the called variants via ROC curves, keep in mind that accuracy can be improved by increasing the amount of training data that you contribute.

For subchapters 4.3-4.6, the main table has purple text boxes illustrating the current step.

To build the classifier using the Original Data and classify your own data, replace the tst1 Inference Data with your somatic variants. This tutorial is designed to facilitate a novice programmer using a deep learning model to classify their own putative somatic variants.

Clone this wiki locally