Skip to content

jsede/virus_assembly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

virus_assembly

A bash pipeline for de novo assembly of viral genomes generated via Illumina NGS. Currently handles the following viruses: HIV-1, RSV, RRV and HMPV.

Current version: V1

Table of contents

Requirements

A conda package manager like Miniconda3.

Installation

  1. Download the initial environment installation file
wget https://raw.githubusercontent.com/jsede/virus_assembly/main/scripts/install_env.sh
  1. Run the script in the terminal
 bash ./install_env.sh
  1. Check if installation worked
 conda activate virus_assembly

Usage

Pipeline: NGS pipeline for viral assembly.
usage: virus_assembly [-h -v -p -q] (-i dir -m value -t value )
(-s string) 
with:
 -h  Show help text
 -v  Version of the pipeline
 -n  Name of RUN.
 -i  Input directory
 -s  Viral species
 -c  Perform clipping of primers
 -q  Perform quality check using fastQC
 -m  Memory
 -t  Number of threads

Usage

  1. Activate environment.
conda activate virus_assembly
  1. Head to the directory where you will perform the analysis.
  2. Place the raw fastq.gz files in a directory called 1_reads.
  3. Create a list holding the sample names from you sequencing files called IDs.list and place it in the main directory.
  4. Start the pipeline using the following command
virus_assembly -i path/to/main/directory -s VIRUS 
  1. When the pipeline has finished, 4 additional folders will have been created:
    • 2_ref_map: Includes a bam file of the trimmed reads against the viral reference genome and a pdf with the qualimap results.
    • 3_contigs: Includes the de novo assembled contigs by megahit for each sample.
    • 4_filter: Includes the high converage contigs generated by megahit (_ *hicov.fasta), filtered and reorientated against the viral reference genome (_ *reoriented.fa)
    • 5_remap: Includes both a fasta file holding the viral contigs for that sample and a bam file of the trimmed reads agains those contigs.

    Included viral reference genomes

  • HIV: K03455.1
  • RSV: MH760627; MH760652
  • RRV: RRV_ref (Accession pending)
  • HMPV: HMPV205; HMPV218 (Accessions pending)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages