- Overview
- Software Requirements
- Installation
- Quick Start
- Usage Examples
- Command Line Reference
- Troubleshooting
- Development
- Citation
RNA MAP is an open-source tool for rapid analysis of RNA mutational profiling (MaP) experiments. This tool was inspired by the DREEM algorithm developed by the Rouskin Lab and provides a comprehensive platform for analyzing DMS-reactivity of RNA molecules.
RNA mutational profiling (MaP) is a powerful technique that uses dimethyl sulfate (DMS) to probe RNA structure by introducing mutations during reverse transcription. The resulting sequencing data reveals the structural state of RNA molecules at single-nucleotide resolution.
- Fast and accurate analysis of DMS-MaPseq experiments
- Support for both single-end and paired-end sequencing data
- Quality control integration with FastQC and Trim Galore
- Flexible alignment options using Bowtie2
- Interactive visualizations with Plotly
- Batch processing capabilities for large datasets
- Docker support for reproducible environments
- FASTA file: Reference RNA sequence(s) of interest
- FASTQ file(s): Raw sequencing data from DMS-MaPseq experiment
- Optional: Dot-bracket structure file for enhanced visualization
- Python: 3.8 or greater
- Bowtie2: 2.2.9+ (for sequence alignment)
- FastQC: 0.11.9+ (for quality control)
- Trim Galore: 0.6.6+ (for adapter trimming)
- Cutadapt: 1.18+ (for quality trimming)
- Conda/Mamba: For environment management
- Docker: For containerized deployment
đź’ˇ Recommendation: If you're trying the software for the first time, we highly recommend using the Docker image for a hassle-free experience.
The easiest way to install RNA MAP is using the provided environment.yml file, which automatically installs all dependencies:
# Clone the repository
git clone https://github.com/YesselmanLab/rna_map
cd rna_map
# Create environment from environment.yml
conda env create -f environment.yml
# Activate the environment
conda activate rna-map
# Install the package in development mode
pip install -e .Alternatively, you can install from PyPI:
# Create a new conda environment
conda create -n rna-map python=3.11
# Activate the environment
conda activate rna-map
# Install bioinformatics tools
conda install -c bioconda bowtie2 fastqc cutadapt trim-galore
# Install RNA MAP
pip install rna-mapDocker provides the most reliable installation method with all dependencies pre-configured:
# Clone the repository
git clone https://github.com/YesselmanLab/rna_map
cd rna_map
# Build the Docker image
# For Linux and Intel Mac:
docker build -t rna-map -f docker/Dockerfile .
# For Apple Silicon Mac (ARM64):
docker build -t rna-map --platform linux/amd64 -f docker/Dockerfile .
# Run RNA MAP with Docker
rna-map -fa <fasta_file> -fq1 <fastq_file> --dockerFor development or if you need the latest features:
# Clone the repository
git clone https://github.com/YesselmanLab/rna_map
cd rna_map
# Install in development mode
pip install -e .
# Install development dependencies
pip install -e ".[dev]"After installation, the rna-map command-line tool will be available in your environment.
# Single-end sequencing
rna-map -fa reference.fasta -fq1 reads.fastq
# Paired-end sequencing
rna-map -fa reference.fasta -fq1 reads_R1.fastq -fq2 reads_R2.fastq
# With secondary structure information
rna-map -fa reference.fasta -fq1 reads.fastq --dot-bracket structures.csvrna-map -fa my_rna.fasta -fq1 my_reads.fastqrna-map -fa my_rna.fasta -fq1 reads_R1.fastq -fq2 reads_R2.fastqrna-map -fa my_rna.fasta -fq1 my_reads.fastq --dot-bracket structures.csv# Create a custom parameter file
rna-map -fa my_rna.fasta -fq1 my_reads.fastq -pf custom_params.yml
# Use preset parameters for barcoded libraries
rna-map -fa my_rna.fasta -fq1 my_reads.fastq -pp barcoded-library# Skip quality control steps for faster processing
rna-map -fa my_rna.fasta -fq1 my_reads.fastq --skip-fastqc --skip-trim-galore
# Custom quality cutoff
rna-map -fa my_rna.fasta -fq1 my_reads.fastq --tg-q-cutoff 20# Apply strict filtering criteria
rna-map -fa my_rna.fasta -fq1 my_reads.fastq \
--map-score-cutoff 20 \
--qscore-cutoff 20 \
--mutation-count-cutoff 2 \
--percent-length-cutoff 0.8For large datasets with thousands of reference sequences:
# Generate summary only (no individual plots)
rna-map -fa large_dataset.fasta -fq1 reads.fastq --summary-output-only
# Skip bit vector generation for very large datasets
rna-map -fa large_dataset.fasta -fq1 reads.fastq --skip-bit-vector# Run with Docker (Linux/Intel Mac)
rna-map -fa my_rna.fasta -fq1 my_reads.fastq --docker
# Run with Docker (Apple Silicon Mac)
rna-map -fa my_rna.fasta -fq1 my_reads.fastq --docker --docker-platform linux/amd64For a complete list of available options, run:
rna-map --help-fa, --fasta PATH: Reference sequences in FASTA format (required)-fq1, --fastq1 PATH: Single-end reads or first pair of paired-end reads (required)-fq2, --fastq2 PATH: Second pair of paired-end reads (optional)--dot-bracket PATH: CSV file with secondary structure information-pf, --param-file PATH: Custom parameter file (YAML format)-pp, --param-preset TEXT: Use preset parameters (e.g., 'barcoded-library')
--skip-fastqc: Skip FastQC quality control--skip-trim-galore: Skip Trim Galore adapter trimming--tg-q-cutoff INTEGER: Quality cutoff for Trim Galore (default: 20)
--bt2-alignment-args TEXT: Custom Bowtie2 arguments (comma-separated)--save-unaligned PATH: Save unaligned reads to specified path
--skip-bit-vector: Skip bit vector generation--summary-output-only: Generate summary only (no individual plots)--map-score-cutoff INTEGER: Minimum mapping score threshold--qscore-cutoff INTEGER: Quality score threshold for nucleotides--mutation-count-cutoff INTEGER: Maximum mutations per read--percent-length-cutoff FLOAT: Minimum read length percentage--min-mut-distance INTEGER: Minimum distance between mutations
--docker: Run in Docker container--docker-image TEXT: Specify Docker image--docker-platform TEXT: Specify platform (e.g., linux/amd64)
--overwrite: Overwrite existing output directory--debug: Enable debug mode--help: Show help message
Problem: rna-map command not found after installation
# Solution: Ensure the conda environment is activated
conda activate rna-map
which rna-mapProblem: Missing dependencies (bowtie2, fastqc, etc.)
# Solution: Install using conda
conda install -c bioconda bowtie2 fastqc cutadapt trim-galoreProblem: Low alignment rates
- Check that your FASTQ files are properly formatted
- Verify that the reference sequence matches your experimental setup
- Try adjusting Bowtie2 parameters:
--bt2-alignment-args "very-sensitive-local"
Problem: Memory issues with large datasets
# Solution: Use summary-only mode for large datasets
rna-map -fa large_dataset.fasta -fq1 reads.fastq --summary-output-onlyProblem: Docker permission errors
# Solution: Add user to docker group (Linux) or use sudo
sudo usermod -aG docker $USER
# Then log out and back inProblem: Poor quality scores
- Check your sequencing quality
- Adjust quality cutoff:
--tg-q-cutoff 15(lower threshold) - Skip quality control if data is known to be good:
--skip-fastqc --skip-trim-galore
Problem: No mutations detected
- Verify DMS treatment was performed correctly
- Check that reference sequence is correct
- Lower mutation count cutoff:
--mutation-count-cutoff 1
- Check the logs: RNA MAP provides detailed logging information
- Use debug mode: Add
--debugflag for verbose output - Test with sample data: Use the provided test files in
test/resources/ - GitHub Issues: Report bugs and ask questions on the GitHub repository
A successful run will show:
rna_map.CLI - INFO - Starting RNA MAP analysis...
rna_map.RUN - INFO - Found 1 valid reference sequences
rna_map.MAPPING - INFO - Bowtie2 2.4.5 detected!
rna_map.MAPPING - INFO - FastQC v0.11.9 detected!
rna_map.MAPPING - INFO - Trim Galore 0.6.6 detected!
rna_map.EXTERNAL_CMD - INFO - Running FastQC...
rna_map.EXTERNAL_CMD - INFO - FastQC completed successfully
rna_map.EXTERNAL_CMD - INFO - Running Trim Galore...
rna_map.EXTERNAL_CMD - INFO - Trim Galore completed successfully
rna_map.EXTERNAL_CMD - INFO - Running Bowtie2 alignment...
rna_map.EXTERNAL_CMD - INFO - Bowtie2 alignment completed successfully
rna_map.BIT_VECTOR - INFO - Starting bit vector generation...
rna_map.BIT_VECTOR - INFO - Analysis completed successfully!
# Clone the repository
git clone https://github.com/YesselmanLab/rna_map
cd rna_map
# Create development environment
conda env create -f environment.yml
conda activate rna-map
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run linting
black rna_map/
flake8 rna_map/This project uses:
- Black for code formatting
- Flake8 for linting
- MyPy for type checking
- Pytest for testing
If you use RNA MAP in your research, please cite:
Yesselman, J.D., et al. (2022). RNA mutational profiling (MaP) with DREEM reveals
structural and functional insights into RNA structure. Nucleic Acids Research,
50(12), e70. https://doi.org/10.1093/nar/gkac435
