Skip to content

3. Filtering

Peter edited this page Nov 28, 2019 · 4 revisions

Overview

At this point, you should have a BAM file of barcoded alignments. This next step is a series of filters to remove reads that do not match certain criteria.

Repeat masking

We use bedtools to discard reads that overlap the annotations in a mask file. The mask file is actually a composite of two separate mask files:

  • UCSC repeatmasker, millidiv less than 140
  • mm10 or hg38 blacklisted regions v2

The composite mm10 and hg38 mask file that we use can be found in the main directory of the SPRITE-DNA pipeline.

bedtools intersect -v -a example.DNA.chr.bam -b mask_file.bed >
example.DNA.chr.masked.bam

Clone this wiki locally