Skip to content

How to keep pairing status in BAM output when also providing index FASTQs #39

@clintval

Description

@clintval

When I run splitcode on a pair of FASTQs I correctly get pairing status in the BAM output:

$ splitcode \
    --nFastqs 2 \
    --out-bam \
    --pipe \
    r1.fastq.gz \
    r2.fastq.gz \
    | samtools view | head -n2
A01000:460:XXXXXX:1:2101:10004:10081	77	*	0	0	*	*     	0	0	ACGT	,:FF
A01000:460:XXXXXX:1:2101:10004:10081	141	*	0	0	*	*     	0	0	GGTC	,:ff

Notice 77 and 141 SAM flags which indicate first and second of pair.

But when I want to use information stored in one of the index FASTQs, I cannot keep the first two paired:

$ splitcode \
    --nFastqs 3 \
    --select '0,1' \
    --out-bam \
    --pipe \
    r1.fastq.gz \
    r2.fastq.gz \
    index_1.fastq.gz \
    | samtools view | head -n2
A01000:460:XXXXXX:1:2101:10004:10081	12	*	0	0	*	*     	0	0	ACGT	,:FF
A01000:460:XXXXXX:1:2101:10004:10081	12	*	0	0	*	*     	0	0	GGTC	,:ff

Notice 12 and 12 SAM flags which are not first or second of pair.

What I'm trying to do is find and extract a variety of cell barcodes, UMIs, linkers, and other complex information from R1, R2, and a single index file while writing the R1 and R2 template sequences to a correctly formatted unmapped SAM/BAM file. I was hoping --select 0,1 might indicate to splitcode I want to retain a pair of reads from only those FASTQ files.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions