Skip to content

Introducing sequence_data type. Align either FASTQs or BAM!#740

Merged
tmooney merged 3 commits intogenome:masterfrom
tmooney:introducing_sequence_type
Aug 5, 2019
Merged

Introducing sequence_data type. Align either FASTQs or BAM!#740
tmooney merged 3 commits intogenome:masterfrom
tmooney:introducing_sequence_type

Conversation

@tmooney
Copy link
Member

@tmooney tmooney commented Jul 18, 2019

Hot on the heels of #730 is a version that works on either FASTQ or BAM. This approach saves us from having to maintain two versions of the pipeline and retains the efficiency of piping directly into bwa from BAM files.

If this approach looks good, I'll next replace the existing alignment steps in all the workflows and probably remove the existing alignment tool CWLs.


if [[ "$MODE" == 'fastq' ]]; then
/usr/local/bin/bwa mem -K 100000000 -t "$NTHREADS" -Y -p -R "$READGROUP" "$REFERENCE" "$FASTQ1" "$FASTQ2" | /usr/local/bin/samblaster -a --addMateTags | /opt/samtools/bin/samtools view -b -S /dev/stdin
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an else-if in case someone defined BAM and FASTQ inputs? Then the FASTQ would have precedence over the BAM (since it's more efficient).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one of the two will ever happen. In this bash script, whichever parameter is encountered last will determine the $MODE.

Further, the CWL type definition uses "exclusive" parameters, so it will error if both BAM and FASTQs are supplied. (The two FASTQ parameters are also dependent so both must be supplied if one is. See also this user guide page for more details.)

Copy link
Member

@apaul7 apaul7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@tmooney tmooney merged commit 4bc0a45 into genome:master Aug 5, 2019
@tmooney tmooney deleted the introducing_sequence_type branch August 5, 2019 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments