Use new sequence type in alignment.#741
Conversation
2914f78 to
b2b638d
Compare
apaul7
left a comment
There was a problem hiding this comment.
should sequence_align_and_tag_adapter.cwl be under definitions/subworkflows/ and not definitions/tools/?
|
My snarky answer would be "it shouldn't exist at all!", but I moved it under subworkflows 😄. |
There was a problem hiding this comment.
In general, +1.
One comment, the new type is sequence_data and it has two items, sequence and readgroup. In the workflows, the inputs are always named sequence even though they are sequence_data inputs. One suggestion for consideration is to use tumor_sequence_data instead of tumor_sequence to avoid the confusion between the sequence_data input vs. the actual sequence property, ie. BAM or FASTQ, and the readgroup property.
NOTE: I realize my differentiation between input vs. property may not be correct. I'm happy to discuss in slack in more detail if this is confusing.
This is #740 plus an implementation of that concept in the pipelines that include the standard alignment subworkflows. I had to make an adapter subworkflow because Cromwell was not staging the file inputs when they were pulled as arguments directly leading to errors about files not being found.
I've tested this on the
somatic_exome.yamlexample but not run any full-scale pipelines yet.