The pipeline currently requires the user to specify at least one reference GFF. However, there may be instances where they don't have or want to use genomic/reference annotations (.fa/.gff). For example, instead of a genomic fasta file they may instead want to use a synthetic fasta with known sequences to count against.
The behavior of tiny-count will be changed to allow for this. If a user does not specify any GFF files:
- Each entry in the user's reference genome file is treated as a "pseudofeature", where the description line is treated as the pseudofeature's ID. The implementation will infer the ID and length of these pseudofeatures from
@SQ flags in input SAM files.
- A GenomicArrayOfSets will be populated with the pseudofeature coordinates just as they are with features
- Accordingly, Stage 1 selection will be skipped in this mode