Skip to content

Prune_transcripts.py error #12

@marcasriv

Description

@marcasriv

Hi Alex,

Thanks for making the ORF-RATE pipeline public and accessible.

When running prune_transcripts.py with the following parameters in Python 2.7.15:
prune_transcripts.py --inbed annot.bed --minlen 27 --maxlen 32 genome.fa test.bam -p 10 --verbose

I am getting the following error :

[2019-09-30 11:59:51] Reading transcriptome and genome
[2019-09-30 12:00:05] Parsing sequence and count information

Traceback (most recent call last):
File "prune_transcripts.py", line 179, in
% (len(tid_summary), lowreads_dropped, len(tid_summary)-lowreads_dropped, opts.minreads, opts.peakfrac))

ValueError: All 46526 transcripts dropped due to too few reads (46526) or too many reads coming from one position (0). Consider decreasing MINREADS (currently 64) or increasing PEAKFR(currently 0.200000), or check validity of input BAM file.

I have done as suggested in the ValueError and decreased MINREADS and increased PEAKFR, but I keep getting the same error no matter what parameters I use. I've checked the input BAM file and it seems okay too. I am quite sure that the problem isn't low coverage. When doing samtools flagstat of the BAM file I get:

203414522 + 0 in total (QC-passed reads + QC-failed reads)
100572976 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
203414522 + 0 mapped (100.00% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Please find attached a sample of the structure of my input files:
genome.fa.txt
annot.bed.txt

Thank you so much,

Marina

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions