Skip to content

crab-assemblathon3/downsample_chua

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

downsample_chua

This python script iterates through one of the inputted fasta files to find how many records there are. The total number of ZMW records to record will be four times that number multipled by the percentage of files to retain. In our case, it was 75%.

The python script then loops through the fasta files and retains the longest ZMW reads. It does this by storing the length information in a dictionary with the ZMW numbers as the values. A counter keeps track of the number of the ZMW seen. Once the counter reaches the threshold described above, the shortest reads are removed. If a ZMW is seen more than once, the longest read is retained.

NOTE: This script has not been tested as we decided to move forward with the C++ script created by Jonas.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages