Skip to content

Assertion failed: (abs(1 - f_sum) <= 1e-3 during classification step #65

@jflot

Description

@jflot

Hello,
I am trying to run MetaMaps to classify Nanopore reads that I know originate from a mix of four genomes. I built a custom database with these four (unpublished) genomes:

perl /Users/jff/miniconda3/bin/combineAndAnnotateReferences.pl --inputFileList listspecies.txt --outputFile combinedrefs.fa --taxonomyInDirectory /Users/jff/References/NCBItaxonomy --taxonomyOutDirectory combinedtaxonomy

Input: 4 genomes and 4 taxon IDs.
Reading taxonomy from /Users/jff/References/NCBItaxonomy ..
	done.

Introduced 0 new taxonomic IDs
Annotated 3320 contigs

Output new taxonomy into combinedtaxonomy

(base) adoxa:metamaps jff$ perl buildDB.pl --DB database --FASTAs combinedrefs.fa --taxonomy combinedtaxonomy 
Can't open perl script "buildDB.pl": No such file or directory
(base) adoxa:metamaps jff$ perl /Users/jff/miniconda3/bin/buildDB.pl --DB database --FASTAs combinedrefs.fa --taxonomy combinedtaxonomy 

Number of found FASTA input files: 1

Reading taxonomy from combinedtaxonomy ..
	done.


Running count keys in gene-2-protein: 0
$VAR1 = [
          'Annotation files',
          0
        ];
$VAR2 = [
          'Protein files',
          0
        ];
Reduced taxonomy in database/taxonomy from 2416811 nodes to 31 

Produced randomized-order database sequence file database/DB.fa and
	taxon info file database/taxonInfo.txt (4 taxa).
scalar((keys %gene_2_protein)): 0
	Total protein annotations: 0 -- with sequence: 0

then

perl /Users/jff/miniconda3/bin/buildDB.pl --DB database --FASTAs combinedrefs.fa --taxonomy combinedtaxonomy 

Number of found FASTA input files: 1

Reading taxonomy from combinedtaxonomy ..
	done.


Running count keys in gene-2-protein: 0
$VAR1 = [
          'Annotation files',
          0
        ];
$VAR2 = [
          'Protein files',
          0
        ];
Reduced taxonomy in database/taxonomy from 2416811 nodes to 31 

Produced randomized-order database sequence file database/DB.fa and
	taxon info file database/taxonInfo.txt (4 taxa).
scalar((keys %gene_2_protein)): 0
	Total protein annotations: 0 -- with sequence: 0

I did the mapping step:

metamaps mapDirectly --all -r database -q ../reads20000.fa -o classification_reads20000
>>>>>>>>>>>>>>>>>>
Reference = [database]
Query = [../reads20000.fa]
Kmer size = 16
Window size = 1000
Min. read length >= 1000
Alphabet = DNA
P-value = 0.001
Percentage identity threshold = 80
Report all = 1
Target max. memory = 0
Threads = 1
Mapping output file = classification_reads20000
>>>>>>>>>>>>>>>>>>
Parameters used:
	- alphabetSize: 4
	- kmerSize: 16
	- minReadLength: 1000
	- p_value: 0.001
	- percentageIdentity: 80
	- windowSize: 1000
	- maximumMemory: ~0 GB


Call storeCurrentState with 0

INFO, skch::Map::mapQuery, [count of mapped reads, reads qualified for mapping, total input reads] = [0, 54092, 54092]
INFO, skch::Map::Map, Time spent mapping the query : 57.7196 sec
INFO, skch::Sketch::build, minimizers picked from reference = 0

Index construction DONE, wrote 1 files.

but got an the following error message during the classification step:

metamaps classify --mappings classification_reads20000 --DB database
>>>>>>>>>>>>>>>>>>
DB = database
Mappings = classification_reads20000
Min. reads for 'U' = 10000
Threads = 1
>>>>>>>>>>>>>>>>>>
Read taxonomy from database/taxonomy -- have 31 nodes.
Starting EM...
Assertion failed: (abs(1 - f_sum) <= 1e-3), function doEM, file src/meta/fEM.h, line 500.
Abort trap: 6

It looks like none of my reads mapped on my reference genomes, but when I try mapping them using minimap2 for instance they almost all map on at least one of my four reference genomes...
Thanks a lot in advance for your help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions