Skip to content

DRAM fails in KEGG step, without KEGG database #1

@erikrikarddaniel

Description

@erikrikarddaniel

I am using a Conda/GitHub installed version of DRAM; unclear which version and downloaded the databases with this command, i.e. skipping KEGG:

DRAM.py prepare_databases --output_dir DRAM_data/ --threads 16

DRAM.py print_config gives the following:

KEGG db location: None
KOfam db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/kofam_profiles.hmm
KOfam KO list location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/kofam_ko_list.tsv
UniRef db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/uniref90.20200215.mmsdb
Pfam db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/pfam.mmspro
dbCAN db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/dbCAN-HMMdb-V7.txt
RefSeq Viral db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/refseq_viral.20200215.mmsdb
MEROPS peptidase db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/peptidases.20200215.mmsdb
VOGDB db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/vog_latest_hmms.txt
Description db location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/description_db.sqlite
Genome summary form location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/genome_summary_form.20200215.tsv
ETC module database location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/etc_mdoule_database.20200215.tsv
Function heatmap form location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/function_heatmap_form.20200215.tsv
AMG database location: /crex/proj/sllstore2017037/nobackup/data/DRAM_data/amg_database.20200215.tsv

When I run DRAM.py annotate -i 'MAGs/*.fna' -o annotation --threads 16, I get the following error, suggesting that it tries KEGG annotation in the absence of a database:

2020-02-16 09:56:20.361333: Annotation started
0:00:00.016704: 293 fastas found
0:00:00.281481: Retrieved database locations and descriptions
0:00:00.281504: Annotating OX3.63.fa.edit
0:00:00.283347: Filtering fasta
0:00:00.876378: Calling genes with prodigal
0:01:02.438230: Turning genes from prodigal to mmseqs2 db
0:03:29.291201: Getting forward best hits from kegg
Traceback (most recent call last):
File "/home/daniel/miniconda3/envs/dram/bin/DRAM.py", line 7, in
exec(compile(f.read(), file, 'exec'))
File "/domus/h1/daniel/dev/DRAM/scripts/DRAM.py", line 159, in
args.func(**args_dict)
File "/domus/h1/daniel/dev/DRAM/mag_annotator/annotate_bins.py", line 826, in annotate_bins
verbose))
File "/domus/h1/daniel/dev/DRAM/mag_annotator/annotate_bins.py", line 675, in annotate_fasta
verbose))
File "/domus/h1/daniel/dev/DRAM/mag_annotator/annotate_bins.py", line 612, in do_blast_style_search
threads, verbose=verbose)
File "/domus/h1/daniel/dev/DRAM/mag_annotator/annotate_bins.py", line 68, in get_best_hits
verbose=verbose)
File "/domus/h1/daniel/dev/DRAM/mag_annotator/utils.py", line 34, in run_process
stderr=stderr).stdout.decode(errors='ignore')
File "/home/daniel/miniconda3/envs/dram/lib/python3.7/subprocess.py", line 488, in run
with Popen(*popenargs, **kwargs) as process:
File "/home/daniel/miniconda3/envs/dram/lib/python3.7/subprocess.py", line 800, in init
restore_signals, start_new_session)
File "/home/daniel/miniconda3/envs/dram/lib/python3.7/subprocess.py", line 1482, in _execute_child
restore_signals, start_new_session, preexec_fn)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I have found no documented way of disabling the KEGG annotation.

BTW, I believe the documentation for downloading the databases has an error as the command with and without KEGG is the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions