fix(appris/mane): allow data source to determine availability#157
fix(appris/mane): allow data source to determine availability#157
Conversation
|
@ovesh What's the process now to upload new files? I'll also need write access to the DG plugins. |
Previously, GK hardcoded the filenames to encode what upstream version was used to build APPRIS/MANE. The issue with this is that older versions of GK could not be updated to use new APPRIS/MANE (ie, a new package with updated locator code needed to be released). This change brings APPRIS/MANE in line with annotations: if they exist on the data source, then it's available to use in GK. BREAKING CHANGE: this change also fixes gencode.v41 to use the correct source (e107 now instead of e103).
cd58d14 to
1d51fe7
Compare
For now, uploads to the open source bucket have to go through me. |
| if name.endswith(".2bit") or name.endswith(".cfg") | ||
| ) | ||
|
|
||
| names.update(get_genomes(os.listdir(self.data_dir))) |
There was a problem hiding this comment.
This is a good change (list_available_genomes also returns genomes available only in the local dir), can you explicitly add that to the PR description?
genome_kit/gk_data.py
Outdated
| DataManagerImpl = list(eps)[0].load() | ||
| except: | ||
| DataManagerImpl = DefaultDataManager | ||
| DataManagerImpl = DefaultDataManager |
There was a problem hiding this comment.
Won't this change break code that relies on implicitly loading the data manager from a plugin?
There was a problem hiding this comment.
sorry this was test code
There was a problem hiding this comment.
Note: It's hard to see on Github UI, but the file moves in the PR all just remove the appris version from the file name, e.g
appris.2018_12.v28_gencode.v29.mini.v7.pkl => appris.gencode.v29.mini.v7.pkl
Can this also be explicitly called out in the PR description?
|
I'll maybe leave you to squash it (when ready), as I'm not sure how to upload the data ahead of a release anymore. |
Previously, GK hardcoded the filenames to encode what upstream version was used to build APPRIS/MANE. The issue with this is that older versions of GK could not be updated to use new APPRIS/MANE (ie, a new package with updated locator code needed to be released).
This change brings APPRIS/MANE in line with annotations: if they exist on the data source, then it's available to use in GK.
To achieve this, all the APPRIS/MANE filenames now only contain the annotation and the binary GK version to make things discoverable without the hardcoded mappings.
A clubbed change is also to list any local genomes for
genome.list_available_genomes: this helps with building APPRIS/MANE on any annotations that have yet to be uploaded.drive-by change: this change also fixes gencode.v41 to use the correct source (e107 now instead of e103).