Skip to content

fix(appris/mane): allow data source to determine availability#157

Merged
ovesh merged 2 commits intomainfrom
fix/allow_appris_version_overrides
May 6, 2025
Merged

fix(appris/mane): allow data source to determine availability#157
ovesh merged 2 commits intomainfrom
fix/allow_appris_version_overrides

Conversation

@s22chan
Copy link
Collaborator

@s22chan s22chan commented Apr 18, 2025

Previously, GK hardcoded the filenames to encode what upstream version was used to build APPRIS/MANE. The issue with this is that older versions of GK could not be updated to use new APPRIS/MANE (ie, a new package with updated locator code needed to be released).

This change brings APPRIS/MANE in line with annotations: if they exist on the data source, then it's available to use in GK.

To achieve this, all the APPRIS/MANE filenames now only contain the annotation and the binary GK version to make things discoverable without the hardcoded mappings.

A clubbed change is also to list any local genomes for genome.list_available_genomes: this helps with building APPRIS/MANE on any annotations that have yet to be uploaded.


drive-by change: this change also fixes gencode.v41 to use the correct source (e107 now instead of e103).


@s22chan s22chan requested a review from ovesh April 18, 2025 18:38
@s22chan
Copy link
Collaborator Author

s22chan commented Apr 18, 2025

@ovesh What's the process now to upload new files?

I'll also need write access to the DG plugins.

Previously, GK hardcoded the filenames to encode what upstream version was used to build APPRIS/MANE. The issue with this is that older versions of GK could not be updated to use new APPRIS/MANE (ie, a new package with updated locator code needed to be released).

This change brings APPRIS/MANE in line with annotations: if they exist on the data source, then it's available to use in GK.

BREAKING CHANGE: this change also fixes gencode.v41 to use the correct source (e107 now instead of e103).
@s22chan s22chan force-pushed the fix/allow_appris_version_overrides branch from cd58d14 to 1d51fe7 Compare April 18, 2025 19:04
@ovesh
Copy link
Contributor

ovesh commented Apr 22, 2025

@ovesh What's the process now to upload new files?

For now, uploads to the open source bucket have to go through me.

if name.endswith(".2bit") or name.endswith(".cfg")
)

names.update(get_genomes(os.listdir(self.data_dir)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change (list_available_genomes also returns genomes available only in the local dir), can you explicitly add that to the PR description?

DataManagerImpl = list(eps)[0].load()
except:
DataManagerImpl = DefaultDataManager
DataManagerImpl = DefaultDataManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this change break code that relies on implicitly loading the data manager from a plugin?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry this was test code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: It's hard to see on Github UI, but the file moves in the PR all just remove the appris version from the file name, e.g
appris.2018_12.v28_gencode.v29.mini.v7.pkl => appris.gencode.v29.mini.v7.pkl

Can this also be explicitly called out in the PR description?

@s22chan
Copy link
Collaborator Author

s22chan commented Apr 22, 2025

I'll maybe leave you to squash it (when ready), as I'm not sure how to upload the data ahead of a release anymore.

@ovesh ovesh merged commit 41c1fbe into main May 6, 2025
12 checks passed
@ovesh ovesh deleted the fix/allow_appris_version_overrides branch May 6, 2025 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants