-
Notifications
You must be signed in to change notification settings - Fork 10
Add ChemicalSystem.isin method
#608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #608 +/- ##
=======================================
Coverage 99.03% 99.04%
=======================================
Files 40 40
Lines 2394 2410 +16
=======================================
+ Hits 2371 2387 +16
Misses 23 23 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
IAlibay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a utility standpoint, this does everything we need it to. So I'm giving it my thumbs up with that hat on.
The only thing I'm not sure about is if the return should be tuple[bool, list[Component]] or just a list that could be empty if you're asking for return_matches.
gufe/chemicalsystem.py
Outdated
| matches.append(comp) | ||
|
|
||
| if return_matches: | ||
| return len(matches) > 0, matches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just adding this here so that @atravitz can weigh in.
I'm not 100% sure if it might not be better for this to be just a return on matches. That way the return width is always 1 and len(matches) == 0 just means no matches.
edit: by not sure, I really do mean it, I can't tell which is more Pythonic.
atravitz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed offline, I prefer the suggestion of returning an empty list if there are no matches.
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
|
@jthorton I'd like to step back and understand what problem we're trying to solve with this. Language Note: I believe this function should be named See pandas isin and contains as reference. 1. Checking for a specific Component present in a Chemical SystemCurrently, to check for the existence of a specific specific_component in my_chemical_system.components.values()which I personally don't think needs a wrapper function, but I'm open to it, since I believe that's what is requested in #566. If we something that's fast on large systems (by exiting after the first match found) and just returns a class ChemicalSystem
...
def contains(self, value: type[Component] | Component) -> bool:where you don't even build up a list, just return a 2. Checking for any instance of a Component present in a Chemical System2a)The 2b)Alternatively , we add a function SolventComponent in my_chemical_system.get_component_types()which would make the septop code way simpler: required_component_types = [SolventComponent, ProteinComponent]
for component_type in required_component_types:
if component_type not in state3. Retrieving all instances of a Component Type that are present in a Chemical SystemI think this makes sense as a separate function, since the (from the tests) isin, matches = solvated_complex.isin(prot_comp, return_matches=True)
assert isin is True
assert matches == [prot_comp]In this case, I agree it'd be a helpful utility to have a function that returns all instances of given type within a chemical system, and this already is used in the We can flip that around and make it a method of class ChemicalSystem
...
def get_components_of_type(self, type:type[Component])->list[Component]:
# insert logic you already wrote for thisseptop already uses this functionality, so we'd just be polishing it and putting it upstream in gufe |
|
Oh, this is great. I also wrote a utility function for this in feflow for the protein mutation protocol. Just in case you find anything useful there, but also great to see we all need this functionality in some way. https://github.com/OpenFreeEnergy/feflow/blob/1bcfdd31d069f166adfa616fdc45faf4aa71085d/feflow/utils/misc.py#L12 I like the way to extract specific types of components instead of getting a list of all the types that we then have to dissect further. The main difference from my approach is that I'm using a |
atravitz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @jthorton! Just a couple things to address and some non-blocking thoughts. And optionally get @hannahbaumann and @ijpulidos feedback since they'll be users.
Co-authored-by: Alyssa Travitz <31974495+atravitz@users.noreply.github.com>
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
atravitz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! I just fixed some docs formatting. merge when ready @jthorton
Fixes #566 by implementing an isin method on a ChemicalSystem which can check if an instance of type of a Component are present in the system and optionally return any matches.
Tips
Since this will create a commit, it is best to make this comment when you are finished with your work.
Checklist
newsentryDevelopers certificate of origin