Conversation
|
I still need to write quite a lot of tests, but here is already the demo I am using to write the thing: https://gist.github.com/jbarnoud/80c13803b551ce4853eb1b5cb3d9e4a8 |
|
For test purposes I would need a system with a ridiculously large number of segments. Is there a way to alter a universe to reasign segments? I could write a function that takes a universe and recreate a |
|
@jbarnoud you should be able to use |
package/MDAnalysis/core/groups.py
Outdated
| Group with elements of `self` and `other` concatenated | ||
|
|
||
| """ | ||
| def _get_other_index(self, other): |
There was a problem hiding this comment.
I don't like this method because it does two things at once (both the level check, as well as returned .ix). I do like centralising the level check so we get consistent error messages etc, but I'd prefer something like a decorator:
@only_same_level
def __add__(self, other):
# at this point, same_level check has happened
@only_same_level
def union(self, other):
etcOr if not a decorator, we could have a level check function inside each method
There was a problem hiding this comment.
The decorator is a good idea indeed. I'll move to that.
|
|
||
| @staticmethod | ||
| def make_groups(u, level): | ||
| a = getattr(u, level)[10:100] |
There was a problem hiding this comment.
If you use the default make_Universe() system, you should be able to get away with using only 5 segments to test sets. There's nothing about the set operations which requires more than 5 objects
There was a problem hiding this comment.
make_universe seems indeed to be the solution. I'll see how to reduce the number of element, I was just being greedy 😃
| yield self._test_union, a, b, c, d | ||
| yield self._test_intersection, a, b, c, d | ||
| yield self._test_difference, a, b, c, d | ||
| yield self._test_symmetric_difference, a, b, c, d |
There was a problem hiding this comment.
I know you've probably already thought of this, but we also need predictable behaviour when we work with duplicate items in a Group. Ie u.atoms[[0, 1, 2, 1, 2]] | u.atoms[[3, 4]] == u.atoms[[0, 1, 2, 3, 4]]
package/MDAnalysis/core/groups.py
Outdated
| | ``s.issuperset(t)`` | | test if all elements of | | ||
| | | | ``t`` are part of ``s`` | | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``s.concatenate(t)`` | ``s + t`` | new Group with elements | |
There was a problem hiding this comment.
Probably worth mentioning that this preserves order (and duplicates?), whereas .union will return a sorted unique version
There was a problem hiding this comment.
I plan to split the table between set operations and not set operations.
| | ``s.difference(t)`` | ``s - t`` | new Group with elements of | | ||
| | | | ``s`` that are not in ``t``| | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``s.symmetric_difference(t)`` | ``s ^ t`` | new Group with elements | |
There was a problem hiding this comment.
It'd be cool if the symbol equivalent worked, you could override __xor__ etc
There was a problem hiding this comment.
This is planned, indeed.
|
We had some discussion about |
|
Time to step away for a bit. I think I dressed all of @richardjgowers comments:
In addition, I added the All the set operators are implemented. I still have some testing to do. Especially, I need to test that levels cannot be mixed, and that the methods work when provided an elements. I also need to fix the error messages in the |
|
I did not notice that |
|
I think it is ready for review. I'll rebase once it is ready to be merged. |
package/MDAnalysis/core/groups.py
Outdated
| the same order. Groups that are not at the same level or that belong | ||
| to different universe cannot be compared. | ||
| """ | ||
| o_ix = other.ix_array |
There was a problem hiding this comment.
Here you probably need to use .ix otherwise Atoms evaluate equal to AtomGroups, eg u.atoms[1] == u.atoms[[1]]
substract -> subtract reworked subtract to use numpy set operations tried to use self.ix and self.universe rather than shortcutting
Suggested changes for PR #1239
substract -> subtract reworked subtract to use numpy set operations tried to use self.ix and self.universe rather than shortcutting
|
I rebased, squashed, fixed the conflict, and force pushed. |
|
Ok this is a big change(tm) so we should get a +2, poke @MDAnalysis/coredevs One thing I thought of looking at it today, we implement |
| | | | common to ``s`` and ``t`` | | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``s.difference(t)`` | | new Group with elements of | | ||
| | | | ``s`` that are not in ``t``| |
There was a problem hiding this comment.
Wouldn't s - t be the equivalent here?
There was a problem hiding this comment.
Ah, I see why not: the treatment as set.
There was a problem hiding this comment.
Nope, currently its:
ag1 = u.atoms[[1, 2, 1, 2, 3, 4]]
ag2 = u.atoms[[3, 4, 5]]
ag1 + ag2 == [1, 2, 1, 2, 3, 4, 3, 4, 5]
ag1 & ag2 == [3, 4]
ag1 | ag2 == [1, 2, 3, 4, 5]
ag1 - ag2 == [1, 2, 1, 2] # ie take from ag1 unless in ag2@jbarnoud Maybe we should make __sub__ == difference and leave subtract as method only access?
There was a problem hiding this comment.
@richardjgowers I went back and forth to many time about it, I cannot have an opinion on that anymore. The current version follows @orbeckst suggestion.
There was a problem hiding this comment.
I vote for the current status. Since + is not a set operator I think that it's more intuitive that - also isn't.
There was a problem hiding this comment.
Alternatively, we could use another unused operator for difference() (such as %).
Not sure I love such a hackish idea, though.
There was a problem hiding this comment.
I think we are converging on using - for the set difference.
|
An other thing people use to set operators can trip on is that sets overload the comparison operators to test if a set is a subset/superset of an other. I chose not to do the same here, to avoid confusion with how the comparison operators are used for components in mda. All these symbol ambiguity may be annoying for a user that often use sets and their operators. On the other hand, I mostly wrote these methods to be able to do this: dppc_upper = upper_leaflet & dppc |
|
Is '/' already used?
'\' would be the "proper" set difference operator but that would be super-stupid to use in Python…
… On 14 Mar, 2017, at 05:22, mnmelo ***@***.***> wrote:
Alternatively, we could use another unused operator for difference() (such as %).
Not sure I love such a hackish idea, though.
--
Oliver Beckstein * orbeckst@gmx.net
skype: orbeckst * orbeckst@gmail.com
|
|
@orbeckst |
|
We can't use symbols differently to the rest of Python, ie people shouldn't have to relearn the language because of our design choices. As far as I can see we can either do:
I think I'm in camp 2 so that @jbarnoud 's snippet of |
|
I am fine with keeping
EDIT: I mean, assign So yes: Option 2 in #1239 (comment) |
orbeckst
left a comment
There was a problem hiding this comment.
Looking really nice as far as I can tell.
I am just blocking it until the - operator is resolved.
package/MDAnalysis/core/groups.py
Outdated
| | ``s.concatenate(t)`` | ``s + t`` | new Group with elements | | ||
| | | | from ``s`` and from ``t`` | | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``s.subtract(t)`` | ``s - t`` | new Group with elements | |
There was a problem hiding this comment.
remove - operator for subtract(). (see discussion)
package/MDAnalysis/core/groups.py
Outdated
| | | | contain the same elements | | ||
| | | | in the same order | | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``x in s`` | | test if ``x`` is part of | |
There was a problem hiding this comment.
What does "part of s" mean? In exact contigious order or "any elements of x anywhere in s"?
Probably the former when I compare this to issubset() below but that could be made clearer.
There was a problem hiding this comment.
I'll clarify this. It means component x is part of group s.
| | | | common to ``s`` and ``t`` | | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``s.difference(t)`` | | new Group with elements of | | ||
| | | | ``s`` that are not in ``t``| |
There was a problem hiding this comment.
I think we are converging on using - for the set difference.
package/MDAnalysis/core/groups.py
Outdated
| return other.ix in self.ix | ||
|
|
||
| def __sub__(self, other): | ||
| return self.subtract(other) |
There was a problem hiding this comment.
change to self.difference(other)
| """ | ||
| return self._ix | ||
|
|
||
| @property |
There was a problem hiding this comment.
Why not just call it ix? The added _array sounds superfluous.
But I admit to my ignorance of the topology system: apparently there's also a ix already defined and that's not what you want? Under which circumstances is ix_array != ix? For Atom? Add some more docs. Perhaps add a SeeAlso or link to :meth:ix.
There was a problem hiding this comment.
self.ix has a different meaning in ComponentBase and in GroupBase. For groups, self.ix == self.ix_array; both are the array of indices of the components. For components, instead, self.ix is the index of the component as an integer. self.ix_array gives a common interface between groups and components to get the index as an array without having to check for the type.
I'll add a word in the doc.
| @_only_same_level | ||
| def difference(self, other): | ||
| """Elements from this Group that do not appear in another | ||
|
|
There was a problem hiding this comment.
This is not the same as subtract.
One can also use -:
ag1 - ag2or
ag1.difference(ag2)| @_only_same_level | ||
| def symmetric_difference(self, other): | ||
| """Group of elements which are only in one of this Group or another | ||
|
|
| @_only_same_level | ||
| def intersection(self, other): | ||
| """Group of elements which are in both this Group and another | ||
|
|
| return a, b, c, d, e | ||
|
|
||
| @staticmethod | ||
| def make_groups_duplicated_and_scrumbled(u, level): |
There was a problem hiding this comment.
I like "scrumbled" (... even if you mean "scrambled") ;-)
There was a problem hiding this comment.
Argghh. I wrote it with a french accent.
| assert_(str(sg) == '<SegmentGroup [<Segment 4AKE>]>') | ||
|
|
||
|
|
||
|
|
There was a problem hiding this comment.
Impressive battery of tests!
The only thing that seems to be missing is tests for the operator versions, -, &, |, ^.
There was a problem hiding this comment.
Their binding is tested test_shortcut_overriding.
|
Does the change of behavior for |
|
@kain88-de this is purely adding things, ie |
| class GroupBase(_MutableBase): | ||
| """Base class from which a Universe's Group class is built. | ||
|
|
||
| Instances of :class:`GroupBase` provide the following operations that |
There was a problem hiding this comment.
We should link to these docs from another place. People who use MDAnalysis will most likely not look at the docs for the low level classes like GroupBase. The set operations are a quite neat addition and others will likely want to use them. So should this be linked/show-cased in the normal doc rst files or should we just link to this from AtomGroup/ResidueGroup/SegmentGroup?
There was a problem hiding this comment.
I was definitely going to include these operations in #1190 which is the tutorial/introduction docs. But yeah, adding them to GroupBase is effectively hiding them from users
There was a problem hiding this comment.
Can we link to this from the high-level groups? I haven't done a lot of cross linking in rst docs. We could definitely add a See Also section to the high level groups.
|
So the consensus is |
orbeckst
left a comment
There was a problem hiding this comment.
Minor comments but you addressed all my previous questions.
package/MDAnalysis/core/groups.py
Outdated
| | ``x in s`` | | test if ``x`` is part of | | ||
| | | | ``s`` | | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``x not in s`` | | test if ``x`` is not part | |
There was a problem hiding this comment.
not in is not really an operator. It is just the logical opposite of in.
There was a problem hiding this comment.
Is it worthwhile pointing out in the docs that this works as expected?
I am fine with leaving it out as you did, given that everything "just works"™ in a pythonic fashion.
There was a problem hiding this comment.
Hopefully it "just works".
package/MDAnalysis/core/groups.py
Outdated
| AtomGroups can be compared and combined using group operators. For | ||
| instance, AtomGroups can be concatenated:: | ||
|
|
||
| ag_concat = ag1 + ag2 # or ag_concat = ag1.concat(ag2) |
There was a problem hiding this comment.
I wouldn't put the alternative in a comment. Either a separate code line or mention it in text with a link to concat().
| treats the AtomGroups as sets, duplicates are removed from the resulting | ||
| group, and atoms are ordered:: | ||
|
|
||
| ag_union = ag1 | ag2 # or ag_union = ag1.union(ag2) |
| new :class:`AtomGroup` for multiple matches. This makes it | ||
| difficult to use the feature consistently in scripts. | ||
|
|
||
| AtomGroups can be compared and combined using group operators. For |
package/MDAnalysis/core/groups.py
Outdated
| :meth:`difference` is the set version of :meth:`subtract`. The resulting | ||
| group is sorted and deduplicated. | ||
|
|
||
| All set methods are listed in the table bellow. These methods treats the |
| | ``s.difference(t)`` | ``s - t`` | new Group with elements of | | ||
| | | | ``s`` that are not in ``t``| | ||
| +-------------------------------+------------+----------------------------+ | ||
| | ``s.symmetric_difference(t)`` | ``s ^ t`` | new Group with elements | |
|
I started to actually use these operators for real applications. Here is just a little example because I relly enjoy the feature: # Split lipids by leaflet and lipid type
lipids = saturated + unsaturated
# We assume that each lipid is a single residue with its atoms
# ordered from the head to the tails
top_head = sum(r.atoms[0] for r in lipids.residues)
top_head_z = top_head.positions[:, 2]
upper_mask = top_head_z > top_head_z.mean()
upper = lipids.residues[upper_mask].atoms
lower = lipids - upper
upper_saturated = upper & saturated
lower_saturated = lower & saturated
upper_unsaturated = upper & unsaturated
lower_unsaturated = lower & unsaturated |
|
Blog post!!! |
|
@jbarnoud can you do a final clean up of the history and then I am happy to merge (unless anyone else has any further comments). |
|
Haha yeah, blog it!
…On Wed, 15 Mar 2017, 8:47 p.m. Oliver Beckstein, ***@***.***> wrote:
Blog post!!!
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#1239 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AI0jBzCfsi8BXyZ91Wbl10wwb5uyoW8Sks5rmE5VgaJpZM4MYhkw>
.
|
See #726 This commit introduce several method to the GroupBase class, including all the set operators, and a subtract method. Some operator symbols are overridden.
substract -> subtract reworked subtract to use numpy set operations tried to use self.ix and self.universe rather than shortcutting
|
@orbeckst I reworked the history. There is now 4 commits instead on plenty. I did not squash everything together to avoid loosing track @richardjgowers' contribution. |
|
Perfect! I'll just wait for travis, just to make extra sure. |
Fixes #726
Add operators, notably but not only set operators, to GroupBase. Change also the meaning of the
==operator: previously,a == bwas testing instance identoty (same asa is b), now two groups are equals if they are at the same level, from the same universe, and with the same elements in the same order.I would like to change the behavior of the
inoperator. I am not sure of what it does now:I would like to be able to do things like
atom in residue_group, orresidue in segment.PR Checklist