Skip to content

plane and axis centering transformations#1973

Closed
davidercruz wants to merge 22 commits intoMDAnalysis:developfrom
davidercruz:more-translations
Closed

plane and axis centering transformations#1973
davidercruz wants to merge 22 commits intoMDAnalysis:developfrom
davidercruz:more-translations

Conversation

@davidercruz
Copy link
Contributor

Adds plane and axis centering transformations

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@coveralls
Copy link

coveralls commented Jul 7, 2018

Coverage Status

Coverage decreased (-0.03%) to 89.922% when pulling e066f17 on davidercruz:more-translations into 1a1b2f1 on MDAnalysis:develop.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! I forgot to click "submit". I am very sorry about that. It is great that Github saves pending reviews!!!!!


Returns
-------
:class:`~MDAnalysis.coordinates.base.Timestep` object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line could be just MDAnalysis.coordinates.base.Timestep without the :class: and "object". See https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard.

is centered on an axis. This axis can only be parallel to the x, y or z planes.

Example
-------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to tell what your example is doing, or I just see random code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the example is aimed at the user which is not really supposed to use the function directly on a TimeStep. Your example should probably use the transformation feature.

center = np.asarray([_origin[0], ag_center[1], _origin[2]], np.float32)
if axis == 'z':
center = np.asarray([_origin[0], _origin[1], ag_center[2]], np.float32)
vector = center - ag_center
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed that the dtype of vector will be float32?

Copy link
Contributor Author

@davidercruz davidercruz Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

center_of_mass and center_of_geometry output an array of dtype float32. I can typecast it again, but it isn't really necessary.

ps - unless someone changes the center_of functions in the future. If that changes than tests will likely fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to typecast again. Though, it would be good to have a test for each transformation to make sure the coordinates have the correct dtype.

----------
ag: AtomGroup
atom group to be centered on the unit cell.
plane: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'x', 'y', and 'z' are not planes so your explanation is unclear. I guess if I want to center on the 'XY' plane, I need to provide 'z'; but I am not really sure. This makes for an unclear API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still an issue.

Copy link
Contributor Author

@davidercruz davidercruz Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centering on xz,xy and yz planes is not trivial. Since these planes cross more than 1 axis, the center of these planes is the center of the cross-section area formed by these planes in the unit cell.
Currently, the code only works with planes that are parallel to x=0, y=0 or z=0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it, assuming a orthogonal box, centering on XY means you do not change the Z coordinate. It is always tricky to have a mental picture of an arbitrary triclinic box, but assuming the lower triangle of the box matrix is 0, then it should work the same. If the triangle is not 0, then it is trickier as you do have to change Z; but we can make sure it is not the case and fail if it is.

Copy link
Contributor Author

@davidercruz davidercruz Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay I see I was confused, sorry. XY is the same as z=0, XZ is y=0 and YZ is x=0. I was talking about non-perpendicular planes, sorry. I will simply change the strings to match.


No newline at end of file


def center_in_plane(ag, plane, coordinate=0, origin=None, center='geometry', wrap=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why defaulting 'origin' to None and not its value?


"""

if isinstance(plane, string_types):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to test the type. You just need to test is it is not None and not in 'xyz'. In general, testing on type should be avoided. Python is keen on duck-typing. If I write an object that pretends to be 'x', you should not discriminate against it if you can avoid it.

if plane == 'z':
shift = _origin[2]+coordinate
position = [boxcenter[0], boxcenter[1], shift]
ts = center_in_box(ag, center=center, point=position, wrap=wrap)(ts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are doing for every frame all the checks that you removed from the wrapped function in center_in_box. This does not seem very efficient.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments on top of the previous ones that are still valid.


Enhancements

* Added box/axis/plane centering trajectory transformations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this line to the changelog, you can update the other line about centering method by listing this PR next to the other one.

Returns
-------
:class:`~MDAnalysis.coordinates.base.Timestep` object
`~MDAnalysis.coordinates.base.Timestep`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the type, no back-tick nor tilde.

.. code-block:: python

ag = u.residues[1].atoms
ts = MDAnalysis.transformations.center_in_axis(ag,'x', [0,0,1], center='mass')(ts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comment on the example applies here as well.

center = np.asarray([_origin[0], ag_center[1], _origin[2]], np.float32)
if axis == 'z':
center = np.asarray([_origin[0], _origin[1], ag_center[2]], np.float32)
vector = center - ag_center
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to typecast again. Though, it would be good to have a test for each transformation to make sure the coordinates have the correct dtype.

@codecov
Copy link

codecov bot commented Jul 11, 2018

Codecov Report

Merging #1973 into develop will increase coverage by 0.03%.
The diff coverage is 97.95%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1973      +/-   ##
===========================================
+ Coverage    89.44%   89.48%   +0.03%     
===========================================
  Files          157      157              
  Lines        18775    18857      +82     
  Branches      2713     2722       +9     
===========================================
+ Hits         16794    16874      +80     
- Misses        1378     1379       +1     
- Partials       603      604       +1
Impacted Files Coverage Δ
package/MDAnalysis/transformations/__init__.py 100% <100%> (ø) ⬆️
package/MDAnalysis/transformations/translate.py 98.33% <97.93%> (-1.67%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abcdb1e...589ce9c. Read the comment docs.

@jbarnoud
Copy link
Contributor

@davidercruz My comments on the API still stand:

  • Your functions are called center_* but do not move the atom group to the center of the box by default;
  • the name origin is confusing, especially next to center and the name of the functions;
  • x, y, and z are not planes, you cannot use them as values for the plane argument of center_to_plane

At this point, I think these are the main issues. The rest looks OK, but depends on the API.

For origin and center, I suggest to rename them center_from and center_to, respectively. The naming is not super consistent with the rest but you could change your other transformations to match that nomenclature. These names are likely not the best, but I think they are less confusing.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you:

  • remove the d argument. The name is bad (it is not explicit at all), and it triger a behaviour that is out of the scope of the function;
  • avoid the succesion of ifs with the dict trick I showed you;
  • rename center_from to center_to, one of my earlier comment was confusing.

Copy link
Contributor

@jbarnoud jbarnoud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have connection issues so I am reviewing from my phone. It is less than optimal so I expect I missed some stuff...


def wrapped(ts):
if isinstance(center_from, string_types) and center_from == 'center':
if isinstance(center_to, string_types) and center_to == 'center':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to test for the type. Testing for the value is enough.

used to define the plane on which the given AtomGroup will be centered. Suported
planes are yz, xz and xy planes.
center_to: array-like or str
coordinate from which the axes are centered. Can be an array of three coordinate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coordinates to which...

"""

pbc_arg = wrap
if plane is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can 'plane' be None? What would it mean?

if center_to != 'center':
_origin = center_to
position[plane] = _origin[plane]
vector = np.asarray(position, np.float32) - center_method()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you really working in 2D there? From what I read you are centering in the box, not the plane.

raise ValueError('{} is not a valid "center_to"'.format(center_to))
else:
center_to = np.asarray(center_to, np.float32)
if center_to.shape != (3, ) and center_to.shape != (1, 3):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it actually work with a (1, 3) vector ? Did you try?

Copy link
Member

@kain88-de kain88-de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. I mainly would like the center calculations to make use of the flexibility in MDAnalysis that we expose in the encore module

raise ValueError('{} is not a valid point'.format(point))
try:
if center == 'geometry':
if center_of == 'geometry':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have center_of here all the time? Can't you make use of the generic center as well? it allows to center with respect to arbitrary weights. People will want to do this. Think of IDP proteins where you want to calculate a center with respect to the flexibility of residues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have an argument weights='mass' like we have for the encore methods. The argument would accept mass and centroid as string arguments or an array-like object or None.

axis = axes[axis]
if center_to is not None:
if isinstance(center_to, string_types):
if not center_to == 'center':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary check and nesting of if clauses. You can leave out the isinstance check in the first place. Here you also have an unnecessary nesting of the if clause. It would be better to write the two tests into the same if-clause.

which sets [0, 0, 0] as the origin of the axis.
center_of: str, optional
used to choose the method of centering on the given atom group. Can be 'geometry'
or 'mass'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to be able to give this arbitrary weights to calculate the center. This is for all your methods in all PRs you have so far. It's a useful addition and makes MDAnalysis more flexible. I actually ran across an issue where I would like to calculate the center with respect to arbitrary weights several times. I'm not aware that any other tool besides MDAnalysis allows to do this.

.. code-block:: python

ag = u.residues[1].atoms
transform = MDAnalysis.transformations.center_in_axis(ag, 'x', [0,0,1], center_of='mass')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mda.transformations....

@kain88-de
Copy link
Member

looks like there are some python 2.7 specific errors left

@davidercruz
Copy link
Contributor Author

davidercruz commented Jul 22, 2018

apparently bytes from strings (b'xy' as plane for example) behave differently in python 2 and python 3. While in python 3 the functions fails as expected, in python 2 it doesn't. What should I do? Should I remove this type from the tests?

@kain88-de
Copy link
Member

kain88-de commented Jul 22, 2018 via email

@davidercruz
Copy link
Contributor Author

@kain88-de I think I had already faced this problem in the groupby PR months ago. Python 2 doesn't have a bytes type. So, while a bytes string is different from a regular string in python3, in python 2 it is the same.

except (KeyError, TypeError):
raise ValueError('{} is not a valid plane'.format(plane))

if isinstance(center_to, string_types):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you remove the import of string_types?

@jbarnoud
Copy link
Contributor

jbarnoud commented Aug 2, 2018

@davidercruz Can yuu elaborate why you have to use such a weird workflow to reach your goal? You should only need

transformations = (mda.transformations.unwrap(prot))
                              mda.transformations.center_in_box(prot, wrap=False),)

Is your notebook visible somewhere? If no, could you share it as a github gist?

@davidercruz
Copy link
Contributor Author

@jbarnoud That is fine for a protein in solution, in a trajectory without solvent.
But when you have a membrane protein, and the protein is off center, when you center the protein, part of the membrane will be outside the unit cell. That's why I do the wrap (to put everything back in the unit cell) and the unwrap of the whole system (ag).

@jbarnoud
Copy link
Contributor

jbarnoud commented Aug 2, 2018

@davidercruz And what does the unwrap option of the centering methods (I did not have time to catch up yet).

@davidercruz
Copy link
Contributor Author

@jbarnoud It runs make_whole for the atomgroup that needs to be centered. If #2012 is merged, make_whole will be called directly from atoms.center() just for the calculation, but without changes to the actual AtomGroup.

@richardjgowers
Copy link
Member

richardjgowers commented Aug 2, 2018 via email

@jbarnoud
Copy link
Contributor

jbarnoud commented Aug 2, 2018

The unwrap argument is a good idea, though I would not encourage it in a transformation workflow as would will end up unwrapping the same molecules multiple times. For transformations, a function that puts things back in the expected image would be better. It would combine the two last steps in your workflow, hopefully in a most efficient way.

About the unwrap argument: what happens if I say warp=True, unwrap=True? At some point, somebody suggested having a pbc argument that could take 'no', 'wrap', or 'unwrap' as value. This could be an option.

@davidercruz
Copy link
Contributor Author

davidercruz commented Aug 2, 2018

@jbarnoud It raises a ValueError. My inital thought was to simply give priority to unwrap and if both were True, then wrap would become False, but now I think an error is better.
Should I added this to the docs?

@davidercruz
Copy link
Contributor Author

davidercruz commented Aug 2, 2018

@richardjgowers @jbarnoud
I completely forgot that ag.wrap(compound='fragments') existed. I can create a clean() transformation that does just this.

Thus the workflow could be

ag = u.atoms
prot = u.select_atoms('protein')
transformations = (mda.transformations.unwrap(ag))
                   mda.transformations.center_in_box(prot),
                   mda.transformations.clean(ag))

@kain88-de
Copy link
Member

I completely forgot that ag.wrap(compound='fragments') existed. I can create a clean() transformation that does just this.

why not add the compound keyword to the wrap transformation? Can't you simply pass all kwargs along?

@kain88-de
Copy link
Member

when you center the protein, part of the membrane will be outside the unit cell

A colleague has asked about exactly that problem and if the transformations can help with it.

@orbeckst
Copy link
Member

Could someone knowledgeable with the discussion (@davidercruz @jbarnoud @richardjgowers @kain88-de ) please summarize with a view towards:

  • Is there a chance that this PR will get finished?
  • What are the major hurdles that would have to be overcome to make it work? (In case someone else might want to pick up the PR.)

Thanks.

@jbarnoud
Copy link
Contributor

jbarnoud commented Dec 14, 2018 via email

@orbeckst
Copy link
Member

orbeckst commented Dec 14, 2018 via email

@davidercruz
Copy link
Contributor Author

As soon as @jbarnoud submits his review I'll make the changes necessary :D

@orbeckst
Copy link
Member

@jbarnoud @richardjgowers @davidercruz any chance that this gets into 0.20.0 #2240 (as we had promised in the 0.19.x blog post)?

orbeckst added a commit to MDAnalysis/binder-notebook that referenced this pull request Apr 26, 2019
- fix #13
- needs MDAnalysisData >= 0.7.0 for the "membrane peptide" dataset
- needs MDAnalysis >= 0.20.0 for the transformations
  (currently PRs MDAnalysis/mdanalysis#2038 ,
  MDAnalysis/mdanalysis#1991 , and
  MDAnalysis/mdanalysis#1973 )
- updated text in notebook
- adde a few more empty lines for clarity
orbeckst added a commit to MDAnalysis/binder-notebook that referenced this pull request Apr 29, 2019
- fix #13
- needs MDAnalysisData >= 0.7.0 for the "membrane peptide" dataset
- needs MDAnalysis >= 0.20.0 for the transformations
  (currently PRs MDAnalysis/mdanalysis#2038 ,
  MDAnalysis/mdanalysis#1991 , and
  MDAnalysis/mdanalysis#1973 )
- updated text in notebook
- adde a few more empty lines for clarity
@orbeckst
Copy link
Member

orbeckst commented Jun 4, 2019

@jbarnoud @kain88-de – sorry to ping you both but you both have open review requests on this PR. Can you provide a quick note if these requests still stand or if they have been addressed? Thanks!

@IAlibay IAlibay added the close? Evaluate if issue/PR is stale and can be closed. label May 18, 2021
@hmacdope hmacdope added the stale label Nov 8, 2023
@hmacdope
Copy link
Member

hmacdope commented Nov 8, 2023

Closing as stale, feel free to re-open if you want to continue.

@hmacdope hmacdope closed this Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

close? Evaluate if issue/PR is stale and can be closed. pbc_project stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants