Skip to content

Fixes velocity units and adds support for scale_factors in NCDFReader#2326

Merged
orbeckst merged 14 commits intoMDAnalysis:developfrom
IAlibay:NetCDFVEL
Aug 19, 2019
Merged

Fixes velocity units and adds support for scale_factors in NCDFReader#2326
orbeckst merged 14 commits intoMDAnalysis:developfrom
IAlibay:NetCDFVEL

Conversation

@IAlibay
Copy link
Member

@IAlibay IAlibay commented Aug 15, 2019

Fixes #2323

Changes made in this Pull Request:

  • Now checks for presence of scale_factor in all AMBER NetCDF convention variables
  • Adds support for scale_factor values, set in NCDFReader __init__ and multiplied through in _read_frame.
  • Implements several new exception checks, mainly revolving around ensuring that units are properly adhered to.
  • Also included in this PR is part framework for the introduction of an NCRST file reader (singleframe version of the NCDFReader for .ncrst files). These can be seen in the _NCDFGenerator class of test_netcdf.py

Questions/Issues:

  • The NCDFWriter uses degrees units for cell_angles instead of the NetCDF convention degree (singular). I have kept support for the plural version, however, should we be thinking about changing the writer to using the singular version so that we properly adhere to the convention?
  • The NCDFWriter currently writes everything without scale_factor values. However, as far as I can tell all AMBER MD engines instead write out velocities with a scale_factor of 20.455 and the actual values in Angstrom/AKMA time units. Whilst this shouldn't be a problem for readers that properly implement the AMBER NetCDF convention, I have a suspicion that isn't actually the case for a lot of other tools/libraries. Would it be worth being more prudent by altering writing so that it matches the AMBER MD engines?

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@codecov
Copy link

codecov bot commented Aug 15, 2019

Codecov Report

Merging #2326 into develop will increase coverage by 0.06%.
The diff coverage is 98.66%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2326      +/-   ##
===========================================
+ Coverage    89.82%   89.88%   +0.06%     
===========================================
  Files          173      173              
  Lines        21711    21767      +56     
  Branches      2846     2857      +11     
===========================================
+ Hits         19502    19566      +64     
+ Misses        1615     1609       -6     
+ Partials       594      592       -2
Impacted Files Coverage Δ
package/MDAnalysis/coordinates/TRJ.py 94.81% <98.66%> (+3.3%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b227a8...168a900. Read the comment docs.

@orbeckst
Copy link
Member

orbeckst commented Aug 16, 2019

@IAlibay thanks fore the detailed analysis:

Questions/Issues:

  • The NCDFWriter uses degrees units for cell_angles instead of the NetCDF convention degree (singular). I have kept support for the plural version, however, should we be thinking about changing the writer to using the singular version so that we properly adhere to the convention?

Which convention do you refer to? AMBER or NetCDF? Can you link to what you're referring to?

In principle we should implement whatever the specs for the file format state. IIRC the AMBER NetCDF specs are actually pretty thorough (unlike the mess that "DCD" is). What do they say?

EDIT: see comments in review (read both until 1.0, auto-correct, issue deprecation warning for auto-correct degrees)

  • The NCDFWriter currently writes everything without scale_factor values. However, as far as I can tell all AMBER MD engines instead write out velocities with a scale_factor of 20.455 and the actual values in Angstrom/AKMA time units. Whilst this shouldn't be a problem for readers that properly implement the AMBER NetCDF convention, I have a suspicion that isn't actually the case for a lot of other tools/libraries. Would it be worth being more prudent by altering writing so that it matches the AMBER MD engines?

Sounds sensible to me, especially once we properly support scale_factor.

Such changes need to be properly advertised

  • CHANGELOG in the Changes section (in addition to a fix for supporting scale_factor)
  • .. versionchanged:: 0.20.0 + explanation similar to the above in the docs

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, I appreciate your attention to the detail of the format and the thoughts that you put into testing.

In addition to your broader questions (see #2326 (comment)) please consider the comments inline. Thanks!

@IAlibay
Copy link
Member Author

IAlibay commented Aug 16, 2019

Thanks for reviewing @orbeckst

So with regards to changing the NCDFWriter behaviour, I was thinking it might make sense to address this in a separate PR, if you think it's appropriate? My rationale was mainly that it addresses a different feature change to what was mentioned in issue #2323.

@IAlibay thanks fore the detailed analysis:

Questions/Issues:

  • The NCDFWriter uses degrees units for cell_angles instead of the NetCDF convention degree (singular). I have kept support for the plural version, however, should we be thinking about changing the writer to using the singular version so that we properly adhere to the convention?

Which convention do you refer to? AMBER or NetCDF? Can you link to what you're referring to?

In principle we should implement whatever the specs for the file format state. IIRC the AMBER NetCDF specs are actually pretty thorough (unlike the mess that "DCD" is). What do they say?

EDIT: see comments in review (read both until 1.0, auto-correct, issue deprecation warning for auto-correct degrees)

Just for reference, my "degrees" -> "degree" coment was based on http://ambermd.org/netcdf/nctraj.xhtml Section 6.2: cell_angles.

  • The NCDFWriter currently writes everything without scale_factor values. However, as far as I can tell all AMBER MD engines instead write out velocities with a scale_factor of 20.455 and the actual values in Angstrom/AKMA time units. Whilst this shouldn't be a problem for readers that properly implement the AMBER NetCDF convention, I have a suspicion that isn't actually the case for a lot of other tools/libraries. Would it be worth being more prudent by altering writing so that it matches the AMBER MD engines?

Sounds sensible to me, especially once we properly support scale_factor.

Such changes need to be properly advertised

  • CHANGELOG in the Changes section (in addition to a fix for supporting scale_factor)
  • .. versionchanged:: 0.20.0 + explanation similar to the above in the docs

So with regards to the writer, do we want to implement a single writing format, i.e. no scale_factors for everything but velocities? Or would it be preferred to have either user defined scale_factors or somehow pick up whatever has been provided on read?

The latter case definitely offers more user freedom, albeit at the cost of a lot more complexity. Although, honestly can't think of a normal use case for it. In my experience AMBER style NetCDF files generally just adhere to whatever format the MD engines write in (note: I think those generated by Yank might be different, but I'm not sure).

@orbeckst
Copy link
Member

We can address the Writer in a separate PR – can you open an issue and add a comment to the code with a TODO to reference the issue? Just in case @richardjgowers creates 0.20.0 before the other PR is done...

Or would it be preferred to have either user defined scale_factors or somehow pick up whatever has been provided on read?

That seems the best approach, especially if velocities gets the scale_factor. If scale_factor are supported for all dimensions then let's support them if we can.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor things...

@IAlibay
Copy link
Member Author

IAlibay commented Aug 18, 2019

@orbeckst Looks like Travis is failing in the MacOS py3.6 build.

I think I have it narrowed down to the recent release of pytest 5.1.0. Looks like the explicit kwargs handling was changed: pytest-dev/pytest@683b263

The other builds aren't pip-installing pytest, so they are still using the older 5.0.1 version of pytest.

I'll raise an issue on this.

Fixed in #2330

@orbeckst
Copy link
Member

All looking good to me, let's wait for green CI. Feel free to ping me when you think it's ready for a squash-merge.

@IAlibay
Copy link
Member Author

IAlibay commented Aug 19, 2019

@orbeckst Travis is failing on the py2.7 MacOS build for test_libdcd.py. It looks like it might be a non-deterministic timings issue (line 1549 in https://travis-ci.com/MDAnalysis/mdanalysis/jobs/226411993).

I'm not sure if this is common? I could try to re-start Travis by closing/re-opening the PR if you think it might help?

@orbeckst
Copy link
Member

This does not look like something related to this PR and in principle I'd be happy to merge. However, I just restarted Job #1176.10 and I expect it come back as passing, in which case we can all be happy and merge.

@orbeckst orbeckst merged commit b7a8590 into MDAnalysis:develop Aug 19, 2019
@orbeckst
Copy link
Member

Thanks @IAlibay !

@IAlibay IAlibay deleted the NetCDFVEL branch August 20, 2019 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NCDFReader velocities can sometimes be improperly scaled

2 participants