Skip to content

Move the water bridge analysis to the new analysis class#2087

Merged
orbeckst merged 30 commits intoMDAnalysis:developfrom
xiki-tempula:new_wba
Aug 13, 2019
Merged

Move the water bridge analysis to the new analysis class#2087
orbeckst merged 30 commits intoMDAnalysis:developfrom
xiki-tempula:new_wba

Conversation

@xiki-tempula
Copy link
Contributor

@xiki-tempula xiki-tempula commented Oct 3, 2018

Move the water bridge analysis to the new analysis class, add high order water bridge support.
Rewritte the count_by_type and count_by_time function, add custom anaylsis function support.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Move the water bridge analysis to the new analysis class, add high
    Order water bridge support. (PR #2087)
@xiki-tempula
Copy link
Contributor Author

It seems that the line result = [(*key, result_dict[key]*1.0/length) for key in result_dict] is falling in the older versions of python. I wonder if there are any hacks to go around this problem except writing another loop?

@kain88-de
Copy link
Member

yes this is not valid syntax for python 3.4 and older. What did you want to replace?

@xiki-tempula
Copy link
Contributor Author

@kain88-de
My current thought would be replaced with

result = [[i for i in key] for key in result_dict]
[result[i].append(result_dict[key]*1.0/length) for i, key in enumerate(result_dict)]
return result

@kain88-de
Copy link
Member

kain88-de commented Oct 3, 2018 via email

@xiki-tempula
Copy link
Contributor Author

@kain88-de It is a tuple has the form
(from_index, to_index, (from_resname, from_resid, from_name), (to_resname, to_resid, to_name)).

@kain88-de
Copy link
Member

kain88-de commented Oct 3, 2018 via email

@xiki-tempula
Copy link
Contributor Author

xiki-tempula commented Oct 3, 2018

@kain88-de Sorry, I haven't made this clear. This is the default output version. In this update, the user has the freedom to define the key to this dictionary.
The current output is formatted so that it is compatible with the previous version. Since the output is not used by other module in the mda. Perhaps, I could just do [[k, v/length] for k, v in six.iteritems(result_dict)].

@codecov
Copy link

codecov bot commented Oct 5, 2018

Codecov Report

Merging #2087 into develop will increase coverage by 0.07%.
The diff coverage is 91.05%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2087      +/-   ##
===========================================
+ Coverage    89.75%   89.82%   +0.07%     
===========================================
  Files          173      173              
  Lines        21536    21711     +175     
  Branches      2804     2846      +42     
===========================================
+ Hits         19329    19502     +173     
  Misses        1615     1615              
- Partials       592      594       +2
Impacted Files Coverage Δ
...age/MDAnalysis/analysis/hbonds/wbridge_analysis.py 91.54% <91.05%> (+5.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7854011...ebf2fb2. Read the comment docs.

@orbeckst
Copy link
Member

The doc test had stalled. I restarted it. Then we should also see coverage.

Please also fix the conflicts with develop.

@xiki-tempula
Copy link
Contributor Author

@orbeckst Thank you for the advice. I have fixed the conflict but I think the py36 develop build might have some problem which makes the test fails.

@orbeckst
Copy link
Member

I restarted the 3.6 jobs, maybe that helps?

@orbeckst
Copy link
Member

If not, drop an email to the dev list; I am not sure what's wrong.

@orbeckst
Copy link
Member

I just merged @richardjgowers PR #2145 so rebase against develop (or merge develop) and see if this fixes things.

@xiki-tempula
Copy link
Contributor Author

@orbeckst Thank you for the information. Would you mind restart the Travis build, please? I guess there we probably don't need another commit.

@orbeckst
Copy link
Member

Did you merge develop? This is needed for the change to take place. This will automatically run Travis.

@xiki-tempula
Copy link
Contributor Author

@orbeckst Thank you for the advice. I have merged the develop to this PR and it seems that MDAnalysisTests.analysis.test_encore.TestEncore is failing.

@orbeckst
Copy link
Member

orbeckst commented Nov 22, 2018 via email

@orbeckst
Copy link
Member

I restarted the one job that failed.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a big PR. It mixes new features (higher order wbridges) with refactoring (AnalysisBase). It would have been easier if these two things had been addressed separately. In particular, new features automatically imply that this has to go into a 0.20.x (because we use semantic versioning.

Many of my comments are regarding documentation, mainly to make clearer what happens in the module.

I am also concerned with code duplication between the new wbridge analysis and the hbond analysis. I am not 100% sure what the best course forward is but I'd like to hear suggestions. Perhaps create a mixin-class and raise an issue for refactoring hbond analysis; not everything has to be accomplished in one PR (many smaller PRs with weel-defined goals are much better).

In the tests I'd also suggest to reduce code-duplication by using more fixtures as this also leads to cleaner separation of data and test logic.

ADDENDUM

  • remove file package/.DS_Store from the PR
  • rebase against develop (or merge develop)

raise pytest.fail("selection_type aaa should rasie error")

def test_empty_selection(self):
grofile = '''Test gro file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it a fixture

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


def test_loop(self):
'''Test if loop can be handled correctly'''
grofile = '''Test gro file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it a fixture

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return Atomtypes(np.array([guess_atom_type(name) for name in names], dtype=object))


class TestHydrogenBondAnalysis(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be in a different test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is the name wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orbeckst I'm thinking of unifying the output of HydrogenBondAnalysis and WaterBridgeAnalysis. So I have transported the HydrogenBondAnalysis to here to make sure WaterBridgeAnalysis can pass the tests which were written for HydrogenBondAnalysis.

@xiki-tempula
Copy link
Contributor Author

@orbeckst Sorry for bothering you in this busy season of GSoC, I have submitted a commit which passes all the test. Would you mind give a review, please? Thank you. Once this PR is merged, we can probably be thinking of how to use capped_distance to improve the performance.

@orbeckst
Copy link
Member

@xiki-tempula , can you please go through the comments above and for each comment reply how you addressed it or if you decided to not address it an why? Think of it as addressing revisions in a paper. Don't click the "Resolved" button on the comments. I will do this after having read your replies.

Given the decisions regarding refactoring H-bond analysis in #2238 , you don't need to worry about code duplication between HydrogenBondAnalysis and WaterBridgeAnalysis for the moment. This might come up later with the new module.

Reading your summary of changes will make it a lot easier for me to go through things. The PR is so big that I don't have the time to re-read all the lines of code and compare to what I said in December. (As a general rule: the smaller the PR the better the chance that someone can review it in a timely manner.)

Ping me when you're done and I will review your comments and code where necessary.

Thanks!

@xiki-tempula
Copy link
Contributor Author

xiki-tempula commented Aug 5, 2019

@orbeckst Thank you for the comments. I have added my comments to the comments. Most of the issue is with the use of the fixture. Now, all the universe used in the test section is made into a fixture and is listed on the top of the test section.

There are some lines in the test which exceeds 80 characters. I can fix that but I prefer to do that when things are finalised.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just address these minor issues (see comments):

  • remove .DS_Store files – add .DS_Store to .gitignore
  • small documentation issues (see comments)
  • very minor code style issue

# hbonds linking the selection 1 and selection 2 to the bridging
# water 1
[ # hbond 1 from selection 1 to the bridging water 1
<donor index (0-based)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • add the versionchanged to the class
  • add an entry to CHANGELOG's Changes stating that the output format of WaterBridgeAnalysis changed


.. _pandas.DataFrame: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
Note that the result is arranged in the format of (key, proportion of time). When no
custom analysis function is supplied, the key is expended for backward compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • fix expended -> expanded (?)
  • state what it is backward compatible to

key = (s1_resname, s1_resid, s2_resname, s2_resid)
output[key] += 1

w.count_by_type(analysis_func=analysis)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the "in-place" modification requirement for the analysis() function.

As the _timeseries to timeserie conversion will be deprecated in 1.0
this function will automatically lose its value.
"""
def _expand_timeseries(self, entry, output_format=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "old" mean? Make this precise with version information, e.g. "format for release up to 0.19.2"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation has been added.

# atom1 is hydrogen bond donor, position not swapped.
atom1, atom2 = atom1, atom2
else:
raise KeyError('Only \'sele1_sele2\' or \'donor_acceptor\' are allowed as output format.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double quotation marks to get rid of ugly backslash escapes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double quotations added.

See Also
--------
:attr:`table` : structured array of the data
.. versionchanged:: 0.20.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say what changed

.. versionchanged:: 0.20.0
   The output format was changed bla bla...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change has been mentioned

@orbeckst
Copy link
Member

orbeckst commented Aug 9, 2019

I merged develop into it and fixed the merge conflict in CHANGELOG; note that your GitHub handle @xiki-tempula was already in the authors list for 0.20.0.

@xiki-tempula
Copy link
Contributor Author

@orbeckst I have made the corrections. All the lines in the test file which are too long are being altered. In the main file, though some of the lines are a bit too long, I think having them in the same line might make it easier to read.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a lot of work but finally: LGTM!

Thanks for being patient and addressing all issues.

@orbeckst orbeckst merged commit e910a12 into MDAnalysis:develop Aug 13, 2019
@xiki-tempula xiki-tempula deleted the new_wba branch August 13, 2019 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants