Skip to content

Secondary structure determination#2609

Closed
lilyminium wants to merge 7 commits intoMDAnalysis:developfrom
lilyminium:dssp
Closed

Secondary structure determination#2609
lilyminium wants to merge 7 commits intoMDAnalysis:developfrom
lilyminium:dssp

Conversation

@lilyminium
Copy link
Member

@lilyminium lilyminium commented Mar 10, 2020

Fixes #2608

Changes made in this Pull Request:

  • added secondary_structure module
  • contains wrapper classes for:
    • mdtraj.compute_dssp
    • DSSP (mkdssp)
    • STRIDE (stride)

Let’s see how these installation scripts go...

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Demo notebook here.

@lilyminium lilyminium force-pushed the dssp branch 2 times, most recently from 29f64ef to 09fd9af Compare March 10, 2020 06:52
@richardjgowers
Copy link
Member

@lilyminium this will be a cool addition. WRT installation, a better long term solution is maybe to create a conda package for these tools.

Also I'm remembering something about one of these tools having a different license that we have to be careful about?

@lilyminium
Copy link
Member Author

@richardjgowers is it STRIDE?

All rights reserved, whether the whole or part of the program is
concerned. Permission to use, copy, and modify this software and its
documentation is granted for academic use, provided that:

i. this copyright notice appears in all copies of the software and
related documentation;

ii. the reference given below (Frishman and Argos, 1995) must be
cited in any publication of scientific results based in part or
completely on the use of the program;

iii. bugs will be reported to the authors.

   The use of the  software	 in  commercial	 activities  is	 not  allowed
   without a prior written commercial license agreement.

DSSP uses:

Boost Software License - Version 1.0 - August 17th, 2003

Permission is hereby granted, free of charge, to any person or organization
obtaining a copy of the software and accompanying documentation covered by
this license (the "Software") to use, reproduce, display, distribute,
execute, and transmit the Software, and to prepare derivative works of the
Software, and to permit third-parties to whom the Software is furnished to
do so, all subject to the following:

The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software, in whole or in part, and
all derivative works of the Software, unless such copies or derivative
works are solely in the form of machine-executable object code generated by
a source language processor.

MDTraj is LGPL.

@codecov
Copy link

codecov bot commented Mar 10, 2020

Codecov Report

Merging #2609 into develop will decrease coverage by 2.97%.
The diff coverage is 19.78%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2609      +/-   ##
===========================================
- Coverage    90.61%   87.64%   -2.98%     
===========================================
  Files          174      179       +5     
  Lines        23554    24585    +1031     
  Branches      3072     3219     +147     
===========================================
+ Hits         21343    21547     +204     
- Misses        1585     2409     +824     
- Partials       626      629       +3     
Impacted Files Coverage Δ
package/MDAnalysis/converters/base.py 0.00% <0.00%> (ø)
package/MDAnalysis/coordinates/PDB.py 90.09% <ø> (ø)
...MDAnalysis/analysis/secondary_structure/wrapper.py 77.01% <77.01%> (ø)
...ge/MDAnalysis/analysis/secondary_structure/dssp.py 89.09% <89.09%> (ø)
...ge/MDAnalysis/analysis/secondary_structure/base.py 94.44% <94.44%> (ø)
...DAnalysis/analysis/secondary_structure/__init__.py 100.00% <100.00%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e416aa7...67b152e. Read the comment docs.

@@ -0,0 +1,2415 @@
# -*- Mode: python; tab-width: 4; indent-tabs-mode:nil; coding:utf-8 -*-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a copy of coordinates/base.py in convertors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, mixed up my branches.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having secondary structure determination is great.

I am somewhat ambivalent about falling back to mdtraj's dssp implementation. Their LGPL license allows us to import their code but vice versa they would not be able to do this, so even though this is perfectly legal it feels a bit like taking from a competitor without giving anything back. Perhaps if we also had a converter for mdtraj objects then there would be more of an interoperabilty angle. @MDAnalysis/coredevs , do you have an opinion on how we should proceed?

- BUILD_CMD="pip install -e package/ && (cd testsuite/ && python setup.py build)"
- CONDA_MIN_DEPENDENCIES="mmtf-python mock six biopython networkx cython matplotlib scipy griddataformats hypothesis gsd codecov"
- CONDA_DEPENDENCIES="${CONDA_MIN_DEPENDENCIES} seaborn>=0.7.0 clustalw=2.1 netcdf4 scikit-learn joblib>=0.12 chemfiles"
- CONDA_DEPENDENCIES="${CONDA_MIN_DEPENDENCIES} seaborn>=0.7.0 clustalw=2.1 netcdf4 scikit-learn joblib>=0.12 chemfiles mdtraj"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a user have to install mdtraj in order to use MDAnalysis? I'd like to avoid this situation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it raises an ImportError only when the class is instantiated.

n_name='N', ca_name='CA', pro_name='PRO'):
# TODO: implement this on its own w/o mdtraj?
try:
from mdtraj.geometry._geometry import _dssp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worthwhile to copy the code and include it here (under LGPL) and of course leave all the citations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so, if this is better than importing MDTraj?

cite_module=True)


class DSSP(SecondaryStructureBase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the name to DSSPmdtraj or something else. You can then still derive DSSP(DSSPmdtraj) but in case at some point we write our own version or bundle it then we can easily switch DSSP over.

@IAlibay
Copy link
Member

IAlibay commented Feb 8, 2022

As discussed with @lilyminium, given the large dependencies being added here, the plan would be to move it downstream to a "structure analysis" package which we have planned as part of the upcoming MDAKits work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Secondary structure determination

4 participants