Implemented extraction of chains from DMS formats#2872
Implemented extraction of chains from DMS formats#2872orbeckst merged 13 commits intoMDAnalysis:developfrom ianmkenney:fix/dms-chains
Conversation
Used PDBParser as a reference
|
Hello @ianmkenney! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-07-28 23:01:55 UTC |
Should keep using segids to avoid confusion
Codecov Report
@@ Coverage Diff @@
## develop #2872 +/- ##
===========================================
- Coverage 92.80% 92.78% -0.02%
===========================================
Files 185 185
Lines 24205 24214 +9
Branches 3133 3137 +4
===========================================
+ Hits 22463 22467 +4
- Misses 1696 1701 +5
Partials 46 46
Continue to review full report at Codecov.
|
richardjgowers
left a comment
There was a problem hiding this comment.
hey @ianmkenney looks good but needs a simple test
|
@richardjgowers I'm creating an altered ADK dms ( Seyler SL, Kumar A, Thorpe MF, Beckstein O (2015) Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways. PLoS Comput Biol 11(10): e1004568. https://doi.org/10.1371/journal.pcbi.1004568 Setting the domains as follows from sqlite3: UPDATE particle SET segid = "CORE";
UPDATE particle
SET segid = "NMP"
WHERE resid BETWEEN 30 AND 59;
UPDATE particle
SET segid = "LID"
WHERE resid BETWEEN 122 AND 159;A check on the selections of those domains should be sufficient for tests, right? |
["A", "B", "A"] -> ["A", "B"]
Derived from adk_closed.dms and edited in sqlite3 UPDATE particle SET segid = "CORE"; UPDATE particle SET segid = "NMP" WHERE resid BETWEEN 30 AND 59; UPDATE particle SET segid = "LID" WHERE resid BETWEEN 122 AND 159; where the domain definitions can be found in: Seyler SL, Kumar A, Thorpe MF, Beckstein O (2015) Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways. PLoS Comput Biol 11(10): e1004568. https://doi.org/10.1371/journal.pcbi.1004568
Added new selections in test_dms.py and relabeled previous selections
|
@richardjgowers should be ready for a review |
| vals = cur.fetchall() | ||
| except sqlite3.DatabaseError: | ||
| errmsg = "Failed reading the atoms from DMS Database" | ||
| raise IOError(errmsg) from None |
There was a problem hiding this comment.
What was the reason for removing raise from None? See PR #2357 for rationale.
| bonds = cur.fetchall() | ||
| except sqlite3.DatabaseError: | ||
| errmsg = "Failed reading the bonds from DMS Database" | ||
| raise IOError(errmsg) from None |
There was a problem hiding this comment.
What was the reason for removing raise from None?
richardjgowers
left a comment
There was a problem hiding this comment.
I'm happy once @orbeckst 's question is answered, thanks @ianmkenney !
orbeckst
left a comment
There was a problem hiding this comment.
Looks good. I have one potential performance comment (see inline).
I changed your CHANGELOG entry to be under 2.0.0.
|
Thanks @ianmkenney ! |
* partially addresses MDAnalysis#1387 for the DMS parser * Changes made in this Pull Request: * Repeated residues are not squashed * Multiple segids are created when multiple chains are present * Implemented extraction of chains from DMS formats * Properly split repeating segids: ["A", "B", "A"] -> ["A", "B"] * tests: New DMS file that includes multiple segids Derived from adk_closed.dms and edited in sqlite3 ```sql UPDATE particle SET segid = "CORE"; UPDATE particle SET segid = "NMP" WHERE resid BETWEEN 30 AND 59; UPDATE particle SET segid = "LID" WHERE resid BETWEEN 122 AND 159; ``` where the domain definitions can be found in: Seyler SL, Kumar A, Thorpe MF, Beckstein O (2015) Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways. PLoS Comput Biol 11(10): e1004568. https://doi.org/10.1371/journal.pcbi.1004568 * tests: - Modified atom selections in test_dms.py - added new selections in test_dms.py and relabeled previous selections - Added DMS with no segid * Updated CHANGELOG
* partially addresses #1387 for the DMS parser * Changes made in this Pull Request: * Repeated residues are not squashed * Multiple segids are created when multiple chains are present * Implemented extraction of chains from DMS formats * Properly split repeating segids: ["A", "B", "A"] -> ["A", "B"] * tests: New DMS file that includes multiple segids Derived from adk_closed.dms and edited in sqlite3 ```sql UPDATE particle SET segid = "CORE"; UPDATE particle SET segid = "NMP" WHERE resid BETWEEN 30 AND 59; UPDATE particle SET segid = "LID" WHERE resid BETWEEN 122 AND 159; ``` where the domain definitions can be found in: Seyler SL, Kumar A, Thorpe MF, Beckstein O (2015) Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways. PLoS Comput Biol 11(10): e1004568. https://doi.org/10.1371/journal.pcbi.1004568 * tests: - Modified atom selections in test_dms.py - added new selections in test_dms.py and relabeled previous selections - Added DMS with no segid * Updated CHANGELOG (cherry picked from commit 2f16a16)
* partially addresses MDAnalysis#1387 for the DMS parser * Changes made in this Pull Request: * Repeated residues are not squashed * Multiple segids are created when multiple chains are present * Implemented extraction of chains from DMS formats * Properly split repeating segids: ["A", "B", "A"] -> ["A", "B"] * tests: New DMS file that includes multiple segids Derived from adk_closed.dms and edited in sqlite3 ```sql UPDATE particle SET segid = "CORE"; UPDATE particle SET segid = "NMP" WHERE resid BETWEEN 30 AND 59; UPDATE particle SET segid = "LID" WHERE resid BETWEEN 122 AND 159; ``` where the domain definitions can be found in: Seyler SL, Kumar A, Thorpe MF, Beckstein O (2015) Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways. PLoS Comput Biol 11(10): e1004568. https://doi.org/10.1371/journal.pcbi.1004568 * tests: - Modified atom selections in test_dms.py - added new selections in test_dms.py and relabeled previous selections - Added DMS with no segid * Updated CHANGELOG
Used PDBParser as a reference
Addresses #1387
Changes made in this Pull Request:
PR Checklist