Adds PDB element writing and fixes reading#3001
Adds PDB element writing and fixes reading#3001lilyminium merged 18 commits intoMDAnalysis:developfrom
Conversation
|
There was quite a lot of discussion on #2422 about what to do when there's empty/unknown elements records. The approach taken here reflects the discussions in #2422, and other places where we've taken the conservative stance that unknown elements are not assigned an element record on reading, and therefore would not have an elements record written down either. I'm sure many will have strong opinions here, including but not limited to; @lilyminium, @RMeli, and @richardjgowers. |
Codecov Report
@@ Coverage Diff @@
## develop #3001 +/- ##
========================================
Coverage 93.09% 93.09%
========================================
Files 186 186
Lines 24663 24665 +2
Branches 3197 3195 -2
========================================
+ Hits 22959 22961 +2
Misses 1656 1656
Partials 48 48
Continue to review full report at Codecov.
|
| vals['tempFactor'] = tempfactors[i] | ||
| vals['segID'] = segids[i][:4] | ||
| vals['element'] = guess_atom_element(atomnames[i].strip())[:2] | ||
| vals['element'] = elements[i][:2] |
There was a problem hiding this comment.
Should this have an upper() call? I've seen PDB files with both mixed case or just all upper case. We enforce UpperLower within MDA, but it would be best to write out in the manner that's consistent, thoughts?
There was a problem hiding this comment.
Yeah I like .capitalize(). That being said, given that we already validate and capitalize elements on input, if the user reeeeeeally wants bespoke capitalisation then maybe we should let them.
There was a problem hiding this comment.
So we normalize all the inputs to be capitalize(), which is where this gets complicated, I prefer then converting back to all caps/upper() elements because I think we see these a bit more often in PDBs, but if users had a specific preference then we'd be overriding that :/
There was a problem hiding this comment.
It seems that all-caps used to be the spec, and here is an example of capitalized symbols causing issues in NGLView -- I vote for upper case unless/until someone specifically raises an issue about it.
There was a problem hiding this comment.
That's convincing enough for me, I've changed it to upper in bd220dc :)
|
IMHO, not assigning unknown elements is very sensible. I'd rather have the code fail cleanly because the elements attribute is missing instead of getting erroneous results because of an incorrect guess. |
lilyminium
left a comment
There was a problem hiding this comment.
Looks good, a couple nitpicks. So the Atomtypes attribute remains unchecked or guessed elements, and the Elements attribute is the same information run through a sensibility filter?
| vals['tempFactor'] = tempfactors[i] | ||
| vals['segID'] = segids[i][:4] | ||
| vals['element'] = guess_atom_element(atomnames[i].strip())[:2] | ||
| vals['element'] = elements[i][:2] |
There was a problem hiding this comment.
Yeah I like .capitalize(). That being said, given that we already validate and capitalize elements on input, if the user reeeeeeally wants bespoke capitalisation then maybe we should let them.
Co-authored-by: Lily Wang <31115101+lilyminium@users.noreply.github.com>
So this is an interesting one that links back to #2918, personally I wouldn't ever guess on read, but we have this default need for atomtypes, and what an atom type isn't properly defined. Given that atom types can be "atom name, atom element, or force field atom type" maybe the answer here is to assign atom types to the raw input atom name? My only worry here is how much would break if we did this (I have no idea for how widely used atomtype is downstream). |
|
I would also be concerned that changing |
|
|
||
|
|
||
| @pytest.fixture | ||
| def dummy_universe_without_elements(): |
There was a problem hiding this comment.
Could you look at minimizing the number of warnings that gets issued by your new tests? I used 2 different approaches for that in #2886: filling all the required attributes in the fixture or filtering the warnings in the tests.
There was a problem hiding this comment.
3d7857d should have done the trick, only warnings left are parmed's ABCs from collections import.
lilyminium
left a comment
There was a problem hiding this comment.
LGTM, thank you @IAlibay and sorry for the delay! 😶
- Fixes MDAnalysis#2423 and MDAnalysis#2422 - PDB parser now allows for partial element parsing, setting an empty record if the element is not recognised. - PDB writer now uses the elements attribute instead of guessing.
Completes/supersedes #2442 Fixes #2422 #2423
Changes made in this Pull Request:
PR Checklist