Skip to content

Add a reindex argument to the PDBWritter#2886

Merged
jbarnoud merged 10 commits intodevelopfrom
issue1072-reindex-pdb
Oct 21, 2020
Merged

Add a reindex argument to the PDBWritter#2886
jbarnoud merged 10 commits intodevelopfrom
issue1072-reindex-pdb

Conversation

@jbarnoud
Copy link
Contributor

@jbarnoud jbarnoud commented Aug 6, 2020

Fixes #1072

Changes made in this Pull Request:

  • Add a reindex argument to the PDBWriter. The argument is True by default and the atom serials are generated starting from 1. If the argument is set to False, then atoms.ids is used for the serial. This is reminiscent of Add reindex option to write atom id to gro file #1781.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@pep8speaks
Copy link

pep8speaks commented Aug 6, 2020

Hello @jbarnoud! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1087:80: E501 line too long (91 > 79 characters)

Line 45:80: E501 line too long (82 > 79 characters)
Line 235:80: E501 line too long (81 > 79 characters)

Comment last updated at 2020-10-20 08:29:55 UTC

@codecov
Copy link

codecov bot commented Aug 6, 2020

Codecov Report

Merging #2886 into develop will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #2886   +/-   ##
========================================
  Coverage    93.05%   93.05%           
========================================
  Files          186      186           
  Lines        24611    24623   +12     
  Branches      3188     3191    +3     
========================================
+ Hits         22902    22914   +12     
  Misses        1661     1661           
  Partials        48       48           
Impacted Files Coverage Δ
package/MDAnalysis/coordinates/PDB.py 94.44% <100.00%> (+0.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6b0cbd6...72d75c4. Read the comment docs.

@jbarnoud jbarnoud force-pushed the issue1072-reindex-pdb branch from 19910be to 73518c3 Compare October 16, 2020 16:04
@jbarnoud jbarnoud marked this pull request as ready for review October 16, 2020 16:36
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some half-backed comments, sorry, not much time.

Does the PDB standard prescribe continuous serials starting from 1? (IIRC we never really found out but maybe someone knows.)

What is the user's expectation: save PDB with exactly the given serials or save with something looking like a "normal" PDB?

If users want renumbered serials, should we have them renumber a serial attribute in the topology instead and then use that for writing instead of just doing something on the fly?

An indexing issue meant it previously used the first charater (Issue #2224)

.. versionchanged:: 2.0.0
Add the "reindex" argument.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State which value of reindex preserves old (pre 2.0.0) behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we put kwargs as bare reST roles i.e.

Add the `redindex` argument. Setting this keyword to ``True`` (the default) preserves
the behavior in earlier versions of MDAnalysis.

multi frame PDB file in which frames are written as MODEL_ ... ENDMDL_
records. If ``None``, then the class default is chosen. [``None``]
reindex: bool (optional)
If ``True`` (default), the atom serial is set to be consecutive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Which value preserves compatibility with 1.x?
  2. Have we ever been able to rountrip a PDB when the serials were not consecutively numbered starting from 1? Is this something we would want to be able to do? Is this something a user would generally expect?

Also silence some warnings for the tests in the PR.
There was a test in PDBWriter._write_pdb_bonds about writing the bonds
for an atomgroup that does not have AG.ids but asking to not reindex.
The test was silencing the problem and would lead to the bonds not being
written. However, that part of the code would never be reached because
writing the atoms would have raised an exception before that method is
called.

This commit removes this test. Now, if that part of the code is called
(which should not happen), an exception is raised when the method tries
to access AG.ids.
@jbarnoud
Copy link
Contributor Author

What we called the atom id is actually called atom serial, which let me assume it should be consecutive and starting at 1. In the discussion in the issue, @arose commented in that direction. So the default seems to be what is expected. However, it is unconvenient for some usecase, especially when patching systems together; hence the new keyword.

@orbeckst
Copy link
Member

Thanks for the explanation. Then this makes sense, with the default the most likely PDB-standard implementation and reindex=False for the special cases.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with the UI. Perhaps someone else can have a look at the implementation.

The test coverage is still a bit low, so that's my only complaint.

@jbarnoud
Copy link
Contributor Author

@orbeckst

The test coverage is still a bit low, so that's my only complaint.

With 100% patch coverage, it should be good now.

@orbeckst
Copy link
Member

Thanks, could you please address the merge conflict in CHANGELOG? Otherwise all good from my side.

@jbarnoud jbarnoud merged commit 0aace31 into develop Oct 21, 2020
PicoCentauri pushed a commit to PicoCentauri/mdanalysis that referenced this pull request Mar 30, 2021
* Add a reindex argument to the PDBWritter

* Tests for PDB reindex

* Remove extra print

* Test PDB CONECT with reindex=False

* Update CHANGELOG

* More precise version change for PDB reindex

* Fix long lines

* Increase test coverage for MDAnalysis#2886

Also silence some warnings for the tests in the PR.

* Remove useless branch in PDB.py

There was a test in PDBWriter._write_pdb_bonds about writing the bonds
for an atomgroup that does not have AG.ids but asking to not reindex.
The test was silencing the problem and would lead to the bonds not being
written. However, that part of the code would never be reached because
writing the atoms would have raised an exception before that method is
called.

This commit removes this test. Now, if that part of the code is called
(which should not happen), an exception is raised when the method tries
to access AG.ids.
@IAlibay IAlibay deleted the issue1072-reindex-pdb branch May 29, 2022 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add option to keep original Atom IDs when writing PDB

4 participants