Serialize FileIO and TextIOWrapper and Universe#2723
Serialize FileIO and TextIOWrapper and Universe#2723orbeckst merged 120 commits intoMDAnalysis:developfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #2723 +/- ##
===========================================
+ Coverage 92.80% 92.86% +0.06%
===========================================
Files 185 186 +1
Lines 24205 24383 +178
Branches 3133 3149 +16
===========================================
+ Hits 22463 22643 +180
+ Misses 1696 1694 -2
Partials 46 46
Continue to review full report at Codecov.
|
|
@yuxuanzhuang ignore the failing Python 2.7 tests, once we release 1.0 we're not supporting Python 2 any more. |
package/MDAnalysis/lib/util.py
Outdated
| return buffer | ||
| elif mode == 'rt' or mode == 'r': | ||
| return TextIOPickable(buffer) | ||
|
|
There was a problem hiding this comment.
I think this should fail for anything else (ValueError for unsupported mode).
There was a problem hiding this comment.
Maybe add a comment here saying that we should never get here; it's not obvious until you read all code up to this point.
Or better
assert False, "mode = {} argument was never processed".format(mode)There was a problem hiding this comment.
Do you mean replacing the ValueError for unsupported mode?
package/MDAnalysis/lib/util.py
Outdated
|
|
||
| # not as comprehensive as built-in open func--no need for other args | ||
| # only should be used for 'reading' modes | ||
| def pickle_open(name, mode): |
| def test_offset(f): | ||
| f.readline() | ||
| f_pickled = pickle.loads(pickle.dumps(f)) | ||
| assert_equal(f.tell(),f_pickled.tell()) |
There was a problem hiding this comment.
I can recommend running a formatting tool with every save. I use black for python nowadays.
IAlibay
left a comment
There was a problem hiding this comment.
just a couple of formatting/test changes
|
I guess these func and classes should be something for internal use, i.e. being called by |
This might be the case right now. But sometimes they find other uses, too. You could start out with them being "private" (but still fully documented) and we can later make them public with a defined API.
|
orbeckst
left a comment
There was a problem hiding this comment.
Overall this looks very good and promising. I have a bunch of comments and suggestions (see inline), mainly
- docs
- suggestion to internally change the order of the pickle (purely for aesthetics and code readability)
- spelling: should be "picklable" instead of "pickable" everywhere; I made suggestions but didn't correct everything (such as all the tests are still "pickable")
If pickle_open() is used elsewhere then we should also provide a corresponding context manager. I don't know how urgent that is and it does not need to happen in this PR.
As I also said in another comment, it might be worthwhile splitting lib.utils and moving all file/IO related functionality into its own module.
package/MDAnalysis/lib/util.py
Outdated
| return buffer | ||
| elif mode == 'rt' or mode == 'r': | ||
| return TextIOPickable(buffer) | ||
|
|
There was a problem hiding this comment.
Maybe add a comment here saying that we should never get here; it's not obvious until you read all code up to this point.
Or better
assert False, "mode = {} argument was never processed".format(mode)|
Yeah I'd put this in a new file then maybe import it into the |
|
@richardjgowers do you mean a module |
|
@orbeckst the file can be called anything, so either of those are fine. I don't think a user needs to understand the file layout of things, instead in from picklable_file_io import FileIOPicklable, ...so |
| ('CHAIN', [PDB, PDB, PDB], dict()), | ||
| ('CHAIN', [XTC, XTC, XTC], dict()), | ||
| ]) | ||
| def ref_reader(request): |
There was a problem hiding this comment.
Would it make sense to put that test in BaseReaderTest (and maybe _SingleFrameReader) so any new reader get tested? While it looks cleaner to have the test here, I am worried we will miss readers, especially when new readers get added.
There was a problem hiding this comment.
Do you mean cover all the _SingleFrameReader as how class TestPDBReader(_SingleFrameReader) is implemented? As for pickle test in BaseReaderTest and MutliframeReaderTest, they were already added.
There was a problem hiding this comment.
It is what I meant, indeed. My concern when I see this fixture is that we will forget to add the new readers to the list. Ideally I would like something that fails if a reader is not tested for pickling and multiprocessing.
There was a problem hiding this comment.
I think for now, only tests in NAMDBIN, PDB, PQR utilize _SingleFrameReader. Makes sense to create a new PR for other single frame readers e.g. MMTF, to be inherited from that? (which it along can be backported.)
There was a problem hiding this comment.
I'd be in favor of getting the PR merged and open a new one where we think hard about how to do consistent testing of all our readers and parsers.
We'll also need to look at converters... but that's a different issue. Literally.
IAlibay
left a comment
There was a problem hiding this comment.
I think any further questions I have re: documentation can/should be documented in the userguide or similar. I'm good with this, thanks for all the work @yuxuanzhuang !
|
ping reviewers @IAlibay @fiona-naughton – can you please check if your comments were addressed? If yes, please approve the PR. If not, please point out specifically what needs to be done, in the form of actionable requests. As discussed on the call, we'd like to get this merged ASAP so that @yuxuanzhuang can move forward. |
|
Oops... sorry @IAlibay – you approved while I was writing my message. Thanks! |
|
looks good to me - nice work @yuxuanzhuang ! |
|
@yuxuanzhuang congratulations – it is done! 🎉 |
|
Thanks for all the help! |
- Fixes #2878 - Changes made in this Pull Request: - add more comprehensive tests for pickling low-level dcd and xdr files - fix a bug that was introduced in #2723 (wrong seeking in DCD and XDR, note: this bug was only in develop, never in released code, see discussion in PR #2908) - update CHANGELOG - maybe backport in #2908
* Fixes MDAnalysis#2878 * basic approach: composition instead of inheritance for pickling Universe (which was tested in PR MDAnalysis#2704 (which was derived from PR MDAnalysis#2140)) * Changes made in this Pull Request: - add new classes and pickle_open function to picklable_file_io.py - add new picklable `BufferedReader`, `FileIO`, and `TextIOWrapper` classes. - implement `__getstate__`, `__setstate__` to `Universe` and `BaseReader` - fix DCD, XDR pickle issue - modified gsd and ncdf to be picklabel - modified ChainReader to be picklabel - ensure chemfiles is picklable * add tests (MultiFrameReader will test for serializability) * update CHANGELOG * update docs Note: This merge squashed 120 commits. See PR MDAnalysis#2723 for the full history with 420 comments.
- Fixes MDAnalysis#2878 - Changes made in this Pull Request: - add more comprehensive tests for pickling low-level dcd and xdr files - fix a bug that was introduced in MDAnalysis#2723 (wrong seeking in DCD and XDR, note: this bug was only in develop, never in released code, see discussion in PR MDAnalysis#2908) - update CHANGELOG - maybe backport in MDAnalysis#2908
Fixes #2878
Universein Updating Serialization Functionality from PR #2140 #2704Changes made in this Pull Request:
pickle_openfunction topicklable_file_io.py__getstate__,__setstate__toUniverseandBaseReaderPR Checklist