Fix py3 compatibility regarding the HydrogenBondAnalysis.save_table()#1744
Fix py3 compatibility regarding the HydrogenBondAnalysis.save_table()#1744xiki-tempula wants to merge 2 commits intoMDAnalysis:developfrom xiki-tempula:py3_save
Conversation
| self.generate_table() | ||
| with open(filename, 'w') as f: | ||
| with open(filename, 'wb') as f: | ||
| cPickle.dump(self.table, f, protocol=cPickle.HIGHEST_PROTOCOL) |
There was a problem hiding this comment.
This is actually not good for python 2/3 compatibility. It means that a pickle written with python 3 won't be readable with python 2. What is the type of table? Is it a numpy array? Then we have more portable options.
There was a problem hiding this comment.
@kain88-de Thank you for the comment. I have tested reading and writing in py3 but only tested checked in py2.
This is a numpy recarray. I guess np.save() might be a good option?
There was a problem hiding this comment.
I would use np.savetxt then a user can consume the table however he likes. But this is a deprecation. So if this is read from somewhere we should allow reading a pickle for some time. Maybe even writing a pickle is OK until a version 1.0 (If the user really wants to).
richardjgowers
left a comment
There was a problem hiding this comment.
Needs tests to identify the bad behaviour and make sure we never regress.
|
The alternative approach is to remove the Having a convenience method seemed, well, convenient at the time but if this requires us to worry about data formats then that's not good. Maybe use datreant.data or some other tools. It makes sense if you want to re-analyze your data with the same functions. But we don't have a |
This is not going to be supported longer for exactly the same reason you mention here. Storing binary data is surprisingly hard for arbitrary objects.
yeah maybe this doesn't make much sense then anyway. @xiki-tempula how to you use the save feature? |
|
@kain88-de Thanks for your comment. To be honest, I don't use this feature. I will just pickle whatever I want which is sometimes the whole HydrogenBondAnalysis object or sometimes only the timeseries attribute. I agree with @orbeckst. It is probably better to just get rid of this function. |
|
Then I suggest we purge git grep 'save\w*(self'gives Let's open another issue and then we have to go through these methods and find out if they are absolutely necessary or can be removed. We can discuss more on the new issue. Does this sound like a sensible way forward? |
|
@orbeckst I agree. Actually, there is another entity.
|
|
Just need to add a few incendiary remarks here:
It seems to me that the Python ecosystem has problems with solving this issue, together with a certain disregard for long-term stability. It wouldn't so much be an issue if everyone could agree to keep formats and file access stable. In principle, libraries such as netcdf or hdf5 or XDR (as used in Gromacs formats) are portable means to store data. |
Thanks, I had forgotten to pull the latest. I edited my comment. |
we can discuss this on the datreant repo. But we do plan to keep the data module seemy alive in mdsynthesis so old code can still work. |
Fixes #1743
Changes made in this Pull Request:
Change the
HydrogenBondAnalysis.save_table()method so that it will work under py3.I can add a test to prove that the current implementation works in both py2/3 but I'm not sure whether it is necessary as technically I only changed one character.
PR Checklist