Conversation
|
This is a very good idea. If in doubt, I always prefer the pythonic way to the sndfile way. I pretty much agree with everything you wrote. This would be a backwards incompatible change (though a rather trivial one). I think it is well-justified though, since it makes using PySoundFile more intuitive to Pythonistas. After all, PySoundFile is still tagged alpha.
Why would we not want to use
Are file descriptors really consistent between Windows and Unix? We should definitely test all the corner cases on all OSes before committing to file descriptors or file objects.
We should definitely test this thoroughly as well. Especially on non-SSD hardware. I have never done any performance measurements for PySoundFile. Maybe it would make sense to build a small benchmark program for comparing different versions/OSes. |
I think it just cannot work.
Also this sentence from the reference sounds like it's not a good idea: I tried it anyway but libsndfile gave me this error: I'm not quite sure what that means, but it sounds like libsndfile cannot really deal with this situation.
Sure, but you can have this quite easily by using
No idea. But I didn't find anything in the docs that suggests any problems. I have no idea how to sensibly test the buffering stuff. But if you have an idea how to test this, we can of course try! I started implementing the suggested changes and it seems to work quite well for the most part. I think the most severe change will be that we have to delete any possibly available file content (including the file header) in I don't know yet how to handle file-like objects in this case. What do you think? |
|
I didn't realize that truncating the file deletes the header as well. But it makes perfect sense, of course. It might be worth considering to "do what I mean" instead of doing it the same way as Basically, we would apply the Unix file modes on the data, but not on the header. Personally, I quite like this. I think it makes sense that truncating a SoundFile does not delete its header. What is your opinion on this? I'll try to implement a small performance benchmark next weekend. Maybe this will tell us something about the differences on different platforms and implementation. As for |
|
That is a lot of modes. Let's summarize:
Did I miss any? |
No, I think those are all.
I think we should embrace either the Python modes or the libsndfile modes. What you're describing sounds like a mixture of both. The real problem, however, will be that most of your suggested modes actually need
But we would actually have to use
This would also need And both
The old I think
No, you're right. Therefore, I think we can just do it like you suggested, pass it to libsndfile and see what happens. I didn't find a formal specification for the behavior of the |
|
This seems to be fraught with ambiguities and unexpected behavior. I think we should experiment with the actual behavior of sndfile and see what does or doesn't work. If most of the common functionality works as expected, I am all for using pythonic modes. If there are too many weird cases, we should probably stick with sndfile's modes. Regardless, we can rename |
|
I implemented the discussed changes, I think it works quite well. So far, I've tested it on Debian Linux with Python 2.7.8 and 3.4.1. |
|
I just tried it on OSX 10.6 and the tests are passing (Python 2.7 and 3.3). |
|
This looks very good! I would have thought this to be a lot more complicated. |
|
Yes, this looks better. This is a terrific contribution! While the impact on actual usage is quite minor, I think that this level of attention to detail goes a long way in making PySoundFile more pythonic! |
|
It seems that using file descriptors is problematic on Windows (see #63), therefore we shouldn't use file descriptors when the user specifies a file name. We could simply use the actual Python file object, which would be a trivial change to this PR.
Point 1. is probably irrelevant, as calling those callback functions will probably need much less resources than reading/writing the data. Another possibility would be to re-investigate the options from #59. |
|
Another option, which just came to my mind, would be to use I don't know if that really improves anything, though. |
|
A third option would be to try to use file descriptors, and fall back to virtual I/O if that fails (and don't tell anyone that this basically means "every time on Windows"). I don't think that the error message issue with virtual I/O is particularly bad, since there error messages in the callbacks are very unlikely. We'd have to investigate performance, though. |
I don't know if this would work ... I suspect that if I think we would have the same problem with the second option in #59. But probably my worries are unjustified? |
|
I guess we'll have to try it to find out. |
|
I switched from file descriptors to virtual I/O in c971db2. To my great surprise, the tests now run about 40% faster on my Linux computer! |
Information about what is allowed in the "file" argument should be written into the docstring, not into the error message.
|
I just tested it on Windows with This deselects the 38 test cases which use file descriptors. The other 109 test cases pass! I wouldn't disable those tests automatically on Windows, because there may be some systems where it actually works. |
|
Cool! |
... in order to be compatible with Python file objects: https://docs.python.org/3.4/library/io.html#io.IOBase.seekable
|
I vote for merging #60. This looks very good! |
Change open modes to be more in line with Python's open() function
Until very recently I had the opinion that if a feature is available in both libsndfile and plain old Python, we should do it in libsndfile's way.
It seems that my opinion is changing ...
I wanted to fix
SoundFile.__init__()(see #59) and got the idea that probably the real problem is that libsndfile tries to handle different situations within a singleSFM_RDWRmode.One solution to the whole problem would be to use Python's native file modes instead of the ones provided by libsndfile.
Initially, I thought this wouldn't be possible (see commit message of a3295f7), but it's actually only impossible if read-write mode has to be one single mode.
I'm proposing to allow the following modes:
'rb','r'(for convenience, we would internally add'b'to that because we only ever deal with binary files)'wb','w','xb'(raisesFileExistsErrorto avoid accidental overwriting),'x''r+b','r+''w+b','w+','x+b','x+'In addition, we could allow
mode=Nonewhere the mode is obtained from themodeattribute of the user-provided file-like object.All other Python file modes should be disallowed, especially
'a'(because I think it wouldn't play nicely with libsndfile) and't'.Our current
'rw'mode should also be disallowed.In the documentation, we could state that "
'b'is always implied" and use the variants without'b'for simplicity.However, this would be a difference to Python, where
't'is implied.The default could be
mode='r', as before.All this would have the additional advantage that
'x'mode would be possible, making sure that no files are accidentally overwritten.In order to implement this, we would have to use Python's
open()function and then pass the opened file to libsndfile.This could be done either as file object or as file descriptor.
I guess libsndfile does its own buffering (?), therefore it would make sense to use
open(..., buffering=0).I guess the "virtual I/O" callback functions are a little less efficient, so it may be better to get the file descriptor with
fileno()and pass this to libsndfile.In order to be consistent with Python's
open(), we should also make sure that the read and write positions are modified accordingly after opening.For the same reason, we should probably also drop the
whichargument fromseek(), as already suggested by @bastibe (I don't remember where exactlyupdate: https://github.com/bastibe/PySoundFile/pull/35#issuecomment-49274456).It would also finally make sense to add a
tell()method (see #44).