Skip to content

Add flush() and close()#14

Closed
mgeier wants to merge 1 commit intobastibe:masterfrom
mgeier:flush-and-close
Closed

Add flush() and close()#14
mgeier wants to merge 1 commit intobastibe:masterfrom
mgeier:flush-and-close

Conversation

@mgeier
Copy link
Contributor

@mgeier mgeier commented Feb 25, 2014

This is meant as a proof-of-concept implementation, there is definitely room for improvement.

@bastibe
Copy link
Owner

bastibe commented Feb 26, 2014

I implemented flush in 42dc599, just like you proposed. Thank you for the idea!

@bastibe
Copy link
Owner

bastibe commented Feb 26, 2014

I think that if we were to provide a close(), we should definitely also provide an open(). Otherwise a SoundFile zombie would be perfectly useless. This however seems to imply that the SoundFile should be created a zombie and only come to life in a context manager or open().

I would find that behavior much worse than the current one. What do you think?

As for __del__, I think it is indeed guaranteed to be called when the object goes out of scope. The only issue is that some implementations might postpone deletion to the next garbage collection run instead of calling __del__ immediately after the reference count drops to zero. This is typically still in a very reasonable time frame though, so I don't think that would be a problem. Any thoughts on that?

@mgeier
Copy link
Contributor Author

mgeier commented Mar 2, 2014

Thanks for implementing flush().

A SoundFile zombie is indeed useless, but I think that's expected behavior. In Python itself it works the same way.
The built-in function open() is used to create a file object of some kind. This file object has a close() method which closes the file and leaves the object in a zombie state.
If you try to use any of the methods afterwards, you get ValueError: I/O operation on closed file.
The attributes are still accessible, e.g. encoding. I just saw that there is also an attribute closed which conveys the fact that the file is indeed closed.
There is no open() method to re-open the file. It's dead. It has ceased to be.
My original idea to completely strip out the object's guts with vars(self).clear() seems not to be what people would expect either. Attributes should stay and methods should generate meaningful error messages.

Long story short, I think an open() method is not needed and people wouldn't expect it.

An open() function, however, would make very much sense, e.g.:

import pysoundfile as sf
f = sf.open('myfile.wav', ...)
f.write(...)
f.whatever ...
f.close()

A context manager would be even better, but let's assume an interactive session here.

If the pysoundfile module gets an open() function, users wouldn't even need to know the class SoundFile, it doesn't get much easier than that. And it's perfectly mimicking Python's behavior.

The SoundFile constructor can of course still be used, but I think the open() function is nicer and easier to use for both novices and experienced Pythonistas.

Regarding __del__(): I'm not an expert and I just recently read about this, but it is my understanding that it is not guaranteed that __del__() is called at all!

Citing http://docs.python.org/3.3/reference/datamodel.html:

An implementation is allowed to postpone garbage collection or omit it altogether

and

It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.

Even if these should be extremely rare cases, it would be reason enough for me to provide a close() method and be done with it. The close() method can of course be called in __del__(), just for the cases where a user forgot to call it manually (or via context manager).

@bastibe
Copy link
Owner

bastibe commented Mar 3, 2014

You finally convinced me. An open function and a close method really do make sense. So we'd have a function open that just returns a new SoundFile instance. This would even re-use the existing context manager, which is great! Similar functionality would probably be useful for PySoundCard as well.

Most of the arguments to open would get passed straight to the SoundFile constructor. I propose a few changes though:

  • read_mode and write_mode and read_write_mode should be called "r", "w" and "rw". I have wanted to do this for a long time, but I didn't want to break backwards compatibility. The open function is a much better place for it though.
  • Maybe the open function should accept file formats as strings instead of integer constants? This would mean fewer imports, but less flexibility. It would make open a higher level interface, and leave the actual constructor for more in-depth tasks.

What do you think about this?

@marcecj
Copy link

marcecj commented Mar 3, 2014

Warning: following are my thoughts from thinking about these issues for just a short time, and without being intimate with the pysoundfile source code.

I don't understand the problem with read_mode etc.. Consumers of pysoundfile should be using the module-level constants pysoundfile.read_mode etc. anyway (I mean, nobody in their right mind would pass mode=48, would they?), so couldn't you just redefine those while simultaneously changing the API to match? Or am I missing something? Hell, if you're really worried about backwards compatibility, you could always check if mode is an int or a string and handle those cases separately, maybe with a deprecation warning in the int case.

As to the second point, from what I can tell that would make sense. AIUI that would limit you to a handful of (common) formats (well, as many as you want, but at some point I expect it to get tedious ;) ), but I suspect that's OK for a higher-level interface that might be used by applications that don't care about all the niche formats that libsndfile supports (i.e., it would be the "stupid" interface to pysoundfile). Of course, you could again handle strings and int's separately. You would probably use a dict mapping anyway, like for the snd_* globals, so the constructor would be either blindly accepting an int or mapping a string to an int via the dict (similarly to the four *_file etc. globals), raising your line count by three (slightly more if you add some argument checking). Then if you want to use a common format, pass a string, and if you have special needs, | your int together however you like.

@mgeier
Copy link
Contributor Author

mgeier commented Mar 3, 2014

I'm glad I could convince you!

Now I guess the open() function deserves its own issue, right?

I don't think it's a good idea to give open() and SoundFile.__init__() different sets of parameters.
This will be really hard for users to learn. I think open() should merely forward everything to the SoundFile constructor.
I don't see why an argument that could be "improved" in open() shouldn't also be "improved" in the SoundFile constructor.
I wouldn't see open() as part of the high-level convenience interface.
I think all of libsndfile's options should be available and the calls should look reasonably Pythonic.

Regarding open mode:
I think @marcecj's idea of allowing both numeric and string arguments is good if we really care about backward compatibility.
However, I think currently we are working towards major changes/additions and I think it wouldn't be that bad if in a few months there is a new version with a few breaking changes (as long as it doesn't happen in every version from then on).
I think it is closer to Python if (only) 'r'/'w'/'rw' are supported (like in Python's open() function). I don't think the original libsndfile names are important here, especially as @bastibe
renamed them anyway (consistently they would have been called M_READ, M_WRITE and M_RDWR).

Regarding file formats:
I don't like string arguments here, I would prefer module-level numeric constants like sf.WAV, sf.FLAC etc.
Because of the sheer number of combinations we should stay as close as possible to libsndfile.
What about keeping the conceptual separation of major and minor formats? With reasonable default values this should become quite manageable.
The major file type could be guessed from the file extension (but still overrideable) and every major format should have a default minor format (which we would have to define, because it's not specified in libsndfile).

An example where all options are set by the user:

myfile = sf.SoundFile('myfile.wav', 44100, 'w', channels=6, format=sf.WAVEX, subtype=sf.PCM_24, endian=sf.LITTLE)

Another example with more defaults (channels=2, format=sf.WAV (from file extension), subtype=sf.PCM_16 (default subtype for sf.WAV), endian=FILE (default endian-ness)):

myotherfile = sf.SoundFile('myotherfile.wav', 44100, 'w')

I don't see the necessity for pre-defining tuples of (format, subtype, endian), because each format has IMHO quite an obvious choice for a default subtype and endian is by default FILE, even in libsndfile.

@marcecj: I think dicts are not needed here, the numeric constants can be defined at module level, e.g. WAV = 0x010000 and the combinations from (format, subtype, endian) would be |-ed together by the library, the user doesn't need to do that.
The only dict which would be needed is the mapping from major formats to default subtypes.

@mgeier
Copy link
Contributor Author

mgeier commented Mar 6, 2014

I implemented some of the discussed points in #18.

There is and open() function which forwards everything, without changes, to the SoundFile constructor (2948783).

I changed the open mode to r/w/rw which are mapped to the module-level constants _M_READ, _M_WRITE and _M_RDWR. I also removed the underscore from self.mode because that's no secret (2572742).

I also changed the handling of file types as described above.
Now the abovementioned examples actually work like this (only the argument order changed slightly):

import pysoundfile as sf
myfile = sf.open(name='myfile.wav', mode='w', sample_rate=44100, channels=6, subtype=sf.PCM_24, endian=sf.LITTLE, format=sf.WAVEX)

The same thing can also be used with only positional arguments, if desired:

myfile = sf.open('myfile.wav', 'w', 44100, 6, sf.PCM_24, sf.LITTLE, sf.WAVEX)

The format can also be OR'ed together, if that's preferred:

myfile = sf.open('myfile.wav', 'w', 44100, 6, format=sf.WAVEX|sf.PCM_24|sf.LITTLE)

In most cases the file type can be obtained from the file extension, in this case WAV would be used instead of WAVEX (little endian is default anyway):

myfile = sf.open('myfile.wav', 'w', 44100, 6, sf.PCM_24)

And finally, if the default 16-bit samples are desired, it get's shorter once again:

myfile = sf.open('myfile.wav', 'w', 44100, 6)

I would still regard this as the low-level interface because you still have to use the write() method and then close the file (or use a context manager).
BTW: I didn't use a default value channels=2, because I think that would be too much (and it would complicate option handling).
It get's even easier with the high-level functions read() and write():

data = np.array(...)
sf.write(data, 'myfile.wav', 44100)

That's exactly the minimalism I'm after, in most cases I don't want to write more (but neither less, I wouldn't want a default value for sample_rate).
The rest of the examples would look like this:

sf.write(data, 'myfile.wav', 44100, sf.PCM_24)
sf.write(data, 'myfile.wav', 44100, format=sf.WAVEX|sf.PCM_24|sf.LITTLE)

The other named arguments work too, of course.
Even that works (although I'm not sure why because I'm specifying the subtype and format get's the default value sf.WAV):

sf.write(data, 'myfile.wav', 44100, sf.WAVEX|sf.PCM_24)

And read is easy anyway:

myarray = sf.read('myfile.wav')

This reads the whole file by default, a part of a file can be read with the frames, start and stop arguments:

myarray = sf.read('myfile.wav', 22050)

The only special case are RAW files, but they can also be handled:

myarray = sf.read('myfile.raw', sample_rate=44100, channels=2, subtype=sf.PCM_24)

Unlike I expected, I had a hard time finding default subtypes for most formats, because I really know only few of them. Missing mappings should be added (see the module-level dict _default_subtypes).
The type-tuples which were defined before (and removed by me) do still work as expected, though.

@bastibe
Copy link
Owner

bastibe commented Mar 7, 2014

I like this very much! Do you want to package these changes into a single pull-request? If not, I can just cherry-pick them from #18.

@mgeier
Copy link
Contributor Author

mgeier commented Mar 8, 2014

I'm glad you like it!

I'll have a look at #18 and will write a comment there which commits you could directly cherry-pick. The rest I will combine into one or more new commits.

@bastibe
Copy link
Owner

bastibe commented May 27, 2014

Why is this still open? I think we implemented this in 42dc599 and e01b7ab.

@bastibe bastibe closed this May 27, 2014
mgeier added a commit that referenced this pull request May 29, 2014
This is the combination of a few commits from #18, plus a few more
things.
See also #14.
@mgeier mgeier deleted the flush-and-close branch June 18, 2014 08:13
@mgeier mgeier mentioned this pull request Dec 11, 2014
mgeier added a commit that referenced this pull request Dec 17, 2014
Now the integer mode constants are called READ, WRITE and RDWR.

They are still in the module namespace, but now they have their own type
_ModeType (derived from int).

Alternatively, 'r'/'w'/'rw' and 'READ'/'WRITE'/'RDWR' (case-insensitive)
can be used in the SoundFile constructor.
This includes the strings suggested by @bastibe in #14.

The strings 'r+'/'w+'/'a'/'a+'/... from the standard Python file objects
are not used, because their behaviour is different from the modes in
libsndfile which would lead to confusion.

The mode is now the second constructor argument, right after the file
name. This way, 'rw' can be used as positional parameter:

f = SoundFile('myfile.wav', 'rw')
mgeier added a commit that referenced this pull request Dec 17, 2014
mgeier added a commit that referenced this pull request Dec 17, 2014
This is the combination of a few commits from #18, plus a few more
things.
See also #14.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants