Adds raw_read/raw_write to exposed API by tgarc · Pull Request #182 · bastibe/python-soundfile

tgarc · 2017-01-08T22:59:43Z

adds support for reading/writing files directly from byte buffers for additional 'dtype' formats:

int24
int8
uint8

Notes:

raw_read/write only supports dtype argument since 'int24' has no native ctype
much of the functionality of _check_buffer/_cdata_io/_check_dtype has been rewritten into these functions specifically because the current functions aren't intended to handle int24 types

+ adds support for reading/writing files directly from byte buffers for additional 'dtype' formats: - int24 - int8 - uint8 Notes: + raw_read/write only supports dtype argument since 'int24' has no native c type + much of the functionality of _check_buffer/_cdata_io/_check_dtype has been rewritten into these functions specifically because the current functions aren't intended to handle int24 types

bastibe · 2017-01-09T09:36:02Z

soundfile.py

                out[frames:] = fill_value
        return out

+    def read_raw(self, frames=-1, dtype=None):


I think that the dtype parameter is confusing here. Maybe instead of having both frames and dtype a simple numbytes would be preferable. At any rate, dtype can't be an argument, since the file itself has a non-mutable datatype.

That was my initial thought too but the raw_read/write functions actually require that you specify a number of bytes that is a multiple of the audio file frame size. From the libsndfile docs:

The number of bytes read or written must always be an integer multiple of the number of channels multiplied by the number of bytes required to represent one sample from one channel.

Hence, it made more sense to make the interface this way.

But isn't the dtype given by the audio file?

bastibe · 2017-01-09T09:37:23Z

soundfile.py

+        Returns
+        -------
+        buffer
+            A buffer containing the read data.


I do not want pysoundfile to leak CFFI data structures. I would much prefer a bytes or bytearray object instead of a cffi buffer.

@bastibe I don't think there is reason for concern here. The repr() mentions some CFFI type, but it is really just like any other Python buffer object.

I think it's the right thing to use buffer objects, since those are the lowest level Python data structures and supported by many built-in and third-party libraries.

Many things might also work with bytearray(), but I see no advantage in wrapping the buffers in bytearrays.

bastibe · 2017-01-09T09:37:38Z

Thank you for the pull request.

What is your use case for this? Why would you ever want to read the raw binary data instead of numpy arrays?

bastibe · 2017-01-09T09:38:24Z

soundfile.py

        assert written == len(data)
        self._update_len(written)

+    def write_raw(self, data, dtype=None):


doesn't need a dtype, see above.

bastibe · 2017-01-09T09:38:41Z

soundfile.py

+
+        return _ffi.buffer(cdata)
+
+    def read_raw_into(self, buffer, dtype=None):


doesn't need a dtype, see above.

tgarc · 2017-01-09T15:44:55Z

@bastibe My use case for this is really for streaming audio data using sounddevice. The buffer_read/write functions are what I typically use, the only real reason for using raw_read/write is to be able to directly read/write 24 bit audio data. (without upcasting/downcasting). It's definitely kind of a corner case since I believe the OS will typically handle downcasting 32bit data to 24bit when writing to audio devices but since the functionality was already there in libsndfile I thought it was worth exposing it.

Said another way, the raw_read/write functions are the closest thing to accessing the bytes directly from an open sndfile object.

mgeier · 2017-01-09T16:05:46Z

Just for reference, there has already been an issue about this topic: #25.

I also tried to incorporate sf_read_raw() and sf_write_raw() in #72, but I didn't find a meaningful way to do this. So I skipped them.

Regarding API, note that the word "raw" is quite ambiguous in the context of libsndfile, and since you are using buffers, there is also a high potential for confusion with buffer_read() and buffer_write().

Also note that sf_read_raw() and sf_write_raw() work only on a subset of the supported file types (I don't know exactly, probably only the RIFF-based types?), therefore I don't think it's worth supporting it.

Did you try using the wave and/or aiff modules from the standard library?
They are quite limited, but if you need packed 24bit data in memory, they could actually be the right thing to use?

What are your concerns regarding conversion from 24bit integer to 32bit float?
Speed, size or accuracy?
IIRC correctly, the conversion is lossless, and I could imagine that the speed difference might be negligible.
And unless you are loading huge files into memory (instead of streaming them from disk), the size difference shouldn't matter that much either, right?

tgarc · 2017-01-10T06:35:31Z

@mgeier Thanks for pointing out those discussions; I hadn't realized you'd try to implement this before. I think I had the same idea as you in that I thought it would be good to have a simple read_bytes kind of functionality. But as you've stated earlier (and is mentioned in the sndfile docs) raw_read and raw_write only work for a subset of audio formats. There's also a dirty little caveat in the sndfile docs:

Note : The result of using of both regular reads/writes and raw reads/writes on compressed file formats other than SF_FORMAT_ALAW and SF_FORMAT_ULAW is undefined.

So in general it's not as 'user-friendly' as the other functions. On the other hand it still provides a way to read bytes directly from several standard audio formats (which ones I'm not entirely sure yet).

I'd like to do some more testing with this and see how many formats are actually supported. I understand the concerns about adding something to the API which has incomplete functionality though.

bastibe · 2017-01-16T13:06:07Z

Wouldn't it be easier to stream the whole file over the network, and open the receiving socket with pysoundfile?

tgarc · 2017-01-22T01:38:25Z

That makes sense but I'm actually wanting to stream pcm audio to an audio device.

bastibe · 2017-01-22T10:47:38Z

Why not open the file without SoundFile, and just skip the header before streaming?

The point of SoundFile is to be able to decode audio files. If you want to explicitly not decode them, why use SoundFile?

I'm sorry, but I am going to reject this pull request. A good library is a library that does one minimal job, and while this is certainly a worthwhile functionality, I don't think that it is a good fit for this library.

tgarc added 5 commits January 8, 2017 15:18

updated raw_read/write functions docs

e0c9880

expose needs_endswap

9802c23

removed additional type from _ffi_types

431f6f1

fixed a variable name mistake in read_raw_into

a367046

bastibe reviewed Jan 9, 2017

View reviewed changes

soundfile.py

return _ffi.buffer(cdata)

def read_raw_into(self, buffer, dtype=None):

Copy link

Owner

bastibe Jan 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need a dtype, see above.

bastibe closed this Jan 22, 2017

mgeier mentioned this pull request Mar 7, 2017

Create 8 Bit Alaw File #194

Closed


		return _ffi.buffer(cdata)

		def read_raw_into(self, buffer, dtype=None):

Conversation

tgarc commented Jan 8, 2017

Uh oh!

bastibe Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgarc Jan 9, 2017

Choose a reason for hiding this comment

Uh oh!

bastibe Jan 9, 2017

Choose a reason for hiding this comment

Uh oh!

bastibe Jan 9, 2017

Choose a reason for hiding this comment

Uh oh!

mgeier Jan 9, 2017

Choose a reason for hiding this comment

Uh oh!

bastibe commented Jan 9, 2017

Uh oh!

bastibe Jan 9, 2017

Choose a reason for hiding this comment

Uh oh!

bastibe Jan 9, 2017

Choose a reason for hiding this comment

Uh oh!

tgarc commented Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgeier commented Jan 9, 2017

Uh oh!

tgarc commented Jan 10, 2017

Uh oh!

bastibe commented Jan 16, 2017

Uh oh!

tgarc commented Jan 22, 2017

Uh oh!

bastibe commented Jan 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bastibe Jan 9, 2017 •

edited

Loading

tgarc commented Jan 9, 2017 •

edited

Loading