Several changes based on #18#22
Several changes based on #18#22mgeier wants to merge 31 commits intobastibe:masterfrom mgeier:many-changes
Conversation
All constants are now in one place In addition, _getAttributeNames() was added which helps auto-completion in IDEs.
In new-style classes the base class __setattr__() has to be called instead of inserting the value into self.__dict__. See http://docs.python.org/2/reference/datamodel.html#customizing-attribute-access
Closes #14
This is relevant for tools that do introspection, e.g. for auto-completion in IDEs.
The first constructor argument can now be: - a string (path to a file) - an int (a file descriptor) - a file-like object The argument virtual_io is not needed, because this can be detected automatically. Closes #19
|
Whoa, that's a lot of changes! That's almost a complete rewrite! Are you interested in becoming a committer? How can I contact you privatly about this? |
I'd be honored. But as long as you're open for discussion it's also OK if everything goes through your hands.
my first name dot my last name at gmail dot com |
_FormatType, _SubtypeType and _EndianType are now combined into _Format. _ModeType was renamed to _Mode. Code for string input is now inside these classes. Format constants are now directly assigned in the module namespace, which should make clearer where they come from. The dictionaries _formats, _subtypes and _endians are not necessary anymore. Many assertions were replaced by exceptions.
... and change assertion to exception.
Closes #17
|
I was thinking more along theses lines: This will behave just like a normal
Why? If sndfile were to introduce a new format, it would be quite easy to define.
Not really, as it does not really duplicate code anywhere. It is merely a string and a variable name that are the same, but which are used for different purposes. On the other hand, it is explicit, which is better than implicit, as per PEP20. |
|
I admit that my suggestion is less explicit. As a side note, I also kept the line length to 79 characters, as of PEP8, which is the reason why I defined the alias Anyway, let's assume PEP20 and DRY cancel each other out, the question of supporting undefined values still remains open and may decide our choice of implementation. Supporting future version of libsndfile is just one aspect, probably not the most important one. Another one is the combination of type/subtype/endian. I think it's easier for many users to have separate arguments I would like to support both. When we decide to support combined format IDs (which I hope!), |
|
We could just check every combination of the three masks with the format. This would provide safety from invalid values. I guess that this safety would be desirable. I wonder if we are overengineering this though. Maybe straight-up integers would really be sufficient after all. In fact, I can see users being confused by our ints-that-print. Alternatively, maybe it would be preferable to use normal classes that contain the int and overload the What do you think about this? |
|
Regarding DRY vs. explicit, I understand DRY as a reminder to not copy and paste code all over the place. This avoids problems when one copy of the code is changed, but not another. It also avoids problems with two almost identical but subtly different pieces of code. Note that in our case, both instances of the code in question would be very close to one another, with little consequence if one was changed but not the other and no danger of mistaking one for the other. Being explicit is a core quality of code in general. "explicit is better than implicit" is a guard against too much magic (at least that is how I understand it). It is meant to prevent things from having non-obvious meaning, such as declaring a global variable and thereby altering the behavior of some class. Again, it would be a rather benign case in our code and errors would merely lead to a wrong print somewhere. Still, code is above all written to be read, and only incidentally for the computer to execute. I would therefore value explicitness over DRY. However, both approaches obscure the meaning of our constants somewhat. The reader will wonder in what way that Also, while we're on the topic of slinging acronyms, I would like to invoke KISS, as an argument for plain old ints. YAGNI, also, as a guard against overengineering ;-) |
Where exactly would you check the combination with the masks? In this PR I suggest the following (in format = format | subtype | endian
if not format:
raise ValueError("No format specified")
if not format & _TYPEMASK:
raise ValueError("Invalid format")
if not format & _SUBMASK:
raise ValueError("Invalid subtype")
if not format & _ENDMASK and endian != FILE:
raise ValueError("Invalid endian-ness")
if not format_check(format):
raise ValueError(
"Invalid combination of format, subtype and endian")
self._info.format = format
We might be over-engineering it ... But naive users don't even need to know that the constants are actually I think plain But that's an important question we should agree on: If we type Users who know that those are actually numeric values, can easily get the numbers with When I came up with the
I know the "rule" to prefer composition over inheritance, but in this case, inheritance is a much better fit. A format constant is a I think overloading the Regarding DRY, I also consider repetition in close vicinity a violation of the rule. In case of the
I was hoping it would make the meaning of the constants clearer! What is more obscure:
I think it's reasonable to assume some knowledge of Python from a reader of the source code.
Yes, plain old Specify formats as strings: f = sf.open("myfile.wav", sf.WRITE, 44100, 2, 'PCM_24')Error checking on positional arguments: f = sf.open("myfile.wav", sf.WRITE, 44100, sf.PCM_24)In this case no error would be raised and
That's not a serious suggestion, but we could follow KISS even further and not provide the named constants at all. I think we should try to find the right balance between convenience for the user and clean-ness of the implementation. |
|
I have been thinking a lot about this. On the one hand, we need numeric constants for libsndfile. On the other hand, we would like to have some human-readable representation of them. If numeric constants are hard to use, let's not use them. Let's use string constants. Their meaning is clear. They can be easily translated to ints using a dict in the class Format:
def __init__(self, format, subtype=None, endian=FILE)
if subtype is None:
subtype = _default_subtypes[type]
self.type = format
self.subtype = subtype
self.endian = endianI think this is a much cleaner solution, and it sidesteps most of our previous discussion. The Better yet, What do you think about this proposal? |
|
As for using If we were to use a class, we should at least output something like |
|
Anyway, if this discussion continues, we should probably split it off into its own issue, and merge the rest. |
|
Using string literals is definitely one of our options, I don't understand the proposed Shall it be exposed to the user? |
|
I propose that we use strings for all the constants. When we want to open a SoundFile, we provide a Thus, |
|
Also, # part of class Format:
def snd_format(self):
types = {
'wav': 0x010000,
'aiff': 0x020000,
...
'rf64': 0x220000
}
if not self.format.lower() in types:
raise ValueError('invalid type')
format = _TYPEMASK & types[self.format.lower()]
subtypes = {
'pcm_s8': 0x0001,
'pcm_16': 0x0002,
...
'vorbis': 0x0060
}
if not self.subtype.lower() in subtypes:
raise ValueError('invalid subtype')
format |= _SUBMASK & subtypes[self.subtype.lower()]
endians = {
'file': 0x00000000,
'little': 0x10000000,
'big': 0x20000000,
'cpu': 0x30000000
}
if not self.endian.lower() in endians:
raise ValueError('invalid endian')
format |= _ENDMASK & endians[self.endian.lower()]
if not format_check(format):
raise ValueError('Invalid combination of format, subtype and endian')
return format |
|
I'm slowly getting used to the idea of using string literals, but I wouldn't expose the Having to instantiate the class is unnecessarily complicated. I would stay with the three separate arguments Here are some calls which I would like to support: f = SoundFile('file.wav', 'w', 44100, 2)
f = SoundFile('file.wav', 'w', 44100, 2, 'PCM_24')
f = SoundFile('file.wav', 'w', 44100, 2, format='WAVEX', subtype='PCM_24')
f = SoundFile('file.wav', 'w', 44100, 2, format='WAVEX', subtype='PCM_24', endian='FILE')
f = SoundFile('file.wav', 'w', 44100, 2, format='WAVEX')The implementation of this should be straightforward and would indeed sidestep many of the problems we discussed in this issue. When changing to string, another question comes up: should we use uppercase or lowercase as default representation? Another question: should we also kick out the numeric constants for the open mode? |
|
My idea with Regarding capitalization, I would go for UPPER_CASE, because that is the way libsndfile named the constants. Should we provide variables for the strings as well? We probably should, if only to have a place for documenting them in the code. On the other hand, there will be a dictionary for translating strings to numbers anyway, so this is a bit of duplication there. As for open mode, I would choose |
Do you mean If yes, I think a dictionary should suffice. I created named variables for each string in my original proposal only for symmetry reasons. But with changing to format strings this is not necessary anymore.
I also think it's the easiest, Python's |
|
Alright, let's do it! |
|
The commits in this PR are quite confusing, so I started a new PR with part of the changes: #30. If that's OK, I'll continue with further PRs for the rest of the discussed changes. |
|
Agreed! |
This commit holds several changes which were discussed in #22 (which was itself based on #18). Change handling of file formats: File formats are handled as three simple strings: format, subtype and endian. Add "which" argument to seek(). This is needed because the combination with logical or (e.g. SEEK_SET | READ) isn't possible with string formats. Update frames counter on write(). Hide all non-API names in module namespace (by prefixing _). This is relevant for tools that do introspection, e.g. for auto-completion in IDEs. Support sf_open_fd(), remove obsolete argument virtual_io The first constructor argument can now be: - a string (path to a file) - an int (a file descriptor) - a file-like object The argument virtual_io is not needed, because this can be detected automatically. Closes #19 Get file extension from 'name' attribute of a file-like object. Change public attributes of SoundFile class to properties. Add 'name' property. Add properties 'format_info' and 'subtype_info'. Proper handling of dtype's in read() and write(). Add function default_subtype().
This commit holds several changes which were discussed in #22 (which was itself based on #18). Change handling of file formats: File formats are handled as three simple strings: format, subtype and endian. Add "which" argument to seek(). This is needed because the combination with logical or (e.g. SEEK_SET | READ) isn't possible with string formats. Update frames counter on write(). Hide all non-API names in module namespace (by prefixing _). This is relevant for tools that do introspection, e.g. for auto-completion in IDEs. Support sf_open_fd(), remove obsolete argument virtual_io The first constructor argument can now be: - a string (path to a file) - an int (a file descriptor) - a file-like object The argument virtual_io is not needed, because this can be detected automatically. Closes #19 Get file extension from 'name' attribute of a file-like object. Change public attributes of SoundFile class to properties. Add 'name' property. Add properties 'format_info' and 'subtype_info'. Proper handling of dtype's in read() and write(). Add function default_subtype().
Change argument name fObj to file. Return dict directly instead of creating a local variable. This was part of #22.
This pull request contains most of the changes of #18, but with more documentation and hopefully with a little less noise.
I also fixed 2 bugs regarding the virtual IO feature (963c1fa and 4e068f3, see also #19), I hope I didn't make it worse.
In addition to #18, I included following features:
available_formats()andavailable_subtypes()(88ab0ef)