Implement Unit Tests using pytest. by bastibe · Pull Request #48 · bastibe/python-soundfile

bastibe · 2014-06-17T18:55:19Z

For discussion, see #41 .

This is not complete yet.

So far, this is not quite as big an improvement as I would have liked. In particular, while the parametric fixtures are quite nice, there is a lot of repetition in there that I haven't been able to get rid of.

At least the tests themselves are pretty clean though.

The write tests suffer from the fact that the files are opened write-only, and thus there is no way to check whether the data was actually written correctly. This can only be done in read-write mode, which is not implemented yet. The tests for read-write-mode would look rather similar to the write-only tests though, and I haven't figured out how to solve that repetition yet.

Finally, I have not figured out how to test both the functions and the methods at the same time yet.

- converted tests for read - converted tests for seek - converted tests for write

this simplifies the fixture setup some. A lot less repetition now.

bastibe · 2014-06-17T19:23:34Z

50fc1b8 improves the fixtures quite a bit.

I took some inspiration from other tests I have seen and used long function names instead of docstrings. I'm not sure I like it. But at least it makes the function names somewhat less redundant--before, they said kind of the same thing as the docstrings.

What do you think?

now symmetric with test_w

There are some peculiarities with how to set the file format when the file is in rw mode. Now, file format is optional for rw mode.

bastibe · 2014-06-18T09:02:19Z

The test rewrite is almost complete now. Turns out, rw mode is quite complicated. In rw mode, a file format may be given, but is not required. I had to make a few changes in order to make rw mode work correctly.

Also, rw mode does not seem to work with virtual-io. The libsndfile documentation on the topic is very vague.

Another interesting thing: Opening a file in rw mode sets the read pointer to 0 and the write pointer to end. This can get quite confusing.

bastibe · 2014-06-18T09:19:44Z

OK, the test rewrite is complete now. Please review the changed tests, then we can begin implementing new tests.

This should have been part of the last commit.

bastibe · 2014-06-18T09:24:30Z

The tests are slightly longer now, but they cover a lot more ground, too.

In particular, all functionality is tested for filename-opening, filehandle-opening and stream-opening. Total test runtime is about equal to before.

I think that the tests themselves look much nicer now, although the fixture setup code is quite hairy.

Test coverage is 83% at this point (according to pytest-cov)

mgeier · 2014-06-18T09:52:47Z

The tests look great, I like the py.test style much more than that of the unittest module!

I did some clean-up, but one function is still too long to fit into a line: test_read_into_out_over_end_with_fill_should_return_full_data_and_write_into_out.

bastibe · 2014-06-18T10:29:38Z

Thank you!

Yeah, that function is ridiculous. I don't think we should change it though, since the name contains a lot of important information.

Since the tests are holding up development in other areas, we should merge soon.

Any more comments? If you approve, just merge.

mgeier · 2014-06-18T12:00:14Z

I found one issue with Python 2.7, I would like to try to somewhat change the fixtures and I have a few more comments.
I can open separate issues for them, so they don't keep us from merging.

But there is one thing I would like to have changed before merging:
It's really not advisable that all values in the test WAV file are the same.
We might mess up the ordering (rows vs. columns etc.) but we would not be aware of that because all values are 0.5!
I think every sample in the test files should have a unique value.
We could also use this to test the seek()-related functionality instead of only relying on the return value of seek().
I gave an example (https://github.com/bastibe/PySoundFile/issues/41#issuecomment-44557032), but even there not all values are unique.
All values should be unique and accurately representable in both decimal and binary floating point.

Probably even 4 frames would be enough:
[[1.0, -1.0], [0.75, -0.75], [0.5, -0.5], [0.25, -0.25]]

Further, I strongly suggest to use a 32-bit float file instead of a 16-bit PCM file.
We should try to avoid all float/integer conversions, that's none of our business (and it unecessarily complicates things).

If you for some reason prefer a PCM file, we should use int16 or int32 in the tests (which would make the tests a bit more verbose).

bastibe · 2014-06-18T12:10:33Z

I found one issue with Python 2.7, I would like to try to somewhat change the fixtures and I have a few more comments.
I can open separate issues for them, so they don't keep us from merging.

I think a separate issue would be a good idea if it is nothing serious.

I think every sample in the test files should have a unique value.

That is a very good idea! I'll see to it tonight (unless you do it by then).

Further, I strongly suggest to use a 32-bit float file instead of a 16-bit PCM file.

Ideally, I'd prefer separate test files for 32-bit floats and 16-bit PCMs in both FLAC and WAV containers. At any rate, 16-bit PCM is the default format everywhere though, so this should be our main concern. 16-bit PCM is what most people will use most of the time.

mgeier · 2014-06-18T13:14:46Z

That is a very good idea! I'll see to it tonight (unless you do it by then).

Go ahead, I won't commit anything for now.

I think it's not really important which format we use for testing, neither is it important that we use more than one.
We have to trust that libsndfile does The Right Thing.
We should only be concerned about selecting formats, but I guess that's already part of the current tests.

I think that we shouldn't concern ourselves with issues regarding the conversion from floating point to two's complement integers.
So please use float32/float64 together with 32-bit float files and/or int16/int32 together with PCM files but don't mix the two.

bastibe · 2014-06-18T16:37:49Z

AAHH, big error. Don't look at that commit!

also added regression test for this

bastibe · 2014-06-18T16:47:59Z

OK, should be fixed now. Sorry for the push forcing. I hope it didn't cause any grief.

Anyway, RAW reading and writing should work correctly now, as should all the other kinds. Also, I added a test to make sure that RAW reading will continue to work in the future.

thus testing for data permutations

mgeier · 2014-06-18T17:08:29Z

Now I'm starting to understand the problem!

'rw' files can be either new (empty) files or already existing files.
In the former case the previous code didn't work ... I never tried this ... it didn't even occur to me!

I think it would be great if we could distinguish the two cases.

Because after your changes, if the file already exists, the specified sample rate, channels and format are just silently ignored!

BTW, the check with "RAW files must specify a subtype" seems strange, because before this case was handled automatically (in _format_int()).

bastibe · 2014-06-18T17:09:52Z

BTW, the check with "RAW files must specify a subtype" seems strange, because before this case was handled automatically (in _format_int()).

I know, but the error message was pretty unclear and didn't match the messages for a missing sample_rate or channels.

bastibe · 2014-06-18T17:11:01Z

I think it would be great if we could distinguish the two cases.

It would be great to be able to detect whether a file exists before we try to open it. This should be done in a new issue though.

bastibe · 2014-06-18T17:14:50Z

Anyway, I converted all the test files to contain meaningful data similar to your suggestion. I did not convert every read to integer though. Instead, I converted every np.all to a function allclose, which calls np.allclose(x, y, atol=2**-15).

I know this is not an ideal solution, but I chose this route as it was less likely to introduce errors. We can try converting everything to integer in a later issue.

Any comments?

mgeier · 2014-06-18T17:58:42Z

I think it would be better if you postpone the whole change to __init__().
Because now you're replacing some buggy code with some other buggy code.

I did not convert every read to integer though. Instead, I converted every np.all to a function allclose

But that's exactly what I hoped to avoid.
We can and should use exact equality.
If you use PCM data you shouldn't use 0.5 but 16384 or any other valid integer which is actually stored in the WAV file and than compare it exactly with this expected value.

It would be probably easier to use 32-bit float files, where you can store exactly 0.5 and compare it exactly to 0.5.

It probably wouldn't change anything in the reliability of the tests, but it would lead to simpler and more beautiful test code.

I think values like 0.5 look nicer in the code than 16384 therefore I think I'd prefer float for the majority of tests.
Also, as float64 is the default, we don't have to specify int16 all the time, again leading to more beautiful code.

OTOH, if you insist on using PCM_16, you should make a test file with small values, like [[1, -1], [2, -2], [3, -3], [4, -4]] or something similar, which would also look nice in the code and would be simpler to handle.

bastibe · 2014-06-18T18:16:37Z

All of that is nice, and we really should talk about all of this.

But the point of this PR was to create a working test bench. This goal has been achieved. The tests might not be perfect yet, but they test the right things. The tested code might not be perfect yet, but it provably does the right thing. That was the point of this.

Now let's merge, and then refactor stuff to our heart's content with the security of a working test bench. This has stalled development long enough.

mgeier · 2014-06-19T10:55:38Z

IMHO, there are two things which have to be done before merging:

Add proper WAV files for testing (I've written previously what I mean by that) and test with exact equality (none of that allclose() stuff).
Remove all changes to pysoundfile.py and all test code related to them. The 'rw' problem should be properly dealt with in a separate PR.

bastibe · 2014-06-19T16:20:57Z

Do the first thing, then. But make sure to also test floating point reading and writing, because they are the primary use case.

The second thing, no. Refactor the code if you like. I fully admit that it is not perfect. But is is now more correct than it was before AND we now have tests to make sure that it stays correct. Removing that would be wrong.

bastibe · 2014-06-19T16:50:22Z

As I said, we can discuss and change this later no problem. But right now, this issue is blocking other developments, and it is not functionally broken. So let's please merge, and see about a refactor in a separate issue.

mgeier · 2014-06-19T17:44:41Z

OK then, if you can't wait ... but please at least change the ValueError back to TypeError.
I don't want to make a separate pull request for that.

I will do my other suggested changes later as separate pull requests.

Sorry for the mess.

Implement Unit Tests using pytest.

began to implement pytest tests.

47df4c5

- converted tests for read - converted tests for seek - converted tests for write

bastibe changed the title ~~Convert tests from UnitTest to pytest. For discussion, see #41~~ Convert tests from UnitTest to pytest. Jun 17, 2014

bastibe changed the title ~~Convert tests from UnitTest to pytest.~~ Implement Unit Tests using pytest. Jun 17, 2014

bastibe mentioned this pull request Jun 17, 2014

Improve tests #41

Closed

refactored test fixtures

50fc1b8

this simplifies the fixture setup some. A lot less repetition now.

bastibe added 4 commits June 17, 2014 21:34

converted a few more tests

3fda6b2

renamed read test file to test_r

0a99d40

now symmetric with test_w

added preparation for rw mode tests

f8e29ba

added tests for rw mode and fixed related bugs

4970fc3

There are some peculiarities with how to set the file format when the file is in rw mode. Now, file format is optional for rw mode.

added mono tests

b5d7a0a

add another test file.

88bd737

This should have been part of the last commit.

mgeier added 3 commits June 18, 2014 11:45

Remove obsolete imports

38eaf40

Rename duplicate test functions

1db46c0

Some (mainly automatic) PEP 8 formatting changes

4816315

fix for reading RAW files

15eac9e

also added regression test for this

changed test files to contain unique data

4f32ae1

thus testing for data permutations

bastibe mentioned this pull request Jun 19, 2014

Implement a generator for optionally overlapping blocks #35

Merged

Revert ValueError to TypeError for missing args

0982d60

Sorry for the mess.

bastibe added a commit that referenced this pull request Jun 20, 2014

Merge pull request #48 from bastibe/tests

e1b6dda

Implement Unit Tests using pytest.

bastibe merged commit e1b6dda into master Jun 20, 2014

bastibe mentioned this pull request Jun 24, 2014

Re-factor tests #50

Merged

bastibe deleted the tests branch April 25, 2016 08:42

Conversation

bastibe commented Jun 17, 2014

Uh oh!

bastibe commented Jun 17, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

mgeier commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

mgeier commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

mgeier commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

mgeier commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

mgeier commented Jun 18, 2014

Uh oh!

bastibe commented Jun 18, 2014

Uh oh!

mgeier commented Jun 19, 2014

Uh oh!

bastibe commented Jun 19, 2014

Uh oh!

bastibe commented Jun 19, 2014

Uh oh!

mgeier commented Jun 19, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants