Skip to content

Fast Gaussian blur#961

Merged
wiredfool merged 59 commits intopython-pillow:masterfrom
homm:fast-box-blur
Nov 27, 2014
Merged

Fast Gaussian blur#961
wiredfool merged 59 commits intopython-pillow:masterfrom
homm:fast-box-blur

Conversation

@homm
Copy link
Member

@homm homm commented Oct 15, 2014

No description provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the difference in structure between this one for(i=1...) and the next one for(i=0...)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because imIn can't be used as source and target simultaneously at the first pass, because it is source image. But we can reuse imOut (this loop) and temp (next loop).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Might make sense to note that imOut is being used as scratch space.

@wiredfool
Copy link
Member

Looking at the results for a synthetic image, 20 pixels wide, one white pixel in the middle:

Blur as a function of radius. Individual numbers are the pixel values, last number is the total of all the pixel values. With perfect precision, it should be 255, since the area under a gaussian curve is 1. Quantization being what it is, it's not exact, especially when running into the edge.

Radius: 0    0   0   0   0   0   0   0   0   0   0 255   0   0   0   0   0   0   0   0   0  255
Radius: 1    0   0   0   0   0   0   0   1  14  60 103  60  14   1   0   0   0   0   0   0  253
Radius: 2    0   0   0   0   0   2   8  18  32  44  49  44  32  18   8   2   0   0   0   0  257
Radius: 3    0   0   1   2   5   9  15  21  27  31  32  31  27  21  15   9   5   2   1   0  254
Radius: 4    1   2   4   6   9  12  16  19  22  23  24  23  22  19  16  12   9   6   4   2  251
Radius: 5    4   5   7   8  11  13  15  17  18  19  19  19  18  17  15  13  11   9   7   5  250
Radius: 6    6   7   8   9  11  12  13  14  15  16  16  16  15  14  13  12  11  10   9   8  235
Radius: 7    7   8   9  10  10  11  12  13  13  13  13  13  13  13  12  11  11  10   9   8  219
Radius: 8    8   9   9  10  10  11  11  11  12  12  12  12  12  12  11  11  11  10  10   9  213
Radius: 9    8   8   9   9   9  10  10  10  11  11  11  11  11  12  12  12  11  11  11  11  208
Radius:10   10  10  10  11  11  11  11  11  12  12  12  12  12  12  13  13  13  13  13  13  235

This looks reasonable to me with a couple caveats, which require further checking.

  • The r=1 distribution looks flatish, like the tails are fatter than the peak. It's probably a matter of my concept of radius not matching the calculated radius. It's certainly close to correct, not off by a factor of 2 or anything.
  • The reflection from the edge that's visible on the right of r=10 looks odd to me. The peak of the distribution has shifted to significantly to the right and has gotten higher from the r=8,9 cases.

Blur as a function of number of passes:
For radius=3, a varying number of passes through the same image.

R: 3 P: 1    0   0   0   0   0  16  25  25  25  25  25  25  25  25  25  16   0   0   0   0  257
R: 3 P: 2    0   0   0   1   6  11  16  21  25  30  34  30  25  21  16  11   6   1   0   0  254
R: 3 P: 3    0   0   1   2   5   9  15  21  27  31  32  31  27  21  15   9   5   2   1   0  254
R: 3 P: 4    0   0   1   2   5   9  15  21  27  31  33  31  27  21  15   9   5   2   1   0  255
R: 3 P: 5    0   0   1   2   5   9  14  21  27  31  33  31  27  21  14   9   5   2   1   0  253
R: 3 P: 6    0   1   1   3   5   9  15  21  28  32  34  32  28  21  15   9   5   3   1   1  264
R: 3 P: 7    0   0   1   2   5   9  14  21  27  32  34  32  27  21  14   9   5   2   1   0  256
R: 3 P: 8    0   0   1   2   5   8  14  20  27  32  34  32  27  20  14   8   5   2   1   0  252
R: 3 P: 9    0   0   1   2   5   9  14  21  27  32  34  32  27  21  14   9   5   2   1   0  256
R: 3 P:10    0   0   0   2   4   8  14  20  26  31  33  31  26  20  14   8   4   2   0   0  243
R: 3 P:11    0   0   0   2   4   9  14  21  28  32  34  32  28  21  14   9   4   2   0   0  254
R: 3 P:12    0   0   0   1   4   8  13  20  27  32  33  32  27  20  13   8   4   1   0   0  243
R: 3 P:13    0   0   0   2   4   8  14  20  27  32  34  32  27  20  14   8   4   2   0   0  248
R: 3 P:14    0   0   1   2   5   9  14  20  26  31  33  31  26  20  14   9   5   2   1   0  249
R: 3 P:15    0   0   1   3   5  10  15  21  27  32  34  32  27  21  15  10   5   3   1   0  262
R: 3 P:16    0   0   1   3   5  10  15  21  28  33  35  33  28  21  15  10   5   3   1   0  267
R: 3 P:17    0   0   1   3   5  10  15  21  27  32  34  32  27  21  15  10   5   3   1   0  262
R: 3 P:18    0   0   1   3   6  10  16  23  30  35  37  35  30  23  16  10   6   3   1   0  285
R: 3 P:19    0   0   0   0   2   6  12  20  27  33  35  33  27  20  12   6   2   0   0   0  235

I'd say 3 or 4 passes is closest here, any more than that and there's some significant numerical instability. I'm guessing that it's a matter of iterating over the operations and squashing down to an 8 bit value between each pass. I suspect that to actually converge, we'd have to be using much more precision in the intermediate steps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert to ppm, since png can be not available.

@homm
Copy link
Member Author

homm commented Oct 30, 2014

Looking at the results for a synthetic image, 20 pixels wide, one white pixel in the middle:

There is no "middle" pixel in 20 px wide image :) That is why result is not symmetric for large radius.

The r=1 distribution looks flatish, like the tails are fatter than the peak. It's probably a matter of my concept of radius not matching the calculated radius.

In fact there is not a radius, this is σ, normal distribution. And result is correct with tolerance is 1.

screen shot 2014-10-30 at 15 20 33

The reflection from the edge that's visible on the right of r=10 looks odd to me. The peak of the distribution has shifted to significantly to the right and has gotten higher from the r=8,9 cases.

Again, this is because your image is not symmetric. This is result for 21 width image:

R  0:   0   0   0   0   0   0   0   0   0   0 255   0   0   0   0   0   0   0   0   0   0 
R  1:   0   0   0   0   0   0   0   1  14  61 104  61  14   1   0   0   0   0   0   0   0 
R  2:   0   0   0   0   0   2   8  18  32  44  49  44  32  18   8   2   0   0   0   0   0 
R  3:   0   0   1   2   5   9  15  21  27  31  32  31  27  21  15   9   5   2   1   0   0 
R  4:   1   2   4   6   9  12  16  19  22  23  24  23  22  19  16  12   9   6   4   2   1 
R  5:   4   5   7   8  11  13  15  17  18  19  19  19  18  17  15  13  11   8   7   5   4 
R  6:   6   7   8   9  11  12  13  14  15  16  16  16  15  14  13  12  11   9   8   7   6 
R  7:   7   8   9  10  10  11  12  13  13  13  13  13  13  13  12  11  10  10   9   8   7 
R  8:   8   9   9  10  10  11  11  11  12  12  12  12  12  11  11  11  10  10   9   9   8 
R  9:   8   8   9   9   9  10  10  10  10  10  11  10  10  10  10  10   9   9   9   8   8 
R 10:  10  10  10  10  11  11  11  11  11  11  11  11  11  11  11  11  11  10  10  10  10 

There is no shift at all. And this if you'll continue blur incrementing:

R  9:   8   8   9   9   9  10  10  10  11  11  11  11  11  12  12  12  11  11  11  11 
R 10:  10  10  10  11  11  11  11  11  12  12  12  12  12  12  13  13  13  13  13  13 
R 11:  12  12  12  12  12  12  12  12  12  12  12  12  12  12  12  12  12  12  12  12 
R 12:  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11 
R 13:  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10  10 
R 14:   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9 
R 15:   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9   9 

The shift is gone away right at R=11. It appear because of passes and that fact what we assume pixels outside the image has same color as nearest. So after first pass with radius 10 we have something like this:

  0  13  13  13  13  13  13  13  13  13  13  13  13  13  13  13  13  13  13  13

At next pass we assume that all pixels on the left is 0, while all pixels at the right is 13.

For radius=3, a varying number of passes through the same image.

Number of passes is even not a part of public api. N=3 already gives good gaussian approximation. Pixels value which we are seeing is rounded values. When we get 12.49, it'll be rounded to 12, while 12.51 will be rounded to 13. Of course on each pass error is collected. More passes = more error. But I want to notice what result is really good. There is only very small difference for N between 3 and 19.

@wiredfool
Copy link
Member

Ok, I don't know what I was seeing thinking that the distribution was fat-tailed. I think I may have been looking at a cdf or similar, rather than the pdf.

I should have been more clear about the iterations -- I'm not seeing that as an issue, just confirmation that 3 iterations is pretty near optimal, and that the behavior gets worse after that.

@wiredfool
Copy link
Member

I finally got my mac mini back up and running and managed to run the bigendian tests. There's definitely endianess issues.

Traceback (most recent call last):
  File "/home/erics/Pillow/Tests/test_imageops_usm.py", line 67, in test_usm_accuracy
    self.assertEqual(i.tobytes(), snakes.tobytes())
AssertionError: '\x03\xff\xff\xff\xff\x00\xff\x00...' != 
'\xff\xff\xff\x03\x00\xff\x00\xff...'

I think what might work is to iterate through the image32 line pointer as a (UINT8 *) and do each pixel individually, except for the alpha channel which can just be copied (has_alpha or no, we can just copy it along)

@wiredfool
Copy link
Member

This works on bigendian machines: wiredfool@712fc5b . It's not cleaned up, and I haven't checked to see if it runs on little endian ones yet.

@homm
Copy link
Member Author

homm commented Nov 6, 2014

As assignment of dword is faster (at least at x86), I think we should make a macro with two implementations.

@wiredfool
Copy link
Member

Fair enough.

@wiredfool
Copy link
Member

Given this: #977 (comment), what do you want to do with the endianness bug here?

@wiredfool
Copy link
Member

Looking at the documentation portion of this, for now, notes on the ImageFilter calls should be enough. We should also document the ImageOps functions, and generally prefer them for any multichannel image. (since the filters split the images, run each of the channels through the filter, and then recombine. That's extra work/memory over just running it on all channels at once).

I'd say that the filters for Gaussian and USM should probably be deprecated for new code in favor of the ImageOPS version once we get the return types worked out there.

@homm
Copy link
Member Author

homm commented Nov 21, 2014

@wiredfool I've fixed endianness in both blur and unsharp. Please, actualize checklist.

@wiredfool
Copy link
Member

Ok, will check.

@wiredfool
Copy link
Member

Ok, works on PPC now.

wiredfool added a commit that referenced this pull request Nov 27, 2014
@wiredfool wiredfool merged commit 8a3302b into python-pillow:master Nov 27, 2014
@homm
Copy link
Member Author

homm commented Nov 27, 2014

🍸
(I had looked for champagne bottle emoji, but didn't find)

@wiredfool
Copy link
Member

Yes. It's been a long process. Good to get it in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants