What did you do?
Resized a file in Linux and saved as GIF. Resized same file in Windows and saved as GIF.
What did you expect to happen?
I should get identical files in Windows and in Linux.
What actually happened?
Files have different contents, sometimes different sizes. The differences are not visible to my eye, but I want reproducible and identical results when possible.
What are your OS, Python and Pillow versions?
- OS: Ubuntu 20.04.1 LTS (GNU/Linux 4.19.128-microsoft-standard x86_64) (WSL2); Windows 10 Pro 64-bit build 18363.1316
- Python: 3.8.5 in Linux; 3.9.1 in Windows
- Pillow: 8.2.0.dev0 -- my own builds from recent source in both Linux and Windows
Why does this happen?
I tracked the problem down to the use of qsort() inbuild_distance_tables() in Quant.c. qsort() is not a stable sort, so the different implementions in Windows and Linux can sort equal keys differently, leading to different quantization results. build_distance_tables() is called by quantize() and quantize2() which are called by ImagingQuantize() which is called by _quantize() in _imaging.c, which is called by quantize() and convert() in Image.py, and ultimately by resize(). (convert() is also called a lot of other places; I don't know if others will end up using quantize().)
How can this be corrected?
Use a sort routine embedded in Quant.c instead of calling the standard library qsort(). Not difficult. See (edited: I created a branch on my fork) https://github.com/raygard/Pillow/blob/quant_sort/src/libImaging/Quant.c for my fix. I've been using this for several months. (The sort routine is very well tested.)
I'm new at this and don't yet know how to create a pull request. (But I'm going to try...)
How to reproduce my results and test my fix
See the code below. Run this on some not-too-small image (static, not animation), in Linux and Windows, using the current release or development version of PIL. I used Pillow/Tests/images/old-style-jpeg-compression.png for this test. The program will resize and save as GIF, and display the CRC of the resulting file.
Then rebuild PIL with my Quant.c in Linux and Windows and try it again. In my case, I get these results:
Linux, before:
ray@radon:/mnt/e/rdg/wsl/pil$ python3 test_quant_sort.py Pillow/Tests/images/old-style-jpeg-compression.png
fn='Pillow/Tests/images/old-style-jpeg-compression.png' im.mode='RGB' im.size=(4160, 870)
crc='a8b873ca'
Windows, before:
15:48:33.98 ++e:\rdg\pil>py -3 test_quant_sort.py Pillow-master\Tests\images\old-style-jpeg-compression.png
fn='Pillow-master\\Tests\\images\\old-style-jpeg-compression.png' im.mode='RGB' im.size=(4160, 870)
crc='481f4453'
Linux, after:
ray@radon:/mnt/e/rdg/wsl/pil$ python3 test_quant_sort.py Pillow/Tests/images/old-style-jpeg-compression.png
fn='Pillow/Tests/images/old-style-jpeg-compression.png' im.mode='RGB' im.size=(4160, 870)
crc='9160fc08'
Windows, after:
15:49:38.65 ++e:\rdg\pil>py -3 test_quant_sort.py Pillow-master\Tests\images\old-style-jpeg-compression.png
fn='Pillow-master\\Tests\\images\\old-style-jpeg-compression.png' im.mode='RGB' im.size=(4160, 870)
crc='9160fc08'
import sys, io, os, binascii
from PIL import Image
def main():
fn = sys.argv[1]
with Image.open(fn).convert('RGB') as im:
print(f'{fn=} {im.mode=} {im.size=}')
factor = 0.2
newsize = tuple(int(x*factor) for x in im.size)
newim = im.resize(newsize, resample=Image.LANCZOS)
b = io.BytesIO()
newim.save(b, format='GIF')
crc = '%08x' % (binascii.crc32(b.getvalue()) & 0xffffffff)
print(f'{crc=}')
newim.save(f'{os.path.splitext(os.path.basename(fn))[0]}_resized.gif')
newim.close()
main()
What did you do?
Resized a file in Linux and saved as GIF. Resized same file in Windows and saved as GIF.
What did you expect to happen?
I should get identical files in Windows and in Linux.
What actually happened?
Files have different contents, sometimes different sizes. The differences are not visible to my eye, but I want reproducible and identical results when possible.
What are your OS, Python and Pillow versions?
Why does this happen?
I tracked the problem down to the use of
qsort()inbuild_distance_tables()inQuant.c.qsort()is not a stable sort, so the different implementions in Windows and Linux can sort equal keys differently, leading to different quantization results.build_distance_tables()is called byquantize()andquantize2()which are called byImagingQuantize()which is called by_quantize()in_imaging.c, which is called byquantize()andconvert()inImage.py, and ultimately byresize(). (convert()is also called a lot of other places; I don't know if others will end up usingquantize().)How can this be corrected?
Use a sort routine embedded in
Quant.cinstead of calling the standard libraryqsort(). Not difficult. See (edited: I created a branch on my fork) https://github.com/raygard/Pillow/blob/quant_sort/src/libImaging/Quant.c for my fix. I've been using this for several months. (The sort routine is very well tested.)I'm new at this and don't yet know how to create a pull request. (But I'm going to try...)
How to reproduce my results and test my fix
See the code below. Run this on some not-too-small image (static, not animation), in Linux and Windows, using the current release or development version of PIL. I used
Pillow/Tests/images/old-style-jpeg-compression.pngfor this test. The program will resize and save as GIF, and display the CRC of the resulting file.Then rebuild PIL with my
Quant.cin Linux and Windows and try it again. In my case, I get these results: