-
-
Notifications
You must be signed in to change notification settings - Fork 80
Description
What did you do?
I was using the torchio library, which relies on humanize to return the memory size of image arrays. The images I processed had dimensions 512x512x166, with each pixel being a 32-bit (4-byte) integer.
What did you expect to happen?
torchio calls the naturalsize function with the occupied memory size in bytes and sets binary=True. Manually calculating the expected value:
512×512×166×4÷(1024×1024)=166 MiB
So, I expected the function to return approximately 166 MiB.
What actually happened?
Instead, the returned value was -2 MiB, accompanied by this warning:
RuntimeWarning: overflow encountered in scalar multiply
ret: str = format % ((base * bytes_ / unit)) + s
This overflow occurs because the input to the function was of type np.int32 instead of Python's native int. Since np.int32 has a maximum value of 2^31−1, the multiplication of base and bytes_ results in an overflow.
Steps to Reproduce
import humanize
import numpy as np
print(humanize.naturalsize(512*512*166*4, binary=True))
# Expected: 166.0 MiB
# Works as expected with Python’s built-in int type
print(humanize.naturalsize(np.int32(512*512*166*4), binary=True))
# Returns: -2.0 MiB
# RuntimeWarning: overflow encountered in scalar multiply ret: str = format % ((base * bytes_ / unit)) + sProposed Solutions
- Change the Order of Operations
Adjusting the order of operations can avoid overflow. In the current line:
ret: str = format % ((base * bytes_ / unit)) + swhen bytes_ is np.int32, multiplying base and bytes_ produces an np.int32 result, which overflows before it’s divided by unit. By dividing either base or bytes_ by unit before the multiplication, each sub-operation remains a float:
ret: str = format % ((base * (bytes_ / unit))) + s
# or
ret: str = format % ((base / unit * bytes_)) + s- Convert Input to Float
Alternatively, cast value to float without checking its type. Currently, the casting applies only if value is a string:
# Current approach
if isinstance(value, str):
bytes_ = float(value)
else:
bytes_ = valueUpdating it to cast all inputs to float could resolve the issue:
bytes_ = float(value)I haven’t created a pull request since I’m unsure if you’d prefer developers to ensure input compatibility or handle this within the function.
Environment
OS: Windows 11
Python: 3.12.1
Humanize: 4.11.0
Numpy: 1.26.3