-
Notifications
You must be signed in to change notification settings - Fork 22
bp.jl_bitround
#29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bp.jl_bitround
#29
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
ds=xr.tutorial.load_dataset("air_temperature")
v=list(ds.data_vars)[0]
i=23
(bp.xr_bitround(ds[v],i)-bp.jl_bitround(ds[v],i)).squeeze().isel(time=0).plot()xr.testing.assert_equal(bp.xr_bitround(ds[v],i),ds[v]) # passes
xr.testing.assert_equal(bp.jl_bitround(ds[v],i),ds[v]) # fails
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Input In [86], in <cell line: 3>()
1 xr.testing.assert_equal(bp.xr_bitround(ds[v],i),ds[v])
----> 3 xr.testing.assert_equal(bp.jl_bitround(ds[v],i),ds[v])
[... skipping hidden 1 frame]
File /work/mh0727/m300524/conda-envs/bitinfo/lib/python3.10/site-packages/xarray/testing.py:81, in assert_equal(a, b)
79 assert type(a) == type(b)
80 if isinstance(a, (Variable, DataArray)):
---> 81 assert a.equals(b), formatting.diff_array_repr(a, b, "equals")
82 elif isinstance(a, Dataset):
83 assert a.equals(b), formatting.diff_dataset_repr(a, b, "equals")
AssertionError: Left and right DataArray objects are not equal
Differing values:
L
array([[[241.20001, 242.5 , ..., 235.5 , 238.6 ],
[243.79999, 244.5 , ..., 235.29999, 239.29999],
...,
[295.90002, 296.2 , ..., 295.90002, 295.2 ],
[296.29004, 296.79004, ..., 296.79004, 296.60004]],
[[242.1 , 242.70001, ..., 233.6 , 235.79999],
[243.6 , 244.1 , ..., 232.5 , 235.70001],
...,
[296.2 , 296.7 , ..., 295.5 , 295.10004],
[296.29004, 297.2 , ..., 296.40002, 296.60004]],
...,
[[245.79001, 244.79001, ..., 243.98999, 244.79001],
[249.89001, 249.29001, ..., 242.48999, 244.29001],
...,
[296.29004, 297.19 , ..., 295.09003, 294.39 ],
[297.79004, 298.39 , ..., 295.49 , 295.19 ]],
[[245.09 , 244.29001, ..., 241.48999, 241.79001],
[249.89001, 249.29001, ..., 240.29001, 241.69 ],
...,
[296.09003, 296.89 , ..., 295.69 , 295.19 ],
[297.69 , 298.09003, ..., 296.19 , 295.69 ]]], dtype=float32)
R
array([[[241.2 , 242.5 , ..., 235.5 , 238.59999],
[243.79999, 244.5 , ..., 235.29999, 239.29999],
...,
[295.9 , 296.19998, ..., 295.9 , 295.19998],
[296.29 , 296.79 , ..., 296.79 , 296.6 ]],
[[242.09999, 242.7 , ..., 233.59999, 235.79999],
[243.59999, 244.09999, ..., 232.5 , 235.7 ],
...,
[296.19998, 296.69998, ..., 295.5 , 295.1 ],
[296.29 , 297.19998, ..., 296.4 , 296.6 ]],
...,
[[245.79 , 244.79 , ..., 243.98999, 244.79 ],
[249.89 , 249.29 , ..., 242.48999, 244.29 ],
...,
[296.29 , 297.19 , ..., 295.09 , 294.38998],
[297.79 , 298.38998, ..., 295.49 , 295.19 ]],
[[245.09 , 244.29 , ..., 241.48999, 241.79 ],
[249.89 , 249.29 , ..., 240.29 , 241.68999],
...,
[296.09 , 296.88998, ..., 295.69 , 295.19 ],
[297.69 , 298.09 , ..., 296.19 , 295.69 ]]], dtype=float32)EDIT: I think the data with just one decimal is responsible. |
|
Can you link me to the python code for def encode(self, buf):
if self.keepbits == 23:
return bufI can easily add that for BitInformation.jl too. Because at the moment we have julia> bitstring.(A,:split)
10-element Vector{String}:
"0 01111101 00100011110101011001100"
"1 01111101 11111100000100010010110"
"0 01111111 00100110111101011010101"
"0 01111100 00000101000001101001010"
"1 01111111 10100000111000010100000"
"0 01111100 01011110111011000110101"
"0 01111010 01100000001000101011100"
"1 01111110 00101101011101001110111"
"1 01111101 10000010000111001000111"
"1 01111101 11000011000111110011100"
julia> bitstring.(round(A,22),:split) # keepbits=22, all correct
10-element Vector{String}:
"0 01111101 00100011110101011001100" # no rounding
"1 01111101 11111100000100010010110" # no rounding
"0 01111111 00100110111101011010100" # round to zero=even (tie)
"0 01111100 00000101000001101001010" # no rounding
"1 01111111 10100000111000010100000" # no rounding
"0 01111100 01011110111011000110100" # round to zero=even (tie)
"0 01111010 01100000001000101011100" # no rounding
"1 01111110 00101101011101001111000" # round away from zero (with carry)
"1 01111101 10000010000111001001000" # round away from zero (with carry)
"1 01111101 11000011000111110011100" # no rounding
julia> bitstring.(round(A,23),:split) # keepbits=23
10-element Vector{String}:
"0 01111101 00100011110101011001100" # no rounding, correct
"1 01111101 11111100000100010010110" # no rounding, correct
"0 01111111 00100110111101011010110" # round away from zero, incorrect
"0 01111100 00000101000001101001010" # no rounding, correct
"1 01111111 10100000111000010100000" # no rounding, correct
"0 01111100 01011110111011000110110" # round away from zero, incorrect
"0 01111010 01100000001000101011100" # no rounding, correct
"1 01111110 00101101011101001111000" # round away from zero, incorrect
"1 01111101 10000010000111001001000" # round away from zero, incorrect
"1 01111101 11000011000111110011100" # no rounding, correctMeaning for the edge case of keepbits=23 there's still some rounding away from zero possible, which obviously shouldn't happen. I'll see in a patch release how to best deal with that. The ball here is in my court. But that shouldn't stop us if you want to make BitInformation.round the default. |
|
This is awesome! I guess we handle keepbit==23 differently in the possible numcodecs implementation and BitInformation. See here. The suggested numcodecs implementation returns the input as is, while BitInformation should do the same according to this test. Seems like the issue needs to be fixed upstream. |
|
Thanks @milankl. The codec is indeed not yet merged and I just have that in my own numcodecs development branch to make it available. With zarr-developers/numcodecs#290 being merged, we could also think of having bitround as an external filter which gets automatically registered with numcodecs. I played a little bit with that here. |
|
this will be addressed with milankl/BitInformation.jl#37 which can be released as v0.5.1 later today |
|
BitInformation.jl v0.5.1 is released. |

bp.jl_bitroundbp.xr_bitroundnumcodecs.bitroundCloses #25
deals with #27