ggml : fix quants nans when all the group weights are very close to zero#7313
ggml : fix quants nans when all the group weights are very close to zero#7313
Conversation
JohannesGaessler
left a comment
There was a problem hiding this comment.
There is no way the change in threshold has any significant effects on the results. Even a threshold of
|
I tried progressively higher values and found that some quants still fail with |
0e1c4f6 to
f59edee
Compare
|
While increasing |
|
To clarify, I'm not sure how generalizable my results are to other models; I think the model for which the fix is needed at least should also be checked since that particular model seems to have some blocks with only very small values. |
|
I have tried to find the lowest possible eps for the quants that require lower than I don't really like this solution, I think the best way to handle this would be to check for zero before doing the division, but that would require deeper changes, the code is not very easy to follow, and don't want to risk introducing bugs that may cause models with bad quants to be distributed. |
6b41894 to
61e8a0a
Compare
61e8a0a to
f07e570
Compare
When the group abs max value is very close to zero but not zero, it may still result in a division by zero when computing the scale, which ends with a
nanscale. To avoid this, we check the max value against an epsilon instead of zero. With the IQ quants, this could also result in aOops: found point %u not on griderror.While doing this, I noticed that there was already a similar check with
1e-30epsilon inmake_qx_quants, however values this small can still result innan, so I bumped it to1e-20and extended it to all the cases that I could find. I used the commented code intest-backend-opsto find these cases. It is possible that an even higher epsilon may be necessary.I don't expect this to result in lower precision in the quants since the epsilon is so small, but it may be worth checking.
Fixes #7311.