Fix covmat reg tests#779
Conversation
|
I guess the true test is whether the CI passes because I can't easily test this locally |
There was a problem hiding this comment.
It works for me. I think we should merge this soonish (go Travis!)
just out of curiosity tried it in my computer to see whether it happens for me and it seems to be a numerical precision thing:
>>> a = np.arange(9, dtype=np.float32).reshape(3,3) ; eigvalsh(a@a.T)[0]
3.1273037e-06
>>> a = np.arange(9, dtype=np.float64).reshape(3,3) ; eigvalsh(a@a.T)[0]
-3.771475733269901e-15
I wonder what made it start failing though.
| if 1 / e_val[0] <= sqr_threshold: # eigh gives eigenvals in ascending order | ||
| # if eigenvalues are close to zero, can be negative which messes this up | ||
| # take abs here to be safe | ||
| if abs(1 / e_val[0]) <= sqr_threshold: # eigh gives eigenvals in ascending order |
There was a problem hiding this comment.
Maybe rather than an absolute value a + epsilon where epsilon=1e-12 or something would be better? That way if it is not 0 but actually negative it will "correctly" fail.
There was a problem hiding this comment.
hmm I'm not sure what the best behaviour is if the method is supplied with a covariance matrix which is actually not positive semidefinite.
So with your suggestion if min eigenval was like -10 then (1/-10 + epsilon) < sqr_threshold
and it just wouldn't do anything to the covmat but also wouldn't raise any errors
I think that this method assumes that the covmat is semi-positive definite and it would be better to raise an error if e_val[0] + epsilon < 0 but I'm not sure exactly what we should set that threshold to be
There was a problem hiding this comment.
Well, float64 is 16 digits and it seems that the problem is around that bound so 1e-14 seems fair. If the precision problems are greater than that we might want to start thinking about using a different library.
But yeah, I also don't know what the best behavior would be. Personally I like failures when things are not what they are supposed to be, isn't there a @check_covmat validphys decorator anywhere?
There was a problem hiding this comment.
I guess if we did a bit of both then we at least know when it fails
PRECISION_THREHSOLD = 1e-12
...
if e_val[0] < PRECISION_THREHSOLD:
raise SomeError("print what the value was so we can see whether its numerical or a non PSD covmat")
if 1 / (e_val[0] + PRECISION_THREHSOLD) <= sqr_threshold:
...…l if smallest eigenvalue is more negative than -tolerance
|
thoughts on that? I quite like it as is because it fails loudly if the covmat is not positive definite within some tolerance we set. Seeing as it relates to a project which @Zaharid was leading I wouldn't mind getting his opinion on it before merge |
|
I'm happy with it. I agree, we can talk during the CM today. |
|
I'd say that rather than picking arbitrary thresholds, we should treat negative eigenvalues returned by Note that it doesn't check if the matrix is symmetric and simply assumes it is, so under that assumption the above is exactly true. Then for the purposes of regularization, zero or close to zero (as in less than the threshold) doesn't make any difference, which is kind of the point. |
|
I guess a shorter way of saying the same thing is that we always want to regularize on the condition |
|
ok in practice we can change the if to be if e_val[0] > 1/sqr_threshold:
return ...right? Only positive numbers which are big enough cause matrix to not be regularized. Then raise a warning if e_val[0] < 0? |
|
Looks OK to me, but I was never very good with inequalities. Indeed we want an inequality that is not equivalent to the previous one! |
|
Should we merge this? |
|
I think that would be good |
I actually kept the old test and added another because I think that the procedure before wasn't robust and the test actually helped find this issues. To (hopefully) fix that I added in an abs() around the correlation matrix eigenvalue in case it is close to zero but negative
Then I added another test which checks with a slightly more sane matrix. Also using numpy.testing.assert_allclose whereever possible in that test module for better errors
closes #777