Fix covmat reg tests by wilsonmr · Pull Request #779 · NNPDF/nnpdf

wilsonmr · 2020-05-13T10:14:31Z

I actually kept the old test and added another because I think that the procedure before wasn't robust and the test actually helped find this issues. To (hopefully) fix that I added in an abs() around the correlation matrix eigenvalue in case it is close to zero but negative

Then I added another test which checks with a slightly more sane matrix. Also using numpy.testing.assert_allclose whereever possible in that test module for better errors

closes #777

wilsonmr · 2020-05-13T10:15:29Z

I guess the true test is whether the CI passes because I can't easily test this locally

scarlehoff

It works for me. I think we should merge this soonish (go Travis!)

just out of curiosity tried it in my computer to see whether it happens for me and it seems to be a numerical precision thing:

>>> a = np.arange(9, dtype=np.float32).reshape(3,3) ; eigvalsh(a@a.T)[0]
3.1273037e-06
>>> a = np.arange(9, dtype=np.float64).reshape(3,3) ; eigvalsh(a@a.T)[0]
-3.771475733269901e-15

I wonder what made it start failing though.

scarlehoff · 2020-05-13T10:34:43Z

-    if 1 / e_val[0] <= sqr_threshold: # eigh gives eigenvals in ascending order
+    # if eigenvalues are close to zero, can be negative which messes this up
+    # take abs here to be safe
+    if abs(1 / e_val[0]) <= sqr_threshold: # eigh gives eigenvals in ascending order


Maybe rather than an absolute value a + epsilon where epsilon=1e-12 or something would be better? That way if it is not 0 but actually negative it will "correctly" fail.

hmm I'm not sure what the best behaviour is if the method is supplied with a covariance matrix which is actually not positive semidefinite.

So with your suggestion if min eigenval was like -10 then (1/-10 + epsilon) < sqr_threshold
and it just wouldn't do anything to the covmat but also wouldn't raise any errors

I think that this method assumes that the covmat is semi-positive definite and it would be better to raise an error if e_val[0] + epsilon < 0 but I'm not sure exactly what we should set that threshold to be

Well, float64 is 16 digits and it seems that the problem is around that bound so 1e-14 seems fair. If the precision problems are greater than that we might want to start thinking about using a different library.

But yeah, I also don't know what the best behavior would be. Personally I like failures when things are not what they are supposed to be, isn't there a @check_covmat validphys decorator anywhere?

I guess if we did a bit of both then we at least know when it fails

PRECISION_THREHSOLD = 1e-12 ... if e_val[0] < PRECISION_THREHSOLD: raise SomeError("print what the value was so we can see whether its numerical or a non PSD covmat") if 1 / (e_val[0] + PRECISION_THREHSOLD) <= sqr_threshold: ...

…l if smallest eigenvalue is more negative than -tolerance

wilsonmr · 2020-05-13T12:32:18Z

thoughts on that? I quite like it as is because it fails loudly if the covmat is not positive definite within some tolerance we set.

Seeing as it relates to a project which @Zaharid was leading I wouldn't mind getting his opinion on it before merge

scarlehoff · 2020-05-13T12:39:30Z

I'm happy with it. I agree, we can talk during the CM today.

Zaharid · 2020-05-14T11:00:27Z

I'd say that rather than picking arbitrary thresholds, we should treat negative eigenvalues returned by eigh as zero, possibly with a big warning. These are "truly" either really zero or very small meaning that that matrix is not positive definite but only semipositive (which indeed is a case we should handle), but the eigenvalues can never actually be negative.

Note that it doesn't check if the matrix is symmetric and simply assumes it is, so under that assumption the above is exactly true.

Then for the purposes of regularization, zero or close to zero (as in less than the threshold) doesn't make any difference, which is kind of the point.

Zaharid · 2020-05-14T11:13:33Z

I guess a shorter way of saying the same thing is that we always want to regularize on the condition e_val[0] <= 0 so we could just add that.

wilsonmr · 2020-05-14T11:14:19Z

ok in practice we can change the if to be

if e_val[0] > 1/sqr_threshold:
    return ...

right? Only positive numbers which are big enough cause matrix to not be regularized. Then raise a warning if e_val[0] < 0?

Zaharid · 2020-05-14T11:18:20Z

Looks OK to me, but I was never very good with inequalities. Indeed we want an inequality that is not equivalent to the previous one!

scarlehoff · 2020-05-15T08:12:19Z

Should we merge this?

wilsonmr · 2020-05-18T10:27:32Z

I think that would be good

expand covmat reg tests and make procedure more robust

c9dc4b1

wilsonmr requested review from Zaharid and scarlehoff May 13, 2020 10:14

scarlehoff approved these changes May 13, 2020

View reviewed changes

changed handling of zero eigenvalues to use epsilon parameter and fai…

b8f04a1

…l if smallest eigenvalue is more negative than -tolerance

change algorithm to still work with negative eigs but give warning

d3ff0f7

voisey mentioned this pull request May 18, 2020

adds resource list script #764

Merged

Zaharid approved these changes May 18, 2020

View reviewed changes

wilsonmr merged commit 7694bb5 into master May 18, 2020

wilsonmr deleted the fix_covmat_test branch May 18, 2020 10:49

wilsonmr mentioned this pull request Sep 10, 2020

New version of conda-build throws error if channel URL path is conda #915

Closed

Conversation

wilsonmr commented May 13, 2020

Uh oh!

wilsonmr commented May 13, 2020

Uh oh!

scarlehoff left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scarlehoff May 13, 2020

Choose a reason for hiding this comment

Uh oh!

wilsonmr May 13, 2020

Choose a reason for hiding this comment

Uh oh!

scarlehoff May 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilsonmr May 13, 2020

Choose a reason for hiding this comment

Uh oh!

wilsonmr commented May 13, 2020

Uh oh!

scarlehoff commented May 13, 2020

Uh oh!

Zaharid commented May 14, 2020

Uh oh!

Zaharid commented May 14, 2020

Uh oh!

wilsonmr commented May 14, 2020

Uh oh!

Zaharid commented May 14, 2020

Uh oh!

scarlehoff commented May 15, 2020

Uh oh!

wilsonmr commented May 18, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

scarlehoff left a comment •

edited

Loading

scarlehoff May 13, 2020 •

edited

Loading