Add analytic single-sample test for ELBO gradient estimators#2894
Conversation
|
|
||
| elbo = Elbo( | ||
| max_plate_nesting=1, # set this to ensure rng agrees across runs | ||
| num_particles=1, |
There was a problem hiding this comment.
From this test alone it is unclear whether there is a bug in stochastic gradient estimation; rather it is clear only that Trace_ELBO disagrees with a particular hand-coded estimator (at a single sample). That is, it still seems possible that Trace_ELBO might agree with your hand-coded estimator in expectation, but disagree at each sample. To confirm this is a true inference bug, we could also try setting num_particles=100000, vectorize_particles=True and see whether the expected gradients coming out of Trace_ELBO and TraceEnum_ELBO agree. WDYT?
There was a problem hiding this comment.
You are right! The missing term is a score function and its expectation is zero (which is cleverly removed by Trace_ELBO). So the expected gradients of Trace_ELBO and TraceEnum_ELBO agree (as tested in test_subsample_gradient).
I fixed the hand-coded formula for Trace_ELBO so the test should pass now. @fritzo do you want to add the test to the repo? If not, I'll close the PR.
Trace_ELBO is missing a term
This test compares analytically derived pathwise and score function gradient estimators for a single particle to gradients obtained by
Trace_ELBOandTraceEnum_ELBO. When reparameterization is disabled gradients obtained byTrace_ELBOdon't match analytically derived score function gradients.By comparing gradient values it seems like grads obtained by
Trace_ELBOare missing a second term (delbo_dscaleanddelbo_dloc) in the derivative of a surrogate loss:On the other hand,
TraceEnum_ELBOpasses all tests.