Optimization Loss isn't Outlier Robust

I've been improving RobustDiff for #165, but as I've done so I've found that the optimization results are actually getting *worse*. The degradation starts when I decouple `qr_ratio` into `log_q` and `log_r`, because the scale of the two matters relative to `huberM`. Suddenly the method has more expressive power and can find lower-cost solutions, but with worse RMSE and R^2 against the true underlying derivative. I've realized $L(\Phi)$ isn't itself robust, so the hyperparameter optimization is favoring solutions that bend in the direction of outliers.

The solution? Pop a Huber loss in the RMSE evaluation, to get something we might call "Root Mean Huber Error" or "Robust Root Mean Error". We now have another parameter to pick, where that Huber switches from quadratic to linear. If we normalize the inputs by their standard deviation, then we can choose the parameter as some number of sigma, like 2 or 3, which would then count ~95% or ~99.73% of inliers as inliers, assuming a Gaussian distribution. This will affect the scale of the first term in the loss function, so we'll have to account for this against the total variation smoothing term.

The TV term is potentially problematic itself too, because if we are allowed to independently drive down RobustDiff's process term's Huber parameter, optimization might favor approximating the 1-norm to make it artificially sparse. I've run in to this in other cases too, and disallow order 1 for TVR for most datasets, because it can "hack" the loss function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization Loss isn't Outlier Robust #167

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Optimization Loss isn't Outlier Robust #167

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions