Skip to content

Optimization Loss isn't Outlier Robust #167

@pavelkomarov

Description

@pavelkomarov

I've been improving RobustDiff for #165, but as I've done so I've found that the optimization results are actually getting worse. The degradation starts when I decouple qr_ratio into log_q and log_r, because the scale of the two matters relative to huberM. Suddenly the method has more expressive power and can find lower-cost solutions, but with worse RMSE and R^2 against the true underlying derivative. I've realized $L(\Phi)$ isn't itself robust, so the hyperparameter optimization is favoring solutions that bend in the direction of outliers.

The solution? Pop a Huber loss in the RMSE evaluation, to get something we might call "Root Mean Huber Error" or "Robust Root Mean Error". We now have another parameter to pick, where that Huber switches from quadratic to linear. If we normalize the inputs by their standard deviation, then we can choose the parameter as some number of sigma, like 2 or 3, which would then count ~95% or ~99.73% of inliers as inliers, assuming a Gaussian distribution. This will affect the scale of the first term in the loss function, so we'll have to account for this against the total variation smoothing term.

The TV term is potentially problematic itself too, because if we are allowed to independently drive down RobustDiff's process term's Huber parameter, optimization might favor approximating the 1-norm to make it artificially sparse. I've run in to this in other cases too, and disallow order 1 for TVR for most datasets, because it can "hack" the loss function.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or improvementresearchwhen a task requires some experimentation or diving into papers and math

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions