Update abs diff rule to 0 at non-differentiable point#98
Update abs diff rule to 0 at non-differentiable point#98oxinabox merged 1 commit intoJuliaDiff:masterfrom agerlach:abs
Conversation
|
Tracker.jl breakage is unrelated. I believe this is correct. (But of course I do, I am explicitly proponents of this property.) I will merge this tomorrow unless someone raises good objections. |
|
Re: Tracker.jl I was hoping that was the case. Thanks |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #98 +/- ##
=======================================
Coverage 97.86% 97.86%
=======================================
Files 3 3
Lines 187 187
=======================================
Hits 183 183
Misses 4 4
☔ View full report in Codecov by Sentry. |
git blame shows it was added in #33 and there the explanation is arguably a bit clearer: |
|
Some additional historical context: It seems the rule for |
|
@devmotion Thanks for the extra context. |
|
I think we should revert this. It breaks higher order derivatives for some differentiable functions. E.g. julia> ForwardDiff.hessian(t -> abs(t[1])^2, [0.0])
1×1 Matrix{Float64}:
2.0
(TestDiffRules) pkg> add DiffRules@1.14
Resolving package versions...
Updating `~/TestDiffRules/Project.toml`
[b552c78f] ↑ DiffRules v1.13.0 ⇒ v1.14.0
Updating `~/TestDiffRules/Manifest.toml`
[b552c78f] ↑ DiffRules v1.13.0 ⇒ v1.14.0
julia> ForwardDiff.hessian(t -> abs(t[1])^2, [0.0])
1×1 Matrix{Float64}:
0.0The example here is of course trivial but |
This PR updates the diffrule for
absto return 0 at the non-differentiable point. The current implementation returns 1. Although valid, this can prevent convergence in gradient descent. The implementation in this PR is the behavior the ChainRules.jl docs advises.This also comes with the added benefit of not requiring the type to support the ternary operator such as
IntervalArithmetic.Interval. This is the use case that led me to make this PR.With this PR:
The diffrule for
abshas the following comment, which I'm not sure how to interpret. As it doesn't work withIntervalArithmetic.IntervalorIntervals.Intervel. Additionally, the current definition assumes that 0 is not in the interval.DiffRules.jl/src/rules.jl
Line 71 in 2001650