How-To: Automatic Differentiation

Retro computes gradients (and optionally Hessians) through DifferentiationInterface.jl, which means any AD backend that DI supports can be used.

Supported backends (non-exhaustive)

BackendType tagModeBest for
ForwardDiff.jlAutoForwardDiff()ForwardSmall-to-medium $n$
Enzyme.jlAutoEnzyme()ReverseLarge $n$, compiled code
Zygote.jlAutoZygote()ReverseML-style models
FiniteDiff.jlAutoFiniteDiff()Finite diffBlack-box functions

Tip

AutoForwardDiff() and AutoEnzyme() are re-exported by Retro via ADTypes, so you don't need to import them separately.

Using ForwardDiff (simplest)

using Retro, ForwardDiff

f(x) = 100*(x[2] - x[1]^2)^2 + (1 - x[1])^2
prob = RetroProblem(f, [-1.2, 1.0], AutoForwardDiff())
result = optimize(prob)

Using Enzyme

using Retro, Enzyme

prob = RetroProblem(f, [-1.2, 1.0], AutoEnzyme())
result = optimize(prob)

Mixing user gradient + AD Hessian

If you have a hand-coded gradient but want AD for the Hessian:

function my_grad!(g, x)
    g[1] = -400*x[1]*(x[2] - x[1]^2) - 2*(1 - x[1])
    g[2] = 200*(x[2] - x[1]^2)
end

prob = RetroProblem(f, my_grad!, [-1.2, 1.0], AutoForwardDiff())
result = optimize(prob; hessian_approximation = ExactHessian())

The AD backend is only used for the Hessian here; the gradient uses your function.

Fully analytic (no AD)

prob = RetroProblem(f, my_grad!, my_hess!, [-1.2, 1.0])
# no AD backend needed

How preparation works

When you construct a RetroProblem (or the underlying objective type), Retro calls DifferentiationInterface.prepare_gradient and DifferentiationInterface.prepare_hessian once on the initial x0. These prep objects are cached inside the objective and reused at every subsequent evaluation, making repeated calls allocation-free.

Troubleshooting

"Method error: no method matching prepare_gradient" → You forgot to load the backend package. Add using ForwardDiff (or whichever backend you chose) before constructing the problem.

Slow first call → Normal — Julia's JIT compilation. Subsequent calls will be fast. Consider using PrecompileTools in a sysimage if startup matters.

Backend doesn't support Hessians → Some reverse-mode backends only support gradients. Use BFGS() (default) as the Hessian approximation — it only needs gradients. Switch to ExactHessian() only with backends that support second-order derivatives (ForwardDiff, Enzyme).