4 — Working in Subspaces

The problem

The trust-region subproblem,

\[\min_{\|p\|\le\Delta}\; \nabla f^\top p + \tfrac12 p^\top H\, p,\]

is an $n$-dimensional constrained optimisation in its own right. Solving it exactly requires an eigenvalue decomposition of $H$ — which costs $O(n^3)$ and dominates the computation for large $n$.

The idea

Instead of solving in the full $n$-dimensional space, project the problem onto a low-dimensional subspace and solve there. This is much cheaper and, for well-chosen subspaces, gives nearly as good a step.

Retro offers three subspace strategies:


TwoDimSubspace (default)

Build a 2-D plane spanned by:

  1. The (negative) gradient direction $-\nabla f$
  2. The Newton direction $H^{-1} \nabla f$ (or quasi-Newton direction)

These two vectors are orthogonalised (Gram–Schmidt) and the 2×2 projected Hessian is formed. The trust-region subproblem is then solved exactly in this 2-D space using an eigenvalue decomposition — which is trivial for a 2×2 matrix.

        ↑ Newton direction
        │  ╱ ← 2-D trust region
        │╱      (ellipse in the plane)
        ·────→ gradient direction

Why this works

The gradient captures steepest descent; the Newton direction captures curvature. Together they span the most informative 2-D slice of the problem.

If the two directions are (nearly) parallel, Retro falls back to a 1-D subspace (gradient only).

CGSubspace

Steihaug–Toint truncated conjugate gradient. This builds a Krylov subspace incrementally using Hessian–vector products, without ever forming the full Hessian. It is ideal for large-scale problems.

The CG iteration stops when:

  • The residual is small enough.
  • Negative curvature is detected (the step heads to the TR boundary).
  • The trust-region boundary is reached.

FullSpace

Solves the trust-region subproblem in the full $n$-dimensional space using an eigenvalue decomposition of $H$. Most accurate, but $O(n^3)$ — only practical for small problems (roughly $n \le 10$).

For very small $n$, Retro uses StaticArrays to avoid allocations entirely.

Robust secular solve

The EigenTRSolver used under the hood tries Newton iteration on the secular equation first; if that doesn't converge it falls back to Brent root-finding (bracket + inverse-quadratic interpolation), and finally to a Cauchy step. This three-phase chain guarantees a feasible step even for near-singular Hessians.


Choosing a subspace

SubspaceCost per iterationBest for
TwoDimSubspace()$O(n)$ + 2×2 eigensolveGeneral use, default
CGSubspace()$O(n \cdot k)$ ($k$ = CG iters)Large problems
FullSpace()$O(n^3)$Small, high-accuracy problems
# Large problem — use CG
result = optimize(prob; subspace = CGSubspace())

# Tiny problem — use full eigensolve
result = optimize(prob; subspace = FullSpace())

Previous ← Handling Bounds · Next → Hessian Approximations