cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

Optimization

no free lunch

  • Wikipedia
  • “any two algorithms are equivalent when their performance is averaged across all possible problems.”

  • a “good” optimization method is one that works on the kinds of problems you have

implementations

the devil is in the details; different implementations of the same algorithm may differ …

  • performance
  • robustness
  • interface (arguments, return values)
  • controls
  • convergence testing

Local optimization

derivative-based optimization

  • uses first (gradient/direction) and second (Hessian/curvature) information
  • assumes smooth objective function
  • R’s optimization functions don’t need user-supplied gradient functions - but will use finite differences instead

derivative-based methods

  • see Press et al. (1994) (or later editions)
  • gradient descent: widely used in machine learning
    • easy, scalable
    • but can almost certainly do better!
  • Newton’s method: simple but not actually practical (unstable, expensive)
  • Levenberg-Marquardt: specifically for least squares problems
  • conjugate gradients (CG), quasi-Newton (BFGS, L-BFGS-B)
    • various fancier methods

derivative-free methods

  • for non-smooth surfaces, or when derivatives are expensive
  • Nelder-Mead
  • BOBYQA, COBYLA, NEWUOA (Powell) …

Stochastic/global optimization

stochastic optimization

simulated annealing

  • local search with decreasing noise
  • local “jumps” according to a candidate distribution
  • accept with probability \(\exp(-\Delta G/T)\)
    • high temperature: random search
    • low temperature: local ratchet
  • tuning: candidate distribution, cooling schedule
  • very general (don’t need continuous parameters)
  • compare Metropolis-Hastings in Markov chain Monte Carlo

genetic algorithms

  • population-based
  • biological analogies
  • reproduction, selection, mutation, recombination (“crossover”)
  • flexible
  • parameters often coded as bit strings
  • genoud package?

differential evolution

  • population-based
  • derivative-free
  • continuous parameters
  • DEoptim package

Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. 1994. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press.