Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.
Likelihood
Definition
- probability of data given a model (structure & parameters)
- in R: distributions via
d*
functions (base, Distributions Task View)
- usually: complex model for the location, simpler models for the scale and shape
- e.g. Gamma with fixed shape, varying mean
MLEs are consistent and asymptotically Normal
- consistent = converge to the true values as the number of independent observations grows to infinity
- asymptotic Normality is the basis for the approximate (Wald) standard errors from
summary()
MLEs are asymptotically efficient
- important but a bit delicate.
- as number of independent observations \(n\) increases, the standard errors on each parameter decrease in proportion to \(C/\sqrt{n}\) for some constant \(C\)
- Asymptotically efficient means that there is no unbiased way of estimating parameters for which the standard errors shrink at a strictly faster rate (e.g., a smaller value of \(C\), or a higher power of \(n\) in the denominator).
MLEs = Swiss Army knife
- MLEs make sense
- lots of justifying theory
- when it can do the job, it’s rarely the best tool for the job but it’s rarely much worse than the best (at least for large samples)
- most statistical models (least-squares, GLMs) are special cases of MLE
Inference
- Wald approximation: quadratic approximation (parabolas/ellipses)
- p-values: \(Z\)-scores (\(\hat \beta/\sigma_{\hat \beta}\) N(0,1)$)
- confidence intervals: based on \(N(\hat \beta, \sigma_{\hat \beta})\)
- strongly asymptotic
- likelihood:
- p-values: likelihood ratio test (\(-2 \Delta \log L \sim \chi^2_n\))
- CIs: likelihood profiles
- bootstrapping
- nonparametric (slow, requires independence)
- parametric (slow, model-dependent)
- Bayesian
- requires priors
- strongly model-dependent
- often slow
- … but solves many problems
Beyond Normal errors, finite-size corrections are tricky
…