Abbott, M. C., & Machta, B. B. (2022).
Far from
Asymptopia (No. arXiv:2205.03343).
arXiv.
https://arxiv.org/abs/arXiv:2205.03343
Abramovich, F., Sapatinas, T., & Silverman, B. W. (1998). Wavelet
thresholding via a
Bayesian approach.
Journal of the
Royal Statistical Society: Series B (Statistical Methodology),
60(4), 725–749.
https://doi.org/10.1111/1467-9868.00151
Agrawal, R., Huggins, J. H., Trippe, B., & Broderick, T. (2019). The
Kernel Interaction Trick:
Fast Bayesian Discovery of
Pairwise Interactions in
High
Dimensions.
arXiv:1905.06501 [Cs, Stat].
http://arxiv.org/abs/1905.06501
Alam, M. A., & Fukumizu, K. (2014). Hyperparameter selection in
kernel principal component analysis. Journal of Computer
Science, 10(7), 1139.
Anderson, E. (1935). The irises of the Gaspe peninsula.
Bull. Am. Iris Soc., 59, 2–5.
Anderson, E. (1936). The
Species Problem in
Iris.
Annals of the Missouri Botanical Garden,
23(3), 457–509.
https://doi.org/10.2307/2394164
Atlas. (2013).
QR factorization for ridge regression. In
Mathematics Stack Exchange.
https://math.stackexchange.com/questions/299481/qr-factorization-for-ridge-regression
Banerjee, S. (2017). High-
Dimensional Bayesian
Geostatistics.
Bayesian Analysis,
12(2),
583–614.
https://doi.org/10.1214/17-BA1056R
Banerjee, S., & Gelfand, A. E. (2003). On smoothness properties of
spatial processes.
Journal of Multivariate Analysis,
84(1), 85–100.
https://doi.org/10.1016/S0047-259X(02)00016-7
Barber, R. F., Candès, E. J., Ramdas, A., & Tibshirani, R. J.
(2021). Predictive inference with the jackknife+.
The Annals of
Statistics,
49(1), 486–507.
https://doi.org/10.1214/20-AOS1965
Barber, S., & Nason, G. P. (2004). Real nonparametric regression
using complex wavelets.
Journal of the Royal Statistical Society:
Series B (Statistical Methodology),
66(4), 927–939.
https://doi.org/10.1111/j.1467-9868.2004.B5604.x
Barillec, R., Ingram, B., Cornford, D., & Csató, L. (2011).
Projected sequential
Gaussian processes:
A
C++ tool for interpolation of large datasets with
heterogeneous noise.
Computers & Geosciences,
37(3), 295–309.
https://doi.org/10.1016/j.cageo.2010.05.008
Bates, S., Hastie, T., & Tibshirani, R. (n.d.).
Cross-validation: What does it estimate and how well does it do
it? 36.
Bezdek, J. C., Keller, J. M., Krishnapuram, R., Kuncheva, L. I., &
Pal, N. R. (1999). Will the real iris data please stand up?
IEEE
Transactions on Fuzzy Systems,
7(3), 368–369.
https://doi.org/10.1109/91.771092
Bien, J., Taylor, J., & Tibshirani, R. (2013). A lasso for
hierarchical interactions.
The Annals of Statistics,
41(3), 1111–1141.
https://doi.org/10.1214/13-AOS1096
Blanchet, F. G., Legendre, P., & Borcard, D. (2008). Forward
Selection of
Explanatory
Variables.
Ecology,
89(9), 2623–2632.
https://doi.org/10.1890/07-0986.1
Bodin, E., Campbell, N. D. F., & Ek, C. H. (2017). Latent
Gaussian Process Regression.
arXiv:1707.05534 [Cs, Stat].
http://arxiv.org/abs/1707.05534
Bodmer, W., Bailey, R. A., Charlesworth, B., Eyre-Walker, A., Farewell,
V., Mead, A., & Senn, S. (2021). The outstanding scientist,
R.
A.
Fisher: His views on
eugenics and race.
Heredity,
126(4), 565–576.
https://doi.org/10.1038/s41437-020-00394-6
Bourotte, M., Allard, D., & Porcu, E. (2016). A flexible class of
non-separable cross-covariance functions for multivariate space–time
data.
Spatial Statistics,
18, 125–146.
https://doi.org/10.1016/j.spasta.2016.02.004
Breiman, L. (1996). Heuristics of instability and stabilization in model
selection.
The Annals of Statistics,
24(6), 2350–2383.
https://doi.org/10.1214/aos/1032181158
Breiman, L. (2001). Statistical
Modeling:
The Two
Cultures.
Statistical Science,
16(3), 199–215.
http://www.jstor.org/stable/2676681
Breiman, L., & Friedman, J. H. (1985). Estimating
Optimal Transformations for
Multiple Regression and
Correlation.
Journal of the American Statistical
Association,
80(391), 580–598.
https://doi.org/10.1080/01621459.1985.10478157
Breiman, L., & Friedman, J. H. (1988). Tree-
Structured
Classification Via Generalized Discriminant Analysis:
Comment.
Journal of the American Statistical
Association,
83(403), 725–727.
https://doi.org/10.2307/2289296
Breiman, L., & Spector, P. (1992). Submodel
Selection
and
Evaluation in
Regression.
The
X-Random Case.
International Statistical Review / Revue
Internationale de Statistique,
60(3), 291–319.
https://doi.org/10.2307/1403680
bremen79. (2020). Neural Networks (Maybe)
Evolved to Make Adam The Best Optimizer. In
Parameter-free Learning and Optimization Algorithms.
Bryan, J. (2017). Project-oriented workflow. In
Tidyverse.
https://www.tidyverse.org/blog/2017/12/workflow-vs-script/
Buckingham-Jeffery, E., Isham, V., & House, T. (2018). Gaussian
process approximations for fast inference from infectious disease data.
Mathematical Biosciences,
301, 111–120.
https://doi.org/10.1016/j.mbs.2018.02.003
Burden, S., Cressie, N., & Steel, D. G. (2015). The
SAR
Model for
Very Large
Datasets:
A Reduced
Rank Approach.
Econometrics,
3(2), 317–338.
https://doi.org/10.3390/econometrics3020317
Burzykowski, P. B. and T. (n.d.). 5 Introduction to
Instance-level Exploration |
Explanatory Model Analysis.
Burzykowski, P. B. and T. (2020). Explanatory model analysis.
Bussola, N., Marcolini, A., Maggio, V., Jurman, G., & Furlanello, C.
(2020).
AI slipping on tiles: Data leakage in digital
pathology. arXiv.
https://doi.org/10.48550/arXiv.1909.06539
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P.
(2002).
SMOTE:
Synthetic Minority
Over-sampling
Technique.
Journal of
Artificial Intelligence Research,
16, 321–357.
https://doi.org/10.1613/jair.953
Chen, T., & Guestrin, C. (2016).
XGBoost:
A Scalable Tree
Boosting System.
Proceedings of the 22nd
ACM SIGKDD International
Conference on Knowledge Discovery
and Data Mining, 785–794.
https://doi.org/10.1145/2939672.2939785
Chen, Y., & Yang, Y. (2021). The
One
Standard Error Rule for
Model Selection:
Does
It Work?
Stats,
4(4),
868–892.
https://doi.org/10.3390/stats4040051
Chipman, H. A., George, E. I., & McCulloch, R. E. (2010).
BART:
Bayesian additive regression trees.
The Annals of Applied Statistics,
4(1).
https://doi.org/10.1214/09-AOAS285
Cho, P. H. (2018). Does
Xgboost do
Newton
boosting? In
GitHub.
https://github.com/dmlc/xgboost/issues/3227
Clarke, B., Clarke, J., & Yu, C. W. (2014). Statistical
Problem Classes and
Their
Links to
Information Theory.
Econometric Reviews,
33(1-4), 337–371.
https://doi.org/10.1080/07474938.2013.807190
Cygu, S., Seow, H., Dushoff, J., & Bolker, B. M. (2023). Comparing
machine learning approaches to incorporate time-varying covariates in
predicting cancer survival time.
Scientific Reports,
13(1), 1370.
https://doi.org/10.1038/s41598-023-28393-7
Dahlgren, J. P. (2010). Alternative regression methods are not
considered in
Murtaugh (2009) or by ecologists in general.
Ecology Letters,
13(5), E7–E9.
https://doi.org/10.1111/j.1461-0248.2010.01460.x
Datta, A., Banerjee, S., Finley, A. O., & Gelfand, A. E. (2016).
Hierarchical
Nearest-
Neighbor
Gaussian Process Models for
Large Geostatistical Datasets.
Journal of the American Statistical Association,
111(514), 800–812.
https://doi.org/10.1080/01621459.2015.1044091
Datta, A., Banerjee, S., Finley, A. O., Hamm, N. A. S., & Schaap, M.
(2016). Nonseparable dynamic nearest neighbor
Gaussian
process models for large spatio-temporal data with an application to
particulate matter analysis.
The Annals of Applied Statistics,
10(3).
https://doi.org/10.1214/16-AOAS931
De Oliveira, V., & Han, Z. (2022). On
Information
About Covariance Parameters in
Gaussian Matérn Random
Fields.
Journal of Agricultural, Biological and
Environmental Statistics.
https://doi.org/10.1007/s13253-022-00510-5
Dezeure, R., Bühlmann, P., Meier, L., & Meinshausen, N. (2015).
High-dimensional inference: Confidence intervals, p-values and
R software hdi.
Statistical Science,
30(4), 533–558.
https://doi.org/10.1214/15-STS527
Donoho, D. L., Johnstone, I. M., Kerkyacharian, G., & Picard, D.
(1995). Wavelet
Shrinkage:
Asymptopia?
Journal of the Royal Statistical Society: Series B
(Methodological),
57(2), 301–337.
https://doi.org/10.1111/j.2517-6161.1995.tb02032.x
Dyson, F. (2005). Wise
Man.
New York Review of
Books.
https://www.nybooks.com/articles/2005/10/20/wise-man/
Efron, B., & Gong, G. (1983). A
Leisurely
Look at the
Bootstrap, the
Jackknife, and
Cross-
Validation.
The American Statistician,
37(1), 36–48.
https://doi.org/10.1080/00031305.1983.10483087
Eilers, P. H. C., & Marx, B. D. (1996). Flexible smoothing with
B-splines and penalties.
Statistical Science,
11(2), 89–121.
https://doi.org/10.1214/ss/1038425655
El-Bachir, Y., & Davison, A. C. (n.d.). Fast
Automatic Smoothing for
Generalized Additive Models.
Elith, J., Leathwick, J. R., & Hastie, T. (2008a). A working guide
to boosted regression trees.
Journal of Animal Ecology,
77(4), 802–813.
https://doi.org/10.1111/j.1365-2656.2008.01390.x
Elith, J., Leathwick, J. R., & Hastie, T. (2008b). A working guide
to boosted regression trees.
Journal of Animal Ecology,
77(4), 802–813.
https://doi.org/10.1111/j.1365-2656.2008.01390.x
Fisher, R. A. (1936). The
Use of
Multiple
Measurements in
Taxonomic Problems.
Annals of
Eugenics,
7(2), 179–188.
https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
Friedman, J. H. (2001). Greedy
Function
Approximation:
A Gradient
Boosting Machine.
The Annals of
Statistics,
29(5), 1189–1232.
https://www.jstor.org/stable/2699986
Friedman, J., Hastie, T., Höfling, H., & Tibshirani, R. (2007).
Pathwise coordinate optimization.
Annals of Applied Statistics,
1(2), 302–332.
https://doi.org/10.1214/07-AOAS131
Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse
covariance estimation with the graphical lasso.
Biostatistics
(Oxford, England),
9(3), 432–441.
https://doi.org/10.1093/biostatistics/kxm045
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization
Paths for
Generalized Linear
Models via
Coordinate Descent.
Journal of Statistical Software,
33(1), 1–22.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2929880/
Garrido-Merchán, E. C., & Hernández-Lobato, D. (2017). Dealing with
Integer-valued
Variables in
Bayesian Optimization with
Gaussian Processes.
arXiv:1706.03673
[Stat].
http://arxiv.org/abs/1706.03673
Gelman, A. (2020). The typical set and its relevance to
Bayesian computation. In Statistical Modeling, Causal
Inference, and Social Science.
https://statmodeling.stat.columbia.edu/2020/08/02/the-typical-set-and-its-relevance-to-bayesian-computation/.
Gelman, A. (2021). Reflections on
Breiman’s
Two
Cultures of
Statistical Modeling.
Observational
Studies,
7(1), 95–98.
https://doi.org/10.1353/obs.2021.0025
Giraud-Carrier, C., & Provost, F. (2005). Toward a justification of
meta-learning: Is the no free lunch theorem a show-stopper?
Proceedings of the ICML-2005 Workshop on Meta-Learning.
Girolami, M., Calderhead, B., & Chin, S. A. (2019). Riemannian
Manifold Hamiltonian Monte Carlo.
arXiv:0907.1100 [Cs,
Math, Stat].
https://arxiv.org/abs/0907.1100
Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized
Cross-Validation as a
Method for
Choosing a
Good Ridge Parameter.
Technometrics,
21(2), 215–223.
https://doi.org/10.1080/00401706.1979.10489751
Görtler, J., Kehlbeck, R., & Deussen, O. (2019). A
Visual Exploration of
Gaussian
Processes.
Distill,
4(4), e17.
https://doi.org/10.23915/distill.00017
Gramacy, R. B. (n.d.). Surrogates.
Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On
Calibration of
Modern Neural
Networks.
Proceedings of the 34th
International Conference on
Machine Learning, 1321–1330.
https://proceedings.mlr.press/v70/guo17a.html
Hand, D. J. (2009). Measuring classifier performance: A coherent
alternative to the area under the
ROC curve.
Machine
Learning,
77(1), 103–123.
https://doi.org/10.1007/s10994-009-5119-5
Harris, D. J. (2015). Generating realistic assemblages with a joint
species distribution model.
Methods in Ecology and Evolution,
6(4), 465–473.
https://doi.org/10.1111/2041-210X.12332
Hastie, T. (2020). Ridge
Regularization:
An
Essential Concept in
Data
Science.
Technometrics,
62(4), 426–433.
https://doi.org/10.1080/00401706.2020.1791959
Hastie, T., & Tibshirani, R. (1987). Generalized
Additive Models:
Some
Applications.
Journal of the American Statistical
Association,
82(398), 371–386.
https://doi.org/10.1080/01621459.1987.10478440
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009).
The
elements of statistical learning data mining, inference, and
prediction.
Springer.
http://public.eblib.com/EBLPublic/PublicView.do?ptiID=437866
Hensman, J., & Ghahramani, Z. (n.d.). Scalable
Variational Gaussian Process
Classification. 10.
Irfan, M. O., & Bull, P. (2021). Cleaning foregrounds from
single-dish 21 cm intensity maps with
Kernel principal
component analysis.
Monthly Notices of the Royal Astronomical
Society,
508(3), 3551–3568.
https://doi.org/10.1093/mnras/stab2855
Jakkala, K. (2021). Deep
Gaussian Processes:
A Survey.
arXiv:2106.12135 [Cs,
Stat].
http://arxiv.org/abs/2106.12135
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An
introduction to statistical learning (Vol. 112).
Springer.
Janson, L., Fithian, W., & Hastie, T. (2015). Effective
Degrees of
Freedom:
A Flawed
Metaphor.
Biometrika,
102(2), 479–485.
https://doi.org/10.1093/biomet/asv019
Jones, A. (2021). The
Matérn class of covariance functions.
In
Andy Jones.
https://andrewcharlesjones.github.io/journal/matern-kernels.html
Jović, A., Brkić, K., & Bogunović, N. (2015). A review of feature
selection methods with applications.
38th International
Convention on Information and
Communication Technology,
Electronics and Microelectronics
(MIPRO), 1200–1205.
https://doi.org/10.1109/MIPRO.2015.7160458
Jurek, M., & Katzfuss, M. (2022).
Scalable
Spatio-Temporal Smoothing via
Hierarchical Sparse Cholesky
Decomposition. arXiv.
https://doi.org/10.48550/arXiv.2207.09384
Kammann, E. E., & Wand, M. P. (2003). Geoadditive models.
Journal of the Royal Statistical Society: Series C (Applied
Statistics),
52(1), 1–18.
https://doi.org/10.1111/1467-9876.00385
Kuhn, M. (2017). Nested resampling with rsample. In
Applied
Predictive Modeling.
http://appliedpredictivemodeling.com/blog/2017/9/2/njdc83d01pzysvvlgik02t5qnaljnd
Kumar, I. E., Venkatasubramanian, S., Scheidegger, C., & Friedler,
S. (2020). Problems with
Shapley-value-based
explanations as feature importance measures.
Proceedings of the 37th
International Conference on Machine
Learning, 5491–5500.
http://proceedings.mlr.press/v119/kumar20e/kumar20e.pdf
Lambert, B., & Vehtari, A. (2022).
R\({_\ast}\):
A Robust MCMC
Convergence Diagnostic with
Uncertainty Using Decision Tree
Classifiers.
Bayesian Analysis,
17(2), 353–379.
https://doi.org/10.1214/20-BA1252
Larsen, K. (2015).
GAM:
The Predictive Modeling
Silver Bullet |
Stitch Fix Technology
Multithreaded. In
MultiThreaded (StitchFix).
https://multithreaded.stitchfix.com/blog/2015/07/30/gam/
Lee, J. D., Sun, Y., & Saunders, M. A. (2014). Proximal
Newton-
Type Methods for
Minimizing Composite Functions.
SIAM Journal on Optimization,
24(3), 1420–1443.
https://doi.org/10.1137/130921428
Lindeløv, J. K. (2019).
Common statistical tests are linear models
(or: How to teach stats).
https://lindeloev.github.io/tests-as-linear/
Loh, W.-Y., & Vanichsetakul, N. (1988). Tree-
Structured
Classification via
Generalized Discriminant
Analysis.
Journal of the American Statistical
Association,
83(403), 715–725.
https://doi.org/10.1080/01621459.1988.10478652
Marra, G., & Wood, S. N. (2011). Practical variable selection for
generalized additive models.
Computational Statistics & Data
Analysis,
55(7), 2372–2387.
https://doi.org/10.1016/j.csda.2011.02.004
McCormick, T. (2021). The "given data" paradigm undermines both
cultures.
arXiv:2105.12478 [Cs, Stat].
http://arxiv.org/abs/2105.12478
Meinshausen, N. (n.d.). Quantile Regression
Forests. 17.
Milà, C., Mateu, J., Pebesma, E., & Meyer, H. (2022). Nearest
neighbour distance matching
Leave-
One-
Out
Cross-
Validation for map validation.
Methods in Ecology and Evolution,
13(6), 1304–1316.
https://doi.org/10.1111/2041-210X.13851
Miller, A. C., Foti, N. J., & Fox, E. B. (2021). Breiman’s two
cultures:
You don’t have to choose sides.
arXiv:2104.12219 [Cs, Stat].
http://arxiv.org/abs/2104.12219
Minderer, M., Djolonga, J., Romijnders, R., Hubis, F., Zhai, X.,
Houlsby, N., Tran, D., & Lucic, M. (2021). Revisiting the
Calibration of
Modern Neural
Networks.
Advances in Neural
Information Processing
Systems,
34, 15682–15694.
https://proceedings.neurips.cc/paper/2021/hash/8420d359404024567b5aefda1231af24-Abstract.html
Mount, J. (2012). How robust is logistic regression? In Win Vector
LLC.
https://win-vector.com/2012/08/23/how-robust-is-logistic-regression/.
Nazarathy, Y., & Klok, H. (2021).
Statistics with
Julia: Fundamentals for data science, machine learning and
artificial intelligence.
Springer International
Publishing.
https://doi.org/10.1007/978-3-030-70901-3
Neal, R. M. (2012). Bayesian learning for neural networks (Vol.
118). Springer Science & Business Media.
Paciorek, C., & Schervish, M. (2003). Nonstationary
Covariance Functions for
Gaussian
Process Regression.
Advances in
Neural Information Processing
Systems,
16.
https://proceedings.neurips.cc/paper/2003/hash/326a8c055c0d04f5b06544665d8bb3ea-Abstract.html
Peng, H., Long, F., & Ding, C. (2005). Feature selection based on
mutual information criteria of max-dependency, max-relevance, and
min-redundancy.
IEEE Transactions on Pattern Analysis and Machine
Intelligence,
27(8), 1226–1238.
https://doi.org/10.1109/TPAMI.2005.159
Perperoglou, A., Sauerbrei, W., Abrahamowicz, M., & Schmid, M.
(2019). A review of spline function procedures in
R.
BMC Medical Research Methodology,
19(1), 46.
https://doi.org/10.1186/s12874-019-0666-3
Poynor, V., & Munch, S. (2017). Combining functional data with
hierarchical
Gaussian process models.
Environmental and
Ecological Statistics,
24(2), 175–199.
https://doi.org/10.1007/s10651-017-0366-2
Prechelt, L. (2012). Early
Stopping - but when? In G.
Montavon & K.-R. Müller (Eds.),
Neural Networks:
Tricks of the Trade (pp. 53–67).
http://page.mi.fu-berlin.de/~prechelt/Biblio/stop_tricks1997.pdf
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P.
(2007). Numerical Recipes 3rd Edition:
The Art of Scientific
Computing (3rd ed.). Cambridge University Press.
Raper, S. (2020). Leo
Breiman’s "
Two
Cultures".
Significance,
17(1), 34–37.
https://doi.org/10.1111/j.1740-9713.2020.01357.x
Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian
Processes for Machine
Learning. The MIT Press.
Ratz, A. V. (2021). Can QR Decomposition Be Actually
Faster? Schwarz-Rutishauser Algorithm. In
Medium.
https://towardsdatascience.com/can-qr-decomposition-be-actually-faster-schwarz-rutishauser-algorithm-a32c0cde8b9b.
Reiss, P. T., & Ogden, R. T. (2009). Smoothing parameter selection
for a class of semiparametric linear models.
Journal of the Royal
Statistical Society: Series B (Statistical Methodology),
71(2), 505–523.
https://doi.org/10.1111/j.1467-9868.2008.00695.x
Riihimäki, J., & Vehtari, A. (2010). Gaussian processes with
monotonicity information.
Proceedings of the Thirteenth
International Conference on
Artificial Intelligence and
Statistics, 645–652.
https://proceedings.mlr.press/v9/riihimaki10a.html
Robert, C. P., & Roberts, G. O. (2021).
Rao-Blackwellization in the MCMC era
(No. arXiv:2101.01011).
arXiv.
https://doi.org/10.48550/arXiv.2101.01011
Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J.,
Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J. J., Schröder,
B., Thuiller, W., Warton, D. I., Wintle, B. A., Hartig, F., &
Dormann, C. F. (2017). Cross-validation strategies for data with
temporal, spatial, hierarchical, or phylogenetic structure.
Ecography,
40(8), 913–929.
https://doi.org/10.1111/ecog.02881
Rue, H., Martino, S., & Chopin, N. (2009). Approximate
Bayesian inference for latent
Gaussian models
by using integrated nested
Laplace approximations.
Journal of the Royal Statistical Society: Series B (Statistical
Methodology),
71(2), 319–392.
https://doi.org/10.1111/j.1467-9868.2008.00700.x
Sansó, B., Schmidt, A. M., & Nobre, A. A. (2008). Bayesian
Spatio-
Temporal Models
Based on
Discrete Convolutions.
The Canadian Journal of Statistics / La Revue Canadienne de
Statistique,
36(2), 239–258.
http://www.jstor.org/stable/20445307
Schölkopf, B., Smola, A., & Müller, K.-R. (1997). Kernel principal
component analysis. In W. Gerstner, A. Germond, M. Hasler, & J.-D.
Nicoud (Eds.),
Artificial Neural Networks
ICANN’97 (pp. 583–588).
Springer.
https://doi.org/10.1007/BFb0020217
Shafer, G., & Vovk, V. (2008a). A tutorial on conformal prediction.
Journal of Machine Learning Research,
9, 371–421.
https://jmlr.csail.mit.edu/papers/volume9/shafer08a/shafer08a.pdf
Shafer, G., & Vovk, V. (2008b). A Tutorial on
Conformal Prediction. Journal of Machine
Learning Research, 9, 371–421.
Shalizi, C. R. (2022).
Advanced data analysis from an elementary
point of view.
https://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/
Sigrist, F. (2018). Gradient and
Newton Boosting for
Classification and
Regression. In
arXiv.org.
https://doi.org/10.48550/arXiv.1808.03064
Silge, M. K. and J. (n.d.). 18 Explaining Models and
Predictions | Tidy Modeling with
R.
Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011).
Regularization
Paths for
Cox’s
Proportional Hazards Model via
Coordinate
Descent.
Journal of Statistical Software,
39(5), 1–13.
https://doi.org/10.18637/jss.v039.i05
Stone, M. (1977). An
Asymptotic Equivalence of
Choice of
Model by
Cross-
Validation and
Akaike’s
Criterion.
J. Royal Stat. Soc. B,
39(1),
44–47.
https://www.jstor.org/stable/2984877
Valavi, R., Elith, J., Lahoz-Monfort, J. J., & Guillera-Arroita, G.
(2019).
blockCV:
An r package
for generating spatially or environmentally separated folds for k-fold
cross-validation of species distribution models.
Methods in Ecology
and Evolution,
10(2), 225–232.
https://doi.org/10.1111/2041-210X.13107
van den Goorbergh, R., Smeden, M. van, Timmerman, D., & Van Calster,
B. (2022). The harm of class imbalance corrections for risk prediction
models: Illustration and simulation using logistic regression.
Journal of the American Medical Informatics Association,
ocac093.
https://doi.org/10.1093/jamia/ocac093
van Houwelingen, J. C. (2001). Shrinkage and
Penalized
Likelihood as
Methods to
Improve Predictive
Accuracy.
Statistica Neerlandica,
55(1), 17–34.
https://doi.org/10.1111/1467-9574.00154
Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., & Bürkner,
P.-C. (2021). Rank-
Normalization,
Folding, and
Localization:
An Improved R^ for
Assessing Convergence of
MCMC (with
Discussion).
Bayesian Analysis,
16(2),
667–718.
https://doi.org/10.1214/20-BA1221
Venables, W. N. (1998).
Exegeses on Linear
Models.
http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf
Wager, S., Hastie, T., & Efron, B. (n.d.). Confidence
Intervals for Random Forests:
The Jackknife and the
Infinitesimal Jackknife.
Wainer, J., & Cawley, G. (2021). Nested cross-validation when
selecting classifiers is overzealous for most practical applications.
Expert Systems with Applications,
182, 115222.
https://doi.org/10.1016/j.eswa.2021.115222
Walters, C. J., & Ludwig, D. (1981). Effects of
Measurement
Errors on the
Assessment of
Stock–
Recruitment Relationships.
Canadian
Journal of Fisheries and Aquatic Sciences,
38(6), 704–710.
https://doi.org/10.1139/f81-093
Wand, M. P., & Ormerod, J. T. (2011). Penalized wavelets:
Embedding wavelets into semiparametric regression.
Electronic Journal of Statistics,
5(none).
https://doi.org/10.1214/11-EJS652
Warnes, J. J., & Ripley, B. D. (1987). Problems with
Likelihood Estimation of
Covariance Functions of
Spatial
Gaussian Processes.
Biometrika,
74(3), 640–642.
http://www.jstor.org/stable/2336705
Wenger, J., Pleiss, G., Hennig, P., Cunningham, J. P., & Gardner, J.
R. (2022).
Preconditioning for Scalable
Gaussian Process Hyperparameter
Optimization. arXiv.
https://doi.org/10.48550/arXiv.2107.00243
Wenger, S. J., & Olden, J. D. (2012). Assessing transferability of
ecological models: An underappreciated aspect of statistical validation.
Methods in Ecology and Evolution,
3(2), 260–267.
https://doi.org/10.1111/j.2041-210X.2011.00170.x
Witten, D. M., Tibshirani, R., & Hastie, T. (2009b). A penalized
matrix decomposition, with applications to sparse principal components
and canonical correlation analysis.
Biostatistics, kxp008.
https://doi.org/10.1093/biostatistics/kxp008
Witten, D. M., Tibshirani, R., & Hastie, T. (2009a). A penalized
matrix decomposition, with applications to sparse principal components
and canonical correlation analysis.
Biostatistics, kxp008.
https://doi.org/10.1093/biostatistics/kxp008
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for
optimization.
IEEE Transactions on Evolutionary Computation,
1(1), 67–82.
https://doi.org/10.1109/4235.585893
Wood, S. N. (2003). Thin plate regression splines.
Journal of the
Royal Statistical Society: Series B (Statistical Methodology),
65(1), 95–114.
https://doi.org/10.1111/1467-9868.00374
Wood, S. N. (2011). Fast stable restricted maximum likelihood and
marginal likelihood estimation of semiparametric generalized linear
models.
Journal of the Royal Statistical Society: Series B
(Statistical Methodology),
73(1), 3–36.
https://doi.org/10.1111/j.1467-9868.2010.00749.x
Wood, S. N. (2017b). P-splines with derivative based penalties and
tensor product smoothing of unevenly distributed data.
Statistics
and Computing,
27(4), 985–989.
https://doi.org/10.1007/s11222-016-9666-x
Yang, Y. (2005). Can the strengths of
AIC and
BIC be shared?
A conflict between model
identification and regression estimation.
Biometrika,
92(4), 937–950.
https://doi.org/10.1093/biomet/92.4.937
Yuan, M., & Lin, Y. (2006). Model selection and estimation in
regression with grouped variables.
Journal of the Royal Statistical
Society: Series B (Statistical Methodology),
68(1), 49–67.
https://doi.org/10.1111/j.1467-9868.2005.00532.x
Zhang, L. (2018). Nearest
Neighbor Gaussian
Processes (
NNGP) based models in
Stan. In
Stan Case Studies.
https://mc-stan.org/users/documentation/case-studies/nngp.html
Zhao, S., Witten, D., & Shojaie, A. (2021). In defense of the
indefensible: A very naïve approach to high-dimensional inference.
Statistical Science,
36(4), 562–577.
https://doi.org/10.1214/20-STS815
Zou, H., Hastie, T., & Tibshirani, R. (2007). On the
“degrees
of freedom” of the lasso.
The Annals of Statistics,
35(5), 2173–2192.
https://doi.org/10.1214/009053607000000127