Due date: April 5 (midnight), on Github.

  1. Build a complete pipeline with a data set of your choice and a tree-based model of your choice in R (using tidymodels) or Python (using scikit-learn). For each step, include a paragraph explaining why you did that step the way you did (what components were included and, possibly, what you decided not to do).
  1. For the gradient boosting algorithm, we want to unpack step 2(c) of algorithm 10.3 (ESL) to derive the optimal value of the weights (\(\gamma_{jm}\)) for each leaf \(j\) at boosting step \(m\). (The derivations in Chen and Guestrin (2016) or Bujokas (2022) may be clearer.)
  1. Derive \(\gamma_{jm}\) for both the MSE (\(L_2\) norm) and binomial deviance loss functions.

  2. Do the same for Newton boosting (Chen and Guestrin 2016), where we use a second-order rather than a first-order approximation to the loss function.

Cite and comment on all references that you used

references

Bujokas, Eligijus. 2022. “Gradient Boosting in Python from Scratch.” Medium. https://towardsdatascience.com/gradient-boosting-in-python-from-scratch-788d1cf1ca7.

Chen, Tianqi, and Carlos Guestrin. 2016. “XGBoost: A Scalable Tree Boosting System.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–94. https://doi.org/10.1145/2939672.2939785.