In case anyone missed it from earlier this week my latest blog post extends Gradient Boosting to fit just about any model you want.
Posts by
The result is smooth curves that can learn high dimensional interaction effects that you can fit at scale!
Then fitting our Gradient Boosting Spline model is as simple as calling {DecisionTreeRegressor} in a for loop to build up our coefficient predictions.
{JAX} makes this super easy because all we need to do as practitioners is to define our loss function as a function of our parameters and let autodiff handle calculating the partial derivatives
Gradient Boosting Machines will generate predictions for each observation by iteratively learning to predict the gradient of the loss function at each iteration. In this blog post I show you can use the same methodology to fit entire parameter sets
I have a new blog post out today that I'm really excited about. I walk through how you can use Gradient Boosting to fit entire vectors of parameters for each observation, not just a single prediction. statmills.com/2026-04-06-g... #pydata #rstats
There is an additional layer of smoothing you can do with low-rank smoothers, which not only smooths the data but speeds up the processing because we use less parameters!
The smoothing you can do with location data is really interesting! Similar to penalizing neighboring coefficients in a GAM, you can penalize the difference between neighboring census tracts
This allows us to look at some cool maps like where first time home buyers are buying, and where the most expensive homes are
My new blog post explores some open housing data with some interesting (to me) maps. I walked through how to smooth location data using the {mgcv} package with Markov Random Fields for those interested in learning more!
statmills.com/2025-02-04-f...
I hate how it's never the actual data analysis that trips me up when switching between R and Python, it's the silly base stuff like `int` and `len` that takes me multiple tries to switch over 🤬