Feature Engineering for Tree-based Time Series Models

Chris Kuo/Dr. Dataman
13 min readMay 1, 2024

Sample eBook chapters (free): https://github.com/dataman-git/modern-time-series/blob/main/20240522beauty_TOC.pdf

eBook on Teachable.com: $22.50
https://drdataman.teachable.com/p/home

The print edition on Amazon.com: $65 https://a.co/d/25FVsMx

In around 2020s, there was a series of prize competitions for time series forecasting called the M competition, hosted by the International Institute of Forecasters and its affiliation with the prestigious journal, International Journal of Forecasting. The winning models were awarded with prizes from $25,000 to $2,000. The competitions have drawn researchers from around the world to invent new time series models.

Who were the winners? Surprisingly, the top-ranking models were dominated by tree-based machine learning methods. In particular, gradient-boosting models like LightGBM (Ke et al., 2017) prevailed in the competitions, as documented in Januschowski et al., 2022. It is for this important reason that I include the tree-based time series forecasting in this book.

Tree-based models are supervised learning models. A supervised learning model requires a data frame with a target variable and features as the predictors. How do we create a data frame with features from a univariate time…

--

--