Model building with time series data

07 Mar 2017 By Roger Peng

A nice post by Alex Smolyanskaya over the Stitch Fix blog about some of the unique challenges of model building in a time series context:

Cross validation is the process of measuring a model’s predictive power by testing it on randomly selected data that was not used for training. However, autocorrelations in time series data mean that data points are not independent from each other across time, so holding out some data points from the training set doesn’t necessarily remove all their associated information. Further, time series models contain autoregressive components to deal with the autocorrelations. These models rely on having equally spaced data points; if we leave out random subsets of the data, the training and testing sets will have holes that destroy the autoregressive components.

Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

Model building with time series data

Related Posts

Some default and debt restructuring data 04 May 2017

Science really is non-partisan: facts and skepticism annoy everybody 24 Apr 2017

Redirect 06 Apr 2017