Abstract
This paper develops a new class of flexible minimum description length (MDL) procedures for multiple changepoint detection. Existing MDL approaches, which are penalized likelihoods, use data description length information principles to construct penalties that depend on both the number of changepoints and the lengths of the series’ segments. While MDL methods have yielded promising results in time series changepoint problems, state-of-the-art MDL approaches are not flexible enough to incorporate domain experts’ knowledge that some times are more likely to be changepoints. Furthermore, current MDL methods do not readily handle multivariate series where changepoints can occur in some, but not necessarily all component series. The Bayesian MDL method developed in this paper provides a general framework to account for various prior knowledge, which substantially increases changepoint detection powers. Asymptotically, our estimated multiple changepoint configuration is shown to be consistent. Our method is motivated by a climate application, to identify mean shifts in monthly temperature records. In addition to autocorrelation and seasonal means, our method takes into account metadata, which is a record of station relocations and gauge changes, thus permitting study of documented and undocumented changepoint times in tandem. The multivariate extension allows maximum and minimum temperatures to be jointly examined. (Co-authors: Robert Lund and Anuradha Hewaarachchi)