Change-Point Detection in a Sequence of Bags-of-Data

Kensuke Koshijima, Hideitsu Hino, Noboru Murata

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


In this paper, the limitation that is prominent in most existing works of change-point detection methods is addressed by proposing a nonparametric, computationally efficient method. The limitation is that most works assume that each data point observed at each time step is a single multi-dimensional vector. However, there are many situations where this does not hold. Therefore, a setting where each observation is a collection of random variables, which we call a bag of data, is considered. After estimating the underlying distribution behind each bag of data and embedding those distributions in a metric space, the change-point score is derived by evaluating how the sequence of distributions is fluctuating in the metric space using a distance-based information estimator. Also, a procedure that adaptively determines when to raise alerts is incorporated by calculating the confidence interval of the change-point score at each time step. This avoids raising false alarms in highly noisy situations and enables detecting changes of various magnitudes. A number of experimental studies and numerical examples are provided to demonstrate the generality and the effectiveness of our approach with both synthetic and real datasets.

Original languageEnglish
Article number7095580
Pages (from-to)2632-2644
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number10
Publication statusPublished - 2015 Oct 1


  • Change-point detection
  • Earth Movers Distance
  • anomaly detection
  • entropy estimator

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Change-Point Detection in a Sequence of Bags-of-Data'. Together they form a unique fingerprint.

Cite this