Abstract
In this paper, the limitation that is prominent in most existing works of change-point detection methods is addressed by proposing a nonparametric, computationally efficient method. The limitation is that most works assume that each data point observed at each time step is a single multi-dimensional vector. However, there are many situations where this does not hold. Therefore, a setting where each observation is a collection of random variables, which we call a bag of data, is considered. After estimating the underlying distribution behind each bag of data and embedding those distributions in a metric space, the change-point score is derived by evaluating how the sequence of distributions is fluctuating in the metric space using a distance-based information estimator. Also, a procedure that adaptively determines when to raise alerts is incorporated by calculating the confidence interval of the change-point score at each time step. This avoids raising false alarms in highly noisy situations and enables detecting changes of various magnitudes. A number of experimental studies and numerical examples are provided to demonstrate the generality and the effectiveness of our approach with both synthetic and real datasets.
Original language | English |
---|---|
Article number | 7095580 |
Pages (from-to) | 2632-2644 |
Number of pages | 13 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 27 |
Issue number | 10 |
DOIs | |
Publication status | Published - 2015 Oct 1 |
Keywords
- Change-point detection
- Earth Movers Distance
- anomaly detection
- entropy estimator
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics