A PCA-Based Change Detection Framework for Multidimensional Data Streams

A PCA-Based Change Detection Framework for Multidimensional Data Streams

Detecting changes in multidimensional data streams is an important and challenging task. In unsupervised change detection, changes are usually detected by comparing the distribution in a current (test) window with a reference window. It is thus essential to design divergence metrics and density estimators for comparing the data distributions, which are mostly done for univariate data. Detecting changes in multidimensional data streams brings difficulties to the density estimation and comparisons. In this paper, we propose a framework for detecting changes in multidimensional data streams based on Principal Component Analysis (PCA), which is used for projecting data into a lower dimensional space, thus facilitating density estimation and change-score calculations. The proposed framework also has advantages over existing approaches by reducing computational costs with an efficient density estimator, promoting the change-score calculation by introducing effective divergence metrics, and by minimizing the efforts required from users on the threshold parameter setting by using the Page-Hinkley test.

More details can be found in the paper:
Abdulhakim A Qahtan, Basma Harbi, Suojin Wang, Xiangliang Zhang, "A PCA-Based Change Detection Framework for Multidimensional Data Streams". In the proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining - KDD 2015.


Download the code​