Change Detection in Data Streams by Testing Exchangeability
350 likes | 364 Vues
Change Detection in Data Streams by Testing Exchangeability. Shen-Shyang Ho JPL/Caltech. The research is part of the author’s PhD dissertation (in computer science) at George Mason University Conference travel is partially sponsored by NASA Postdoctoral Program (NPP) Travel Grant. Outline.
Change Detection in Data Streams by Testing Exchangeability
E N D
Presentation Transcript
Change Detection in Data Streams by Testing Exchangeability Shen-Shyang Ho JPL/Caltech The research is part of the author’s PhD dissertation (in computer science) at George Mason University Conference travel is partially sponsored by NASA Postdoctoral Program (NPP) Travel Grant.
Outline • Introduction • Previous Work (Statistics and Machine Learning/Data Mining/Computer Vision) • Intuition • Background (Exchangeability/Martingale) • Methodology • Comparison and Experimental Results • Application I: Adaptive Support Vector Machine (Classification Model) • Application II: Video Shot Change Detection (Cluster Model)
Introduction Letbe a sequence of independent p-dimensional random vectors with parameters Test the following hypothesis: Assumption: Data vectors are observed sequentially.
Previous Work • Statistics :- Sequential Analysis is statistical inference with the assumption that the number of observations/samples required is not pre-determined. • Sequential Probability Ratio Test – A. Wald (1945) • Application: Quality Control (Military/Manufacturing) • CUSUM (Cumulative Sum) – E. S. Page (1954) • Refer to “Sequential Analysis: Design Methods and Applications” Journal for recent research. • Most recent issue (vol 27, no 2, 2008) – papers on structural change/minimax method for change-point detection problems/multidecision quickest change-point detection – 3 out of 6 papers. • Machine Learning/Data Mining: • Applications: Concept Drift Problem, Adaptive classifier, Anomaly in Internet Traffic, Video-shot change detection • Proposed methodology is usually problem-specific • Monitoring error, sliding window, weighted data, ensemble classifier … • Statistical method: Likelihood ratio method, Bayesian methods, Hypothesis Testing …
Related Data Mining/Machine Learning/Computer Vision Research • Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, Sanjay Ranka: Statistical change detection for multi-dimensional data. KDD 2007: 667-676 • Kolter, J.Z. and Maloof, M.A. Dynamic Weighted Majority: An ensemble method for drifting concepts. Journal of Machine Learning Research 8:2755--2790, 2007. • Klinkenberg, Ralf and Joachims, Thorsten: Detecting Concept Drift with Support Vector Machines. Proceedings of the Seventeenth International Conference on Machine Learning (ICML): 487--494, 2000. • Bi Song, Namrata Vaswani, Amit K. Roy Chowdhury: Closed-Loop Tracking and Change Detection in Multi-Activity Sequences. CVPR 2007 • Paul L. Rosin: Thresholding for Change Detection. ICCV 1998: 274-279 • Balachander Krishnamurthy, Subhabrata Sen, Yin Zhang, Yan Chen: Sketch-based change detection: methods, evaluation, and applications. Internet Measurement Conference 2003: 234-247 • Tsuyoshi Idé, Keisuke Inoue: Knowledge Discovery from Heterogeneous Dynamic Systems using Change-Point Correlations. SDM 2005 • Tsuyoshi Idé, Koji Tsuda: Change-Point Detection using Krylov Subspace Learning. SDM 2007 • Daniel Kifer, Shai Ben-David, Johannes Gehrke, Detecting Changes in Data Streams, Proc. 30th VLDB Conference, 2004. • ... …
Motivation “Lack of Exchangeability” implies “Change in Data Distribution/Model” 1/4/2020 7
1 2 3 4 5 6 7 8 9 10 • 1 9 3 5 2 6 7 4 8 10 • 2 3 4 5 6 7 8 9 10 • 1 9 3 5 2 6 7 2 8 10 Identically Distributed but may be Dependent Intuition
Background • Vovk et al’s work on “Testing Exchangeability Online” (ICML 2003) and “Algorithmic Learning in a random world” (Springer) : - • Testing exchangeability assumption in an online mode. • Explicit Martingale for testing the hypothesis of exchangeability (Refer to http://www.vovk.net (conformal prediction) ) 1/4/2020 9
Background Let be a sequence of random variables. A finite sequence of random variable is exchangeable if , the joint distribution is invariant under any permutation of the indices of the random variables. A martingale is a sequence of random variables such that is a measurable function of for all (in particular, is a constant value) and the conditional expectation of given is equal to , i.e., 1/4/2020 10
Methodology - Strangeness • Strangeness measures how well one data point (for each data point seen so far) is represented by a data model compared to other points • Applicable to classification, regression or cluster model • measure diversity / disagreements, i.e. the higher the strangeness of a point, the less likely it comes from the model Condition for a valid strangeness measure: A strangeness value of a data point at a particular time instance should be independent of the order it is observed with respect to the other data points.
Classification Model Strangeness (K-NN): t = 1 to 1000 1001 to 2000 2001 to 3000 A B C t aaaaa…aaaaabbbbbb…….bbbbbccccc…cccccc Strangeness (SVM): Lagrange Multiplier
Classification Model Strangeness (SVM): Lagrange Multiplier 1/4/2020
Cluster Model Strangeness of a data vector in a cluster
Regression Model where is the regression function and is the error estimation function for at (Papadopoulos et al., Inductive Confidence Machines for Regression, ECML, LNAI 2430, pp 345-356, 2002)
Methodology p-value of a new point given previous seen data points: • where is the strangeness measure for • and is randomly chosen from [0,1] for each new point • : necessary so the sequence of p-values are uniformly distributed in [0,1] for any strangeness measure (Vovk, 2003)
Methodology Consider the null hypothesis against the alternative hypothesis The test for change continues as long as One rejects the null hypothesis when
Methodology 1/4/2020 21
Experimental Result – Performance Measure 1/4/2020 22
Experimental Result – Varying 1/4/2020 23
Experimental Result –Varying Linearly Non-separable Classification Model Linearly Separable Classification Model
Experimental Result Ringnorm/Twonorm (Change in dataset every 1000 points) Nursery Categorical Dataset (Change in class compositions every 1000 points) 1/4/2020 26
Experimental Result 1/4/2020 27
Application: Adaptive SVM Simulated USPS 3-Digit Image Data Stream t 01120120…0340033404…156556115…77789987… 1/4/2020 30
Application: Adaptive SVM A (blue): True Change Point Known to the SVM B(red): Adaptive SVM using martingale method C(magenta): SVM using sliding window of size 250 D(black): SVM using sliding window of size 500 E(green): SVM using sliding window of size 1000
Application: Video-Shot Change Detection Martingale Change Detection using multiple features (MVMT: Multiple-view martingale test)
Application: Video-Shot Change Detection • HI: Histogram Intersection • Chi-Square Measure • Euclidean Distance (ED) 1/4/2020 33
Reference • S.-S. Ho and H. Wechsler, Detecting Change-Points in Unlabeled Data Streams using Martingale, Proc. 20th Int. Joint. Conf. Artificial Intelligence (IJCAI 2007), Hyderabad, India, Jan. 6 - 12, 2007. • S-S Ho, A Martingale Framework for Concept Change Detection in Time-Varying Data Streams, Proc Int. Conf. on Machine Learning (ICML 2005), Bonn, Germany, Aug. 7 - 11, 2005 • S-S Ho and H. Wechsler, Adaptive Support Vector Machine for Time-Varying Data streams Using the Martingale, Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, July 30 - Aug. 5, 2005 • S-S Ho and H. Wechsler, On the detection of concept change in time-varying data streams by testing exchangeability, Proc. Conference on Uncertainty in Artificial Intelligence (UAI 2005), Edinburgh, Scotland, July 26 - 29, 2005 • http://shenshyang.googlepages.com/codes (matlab codes + datasets) 1/4/2020 34
Acknowledgement • Harry Wechsler, PhD Advisor (George Mason University) • Volodya Vovk, (Royal Holloway, University of London) • Alexander Gammerman (Royal Holloway, University of London) • Oak Ridge Associated University (ORAU) 1/4/2020 35