Space Shuttle Engine Valve Anomaly Detection by Data Compression
Space Shuttle Engine Valve Anomaly Detection by Data Compression. Matt Mahoney. Outline. Problem Statement Related Work Anomaly Detection by Data Compression Future Work. Normal Solenoid Current. Abnormal. Problem: How to Detect Anomalies in Space Shuttle Valves. Current Method.
Space Shuttle Engine Valve Anomaly Detection by Data Compression
E N D
Presentation Transcript
Space Shuttle Engine Valve Anomaly Detection by Data Compression Matt Mahoney
Outline • Problem Statement • Related Work • Anomaly Detection by Data Compression • Future Work
Normal Solenoid Current Abnormal Problem: How to Detect Anomalies in Space Shuttle Valves
Current Method • Identify features (zero crossings, peaks…) • Specify correct behavior using SCL rules
Goal • Reduce the human workload in specifying “normal” behavior of time-series data • Rule output should be in Space Command Language (SCL, an expert system language) to allow manual adjustments • Anomaly detection must be real time (1K-10K samples per second)
Related Work • Automated waveform segmentation (Gecko, Stan Salvador) • Segment characteristics (level, slope, curvature) identify states • Rules are specified as allowed state transitions • Problem: segmentation is slow
Proposal: Modeling using Data Compression • Train model on “normal” time series • Test by measuring goodness of fit to the trained model
Cross Entropy • Measures fitness of a model M relative to a true (but unknown) probability distribution, P • Minimized when M = P • Estimated by a data compressor that uses M HM(P) = x X -P(x) log M(x) • HM(P) = Cross entropy (compressed data size) • X = set of all possible inputs (waveforms) • P(x) = true probability of x • M(x) = estimated probability by model M
Measuring Cross Entropy Normal, uncompressed Abnormal, uncompressed Normal, compressed Abnormal, compressed Normal 1 Normal 2 Normal 1 or 2 Abnormal
Anomaly Score Score(y) = (C(xy) – C(x)) / C(y) • x = Training (normal) waveform • y = Test (possibly abnormal) waveform • xy = Concatenation of x and y • C(.) = Size after compression • A higher score (worse compression after training) indicates an anomaly
Data Compressors • GZIP (Gailly) • LZ77: duplicate strings are replaced by pointers to the previous occurrence • PAQ3 (Mahoney) • Weighted context mixing • Arithmetic coding of next-bit probability • RK 1.04 (Taylor) • PPMZ (models longest matching context) • Delta coding option for analog data
Data • TEK 0, TEK 1 = Normal on/off cycle of Marotta valve S/N 37898 • TEK {2, 3, 5, 10, 11, 15, 16, 17} = various forced failures • 1000 solenoid current samples at 1 ms intervals • Range: -3.1 to 7.06 A at 0.04 A resolution • Converted to 1000 8-bit values (1000 byte files)
Experimental Procedure • Nor 0: Train on TEK 0, test on TEK 1 (normal) • Nor 1: Train on TEK 1, test on TEK 0 (normal) • Ab 0: Train on TEK 0, average of tests on 8 abnormal traces • Ab 1: Train on TEK 1, average of tests on 8 abnormal traces
Run Time Performance(750 MHz PC) • Real Time = 1K sample/sec • GZIP – 3000K samples/sec • PAQ3 – 40K samples/sec • RK -mx3 –fd1 – 78K samples/sec
Summary • Data compression detects anomalies in the TEK valve data (2 normal, 8 abnormal traces) • GZIP and PAQ3 detect anomalies in 8 of 8 cases using either training set • RK detects 7 of 8 anomalies using either training set (TEK 15 appears more “normal” to all 3 compressors)
Future Work • Verify with more data sets (voltage, temperature, plunger blockage) • Identify anomalous points within the trace • Improve modeling of analog data • Translate models to SCL Work is preliminary. Much needs to be done.
Thank You • For more information, http://cs.fit.edu/~mmahoney/nasa/