1 / 18

Visually Mining and Monitoring Massive Time Series

Visually Mining and Monitoring Massive Time Series. Lin, J, Keogh, E., Lonardi, S., Lankford, J.P. and Nystrom, D.M. In Proceedings of the 10 th ACM SIGKDD International Converence on Knowledge Discovery and Data Mining, 2004. Amy Karlson V. Shiv Naga Prasad 15 February 2004 CMSC 838S.

berk-holt
Télécharger la présentation

Visually Mining and Monitoring Massive Time Series

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visually Mining and Monitoring Massive Time Series Lin, J, Keogh, E., Lonardi, S., Lankford, J.P. and Nystrom, D.M.In Proceedings of the 10th ACM SIGKDD International Converence on Knowledge Discovery and Data Mining, 2004. Amy Karlson V. Shiv Naga Prasad 15 February 2004 CMSC 838S Images courtesy of Jessica Lin and Eamonn Keogh

  2. What are Time Series? • Simply: • Observations of a variable made over time • Typical across a wide variety of domains • Medicine • Physiology • Finance • Microbiology • Meteorology • Surveillance

  3. Motivation: Critical Decision Making • Domains • Spacecraft Launch • Medicine • Research Directions • Mining Archives • Extract rules, patterns, regularities • Visualizing Streams • Novel visualization and interaction for: • Query by content • Motif discovery • Anomaly detection

  4. Some Visual Time Series Systems • Time Searcher • Direct Manipulation Pattern Query • Theme Rivers • Theme strengthover time • Spirals • Periodic Data withknown period Hochheiser and Shniederman dot.com stocks 1999-2002 Havre, Hetzler, Whitney & Nowell InfoVis 2000 Weber et. al

  5. VizTree • Construct a subsequence tree to span the space of subsequences of a given time series. • Use this to collect statistics about the series. • Size of the structure is independent of the length of the series.

  6. VizTree Approach - Overview • Place windows along the time series to obtain subsequences. • Quantize along time and value dimension to obtain sequences of discrete symbols. • Construct a subsequence tree to represent all possible such sequences. • Collect frequencies of traversal of the branches of the subsequence tree. • Use these for motif and anomaly detection, and for comparing time series.

  7. Subsequences Place windows along the time series to obtain subsequences.

  8. Discretization • Subsequences are patterns. • Take windows along time series – length of window ~ length of subsequence. • Discretize the range of data - one symbol for each quantum. • Divide window into segments ~ represent one segment with one symbol.

  9. Symbolic Aggregate approXimation(SAX) Representative symbols Quantization levels Segments One subsequence Discrete version = acdcbdba

  10. Subsequence Tree - example a b • symbols={a,b,c} • #segments per window=2 • Tree spans the space of subsequences. • #Branch factor ~ # symbols (size of alphabet) • Depth ~ # segments per window • Branch thickness ~ freq. of occurrence of subsequence. a c a b b c a c b c

  11. VisTree Tool Demo

  12. Query by Content: Subsequence Matching • Finding known patterns • Chunking • Breaking a time series into individual series • Methods • Time (e.g. power usage) • Shape (e.g. heart beats) • Search Approaches • Exact - Slow • Approximate - Fast • Exploration • Hypothesis Testing --------- VizTree --------------------- VizTree

  13. Motif Discovery • Finding unknown patterns • Not exact matches • VisTree allows exploration at varying levels of precision • E.g., cc** vs. ccac

  14. Anomaly Detection • Finding abnormal patterns. • Use data already seen to identify anomalies • Identified by thin branches

  15. Comparing Series: Diff Tree • Same parameters  same tree structure • Compare the test branch frequencies with respect to reference branch frequencies • Blue = underrepresented • Green = overrepresented • Red = equivalent • Thickness = magnitude

  16. Thoughts on VizTree (Vis.) • Most of “discovery” is implicit • Manual search • Parameter setting might be an issue • Automation might help • Tree Visualization • Use of real estate? • Effective? • Intuitive? • Alternatives?

  17. Thoughts on VizTree (HCI) • Primarily a tool to for researchers now • (Also, we might have an outdated version) • Even so, some HCI suggestions: • Indication of how tree detail relates to tree overview • Zoom into a specific area of the time series (rather than zoom+scroll) • Selection in subsequence detail relates to subsequence overview • Unfortunately, least interesting patterns are most easily accessed (branches at root) • “snap to branch” or “snap to intersection” ? • Ability to turn off highlighting (undo)

  18. Summary: Unique Contributions • Fundamental support for aperiodic series • Scalable • Resource requirements do not grow linearly with length series • Rich visual feature set • Global summaries • Diff-trees between multiple series • Local patterns and anomalies

More Related