Presented by Justin Domke
170 likes | 323 Vues
Dynamic query tools for time series data sets: Timebox widgets for interactive exploration Harry Hochheiser Ben Shneiderman. Presented by Justin Domke. Motivation. Data that changes over time is common. Algorithmic and statistical methods are good at answering questions.
Presented by Justin Domke
E N D
Presentation Transcript
Dynamic query tools for time series data sets:Timebox widgets for interactive explorationHarry HochheiserBen Shneiderman Presented by Justin Domke
Motivation • Data that changes over time is common. • Algorithmic and statistical methods are good at answering questions. • How to choose the questions themselves?
Standard time plots are very compelling, but can only display a limited amount of data
Idea: Query the data!
Notation niis an item in a time series data set ni(t) is the value of ni at time t
Three Widgets: (1) Timebox A timebox is a 4-tuple b = (tmin, tmax, vmin, vmax) nisatisfiesb if for all t, tmin ≤ t ≤ tmax, vmin≤ ni(t) ≤ vmax
Three Widgets: (2) Variable Time Timebox A variable time timebox is a 5-tuple b = (tmin, tmax, vmin, vmax,R) nisatisfiesb if: there exists t0, tmin ≤ t0 ≤ tmax- R, such that for all t, t0 ≤ t ≤ t0+R, vmin≤ ni(t) ≤ vmax vmax vmin tmin tmin R
Three Widgets: (3) Angular Query Widget An angular query widget is a 4-tuple b = (tmin, tmax, θmin, θmax) nisatisfiesb if for all t, tmin ≤ t ≤ tmax, θmin≤ φ(ni(t), ni(t)) ≤ θmax Where φ is the angle formed on the graph. max min
Demonstration • Standard Timeboxes • Drag From Display Window • Manpulate multiple boxes • Coupling of windows • Variable Time Timeboxes • Angular Queries • Query Inversion • Query Multiple Variables • Leaders and Laggards
Performance • Over 75% of time is spent on query evaluation. • Naïve approach: • For each item in the set, examine every point in each timebox. • Easy improvement: • Throw an item out if it fails any query.
Performance (2) – Alternatives • Suppose data has n time series, each with m time points. • Think of this as mn points in 2-d space. • Use geometric methods to find the points in each given range. • Increment a value for each point in a series. If the sum is right, the series satisfies the query. • Use orthogonal range tree or grid approach with buckets
Seq – Sequential Orth – Orthogonal Range Tree Grid-X – Grid approach w/ X buckets Performance – 3 Average query completion time vs. number of items for random data. (100 time points)
Seq – Sequential Orth – Orthogonal Range Tree Grid-X – Grid approach w/ X buckets Performance – 4 Average query completion time vs. number of time points for random data. (100 items)
Design Studies • 24 Computer Science students completed various tasks using different but semantically equivalent input mechanisms: • Timebox queries • Fill-in • Range sliders
Design Study 1 • Fully specified tasks. (“During days 22-23, are there more stocks between 69-119, 59-109, or 49-99”) • Form fill in fastest • Range sliders second. • Timeboxes last.
Design Study 2 • More open-ended tasks. • Comare: • Timeboxes with graphical output • Forms with graphical output • Forms with tabular output • No statistically significant difference. (Were the users already familiar with timeboxes?)
Comments • Problems with user interface? • Why “timesearcher”, instead of “parallelcoordinatesearcher”? • In the performance experiment, what did the data look like? • In the design study, were the users already familiar with Timesearcher?