1 / 27

GGobi

GGobi. Dr. Yan Liu Department of Biomedical, Industrial and Human Factors Engineering Wright State University . Introduction. Overview An open source visualization program for exploring high-dimensional data Basic plots

kenaz
Télécharger la présentation

GGobi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GGobi Dr. Yan Liu Department of Biomedical, Industrial and Human Factors Engineering Wright State University

  2. Introduction • Overview • An open source visualization program for exploring high-dimensional data • Basic plots • scatterplot, scatterplot matrix, parallel coordinates, time series, histogram • Interaction • Tour, linking and brushing, dynamic selection, probing, zooming, etc. • Interface • Graphic user interface (GUI) when using GGobi as stand-alone tool • A command-line interface when using GGobi with R (rggobi package)

  3. Load Dataset • Two Data Types • XML(extensible markup language) • CSV (Comma-seperated variables) File >> Open

  4. First Two Windows after Loading a Dataset Move mouse cursor over a control to see tooltip that explains its function 2 3 4 5 1 1 Clicking buttons to select which variables to be mapped to the X(horizontal) and Y(vertical) axes Scatterplot (XY Plot) GGobi Console Adjusting cycling speed 4 2 When checked, all the possible XY plots (changing variables mapping to the X and/or Y coordinates) are displayed automatically (one after another) 3 Select whether to fix the variable mapped to X/Y axis when cycling the plot 5 Specify the direction of cycling (according to the list)

  5. Menu bar • File: open an existing file, open a new console, save data, close the current console, or quit the application • Display: open a new plot window (2D scatterplot, scatterplot matrix, parallel coordinates, time-series, or bar chart) • View: specify the projection (1D, 1D tour, 2D, 2D tour, etc) for the current display • Interaction: specify interaction(scaling, highlighting, moving, etc.) with the current display • Tools: open other windows to manipulate characteristics of data and view

  6. Open a New Plot • Display >> New Barchart to open a new bar chart of the dataset • The current active plot is surrounded by a narrow white band • The GGobi console corresponds to the current active plot • Click a plot (in the plot region) to make it active

  7. Tour • A d-dimensionalgrand tour is a continuous geometric transformation of a d-dimensional coordinate systems such that all possible orientations of the coordinate axes are eventually achieved • Allow for an in-depth study of high dimensional data • Types of Tour in GGobi • 1D tour • 2D tour • Rotation: 2D tour with three variables • 2X1D tour

  8. 1D Tour • View >> 1D Tour • Generates a continuous sequence of 1-D projections of the active variable space • The active variable space is the subset of attribute currently selected • Variable circles are drawn with a bold outline • De-selected variable fades out gradually to maintain continuity of motion • The projected data are displayed as an average shifted histogram (ASH) • The idea of average shifted histogram is that we generate a set of histograms with different origins and then we average these histograms • Manually select the variable to manipulate • Click on the purple Manip button and then click on the variable circle • Horizontal mouse motions in the plot window then alter the coefficient for the manipulation attribute (from -1 to 1)

  9. Axes for the tour • can be removed by Options >> show axes 1 Select the attributes for the tour 2 Current manipulation attribute 3 Adjust tour speed 4 Stop/start tour 5 1 Initiate the tour (the coefficient of the manipulation attribute starts from 1) 6 Start the tour from a randomly selected coefficient of the manipulation attribute 4 7 5 3 Control the number of histograms (the smoother, the more histograms) 8 6 7 2 8

  10. 2D Tour • Axes for the tour • can be removed by Options >>show axes 1 Select the attributes for the tour 2 3 • Manipulation modes • Oblique: unconstrained manipulation • Horizontal/Vertical: constrain manipulation along horizontal/vertical axis • Radial: constrain manipulation to the current direction of the variable keeping angle fixed • Angular: allows rotating the variable axis in the plane of the plot window, keeping the length of the axis fixed 3 2 1 View >> 2D Tour Generates a continuous sequence of 2-D projections of the active variable space The projected data are displayed as a scatterplot Many features are similar to those in 1D tour

  11. 2D Tour with Three Variables View >> Rotation 2D tour restricted to using three variables The three axes are individually represented by toggle buttons labeled X, Y and Z

  12. 2x1D Tour Axes for the tour 1 Select the horizontal and vertical attributes for the tour 2 • Manipulation modes • Combined: change both horizontal and vertical manipulation variable coefficients • EqualComb: constrains the horizontal and vertical changes to be equal • Horizontal/Vertical: constrain manipulation along horizontal/vertical axis 3 3 1 2 • View >> 2x1D • Generates 2 independent continuous sequences of 1D projections of 2 active variable space • Plotting results horizontally and vertically generating a scatterplot

  13. Scaling of Axes Zoom vertically/horizontally 1 4 Pan vertically/horizontally 2 1 Control whether to hold aspect ratio constant during zooming 3 3 More scale controls 4 2 • Interaction >> Scale • Changing the view of the data rather than transforming it • Using the sliders on the console • Direct manipulation in the plot window • Left mouse button for panning (moving data freely around window) • Right or middle mouse button for zooming (moving up/down for zooming in/out along vertical axis; moving right/left for zooming in/out along horizontal axis)

  14. Brush • Interaction >> Brush • Interactively paint(highlight) points • More powerful when linking multiple plots • “brush” • all points in the brush will be affected • press left mouse button to drag it around the plotting window • press right or middle button to resize or reshape the brush 1 2 3 4 1 Choose the color, glyph shape and size, and line type of the edge of the “brush” 2 If checked, the points brushed will remained highlighted after the “brushed” is move away 3 Undo the most recent persistent brushing changes 4

  15. 6 5 5 6 • Specify characteristics of “brushed” points • by “Shadow”: brushed points are drawn in a color that’s very close to the background color, so that these points are de-emphasized yet provide context for the rest of the data 5 Specify characteristics of “brushed” line segments 6

  16. Brush menu options 7 7 • “Exclude shadowed points/edges”: exclude shadowed points/edges from the plot, and the view of the plot is rescaled without them • “Include shadowed points/edges”: redraw the shadowed points/edges and include them in the rescaling • “Unshadow all points/edges”: restore the points/edges to their usual colors • “Reset brush”: restore the brush to its default size and position • “Brush on”: if unchecked, the brush can freely move around the plotting window without changing the covered points (useful when need to position the brush quickly before painting) • “Update brushing continuously”: update linked brushing with every mouse motion; if unchecked, update linked views only when mouse is released

  17. 8 Two linked views • Linking rules • by “Case ID”: when points in one view are painted, the points corresponding to the same records (cases) in the other view are also painted • by “Area”: points that have the same value in variable Area are painted in both views 8

  18. Identification of Points • Interaction >> Identify 1 • Specify how to label each record (by which attribute) • by “Record Label” by default (supplied by the user) • by “Record Number” (if no record label is supplied) • by a specific attribute 1 2 2 3 Remove labels in the plot 4 3 Label all records in the plot Recenter the plot based on the selected record in the plotting window 4

  19. Variable Manipulation Tool • Tools >> Variable Manipulation • Display some statistics of attributes • For a continuous(real) attribute, display variable transformation (if any), min, max, mean, and median of raw data, and # of missing values) • For a categorical attribute, display # of values for each level of the attribute • Select subsets of attributes to be plotted when launching parallel coordinates or scatterplot matrix • CTRL/Shift key for selecting multiple inconsecutive/consecutive attributes

  20. Variable Manipulation Tool (Cont.) Set variable ranges 1 Rescale the plot using the user specified ranges 2 Clone selected variables 3 Create a new variable 4 Rename variable name 5 • Set variable ranges • Change the variable ranges used for projecting data into the plot window • Clone the selected variables and add the new variables to the console and data table • Create a new variable • Its value is set to either the row numbers (1: # of records) or a set of integers reflecting the assignment of groups defined by brushing • Rename the name of the selected variable 2 1 5 4 3

  21. Variable Transformation Tool T(Y) = (Yλ – 1)/λ λ: transformation parameter T(Y) = (Y – sample mean)/sample standard deviation • Tools >> Variable Transformation • Stage 0: adjust the domain of variables • Change minimum to 0 or 1; negative • Stage 1: data-independent transformations, preserving user-defined limits • Box-Cox family of linear transformation • Take logarithmic function of base 10 • Inverse • Take absolute value • Scale to a specific range [a, b] • Stage 2: data-dependent transformations • Standardize • Normal score • Z-score • Sort • Rank, etc.

  22. Jittering • Tools >> Variable Jittering • Add random noise to selected variables in order to ameliorate the problem of data overlapping • Select uniform or normal random jitter • Choose the degree of jittering (0-1)

  23. Color Schemes • Tools >> Color Schemes • Select a color scheme and use it to color points • The number of colors in the selected color scheme to color data points cannot be less than the number of colors used currently used • The colors placed first in the color scheme are used first to color data points • Four types of color schemes • Diverging: used when the midpoint of the variable has important meaning • Sequential: used to highlight a continuous progression of values (for continuous variables) • Spectral: use different colors in the spectrum (3-11 colors) • Qualitative: used for categorical variables

  24. Automatic Brushing Select variable 1 Values of the chosen variable that define boundaries between colors (can be adjusted by moving the sliders) 2 1 2 # of points in each color 3 • Define bin boundaries • “constant bin width”: the range (difference between the max. and min. values) in each bin is (almost) the same • “constant bin count”: # of counts in each bin is (almost) the same 4 3 4 5 • Control how display will respond as sliders are moved • continuously (only when data size is small) • on mouse release 5 • Tools >> Automatic Brushing • Paint data according to the values of a specific variable

  25. Select Subset of Data • Random sample without replacement • Specify # of records to be randomly sampled • Consecutive Block • Specify the first record and # of records in the block • Limits • Use the limits defined by the user in the variable manipulation table to define the subset • Tools >> Case Subsetting and Sampling

  26. Select Subset of Data (Cont.) • Every nth Case • The first record and k(1+N)th (k=1,2…; N: the interval size) records will be sampled • Sticky Label • All cases with “sticky” labels (formed through identification) will be in the subset • Row Label • Type in a string, and specify where it should fall (or not fall) in record labels and whether cases should be included or ignored

  27. Save Display Tools >> Save Display Description Save the description of the display as a XML file

More Related