230 likes | 242 Vues
Tile-based parallel coordinates and its application in financial visualization. Jamal Alsakran, Ye Zhao Kent State University, Department of Computer Science, Kent, OH. and Xinlei Zhao Kent State University, Department of Finance, Kent, OH
E N D
Tile-based parallel coordinates and its application in financial visualization Jamal Alsakran, Ye Zhao Kent State University, Department of Computer Science, Kent, OH and Xinlei Zhao Kent State University, Department of Finance, Kent, OH Office of the Comptroller of the Currency, Washington, USA
Motivation • Visual clutter usually weakens or even diminishes parallel coordinates ability when the data size increases • Visualization interactivity allows users to gain wider insight into the data • Financial data analysis is a significant application domain for visual analytics
Background • Johansson et al (05,06) propose high textures to represent the data, and first introduced an opacity transfer function to reveal structures of the data • Zhou et al (08) propose energy minimization to perform visual clustering, where they used transfer functions to assign opacity and colors to different clusters • In financial data, Theme River, Growth Matrix, Pixel-based …etc
Tile-based Parallel Coordinates • Parallel coordinates plotting area defines an image, I(W,H), with width W and height H • Each data item q is projected as a polyline on the image, I(W,H) • For each fragment I(x,y), where 0 ≤ x < W and 0 ≤ y < H, we compute the number of lines intersecting with it, denoting as D(x,y) • A polyline-intersection density image D(W,H) is generated.
Tile-based Parallel Coordinates • Tile-based PC promotes the traditional pixel-based perspective of plotting to a new stage, by defining each fragment as a rectangular region of the image space with a user-specified size • A classical PC plot can simply be achieved by assigning each fragment for one pixel in the image space
Tile-based Parallel Coordinates X Y H I(x,y) W
Tile-based Parallel Coordinates X Y H I(x,y) W
Tile-based Parallel Coordinates X Y H I(x,y) W
Tile-based Parallel Coordinates X Y H I(x,y) W
Color and Opacity TFs • Transfer functions are employed to assign local optical attributes according to the density values • For each fragment I(x,y), we define four transfer functions TF to determine the three color elements, R, G, B, and the opacity, O, from its density value D(x,y) • The histogram of the densities is plotted to facilitate the manipulation of the transfer functions
Color and Opacity TFs Occurrence density Histogram
Fast Computing of Line-Tile Intersection • Immediate visual feedback when users continuously change the tile size is crucial to guarantee interactivity • A fast computing algorithm is employed (Bresenham algorithm) • To fully utilize Bresenham algorithm, we perform a coordinates transformation, which scales each tile to one pixel
Example Original plot # tiles = 450 # tiles = 150 # tiles = 20
Case Study: Mutual Funds • Mutual fund allows a group of investors to pool their money together and invest. • In our study, we have 5785 funds • Each data item represents one mutual fund, whose characteristics are investigated to find its correlation with the annual return • The study examines the most significant characteristics including total net asset size, cash holdings, front-end load, rear-end load, expense ratios, and turnovers
Front Load vs. Return # of tiles = 100 # of tiles = 20
Turnover vs. Return with Outliers • It easily accommodate emphasized outliers together with the main trend • It emphasizes crucial data items while keeping the whole data as a background view • outliers are more easily to be compared with mainstream data
Analyzing Statistical Regression with Visualization • Tile-based PC is used to visually analyze the performance of a traditional statistical method widely used by financial analysts • The standard linear regression model that assumes a linear relation between the explanatory variables and the dependent variable Estimated return = coef * characteristic + interp. • Comparison shows that our method is more informative
Analyzing Statistical Regression with Visualization Regression Data Real Data
Full Attributes Visualization with Outliers • The red polyline represents the best performer, Dreyfus Premier Greater China B (DPCBX), which produced 85% return for investors. • The purple polyline is the second-best mutual fund, Old Mutual Clay Finlay China C (OMNCX) • The best performers achievement in the year 2006 has no direct relation with their fund properties and managing activities
Conclusion • A novel tile-based density and transfer functions to for visual cluttering reduction • The tile-based parallel coordinates technique improves the performance, yields more controllability and promotes the visual understanding • Visual analytical results on financial data set of 2006 U.S mutual funds illustrate the potential of using the method in financial economics