name that cluster Text Vs. Graphics

name that clusterText Vs. Graphics Shuang Wu REU-DIMACS, 2010 Mentor: James Abello

Talk Outline • Project description • Our research project Input: time data recorded from the ‘Name That Cluster’ web page. Output: statistic results of participants’ different behaviors when using three interfaces. • Collected Statistics • Conclusions

Description • For a pre-computed collection of search engine queries, users select for each query one out of three interfaces: Textual, Graphical and Hybrid. • The evaluation process consists of exploring clusters associated with each query, naming the correspond clusters, selecting Clusters Ratings. (ClusterFitRatings and Cluster Name Ratings). • The ClusterFitRatings are on a scale from -1 to 4 and Name Ratings are on a scale from -1 to 2. Note: -1 means that participants didn’t give a rating. • The collected statistics are: Exploration Times, Naming Times, Cluster Rating Times, Name Rating Times, ClusterFitRatings, and Name Ratings.

Statistics Data The raw data collected online is: • Userid • QueryString: the evaluated query • ClusterNum: the evaluated cluster in that query • Name: name/description/summary given to the cluster • Timestamp: server data/time at which the evaluation was written on the data base

General information from the raw data • 440 clusters were evaluated in the Textual interface, and 338 clusters were evaluated in the Graphical interface, another 378 clusters were evaluated in the Hybrid interface. • We used the Exploration Time, and the Evaluation Time = sum of Naming time, ClusterFitRating time and Name Rating time in the following analysis. • Notation: Ex(T),Ex(G),Ex(H) denote Exploration time per interface; T,G,H denote Evaluation time per interface; NT(T), NT(G), NT(H) denote Naming time per interface.

Statistics collected on a per cluster base Dealt with the outliers Note: We treated data with 3.5 standard deviations from mean as outliers.

Statistics Tests Two sample t-test Anova F-test • Test for the difference in means of two samples. • Null Hypothesis: there is no difference in two means. vs. Alternative Hypothesis: a mean of the first sample is larger/smaller than a mean of the second sample. • Reject a Null Hypothesis if P-value is less than .05. • Test for the difference in means for three or more samples. • Null Hypothesis: all means are equal. vs. Alternative Hypothesis: at least one of the means are different. • Reject a Null Hypothesis if P-value is less than .05.

Statistics Results • Exploration time: There is no difference on the average of Exploration times per interface. • Name time: The Textual interface has the larger Naming time mean. • Evaluation time: The Graphical interface has a larger mean of Evaluation time than the Textual and Hybrid interfaces. After a series of T-tests and ANOVA F-tests we got the following results.

ClusterFitRatings vs. Evaluation/Naming times We wanted to see if there was a relationship between ClusterFitRatings and Evaluation times or Naming times for the cluster collection.

Statistics Results

Name Ratings vs. Evaluation/Naming times We also wanted to see if there was a relationship between Name Ratings and Evaluation times or Naming times for the cluster collection.

Statistics Results

Statistics Results • When participants gave ClusterFitRating=4, they had the shorter mean of Evaluation time and Naming time than the other Cluster FirRatings in all interfaces. • Users either had the shorter mean of Evaluation time and Naming time when they gave a Name Rating=-1 or 2 than when they gave other ratings or there was no significant time difference in all interfaces. • There are linear correlations between ClusterFitRatings and Name Ratings in all interfaces. According to the results from the four pages and a regression test: test for the linear relationship between a response variable and a explanatory variable, we got the following observations.

Statistics collected on a per query base We wanted to see if there was a per query variation of task time among the three interfaces. In order to do this, we grouped the queries that were evaluated with the three interfaces by different users. There was 16 queries that were evaluated by different users with the 3 interfaces. For each such query we tested for the difference.

note: * indicates the existence of the outliers; M(E(T)), etc. denote the means of Exploration time per interface; M(T), etc. denote the means of Evaluation time per interface.

Statistics Results • The Textual interface has the shorter mean of Evaluation time and Exploration time than the Graphical and Hybrid interfaces. • The difference in the average of Evaluation time and Exploration time between the Textual and the Graphical interfaces is larger than the one between the Textual and the Hybrid and the Graphical and the Hybrid. According to the results from the last two pages, we got the following observations for these 16 queries user triples.

Statistics Process cont. To see if there is an interface with the shortest Exploration time and Evaluation time for these 16 qualified queries. We found the minimum number of triples over all such queries, in order to best deal with the leftovers (a remaining data after grouping in triples). After a consideration of the number of triples per query and the outliers of these data, we set this minimum number as five.

This is a part of table with five randomly selected triples from each query.

Statistics Results • There is no difference in the means of Exploration times and Evaluation times for each interface. • There exist linear correlations between Exploration times and Evaluation times in the Graphical and the Hybrid interface, but not in the Textual interface. After a series of T-tests , ANOVA F-tests and regression tests, we got the following results for this tripled set of 16 queries

Overall Conclusions • The Textual interface has the larger mean of Naming time. • The Graphical interface has the larger mean of Evaluation time. • Participants give the highest ClusterFitRating have the shorter mean of Evaluation time and Naming time in all interfaces. • There exists a linear correlation between ClusterFitRatings and Name Ratings in all interface; and a linear correlation between Exploration times and Evaluation times in the Graphical and Hybrid interfaces.

References • Name Than Cluster online survey, http://gem1.rutgers.edu/userstudy/login.php • J. Abello, J, Schulz, H, Gaudin, B, and Tominski, C (2007). Name That Cluster - Text vs. Graphics, IEEE InfoVis Conference, Sacramento, November 2007. • Ramsey, Fred L, The statistical sleuth : a course in methods of data analysis, Duxbury/Thomson Learning, 2002

Thank you THE END

name that cluster Text Vs. Graphics

name that cluster Text Vs. Graphics

Presentation Transcript

JFrame vs Graphics

Name That Rhythm

Multimedia Elements – Text and Graphics

Animating Graphics and Text Boxes

Debased Text vs. Multi-modal Text

Name of Cluster Report

Name That Quadrilateral

Name That Tune

Name That Number

Text vs. Subtext

Working with Text and Graphics

Working with Text and Graphics

Name that tune.

Cluster Name

Text and Graphics: SOLVEIG WILDER

Text, Graphics, Symbols, and Codes

Communication: text and graphics

Text and Graphics

Text vs. Subtext

Name that Place

Text That Girl