1 / 30

Search Text Mining Web Site Usability

Search Text Mining Web Site Usability. Marti Hearst SIMS. BAILANDO Projects. Better Access to Information using Language Analysis and Novel Dynamic Organizations. Current BAILANDO Projects. CHA-CHA & FLAMENCO: Better Search Interfaces LINDI: UI support for Search

ishana
Télécharger la présentation

Search Text Mining Web Site Usability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SearchText MiningWeb Site Usability Marti Hearst SIMS UCB CS Research Fair

  2. BAILANDO Projects Better Access to Information using Language Analysis and Novel Dynamic Organizations UCB CS Research Fair

  3. Current BAILANDO Projects • CHA-CHA & FLAMENCO: • Better Search Interfaces • LINDI: • UI support for Search • Text Data Mining • TANGO: • Automated Web Site Usability UCB CS Research Fair

  4. Search UIs Combine Browsing & Search Place Search Results in Context Large Category Hierarchies UCB CS Research Fair

  5. Cha-Cha Students: Mike Chen, Jamie Laflen, Jason Hong, Jimmy Lin, Shiang Chen UCB CS Research Fair

  6. Medical Category Hierarchy UCB CS Research Fair

  7. DynaCat (Pratt, Hearst, & Fagan 99) UCB CS Research Fair

  8. DynaCat Study • Design • Three queries • 24 cancer patients • Compared three interfaces • ranked list, clusters, categories • Results • Participants strongly preferred categories • Participants found more answers using categories • Participants took same amount of time with all three interfaces • Similar results have been verified by another study by Chen and Dumais (CHI 2000) UCB CS Research Fair

  9. Cat-a-Cone Interface(Hearst & Karadi 97)

  10. FLAMENCO:Improving Search via Large Category Hierarchies • How to show intersections across category types? • How to preview related categories in a user-tailored, dynamic manner? UCB CS Research Fair

  11. Text Data Mining Relationships between information in documents can create new facts, not previously known. UCB CS Research Fair

  12. Imagine You are a medical researcher Your patient has spinal inflammation numbness in fingers low TC levels negative results for all tests How can you help her? UCB CS Research Fair

  13. Idea A new way of searching text. Link pieces of information together to formulate hypotheses … UCB CS Research Fair

  14. LINDILinking Information for New DIscoveries • Three main parts • Search UI for building and reusing hypothesis seeking strategies. • Statistical language analysis techniques for interpreting the text. • Backend for interfacing with various databases and translating different formats. UCB CS Research Fair

  15. Gathering Evidence Spinal Inflammation Numbness in fingers Low TC Levels UCB CS Research Fair

  16. Gathering Evidence Find diseases associated with each Spinal Inflammation Numbness in fingers Low TC Levels UCB CS Research Fair

  17. Spinal Inflammation Numbness in fingers Low TC Levels Supporting Cascaded Search Operations UCB CS Research Fair

  18. UCB CS Research Fair

  19. New Language Analysis • First use category labels to retrieve candidate documents • Then use language analysis to detect causal relationships between concepts • Title: • Magnesum deficiency implicated in increased stress levels. • Interpretation: • <nutrient><reduction> related-to <increase><symptom> • Use these to find relationships and formulate hypotheses UCB CS Research Fair

  20. Statistical Semantic Parsing • Modern statistical techniques • Mainly applied to syntactic structure • Probabilistic knowledge representation • Represent hypotheses with different degrees of certainty. UCB CS Research Fair

  21. Automating Assessment of Web Site Usability UCB CS Research Fair

  22. Why Worry? • Problem: IBM's extranet • Heavy use of help and search • Unhappy users • Solution • Massive web site redesign • Focus on info-organization, not the purchasing process. • Cost: "in the millions" • Results • Not announced or trumped up • Use of "help" decreased 84% • Sales increased 400% UCB CS Research Fair

  23. Web TANGOTool for Assessing NaviGation & Organization • Goal: automated support for comparing design alternatives • How: Assess usability of the information architecture • Approximate people’s information-seeking behavior (Monte Carlo simulation) • Output quantitative usability metrics UCB CS Research Fair

  24. Guidelines • There are many usability guidelines • A survey of 21 sets of web guidelines found little overlap (Ratner et al. 96) • Why? • Our hypothesis: not empirically validated • So … let’s figure out what works! UCB CS Research Fair

  25. An Empirical Study: Which features distinguish well-designed web pages? UCB CS Research Fair

  26. Methodology • Data collection • 1108 pages • 163 sites • 3 levels per site • 14 metrics • About 85% accurate • Text cluster and text positioning counts less accurate UCB CS Research Fair

  27. Metrics UCB CS Research Fair

  28. Preliminary Results • Linear regression to predict Webby judges ratings • Top 30% vs bottom 30% • Prediction accuracy: • 72% if categories not taken into account • 83% if categories assessed separately UCB CS Research Fair

  29. Goals • Create empirical foundations for what is still guesswork • Next step: • A free online tool • Long term goal: • An monte carlo simulator for comparing potential designs UCB CS Research Fair

  30. For More Information http://webtango.berkeley.edu hearst@sims.berkeley.edu UCB CS Research Fair

More Related