partitioning search engine returned citations for proper noun queries n.
Skip this Video
Loading SlideShow in 5 Seconds..
Partitioning Search-Engine Returned Citations for Proper-Noun Queries PowerPoint Presentation
Download Presentation
Partitioning Search-Engine Returned Citations for Proper-Noun Queries

play fullscreen
1 / 17
Download Presentation

Partitioning Search-Engine Returned Citations for Proper-Noun Queries - PowerPoint PPT Presentation

morrison
99 Views
Download Presentation

Partitioning Search-Engine Returned Citations for Proper-Noun Queries

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Partitioning Search-Engine Returned Citations for Proper-Noun Queries Reema Al-Kamha Supported by NSF

  2. The Problem • Search engines return too many citations • Example: “Bonnie Lake” • Google returns around 800 citations • Citations ranked best first • Many refer to the same object • Can we partition by same object? • Proper Noun Queries • Discard citations not of the right kind • Partition the rest by same object • Retain the best-first ranking

  3. “Bonnie Lake” Query to Google

  4. The Interface

  5. “Bonnie Lake” Query Result

  6. Solution • Classification • Group 1: those of the chosen kind • Group 2: those not of the chosen kind • Partition • Three facets • Attributes • Links • Page Similarity • Sub-facets for each facet • Confidence Matrix for each sub-facet • (Weighted) Mean for each facet • Final Confidence Matrix

  7. Attributes • Attribute(s) (One-to-One) Latitude and longitude • Single Attribute (Functional Determination) Province with a lake’s name • Multiple Attributes (Functional Determination) Campground name and highway with a lake’s name • Attributes (Nonfunctional Determination) Country with a lake’s name • Distinguishing Attribute State for a lake

  8. Links • Returned citations that link together • Returned citations that have a common URL prefix: same Host, same File name, and same URL. example of Host: http://www.cs.byu.edu/info/dwembley.html http://www.cs.byu.edu/info/directory.php example of File: http://sunsite.unc.edu/javafaq/oldnews.html http://helios.oit.unc.edu/javafaq/oldnews.html

  9. Confidence Matrix for Returned Citations that Link Together 1 4

  10. Page Similarity • Similarity between each two returned citations • Similarity between two citations-referenced documents

  11. Confidence Matrix for Similarity between two Citation-Referenced Documents

  12. Modified Confidence Matrix for Similarity between two Citation-Referenced Documents

  13. 1,4 3,5 5,8 7,8 Final Matrix {1,4} {3,5,7,8} {2} {6}

  14. “Bonnie Lake”—Results

  15. Measurements • Classification ( Percent correctly classified) • Number of Partitions (Precision and Recall) • Each Partition (Precision and Recall)

  16. Current Implementation Status • Interface • Google connection • Citations retrieval • Page retrieval

  17. Contribution • Solve one type of object-identity problem • Provide an additional tool for search engine queries