1 / 16

Spring 2008 Progress Report SPR2008PR

Spring 2008 Progress Report SPR2008PR. David Gleich and Ying Wang (with Margot Gerritsen and Amin Saberi too!) Library of Congress May 27 th or May 28 th. Alternate Titles. Why LCSH is better than Wikipedia Matching stuff to fluff

dinesh
Télécharger la présentation

Spring 2008 Progress Report SPR2008PR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spring 2008 Progress ReportSPR2008PR David Gleich and Ying Wang (with Margot Gerritsen and AminSaberi too!) Library of Congress May 27th or May 28th

  2. Alternate Titles Why LCSH is better than Wikipedia Matching stuff to fluff A novel quadratic programming framework for the network alignment problem.

  3. Outline The matching problem and it’s myriad uses Parsing wikipedia and LCSH for all of the data Theories on subject ontologies(you probably know better)

  4. Last fall Ying, Jeremy, Vinayak, and I spoke to a few of you about the similarities between LCSH and Wikipedia categories. We started working on ways of comparing these databases.

  5. From MARC to GRAPH • Concatenate subfields of 1xx tags for nodes. • Use 550 and 551 tags for edges. • Use 450 and 451 tags for alternate names. ... 150 0 _aKlingon (Artificial language) 450 0 _atlhIngan (Artificial language) 550 0 _wg _aLanguages, Artificial ... Klingon (artificial language) Languages, Artificial

  6. Privacy, Right of Privacy Privacy (Jewish Law) Privacy (Islamic Law) Privacy (Canon Law) LCSH Overview(largest connected component)

  7. Wikipedia to GRAPH see also narrower term Determinants Linear algebra

  8. Wikipedia ideas Evaluate LCSH graph vs. WC graph Try and match LCSH with WC ... many more ideas …

  9. MATCHING Matching means taking a node in LCSH and finding only one node in WC that is a good pair. Most famous matching problem:stable marriage.

  10. Stable Marriage David Gleich 3 4 5 6 Brad Pitt Angelina Jolie 1 1 2 5 6 3 Laura Bofferding 4 2 Slide approved by Laura Bofferding 2008 May 27

  11. Matching WC and LCSH • LCSH and WC have short text labels; use the labels to come up with a set of potential links. Algebra Linear algebra Linear algebra Linear functions Graph A Graph B

  12. Matching with links • How? Graph A Graph B

  13. Matching without links • Bipartite matching problem/stable marriage • Maximize the cardinality (number of pairs) Graph A Graph B

  14. Matching with squares • Enumerate squares • Maximize cardinality and squares i j i' j' j i i' j' Graph A Graph B

  15. Matching with squares Bipartitematching Polynomial Square matching NP-Complete

More Related