1 / 11

Pedro DeRose University of Wisconsin-Madison

Pedro DeRose University of Wisconsin-Madison. The DBLife Prototype System in The Cimple Project on Community Information Management. Community Information Management. Numerous Web communities database researchers, movie fans, legal professionals, bioinformatics, etc.

ameeks
Télécharger la présentation

Pedro DeRose University of Wisconsin-Madison

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pedro DeRose University of Wisconsin-Madison The DBLife Prototype System inThe Cimple Project onCommunity Information Management

  2. Community Information Management Numerous Web communities database researchers, movie fans, legal professionals, bioinformatics, etc. enterprise intranets, tech support groups Each community = many data sources + many members Members often want to integrate data, query, and discover community information any interesting connection between researchers X and Y? find all citations of this paper in the past one week on the Web what is new in the past 24 hours in the database community? what are current hot topics? who has moved where?

  3. Cimple Project @ Wisconsin/Yahoo! Research Structured community portal, driven by extraction + integration + mass collaboration Keyword search SQL querying Question answering Browse Mining Alert/Monitor News summary Jim Gray Jim Gray Researcher Homepages Conference Pages Group Pages DBworld mailing list DBLP Web pages * * * * give-talk * * * SIGMOD-04 SIGMOD-04 * * * * * * * * Text documents Personalize system, provide feedback

  4. The Research Team • Core Members • Pedro DeRose • Warren Shen • AnHai Doan • Raghu Ramakrishnan • Supporting Members • Fei Chen • Yoonkyong Lee • Doug Burdick • Mayssam Sayyadian • Xiaoyong Chai • Ting Chen

  5. Prototype System: DBLife Integrate data of the DB research community Live at dblife-labs.cs.wisc.edu 1,075 data sources 463 researcher homepages 103 department homepages 54 conference homepages 99 faculty hubs 56 database group pages 203 project homepages 85 colloquia 11 event pages DBWorld DBLP Crawled daily, 11000+ pages = 160+ MB / day

  6. Information Extraction

  7. Data Integration Raghu Ramakrishnan co-authors = A. Doan, Divesh Srivastava, ...

  8. Resulting ER Graph “Proactive Re-optimization write write write Pedro Bizarro Shivnath Babu coauthor coauthor advise David DeWitt advise coauthor Jennifer Widom PC-member PC-Chair SIGMOD 2005

  9. Provide Services

  10. Mass Collaboration: An Example

  11. Summary • Community Information Management • increasingly crucial problem • The Cimple project • sample challenges: information extraction data integration mass collaboration • extends the footprints of DB technologies to Web data • develops new DB technologies • DBLife prototype • more at dblife.cs.wisc.edu, latest features (e.g., wiki) at dblife-labs.cs.wisc.edu • research/education tool, community service,benchmark, challenge problem

More Related