1 / 29

SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks. Advisor: I. Budak Arpinar Committee: Prashant Doshi Robert J. Woods. 11/27/2007. Delroy Cameron. Masters Thesis Computer Science, University of Georgia. OUTLINE. Background Expertise Profiles

Télécharger la présentation

SEMEF : A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks Advisor: I. Budak ArpinarCommittee: Prashant Doshi Robert J. Woods 11/27/2007 Delroy Cameron Masters ThesisComputer Science, University of Georgia

  2. OUTLINE • Background • Expertise Profiles • Ranking Experts • Collaboration Networks Expansion • Results and Evaluation • Conclusion • Demo SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  3. BACKGROUND • Semantic Web • What? • Extension of current Web • Attach Meaning to Data • Why? • Under Utilization of Current Web • HTML Limitations • Goal • Enhance Information Exchange • Automatic Information Discovery • Interoperability of Services SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  4. BACKGROUND • Semantic Web • Technologies • XML • RDF/RDFS/OWL • URI • Ontology “David Billington is a Professor of Mathematics” <course name=”Mathematics”> <lecturer>David Billington</lecturer> </course> <lecturer name=”David Billington”> <teaches>Mathematics</teaches> </lecturer> <teachingOffering> <lecturer>David Billington</lecturer> <course>Mathematics</course> </teachingOffering > <rdf:Description rdf:id=mynamespace:Professor_2”> <rdf:has_name>David Billington</rdf:has_name> <rdf:teaches rdf:resource=”#Mathematics”/> </rdf:Description> SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  5. BACKGROUND • Semantic Web • Common Challenges • Entity Disambiguation • Ontology Mapping/Alignment • Trust/Provenance • Semantic Association Discovery • Application • Social Networks • Bio-Informatics • National Security • GPS Data Mining SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  6. BACKGROUND • Social Networks • What? • Connected through Social Relationships • Characteristics • Clustering Coefficient (connectedness to neighbors)‏ • Centrality (average shortest path length)‏ • Geodesic (shortest path length)‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  7. BACKGROUND • Peer-Review Process • What? • Review scholarly manuscripts • Challenges • Slow • Conflict of Interest • Finding Suitable Reviewers • Arbitrary Knowledge Approach • Research Diversification • Emerging Fields SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  8. CONTRIBUTIONS • Applicability of Semantics • Finding Expertise • Fine Levels of Granularity • Finding Experts • Taxonomy • Collaboration Networks • Discovery of Unknown Experts SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  9. SEMEF • SEMantic Expert Finder • Finding Expertise (Expertise Profiles)‏ • Collecting Expertise • Quantifying Expertise • Finding (Ranking) Experts • w/ and w/o taxonomy • Collaboration Networks • Geodesic • C-Nets SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  10. EXPERTISE PROFILES • Collecting Expertise • Collect All Publication • Map papers to topic • Quantify all papers • Publications Dataset • DBLP 473,296 papers (conference/session names - Nov. 2007)‏ • ACM, IEEE, Science Direct 29,454 papers (abstracts/index terms)‏ • Combined 476,299 papers SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  11. EXPERTISE PROFILES • Collecting Expertise • Papers-to-Topics Dataset • Combined (476,299)‏ • Topics (320)‏ • Relationships (676,569)‏ • Expertise Profiles (560,792)‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  12. EXPERTISE PROFILES • Quantifying Expertise • Mapping each paper to distinct value • Publication Impact • Hector Garcia-Molina (248 papers - 2003)‏ • E. F. Codd (49 papers - 2003)‏ • Citeseer Impact Statistics (1221 venues)‏ • DBLP URIs SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  13. EXPERTISE PROFILES author_A topic1 (4.50)‏ topic2 (1.86)‏ topic3 (3.08)‏ paper1 paper2 paper3 paper4 paper5 paper6 1.54 1.10 1.86 1.86 1.54 1.54 Figure 1: Expertise Profile

  14. RANKING EXPERTS • Taxonomy of Topics • Session names • Conference Names • O’CoMMA • Paper Abstracts • Index Terms 216 50 60 192 128 320 Figure 2: Taxonomy of Topics

  15. RANKING EXPERTS • Case 1 • Single Topic without Taxonomy • Traverse all Expertise Profiles • Sum impact, (papers  topics)‏ • Case 2 • Single Topic with Taxonomy • Traverse all Expertise Profiles • Sum impact, (papers  topics, subtopics)‏ Prevent Expertise Overestimation 1) Map 2) Papers to leaf nodes only SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  16. RANKING EXPERTS • Case 3 • Array of Topics without Taxonomy • Same as Case 2 • Case 4 • Array of Topics with Taxonomy • Filter input topics • Sum impact, (papers  topics, subtopics)‏ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  17. COLLABORATION NETWORKS EXPANSION • Geodesic STRONG WEAK opus:Proceedings_543 opus:Article_in_Proceedings_179 opus:isIncludedIn opus:isIncludedIn opus:author opus:author opus:Article_in_Proceedings_35 opus:Article_in_Proceedings_8 author_A author_B opus:author opus:author author_A author_B opus:Article_in_Proceedings_291 opus:Article_in_Proceedings_3 opus:author opus:author opus:author opus:author author_B author_A author_A author_2 author_1 author_B MEDIUM UNKNOWN Figure 3: Geodesic Relationships

  18. COLLABORATION NETWORKS EXPANSION • C-Net • Ordering Cluster of Experts • Collaboration Strength* coauthor_1 {0.73, 0.5} coauthor_2 {1.81, 1.0} coauthor_n {1.1, 0.8} Super Node {14.80} coauthor_5 {1.54, 1.0} coauthor_3 {0.73, 0.5} coauthor_4 {0.73, 0.5} Figure 3: Geodesic Relationships * Newman, M. E. J.: Coauthorship Networks and Patterns of Scientific Collaboration. National Academy of Sciences of the United States of America, 1(101): 5200- 5205, (2004).

  19. RESULTS AND EVALUATION • Evaluation • WWW Search Track (2005/6/7)‏ • Input Topics Call For Papers • SWETO-DBLP Subset (67,366 authors)‏ • DBLP (560,792)‏ • Validation • Collaboration Networks Expansion SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  20. Percentage in SEMEF List Search Track (Number of PC Members in SEMEF List) Cumulative Percentage in PC List Search 2005 Search2006 Search 2007 Average (top) 0-10% 10 13 13 12 35% 10-20% 5 8 6 6 52% 20-30% 6 0 0 2 58% 30-40% 4 1 1 2 65% 40-50% 6 2 0 3 73% 50-60% 3 1 1 2 79% 60-70% 4 0 0 1 82% 70-80% 1 1 0 1 85% 80-90% 1 0 0 0 85% 90-100% 0 0 0 0 85% Total 40/48 26/29 21/25 29/34 83 89 84 85 RESULTS AND EVALUATION • Validation Table 1: Past PC Lists comparison with SEMEF

  21. RESULTS AND EVALUATION • Validation Figure 4: Average Number of PC in SEMEF List

  22. RESULTS AND EVALUATION • Validation Figure 5: Average PC Distribution in SEMEF List

  23. Relationships PC List (Number of Expert Relationships)‏ Above Average Expertise (in PC)‏ Search 2005 Search2006 Search 2007 Chair1 Chair2 Chair1 Chair2 Chair1 Chair2 STRONG 2 0 3 0 3 0 0 MEDIUM 10 7 6 2 7 8 4 WEAK 31 17 15 20 11 14 10 EXTREMELY WEAK 1 2 1 2 0 0 0 Relationships SEMEF (Number of Expert Relationships)‏ Above Average Expertise (in PC)‏ Search 2005 Search2006 Search 2007 Chair1 Chair2 Chair1 Chair2 Chair1 Chair2 STRONG 6 2 10 3 10 2 3 MEDIUM 106 53 88 55 88 76 16 WEAK 649 293 608 582 605 576 58 EXTREMELY WEAK 99 26 66 26 66 43 3 RESULTS AND EVALUATION • Collaboration Networks Expansion Table 3: PC Chair – PC Member Geodesic Relationships Table 4: PC Chair – SEMEF List Geodesic Relationships

  24. CONCLUSION • Expertise Profiles • Publication Data • Publication Impact Statistics • Papers-to-Topics Relationships • Ranking Experts • w/ and w/o Taxonomy • Single and Array of Topics • Collaboration Networks Expansion • Semantic Association Discovery • Geodesic • C-Nets SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  25. DEMO • Web Application • Apache Tomcat 6.0 • Java Server Pages • Ubuntu 7.10 Delroy Cameron MastersThesisComputer Science, University of Georgia

  26. RELATED WORK • Particle Swarm Algorithm • ExpertiseNets • Expertise Browser • Experience Atoms • Expertise Recommender • Change history • Tech Support Heuristics • Profiling, Identification, Supervisor SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  27. RELATED WORK • Web-Based Communities • Expert Rank • Formal Probabilistic Models • Candidate Models • Document Models • RDF-Matcher SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  28. EXPERTISE PROFILE ALGORITHM Algorithm findExpertiseProfile(researcherURI, list of publications)‏ create ‘empty expertise profile’ foreach paper of researcherdo get ‘topics’ list of paper (using papers-to-topics dataset) get ‘publication impact’ if ‘publication impact’ is null do ‘publication impact’  default weight else ‘weight’  ‘publication impact’ + existing ‘weight’ from expertise profile if ‘expertise profile’ contains ‘topic’ do update ‘expertise profile’ with <’topic,’ ‘weight’> else add <’topic,’ ‘weight’> pair to ‘expertise profile’ end return ‘expertise profile’ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

  29. RANKING EXPERTS ALGORITHM Algorithm rankValue(researcherURI, list of topics)‏ setexpertRank to zero create temp ‘expertiseprofile’ filter topics foreachtopic in filtered topics list do get ‘papers’ for this topic (using papers-to-topics dataset) foreachpaper in papers list do ifresearcher is author do get ‘publicationimpact’ as ‘weight’ expertRankValue = expertRankValue + ‘publicationimpact’ add <’topic,’ ‘weight’> pair to temporary ‘expertise profile’ endif end end return ‘rankValue’ SEMEF: A Taxonomy-Based Discovery of Experts, Expertise and Collaboration Networks

More Related