Iterative MapReduce for SPARQL BGP Processing: A Comprehensive Approach

Research Meeting 2009-12-28 JaeseokMyung

Summary • 수업(성적입력) • 학부생졸업논문(이승재, 김홍찬) • 서울대 멘토링 진행중 • Research • SPARQL BGP Processing with Iterative MR • Implementation: Hbase • WAIM 2010(1/29), VLDB 2010(3/9) • How MR works for triples? • Why do we need iterative MRs? Center for E-Business Technology

Outline • Introduction • Related Work • BGP Processing with MR • MR Iteration (Join시 MR iteration 발생이유, N-Triple 저장 구조) • Naïve Approach (Single-Random) • Our Approach • Multi-Greedy Algorithm • Discussion (edge preserving, type별 performance, key selection) • Experiments • Environmental Settings (Hadoop, LUBM, Complex Query, Amazon EC2, Converter) • SPARQL Processing Results (node개수 변화, 데이터 size 변화) • Dealing with Intermediate Result (중간의 파일 IO 비용 크다, CGL-MR, MR-Online) • Conclusion (N-Triple보다 복잡한, 압축가능한 저장 구조 및 인덱싱 연구 필요) • Reference Center for E-Business Technology

MapReduce 한재선, SearchDay2008, http://nexr.tistory.com Center for E-Business Technology

How MR works fortriples? (1/2) SELECT ?a ?b WHERE { ?a dbpedia:spouse ?b. ?a dbpedia:wikilinkdbpediares:actor. ?b dbpedia:wikilinkdbpediares:actor. ?a dbpedia:placeOfBirth ?c. ?b dbpedia:placeOfBirth ?c } Actors who are married to each other and born in the same place 1 2 3 4 5 2 4 1 3 5 a1 (1), (2), (4) … a1 a1 a1 a1 b1 a1 b1 a1 a1 b1 b1 a1 place spouse spouse link link place place link spouse place link place b1 c1 c1 actor c1 c1 actor b1 actor actor c1 b1 Mapper … b1 (1), (3), (5) c1 … (4), (5) … Center for E-Business Technology

How MR works for triples? (2/2) SELECT ?a ?b WHERE { ?a dbpedia:spouse ?b. ?a dbpedia:wikilinkdbpediares:actor. ?b dbpedia:wikilinkdbpediares:actor. ?a dbpedia:placeOfBirth ?c. ?b dbpedia:placeOfBirth ?c } Actors who are married to each other and born in the same place 1 2 3 4 5 2 4 1 3 5 a1 a1 spouse b1 (1, 2, 4) link actor … b1 b1 a1 a1 a1 place spouse place link link b1 actor c1 actor c1 Reducer place c1 b1 a1 spouse b1 link actor … (1, 3, 5) c1 a1 place c1 … (4, 5) b1 place … Center for E-Business Technology

Why do we need iterative MR? SELECT ?a ?b WHERE { ?a dbpedia:spouse ?b. ?a dbpedia:wikilinkdbpediares:actor. ?b dbpedia:wikilinkdbpediares:actor. ?a dbpedia:placeOfBirth ?c. ?b dbpedia:placeOfBirth ?c } Actors who are married to each other and born in the same place a|c a 1 2 3 4 5 2 4 a|b 1 b b|c 3 5 a1 a1 spouse b1 (1, 2, 4) link actor … a1 a1 b1 a1 b1 place link spouse link place actor c1 actor b1 c1 place c1 b1 (1, 3, 5) a1 spouse b1 link actor … (4, 5) c1 a1 place c1 … b1 place … … Center for E-Business Technology

Why do we need iterative MR? SELECT ?a ?b WHERE { ?a dbpedia:spouse ?b. ?a dbpedia:wikilinkdbpediares:actor. ?b dbpedia:wikilinkdbpediares:actor. ?a dbpedia:placeOfBirth ?c. ?b dbpedia:placeOfBirth ?c } Actors who are married to each other and born in the same place a|c a 1 2 3 4 5 2 4 a|b 1 b b|c 3 5 a|b b|c a|d 3 1 2 a|c 2 a|b b|c c|d a|b 4 a|e 1 1 2 3 6 a|b b|c c|d d|e a|g 5 a|f 1 2 3 4 (b) (c) (d) … (a) Center for E-Business Technology

Naïve vs. Our Approach • 정리 진행중 Center for E-Business Technology

Outline • Introduction • Related Work • Preliminaries • BGP Processing with MR • MR Iteration (Join시 MR iteration 발생이유, N-Triple 저장 구조) • Naïve Approach (Single-Random) • Our Approach • Multi-Greedy Algorithm • Improvement • Using Advanced Storage for Selection Task • Using Selectivity Info. for Minimizing BGP Iteration • Discussion (edge preserving, type별 performance, key selection) • Experiments • Environmental Settings (Hadoop, LUBM, Complex Query, Amazon EC2, Converter) • SPARQL Processing Results (node개수 변화, 데이터 size 변화) • Dealing with Intermediate Result (중간의 파일 IO 비용 크다, CGL-MR, MR-Online) • Conclusion (N-Triple보다 복잡한, 압축가능한 저장 구조 및 인덱싱 연구 필요) • Reference Center for E-Business Technology

Iterative MapReduce for SPARQL BGP Processing: A Comprehensive Approach

Iterative MapReduce for SPARQL BGP Processing: A Comprehensive Approach

Presentation Transcript

Research Administrator Meeting

research administrator meeting

Research Meeting

Research Rookies Meeting

Research Rookies Meeting

Research Meeting

Research Town Meeting

Research Meeting

Research Meeting

AcademyHealth Research Meeting

Research Meeting

Research Meeting

Research Meeting

Research Meeting

Research Meeting

Research Rookies Meeting

SICU research meeting

Faculty Research Meeting

Research Meeting