graph algebra n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Graph Algebra PowerPoint Presentation
Download Presentation
Graph Algebra

play fullscreen
1 / 27

Graph Algebra

95 Views Download Presentation
Download Presentation

Graph Algebra

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Graph Algebra with Pattern Matching and Aggregation Support

  2. Nowadays Graph • Variety of Sources • Scientific Studies • Business Activities • Social Needs • Internet • Data are often of • Large Scale • Highly Liked • Schema-less

  3. Managing Graph Data • Primary Role of Database • Persistent store • Efficient Query • RDBMS • Storage Model : vertex and edge as tuples • Query: Link is by join • Graph Database • Storage Model: graphs • Query: path traversal

  4. Why not RDBMS ? • Schema Issue • Every data inserted may of a different schema (Web Graph) • Hard to represent semi structured info • Scalability Issues • ACID property VS CAP theorem • Query performance • Difficult to optimize intensive Joins

  5. Graph Databases and Query Languages No Universal Languages !!!

  6. No Universal Language Like SQL? • No commonly agreed algebra • Relational Algebra ? • Expressive, test-of-time to be effective • NOT suitable for GRAPH • Graph Algebra ? • Still at preliminary work

  7. Issues with Relational Algebra (RA) • Defined on Tuples or Set of Tuples • Mismatch with graph nature • Operators loose semantics • What is Union, Intersection, Join in GRAPH? • I/O type ? • Tables not GRAPH • Domain centric, not Data centric • Don’t anticipate out-of-order data • Treat Tuples as independent • Didn’t aware the links among Tuples • Queries written using RA are verbose and complex

  8. Advantage of Graph Algebra • An algebra itself is a query language • Easy to work out a language with Strong theoretic support • Evaluate expressiveness of given languages • Justify when to use what: Gremlin, Cypher etc. • Query Optimization • Operator order EQUALS execution plan • Algebraic Equivalence IMPLIES query optimization

  9. Advantage of Graph Algebra • Separation of Query and System: • One can write Query on any system as long as common algebra is supported. • Knowing RA, one can write SQL, PL/SQL, MS/SQL on MySQL, Oracle, SQLServer • Integrate new operators to database: • Current graph database systems didn’t support newly developed queries: • Graph OLAP, Graph Cube, Graph Aggregation etc. • Proper Algebra can incorporate these operators

  10. Existing Works on Graph Algebra • Graph QL [1] • A graph based algebra, operators are based on graphs • Selection • Join – not properly defined • Template • VAQL [2] • Focused on visualization • Selection • Aggregation – restricted • Visualization • Selection is restricted on isomorphism • Aggregation is not defined over edges • No algebra equivalence [1] He, Huahai, and Ambuj K. Singh. "Graphs-at-a-time: query language and access methods for graph databases." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 2008. [2] Shaverdian, Anna A., et al. "A graph algebra for scalable visual analytics." Computer Graphics and Applications, IEEE 32.4 (2012): 26-33.

  11. What we want for a Graph Algebra? • Universal • Independent of graph types: • Directed VS Undirected. Simple VS Hyper. Homogeneous VS heterogeneous. • Expressive • Able to answer typical graph queries: • Pattern match, Reachability, Path finding etc. • Cover Relational Algebra (RA) • This ensures that graph database can handle relational data as well • Scale • Able to manage data in-scale • Support queries to summarize, aggregate data

  12. Extended Algebra – Graph Model • is an attributed graph • is vertex set, each has a unique ID • is edge set • contains attributes for each vertex • contains attributes for each edge • Edge contain identifier as well • In simple graph, edge can be represented by end points • contains information for the graph

  13. Extended Algebra – Operators • Projection • Restriction • Unification • Pattern Matching • Aggregation

  14. Operators: Projection • Purpose: • Select user interested data from base graph • Syntax: • are the attribute lists for vertex, edge and graph • The result is a new graph, whose attributes are trimmed by

  15. Operators: Restriction • Purpose: • Restrict the attribute value from base graph • Syntax: • : vertex restriction, select all the vertices (and their induced edges) which matches predicate • : edge restriction, select all the edges (and their endpoints) which matches predicate • : graph restriction, select graphs whose every vertex matches predicate, every edge matches and the graph matches

  16. Operator: Unification • Purpose: • Concatenate graphs • Syntax: • : vertex unification, unify vertices with identical ids • : edge unification, adding edges between two vertices matching • : attribute unification, create a virtual vertex for each distinct value in

  17. Operator: Unification P(v1,v1) and P(v4,v5) are true

  18. Operator: Unification

  19. Operator: Pattern Matching • Purpose: • Find subgraphs out of base graph matching a given pattern • Syntax: • is a pattern, which is also a graph. The definition comes from [1] • returns all the matching graphs • returns abstractive matching, where only vertices appeared in is returned [1] Fan, Wenfei, et al. "Adding regular expressions to graph reachability and pattern queries." Data Engineering (ICDE), 2011 IEEE 27th International Conference on. IEEE, 2011.

  20. Operator: Pattern Matching

  21. Operator: Aggregation • Purpose: • To summarize a given graph • Syntax: • : graph aggregation, every vertex is supplied to and every edge set is supplied to • : vertex aggregation, given a set of vertices group them by • : edge aggregation, given a set of edges, group them by

  22. Operator: Aggregation

  23. Expressiveness • This set of operators aremore expressive than Relational Algebra and Graph QL • It can represent many graph queries • Reachability • Graph Cube computation • I-OLAP and T-OLAP

  24. Algebra Equivalence • When operators are chained up, they can form a query execution plan friend Comment friend V-Unification Base Graph Matched Result Restriction v.name Find the network induced by the person whose friends comment on each other’s posts with birthday greater than 1989. Output those names as a graph

  25. Algebra Equivalence • To generate multiple execution plans for a same query, we need theoretic support: • Identity Equivalence: • A operator can be represented by other operators • // p is a common attribute predicate • D(P) is to decompose a pattern P into edges • // • ...

  26. Conclusion • Graph Algebra plays an important role in graph database development • We make one step forward by proposing a Graph Algebra which: • extends existing algebraic work with • Regular pattern matching • Aggregation • is expressive and well-defined • contains equivalence rules for further query optimization

  27. Thank you!