1 / 15

Datalog

Datalog. Another query language cleaner closer to a “logic” notation, prolog more convenient for analysis can express queries that are not expressible in relational algebra or SQL (recursion). No grouping and aggregation, bags, orderby. Predicates and Atoms.

baby
Télécharger la présentation

Datalog

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Datalog • Another query language • cleaner • closer to a “logic” notation, prolog • more convenient for analysis • can express queries that are not expressible in relational algebra or SQL (recursion). • No grouping and aggregation, bags, orderby.

  2. Predicates and Atoms - relations are represented by predicates - tuples are represented by atoms. Purchase( “joe”, “bob”, “Nike Town”, “Nike Air”) - arithmetic comparison atoms: X < 100, X+Y+5 > Z/2 - negated atoms: NOT Product(“Brooklyn Bridge”, $100, “Microsoft”)

  3. Datalog Rules and Queries A datalog rule has the following form: head :- atom1, atom2, …., atom,… Examples: PerformingComp(name) :- Company(name,sp,c), sp > $50 AmericanProduct(prod) :- Product(prod,pr,cat,mak), Company(mak, sp,“USA”) All the variables in the head must appear in the body. A single rule can express exactly select-from-where queries.

  4. The Meaning of Datalog Rules AmericanProduct(prod) :- Product(prod,pr,cat,mak), Company(mak, sp,“USA”) Consider every assignment from the variables in the body to the constants in the database. If each of the atoms in the body is in the database, then the tuple for the head is in the resulting.

  5. More Examples CREATE VIEW Seattle-view AS SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = “Seattle” AND Person.per-name = Purchase.buyer SeattleView(buyer,seller,product,store) :- Person(buyer, “Seattle”, phone), Purchase(buyer, seller, product, store).

  6. More Examples (negation, union) SeattleView(buyer,seller,product,store) :- Person(buyer, “Seattle”, phone), Purchase(buyer, seller, product, store) not Purchase(buyer, seller, product, “The Bon”) Q5(buyer) :- Purchase(buyer, “Joe”, prod, store) Q5(buyer) :- Purchase(buyer, seller, store, prod), Product(prod, price, cat, maker) Company(maker, sp, country), sp > 50.

  7. Rule Safety Every variable that appears anywhere in the query must appear also in a relational, non-negated atom in the query. Q(X,Y,Z) :- R1(X,Y) & X < Z not safe Q(X,Y,Z) :- R1(X,Y) & NOT R2(X,Y,Z) not safe

  8. Defining Views SeattleView(buyer,seller,product,store) :- Person(buyer, “Seattle”, phone), Purchase(buyer, seller, product, store) not Purchase(buyer, seller, product, “The Bon”) Q6(buyer) :- SeattleView(buyer, “Joe”, prod, store) Q6(buyer) :- SeattleView(buyer, seller, store, prod), Product(prod, price, cat, maker) Company(maker, sp, country), sp > 50.

  9. From Relational Algebra to Datalog We can translate any relational algebra operation to datalog: - projection - selection - union - intersection - join

  10. Exercises Product ( name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stock price, country) Person( name, phone number, city) Ex #1: Find people who bought telephony products. Ex #2: Find names of people who bought American products Ex #3: Find names of people who bought American products and did not buy French products Ex #4: Find names of people who bought American products and they live in Seattle. Ex #5:Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.

  11. Transitive Closure Suppose we are representing a graph by a relation Edge(X,Y): Edge(a,b), Edge (a,c), Edge(b,d), Edge(c,d), Edge(d,e) b a d e c How can I express the query: Find all nodes reachable from a.

  12. Only in Datalog Recursive queries: Path( X, Y ) :- Edge( X, Y ) Path( X, Y ) :- Path( X, Z ), Path( Z, Y ). A query is recursive if there is a cycle in the dependency graph of the predicates.

  13. Evaluating Recursive Queries Path( X, Y ) :- Edge( X, Y ) Path( X, Y ) :- Path( X, Z ), Path( Z, Y ). Semantics: evaluate the rules until a fixed point: Iteration #0: Edge: {(a,b), (a,c), (b,d), (c,d), (d,e)} Path: {} Iteration #1: Path: {(a,b), (a,c), (b,d), (c,d), (d,e)} Iteration #2: Path gets the new tuples: (a,d), (b,e), (c,e) Iteration #3: Path gets the new tuple: (a,e) Iteration #4: Nothing changes -> We stop. Note: number of iterations depends on the data. Cannot be anticipated by only looking at the query!

  14. Another Recursive Query • Flight(x,y) : there is a flight from city x to y. • Hub(x): x is a hub. • Query: find pairs of cities for which there is a flight that goes through some hub. Fl(x,y) :- Flight(x,y) Fl(x,y) :- Flight(x,z), Fl(z,y) Answer(x,y) :- Fl(x,z), Hub(z), Fl(z,y).

  15. Deductive Databases • General idea: some relations are stored (extensional), others are defined by datalog queries (intensional). • Many research projects (MCC, Stanford, Wisconsin) [Great Ph.D theses!] • SQL3 realized that recursion is useful, and added linear recursion. • Hard problem: optimizing datalog performance. • Ideas from deductive databases made it into the mainstream.

More Related