1 / 26

Declarative Probabilistic Programming with Datalog

19 th International Conference on Database Theory (ICDT) March 15, 2016. Declarative Probabilistic Programming with Datalog. Example of a Probabilistic Program. Probability of =1?. int earthquake = Flip (0.001); for ( int i=0; i< knownAlarms .length ; ++i ) {

kiersten
Télécharger la présentation

Declarative Probabilistic Programming with Datalog

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 19th International Conference on Database Theory (ICDT) March 15, 2016 Declarative Probabilistic Programming with Datalog

  2. Example of a Probabilistic Program Probability of =1? intearthquake = Flip(0.001); for ( int i=0; i<knownAlarms.length; ++i ) { burglaries[i]=Flip(0.01); broken[i]=Flip(0.05); if ( broken[i]==0 && (burglaries[i]>0 || earthquake>0) ) then alarms[i]=1; else alarms[i]=0; } for ( int i=0; i<knownAlarms.length; ++i ) { if ( knownAlarms[i]!=alarms[i] ) addInifiniteCost(); } 0/1 “Observations” Pearl’s alarm example in Chimple

  3. Probabilistic Programming • Idea: • Develop ordinary programs with randomness • Execution engine answers inference queries over the space of executions • What is Pr(var1=“true” | var15=17)? • What is the most likely var1 given that var15=17? • Chimple (Java), Figaro (Scala), Venture (Scheme), PyMC (Python), … • Common approach to ML development • Randomized procedure for generating params and observations; params as likely values conditioned on observed training data • Promoted by DARPA’s PPAML program • “Probabilistic Programming for Advancing ML” “Unfortunately, building effective ML applications currently still requires Herculean efforts on the part of highly trained experts ...”

  4. LogicBlox • Database management solution, featuring LogiQL – a Datalog extension • ~160 employees, background in DB, PL, ML • Advancing Datalog based analytics • Existential second-order logic • Enables optimization, predictive/prescriptive analytics • Leapfrog Triejoin replaces traditional relational algebra ops • [Veldhuizen,ICDT14] [Green,PODS15] [Green+,VLDB15] [Aref+,SIGMOD15] • Interest in probabilistic programming: business analytics, DARPA projects (PPAML, MUSE)

  5. Why Datalog? “Highly declarative” database programming Rule set describes desired outcome, not procedure • Execution-order independence • The result is invariant under different chase strategies (even in the presence of recursion) • Hence, safe to manipulate the execution • Invariance under transformations / rewritings that preserve logical equivalence • Hence, safe to manipulate the program

  6. Simple Retail Example Deterministic: Sales(sid,pid,r)⟵ Stores(sid,city), HistRate(city,pid,r), IsPromoted(pid,false) Sales(sid,pid,r)⟵ Stores(sid,city), HistPrmRate(city,pid,r), IsPromoted(pid,true) Probabilistic: IsPromoted(pid,Flip[0.02])⟵ Products(pid,pname) Sales(sid,pid,Poisson[r])⟵ Stores(sid,city), HistRate(city,pid,r), IsPromoted(pid,0) Sales(sid,pid,Poisson[r])⟵ Stores(sid,city), HistPrmRate(city,pid,r), IsPromoted(pid,1) • Inference Examples: • Given yesterday’s sales numbers, which products are on promotion? • What is the expected revenue? • Build a histogram of the revenues in Bordeaux

  7. Simple HMM Example Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Join my club! Joined Club? Joined Club? Joined Club? Joined Club? Joined Club? Joined Club? Posted(‘Alice’) ⃪ true Saw(x) ⃪ Follows(x,y), Posted(y) Posted(x) ⃪ Saw(x) Joined(x) ⃪ Saw(x) Probabilistic Posted(‘Alice’,1) ⃪ true Saw(x,Flip[p]) ⃪ Follows(x,y), Posted(y,1), SeeRate(x,p) Posted(x,Flip[q]) ⃪ Saw(x,1), PostRate(x,q) Joined(x,Flip[0.1]) ⃪ Saw(x,1) Deterministic Recursive

  8. Probabilistic Semantics Ordinary Datalog: Datalog Rules Database (EDBs) Outcome (IDBs) Probabilistic Variant: Datalog* Rules 1.9 0.18 0.05 0.1 0.2 0.16 0.1 0.12 Database (EDBs) Probability space over possible outcomes (IDBs)

  9. Probabilities in Logic • Decades of research on probabilistic variants of logic • Various approaches [De Raedt & Kimmig]: • “Distribution Semantics” • Ordinary logic; base facts randomly chosen via an external process • Dantsin[91]; ICL [Poole93], Sato [95]; PRISM [SatoKameya97]; Fuhr[00]; Poole [00]; Dalvi-Suciu[04]; Poole [08]; ProbLog [Kimmig+11], … • “Probabilistic Programming” • Imperative / stratified programs (logic defines a procedural sampler) • BLOG [Milch+05]; P-Log [Baral+07]; World-set alg. [Antova+07]; MCDB [Jampani+08]; Trio [Widom+08]; SimSQL[Cai+13] • “Knowledge-Based Model Construction” • Different semantics – logic as parameterized factors (grounding=factor) • “[…] logic is used as a template for constructing a graphical model” • PLP [Haddawy94]; RBN [Jaeger97]; PRM [KollerPfeffer98]; MLN [Domingos04] DeepDive [Re+12]; PSL [Bröcheler+10]; Probabilistic Datalog+/-[Gottlob+13] • More reading: De Raedt, Kimmig: Probabilistic (logic) programming concepts (2015)

  10. Our Goal: Modelling Recall: • Execution-order independence • The result is invariant under different chase strategies, even in the presence of recursion • Hence, safe to manipulate the execution • Invariance under transformations / rewritings that preserve logical equivalence • Hence, safe to manipulate the program Establish a “strongly declarative” extension of Datalog for building statistical models

  11. What is the Semantics? Should “fire” for everyy? Should both rules “fire” for x? • A(x,Poisson[l]) ⃪ B(x,l), C(x,y) • A(x,Poisson[l]) ⃪ B(x,l), D(x,y), E(y) • E(y) ⃪ A(y,k) When is lhs “satisfied”? What’s a “fixpoint” for recursion? What if we have severall’s? Equivalent? • F(x,l) ⃪ B(x,l), C(x,y) • F(x,l) ⃪ B(x,l), D(x,y), E(y) • A(x,Poisson[l]) ⃪ F(x,l) • E(y) ⃪ A(y,k)

  12. Our Interpretation • A(x,Poisson[l]) ⃪ B(x,y,l) Interpreted as: “If RHS holds, then there should be a value r chosen for A(x,_), drawn from the Poisson[l], such that A(x,r) holds” • [∃rAPoisson(x,l,r)] ⃪ B(x,y,l) • A(x,r) ⃪ B(x,y,l), APoisson(x,l,r) • Ordinary existential Datalog (no probabilities) • Richer syntax in the paper (simplified here) • Next: assign probabilities to solutions

  13. Corresponding Chase • Same as ordinary chase, except for the existential rules • For an existential rule, a value is selected from the corresponding distribution [∃rAPoisson(x,l,r)] ⃪ B(x,y,l) Rule: [∃rAPoisson(‘a’,7,r)] ⃪ B(‘a’, ‘b’, 7) Grounding: c ~ Poisson(7) Add APoisson(‘a’,7,c)

  14. Probabilistic Interpretation Chase gives a probabilistic interpretation: chase up to a fixpoint • In the paper we also give a model-theoretic definition Well defined?

  15. Challenge 1: Uncountability • A possible outcome may be infinite • Example: R(r,Distribution[r]) ⟵ R(x,r) • We consider only discrete numerical distributions • Still, space of solutions may be uncountable • Cannot be ignored in probability computations! • Hence, we need an interpretation by means of a probability measure space • Solution: adopt the notion of cylinder setsfor infinite Markov processes [Ash & Doleans-Dade 00] • In the paper: “weak acylicity” guarantees finiteness of solutions (& discreteness)

  16. Challenge 2: Chase Order Does the probabilistic interpretation cast semantics sensitive to the chase order?

  17. Independent of Chase Order? Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Join my club! Joined Club? Joined Club? Joined Club? Joined Club? Joined Club? Joined Club? Posted(‘Alice’,1) ⃪ true Saw(x,Flip[p]) ⃪ Posted(y,1), Follows(x,y), SeeRate(x,p) Posted(x,Flip[q]) ⃪ Saw(x,1), PostRate(x,q) Joined(x,Flip[0.1]) ⃪ Saw(x,1)

  18. And Now? Saw? Posted? Saw? Posted? Saw? Posted? Saw? Posted? Join my club! Saw? Posted? Saw? Posted? Joined Club? Joined Club? Joined Club? Joined Club? Joined Club? Joined Club? Posted(‘Alice’,1) ⃪ true Saw(x,Flip[p]) ⃪ Posted(y,1), Follows(x,y), SeeRate(x,p) Posted(x,Flip[q]) ⃪ Saw(x,1), PostRate(x,q) Joined(x,Flip[0.1]) ⃪ Saw(x,1)

  19. Main Theorem: For a given program and input instance, all fair chases give the same probability measure space • Fairness: every violation gets to be resolved at some point (standard requirement for infinite chase)

  20. What is the Semantics? Should both rules “fire” for x? Should “fire” for everyy? • A(x,Poisson[l]) ⃪ B(x,l), C(x,y) • A(x,Poisson[l]) ⃪ B(x,l), D(x,y), E(y) • E(y) ⃪ A(y,k) When is lhs “satisfied”? What’s a “fixpoint” for recursion? What if we have severall’s? Equivalent? • F(x,l) ⃪ B(x,l), C(x,y) • F(x,l) ⃪ B(x,l), D(x,y), E(y) • A(x,Poisson[l]) ⃪ F(x,l) • E(y) ⃪ A(y,k)

  21. Rewrite Example • ForwardAd(x, y, Flip[0.5]) ⃪ GotAd(x), FBFriends(x, y). • ForwardAd(x, y, Flip[0.5]) ⃪ GotAd(x), TwitterFollows(y, x). • GotAd(x) ⃪ ForwardsAd(y, x, 1). • Channel(x, y) ⃪ FBFriends(x, y); TwitterFollows(y, x). • ForwardAd(x, y, Flip[0.5]) ⃪ GotAd(x), Channel(x, y). • GotAd(x) ⃪ ForwardsAd(y, x, 1). • Channel(x, y) ⃪ FBFriends(x, y); TwitterFollows(y, x). • ForwardAd(x, y, Flip[0.5]) ⃪ Channel(x, y), ForwardAd(z, x, 1).

  22. FO Equivalence • How to define “equivalence” for rewriting? • Which rewriting operations are “legitimate”? • We can view a probabilistic program as an FO theory with function symbols • The distribution functions (e.g., Poisson, Flip) are viewed as functions in the signature • Two programs are FO equivalent if they are equivalent when viewed as FO theories

  23. Theorem: Two programs that are FO equivalent are also equivalent as probabilistic programs(i.e., define the same probability measure space for all input instances)

  24. Three Core Components PlogiQL LB’s interface to probabilistic programming within the LogiQL; PlogiQL-to-LogiQL rewriting PPDL: Probabilistic Programming Datalog A theoretical framework for declarative probabilistic programming within Datalog [ICDT2016] FAQ Novel algorithm for the general class of Functional Aggregate Queries with optimality guarantees [PODS2016 best paper]

  25. Solver Strategies Data Result rewrite PlogiQL Program LogiQL Program grounding Probabilities / MAP Graphical Model: BN / factors FAQ alg Probabilities / Samples Internal Solver: FAQ External Statistical Solver inference

  26. Concluding Remarks • Introduced PPDL – Datalog for programming statistical models • PPDL retains the essentials of Datalog • Independence of execution (chase order) • Invariance under equivalent FO rewriting • Currently working on implementation • Translation to generic solvers, lifted inference, samplers • Thank you!

More Related