680 likes | 858 Vues
CSEP505: Programming Languages Lecture 2: functional programming, syntax, semantics via interpretation or translation. Dan Grossman Winter 2009. Where are we. Programming: To finish: Caml tutorial Idioms using higher-order functions Similar to objects Tail recursion Languages:
E N D
CSEP505: Programming LanguagesLecture 2: functional programming, syntax, semantics via interpretation or translation Dan Grossman Winter 2009
Where are we Programming: • To finish: Caml tutorial • Idioms using higher-order functions • Similar to objects • Tail recursion Languages: • Abstract syntax, Backus-Naur Form • Definition via interpretation • Definition via translation We are unlikely to finish these slides today; that’s okay CSE P505 Winter 2009 Dan Grossman
Picking up our tutorial • We did: • Recursive higher-order functions • Records • Recursive datatypes • We started some important odds and ends: • Standard-library • Common higher-order function idioms • Tuples • Nested patterns • Exceptions • Need to do: • (Simple) Modules CSE P505 Winter 2009 Dan Grossman
Standard library • Values (e.g., functions) bound to foo in module M are accessed via M.foo • Standard library organized into modules • For homework 1, will use List, String, and Char • Mostly List, for example, List.fold_left • And we point you to the useful functions • Standard library a mix of “primitives” (e.g., String.length) and useful helpers written in Caml (e.g., List.fold_left) • Pervasives is a module implicitly “opened” CSE P505 Winter 2009 Dan Grossman
Higher-order functions • Will discuss “map” and “fold” idioms more next time, but to help get through early parts of homework 1: let rec mymap f lst = match lst with [] -> [] | hd::tl -> (f hd)::(mymap f tl) let lst234 = mymap (funx-> x+1) [1;2;3] letlst345= List.map (funx-> x+1) [1;2;3] let incr_list = mymap (funx-> x+1) CSE P505 Winter 2009 Dan Grossman
Tuples Defining record types all the time is unnecessary: • Types: t1 * t2 * … * tn • Construct tuples e1,e2,…,en • Get elements with pattern-matching x1,x2,…,xn • Advice: use parentheses! let x = (3,"hi",(fun x -> x), fun x -> x ^ "ism") let z = match x with (i,s,f1,f2) -> f1 i (*poor style *) let z = (let (i,s,f1,f2) = x in f1 i) CSE P505 Winter 2009 Dan Grossman
Pattern-matching revealed • You can pattern-match anything • Only way to access datatypes and tuples • A variable or _ matches anything • Patterns can nest • Patterns can include constants (3, “hi”, …) • Patterns are not expressions, though syntactically a subset • Plus some bells/whistles (as-patterns, or-patterns) • Exhaustiveness and redundancy checking at compile-time! • let can have patterns, just sugar for one-branch match! CSE P505 Winter 2009 Dan Grossman
Fancy patterns example To avoid overlap, two more cases (more robust if type changes) typesign=P |N |Z let multsign x1 x2= let sign x = if x>=0 then (if x=0 then Z else P) else N in match (sign x1,sign x2) with (P,P) -> P | (N,N) -> N | (Z,_) -> Z | (_,Z) -> Z | _ -> N (* many say bad style *) CSE P505 Winter 2009 Dan Grossman
Fancy patterns example (and exns) ’a list -> ’b list -> ’c list -> (’a*’b*’c) list exception ZipLengthMismatch let rec zip3 lst1 lst2 lst3= match (lst1,lst2,lst3) with ([],[],[]) -> [] | (hd1::tl1,hd2::tl2,hd3::tl3) -> (hd1,hd2,hd3)::(zip3 tl1 tl2 tl3) | _ ->raise ZipLengthMismatch CSE P505 Winter 2009 Dan Grossman
Pattern-matching in general • Full definition of matching is recursive • Over a value and a pattern • Produce a binding list or fail • You implement a simple version in homework 1 • Example: (p1,p2,p3) matches (v1,v2,v3) if pi matches vi for 1<=i<=3 • Binding list is 3 subresults appended together CSE P505 Winter 2009 Dan Grossman
“Quiz” What is letf x y= x + y letf pr= (match pr with (x,y) -> x+y) letf (x,y) = x + y letf (x1,y1) (x2,y2) = x1 + y2 CSE P505 Winter 2009 Dan Grossman
Exceptions See the manual for: • Exceptions that carry values • Much like datatypes but extendsexn • Catching exceptions • try e1 with … • Much like pattern-matching but cannot be exhaustive • Exceptions are not hierarchical (unlike Java/C# subtyping) CSE P505 Winter 2009 Dan Grossman
Modules • So far, only way to hide things is local let • Not good for large programs • Caml has a fancy module system, but we need only the basics • Modules and signatures give • Namespace management • Hiding of values and types • Abstraction of types • Separate type-checking and compilation • By default, Caml builds on the filesystem CSE P505 Winter 2009 Dan Grossman
Module pragmatics • foo.ml defines module Foo • Bar uses variable x, type t, constructor C in Foo via Foo.x, Foo.t, Foo.C • Can open a module, use sparingly • foo.mli defines signature for module Foo • Or “everything public” if no foo.mli • Order matters (command-line) • No forward references (long story) • Program-evaluation order • See manual for .cm[i,o] files, -c flag, etc. CSE P505 Winter 2009 Dan Grossman
Module example foo.ml: foo.mli: typet1=X1 of int | X2 of int let get_int t= match t with X1 i-> i | X2 i-> i typeeven= int let makeEven i= i*2 let isEven1 i= true (* isEven2 is “private” *) let isEven2 i=(i mod 2)=0 (* choose to show *) typet1=X1 of int | X2 of int val get_int : t1->int (* choose to hide *) typeeven valmakeEven:int->even valisEven1:even->bool CSE P505 Winter 2009 Dan Grossman
Module example bar.ml: foo.mli: typet1=X1 of int | X2 of int let conv1 t= match t with X1 i-> Foo.X1 i | X2 i-> Foo.X2 i let conv2 t= match t with Foo.X1 i-> X1 i | Foo.X2 i-> X2 i let _ = Foo.get_int(conv1(X1 17)); Foo.isEven1(Foo.makeEven17) (* Foo.isEven1 34 *) (* choose to show *) typet1=X1 of int | X2 of int val get_int : t1->int (* choose to hide *) typeeven valmakeEven:int->even valisEven1:even->bool CSE P505 Winter 2009 Dan Grossman
Not the whole language • Objects • Loop forms (bleach) • Fancy module stuff (e.g., functors) • Polymorphic variants • Mutable fields • … Just don’t need much of this for class (nor do I use it much) • Will use floating-point, etc. (easy to pick up) CSE P505 Winter 2009 Dan Grossman
Summary • Done with Caml tutorial • Focus on “up to speed” while being precise • Much of class will be more precise • Next: functional-programming idioms • Uses of higher-order functions • Tail recursion • Life without mutation or loops Will use Caml but ideas are more general CSE P505 Winter 2009 Dan Grossman
6 closure idioms Closure: Function plus environment where function was defined • Environment matters when function has free variables • Create similar functions • Combine functions • Pass functions with private data to iterators • Provide an abstract data type • Currying and partial application • Callbacks CSE P505 Winter 2009 Dan Grossman
Create similar functions let addn m n= m + n let add_one = addn 1 let add_two = addn 2 let rec f m = if m=0 then [] else (addn m)::(f (m-1)) let lst65432 = List.map (fun x -> x 1) (f 5) CSE P505 Winter 2009 Dan Grossman
Combine functions let f1 g h= (fun x -> g (h x)) type’a option = None|Someof ’a (*predefined*) let f2 g hx= match g x with None -> h x | Some y-> y (* just a function pointer *) let print_int = f1 print_string string_of_int (* a closure *) let truncate1 lim f = f1 (fun x -> min lim x) f let truncate2 lim f = f1 (min lim) f CSE P505 Winter 2009 Dan Grossman
Private data for iterators let rec map f lst= match lst with [] -> [] |hd::tl-> (f hd)::(map f tl) (* just a function pointer *) let incr lst= map (funx-> x+1) lst let incr = map (funx-> x+1) (* a closure *) let mul i lst= map (funx-> x*i) lst let mul i = map (funx-> x*i) CSE P505 Winter 2009 Dan Grossman
A more powerful iterator let rec fold_left f acc lst= match lst with [] -> acc |hd::tl-> fold_left f (f acc hd) tl (* just function pointers *) let f1= fold_left (funx y-> x+y) 0 let f2 = fold_left (funx y-> x && y>0) true (* a closure *) let f3lstlo hi= fold_left (funx y-> if y>lo && y<hi then x+1 else x) 0 lst CSE P505 Winter 2009 Dan Grossman
Thoughts on fold • Functions like fold decouple recursive traversal (“walking”) from data processing • No unnecessary type restrictions • Similar to visitor pattern in OOP • Private fields of a visitor like free variables • Very useful if recursive traversal hides fault tolerance (thanks to no mutation) and massive parallelism MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat 6th Symposium on Operating System Design and Implementation 2004 CSE P505 Winter 2009 Dan Grossman
Provide an ADT • Note: This is mind-bending stuff type set = { add : int -> set; member : int -> bool } letempty_set = let exists lst j = (*could use fold_left!*) let rec iter rest = matchrestwith []->false | hd::tl ->j=hd || iter tlin iter lstin let rec make_set lst = { add=(fun i ->make_set(i::lst)); member=exists lst }in make_set [] CSE P505 Winter 2009 Dan Grossman
Thoughts on ADT example • By “hiding the list” behind the functions, we know clients do not assume the representation • Why? All you can do with a function is apply it • No other primitives on functions • No reflection • No aspects • … CSE P505 Winter 2009 Dan Grossman
Currying • We’ve been using currying a lot • Efficient and convenient in Caml • (Partial application not efficient, but still convenient) • Just remember that the semantics is to build closures: • More obvious when desugared: let f = fun x -> (fun y -> (fun z -> … )) let a = ((f 1) 2) 3 CSE P505 Winter 2009 Dan Grossman
Callbacks • Library takes a function to apply later, on an event: • When a key is pressed • When a network packet arrives • … • Function may be a filter, an action, … • Various callbacks need private state of different types • Fortunately, a function’s type does not depend on the types of its free variables CSE P505 Winter 2009 Dan Grossman
Callbacks cont’d • Compare OOP: subclassing for private state type event = … val register_callback: (event->unit)->unit abstract class EventListener{ abstract voidm(Event); //”pure virtual” } voidregister_callback(EventListener); • Compare C: a void* arg for private state voidregister_callback(void*, void (*)(void*,Event); // void* and void* better be compatible // callee must pass back the same void* CSE P505 Winter 2009 Dan Grossman
Recursion and efficiency • Recursion is more powerful than loops • Just pass loop state as another argument • But isn’t it less efficient? • Function calls more time than branches? • Compiler’s problem • An O(1) detail irrelevant in 99+% of code • More stack space waiting for return • Shared problem: use tail calls where it matters • An O(n) issue (for recursion-depth n) CSE P505 Winter 2009 Dan Grossman
Tail recursion example • More complicated, more efficient version (* factorial *) letrecfact1 x = if x==0 then 1 else x * (fact1(x-1)) letfact2 x = let rec facc x= if x==0 then acc else f (acc*x) (x-1) in f 1 x • Accumulator pattern (base-case becomes initial accumulator) CSE P505 Winter 2009 Dan Grossman
Another example let recsum1 lst= match lst with [] -> 0 |hd::tl-> hd + (sum1 tl) letsum2 lst= let rec facc lst = match lst with [] -> acc |hd::tl-> f (acc+hd) tl in f 0 lst • Again O(n) stack savings • But input was already O(n) size CSE P505 Winter 2009 Dan Grossman
Half-example type tree = Leaf of int | Node of tree * tree let sum tr= let rec facc tr = match tr with Leaf i-> acc+i | Node(left,right) -> f (f acc left) right in f 0 tr • One tail-call, one non • Tail recursive version will build O(n) worklist • No space savings • That’s what the stack is for! • O(1) space requires mutation and no re-entrancy CSE P505 Winter 2009 Dan Grossman
Informal definition If the result of f x is the result of the enclosing function, then the call is a tail call (in tail position): • In (fun x -> e), the e is in tail position. • If if e1 then e2 else e3 is in tail position, then e2 and e3 are in tail position. • If let p = e1 in e2 is in tail position, then e2 is in tail position. • … • Note: for call e1 e2, neither is in tail position CSE P505 Winter 2009 Dan Grossman
Defining languages • We have built up some terminology and relevant programming prowess • Now • What does it take to define a programming language? • How should we do it? CSE P505 Winter 2009 Dan Grossman
Syntax vs. semantics Need: what every string means: “Not a program” or “produces this answer” Typical decomposition of the definition: • Lexing, a.k.a. tokenization, string to token list • Parsing, token list to labeled tree (AST) • Type-checking (a filter) • Semantics (for what got this far) For now, ignore (3) (accept everything) and skip (1)-(2) CSE P505 Winter 2009 Dan Grossman
Abstract syntax To ignore parsing, we need to define trees directly: • A tree is a labeled node and an ordered list of (zero or more) child trees. • A PL’s abstract syntax is a subset of the set of all such trees: • What labels are allowed? • For a label, what children are allowed? Advantage of trees: no ambiguity, i.e., no need for parentheses CSE P505 Winter 2009 Dan Grossman
Syntax metalanguage • So we need a metalanguage to describe what syntax trees are allowed in our language. • A fine choice: Caml datatypes type exp = Int of int | Var of string |Plusof exp * exp |Timesof exp * exp type stmt = Skip | Assign of string * exp |Seqof stmt * stmt |Ifof exp * stmt * stmt |Whileof exp * stmt • +: concise and direct for common things • -: limited expressiveness (silly example: nodes labeled Foo must have a prime-number of children) • In practice: push such limitations to type-checking CSE P505 Winter 2009 Dan Grossman
We defined a subset? • Given a tree, does the datatype describe it? • Is root label a constructor? • Does it have the right children of the right type? • Recur on children • Worth repeating: a finite description of an infinite set • (all?) PLs have an infinite number of programs • Definition is recursive, but not circular! • Made no mention of parentheses, but we need them to “write a tree as a string” CSE P505 Winter 2009 Dan Grossman
BNF A more standard metalanguage is Backus-Naur Form • Common: should know how to read and write it e ::= c| x |e+e|e*e s ::= skip| x := e |s;s|if e then s else s |while es (x in {x1,x2,…,y1,y2,…,z1,z2,…,…}) (c in {…,-2,-1,0,1,2,…}) Also defines an infinite set of trees. Differences: • Different metanotation (::= and |) • Can omit labels, e.g., “every c is an e” • We changed some labels (e.g., := for Assign) CSE P505 Winter 2009 Dan Grossman
Ambiguity revisited • Again, metalanguages for abstract syntax just assume there are enough parentheses • Bad example: if x then skip else y := 0; z := 0 • Good example: y:=1; (while x (y:=y*x; x:= x-1)) CSE P505 Winter 2009 Dan Grossman
Our first PL • Let’s call this dumb language IMP • It has just mutable ints, a while loop, etc. • No functions, locals, objects, threads, … Defining it: • Lexing (e.g., what ends a variable) • Parsing (make a tree from a string) • Type-checking (accept everything) • Semantics (to do) You’re not responsible for (1) and (2)! Why… CSE P505 Winter 2009 Dan Grossman
Syntax is boring • Parsing PLs is a computer-science success story • “Solved problem” taught in compilers • Boring because: • “If it doesn’t work (efficiently), add more keywords/parentheses” • Extreme: put parentheses on everything and don’t use infix • 1950s example: LISP (foo …) • 1990s example: XML <foo> … </foo> • So we’ll assume we have an AST CSE P505 Winter 2009 Dan Grossman
Toward semantics e ::= c| x |e+e|e*e s ::= skip| x := e |s;s|if ethen s else s |while es (x in {x1,x2,…,y1,y2,…,z1,z2,…,…}) (c in {…,-2,-1,0,1,2,…}) Now: describe what an AST “does/is/computes” • Do expressions first to get the idea • Need an informal idea first • A way to “look up” variables (the heap) • Need a metalanguage • Back to Caml (for now) CSE P505 Winter 2009 Dan Grossman
An expression interpreter • Definition by interpretation: Program means what an interpreter written in the metalanguage says it means type exp = Int of int | Var of string |Plusof exp * exp |Timesof exp * exp type heap = (string * int) list let reclookup h str = … (*lookup a variable*) let recinterp_e (h:heap) (e:exp) = match e with Int i->i |Var str->lookup h str |Plus(e1,e2) ->(interp_e h e1)+(interp_e h e2) |Times(e1,e2)->(interp_e h e1)*(interp_e h e2) CSE P505 Winter 2009 Dan Grossman
Not always so easy let recinterp_e (h:heap) (e:exp) = match e with Int i-> i |Var str-> lookup h str |Plus(e1,e2) ->(interp_e h e1)+(interp_e h e2) |Times(e1,e2)->(interp_e h e1)*(interp_e h e2) • By fiat, “IMP’s plus/times” is the same as Caml’s • We assume lookup always returns an int • A metalanguage exception may be inappropriate • So define lookup to return 0 by default? • What if we had division? CSE P505 Winter 2009 Dan Grossman
On to statements • A wrong idea worth pursuing: let recinterp_s (h:heap) (s:stmt) = match s with Skip -> () |Seq(s1,s2) -> interp_s h s1 ; interp_s h s2 |If(e,s1,s2) ->if interp_e h e then interp_s h s1 else interp_s h s2 |Assign(str,e) ->(* ??? *) |While(e,s1) ->(* ??? *) CSE P505 Winter 2009 Dan Grossman
What went wrong? • In IMP, expressions produce numbers (given a heap) • In IMP, statements change heaps, i.e., they produce a heap (given a heap) let recinterp_s (h:heap) (s:stmt) = match s with Skip -> h |Seq(s1,s2) ->let h2 = interp_s h s1 in interp_s h2 s2 |If(e,s1,s2) ->if (interp_e h e) <> 0 then interp_s h s1 else interp_s h s2 |Assign(str,e) ->update h str (interp_e h e) |While(e,s1) ->(* ??? *) CSE P505 Winter 2009 Dan Grossman
About that heap • In IMP, a heap maps strings to values • Yes, we could use mutation, but that is: • less powerful (old heaps do not exist) • less explanatory (interpreter passes current heap) type heap = (string * int) list let reclookup h str= match h with [] -> 0 (* kind of a cheat *) |(s,i)::tl->if s=str then i else lookup tl str letupdate h str i= (str,i)::h • As a definition, this is great despite terrible waste of space CSE P505 Winter 2009 Dan Grossman
Meanwhile, while • Loops are always the hard part! let recinterp_s (h:heap) (s:stmt) = match s with … | While(e,s1) ->if (interp_e h e) <> 0 thenlet h2 = interp_s h s1 in interp_s h2 s else h • s is While(e,s1) • Semi-troubling circular definition • That is, interp_s might not terminate CSE P505 Winter 2009 Dan Grossman