Exploring Collection-Oriented Programming: Techniques and Languages

Functional Collection-OrientedProgramming Guy Blelloch Carnegie Mellon University

Collection-oriented programming • Programmer emphasis is on operations over collections of values. (Data Driven) • Array based: APL, Nial, FP, Matlab • Database: SQL, Linq • Scripting: SETL, Python • Data parallel: *Lisp, HPF, Nesl, Id, ZPL • Map-reduce • All of these support some form of Map and some form of reduce.

Collection-oriented programming • Concise code • Promotes a functional style of programming • Has become popular even without parallelism (matlab, python, sql, …) • Parallelism • Map is naturally parallel • Many collection operations are parallel: reduce, scan, collect, flatten, transpose, … • Most often DETERMINISTIC

Collection-oriented programming • “Concurrency” (Non-deterministic environment) • On its own not really useful for “concurrent” applications (e.g. operating systems, or front-end of a web server).

Flat vs. Nested Can collections contain collections? Can arbitrary functions be mapped? • Flat languages • APL, SQL, Map-reduce, HPF, Matlab • Nested Languages • SETL, Python, Nesl, Id I conjecture that flat CO languages will never be general purpose—not good for trees, divide-and-conquer, …

Quicksort in NESL function quicksort(S) = if (#S <= 1) then S else let a = S[rand(#S)]; S1 = {e in S | e < a}; S2 = {e in S | e = a}; S3 = {e in S | e > a}; R = {quicksort(v) : v in [S1, S3]}; in R[0] ++ S2 ++ R[1];

Quicksort in X10 double[] quicksort(double[] S) { if (S.length < 2) return S; double a = S[rand(S.length)]; double[] S1,S2,S3; finish { async { S1 = quicksort(lessThan(S,a));} async { S2 = eqTo(S,a);} S3 = quicksort(grThan(S,a)); } append(S1,append(S2,S3)); }

Matrix Multiplication Fun A*B { if #A < k then baseCase.. A11,A12,A21,A22 = QuadSplit(A) B11,B12,B21,B22 = QuadSplit(B) Parallel { C11 = A11*B11 + A12*B21 C12 = A11*B12 + A12*B22 C21 = A21*B11 + A22*B21 C22 = A21*B12 + A22*B22 } return QuadJoin(C11,C12,C21,C22) } Need to be able to program for locality.

Question: • How general is functional CO programming? • Advantages • High-level/concise • Natural/Intuitive • Deterministic Parallelism (for all partial results) • No need for annotations, commutativity, regions • No speculation • Simple cost model (even including locality) • Potential Disadvantages • Performance • Major rewriting of code • Does not support “concurrency” on its own

Barnes Hut function bTree(Pts,box as (x0,y0,s)) = if #pts = 0 then EMPTY else if #pts = 1 then LEAF(p[0]) else let xm = x0 + s/2; ym = y0 + s/2; parallelLet T1 = bTree({(x,y,d) in pts | x<xm & y<ym}, (x0,y0,s/2)); T2 = bTree({(x,y,d) in pts | x<xm & y>=ym}, (x0,y0+s/2,s/2)); .. in NODE(cmass(T1,T2,T3,T4),box,T1,T2,T3,T4)

Barnes Hut function force(p,LEAF(p’)) = force(p,p’) | force(p,EMPTY) = 0 | force(p,(c,box,T1,T2,T3,T4) if far(p,box) then forceC(p,c) else force(p,T1)+force(p,T2)+force(p,T3) +force(p,T4) function forces(Points,T) = {move(p,force(p,T)) : p in Points};

“Algorithms in the Real World” • Compression: • JPEG *Easily expressed with no shared writeable state ^Depends on algorithm

Compression: • JPEG ^Depends on algorithm

Barnes Hut function bTree(Pts,box as (x0,y0,s)) = if #pts = 0 then EMPTY else if #pts = 1 then LEAF(p[0]) else let xm = x0 + s/2; ym = y0 + s/2; parallelLet T1 = bTree({(x,y,d) in pts | x<xm & y<ym}, (x0,y0,s/2)); T2 = bTree({(x,y,d) in pts | x<xm & y>=ym}, (x0,y0+s/2,s/2)); .. in NODE(cmass(T1,T2,T3,T4),box,T1,T2,T3,T4)

Barnes Hut function force(p,LEAF(p’)) = force(p,p’) | force(p,EMPTY) = 0 | force(p,(c,box,T1,T2,T3,T4) if far(p,box) then forceC(p,c) else force(p,T1)+force(p,T2)+force(p,T3) +force(p,T4) function forces(Points,T) = {force(p,T) : p in Points};

Graph Connectivity/Spanning Tree

Graph Connectivity 0 2 3 1 4 5 6 Edge List Representation: Edges = [(0,1), (0,2), (2,3), (3,4), (3,5), (3,6), (1,3), (1,5), (5,6), (4,6)]

0 2 1 2 3 1 1 4 1 6 5 6 1 6 2 2 1 1 1 6 1 6 1 1 Graph Contraction 0 2 3 1 4 5 6 Form stars relabel contract

Hooks = [(0,1), (1,3), (1,5), (3,6), (4,6)] Graph Connectivity 0 2 3 1 4 5 6 Edge List Representation: Edges = [(0,1), (0,2), (2,3), (3,4), (3,5), (3,6), (1,3), (1,5), (5,6), (4,6)]

Graph Connectivity L = Vertex Labels, E = Edge List function connectivity(L, E) = if #E = 0 then L else let FL = {coinToss(.5) : x in [0:#L]}; H = {(u,v) in E | Fl[u] and not(Fl[v])}; L = L <- H; E = {(L[u],L[v]): (u,v) in E | L[u]\=L[v]}; in connectivity(L,E);

Conclusions/Questions • Perhaps Functional Programming is adequate for most/all parallel applications. • Collections seems to encourage a functional style even in non functional languages • Give fully deterministic results/and partial results

Quicksort in Multilisp (defun quicksort (L) (qs L nil)) (defun qs (L rest) (if (null L) rest (let ((a (car L)) (L1 (filter (lambda (b) (< b a)) (cdr L))) (L3 (filter (lambda (b) (>= b a)) (cdr L)))) (qs L1 (future (cons a (qs L3 rest))))))) (defun filter (f L) (if (null L) nil (if (f (car L)) (future (cons (car L) (filter f (cdr L)) (filter f (cdr L))))

Quicksort in Multilisp (futures) Work = O(n log n) Not a very good parallel algorithm Span = O(n)

Scan code function addscan(A) = if (#A <= 1) then [0] else let sums = {A[2*i] + A[2*i+1] : i in [0:#a/2]}; evens = addscan(sums); odds = {evens[i] + A[2*i] : i in [0:#a/2]}; in interleave(evens,odds);,

Fourier Transform function fft(a,w) = if #a == 1 then a else let r = {fft(b, w[0:#w:2]): b in [a[0:#a:2],a[1:#a:2]} in {a + b * w : a in r[0] ++ r[0]; b in r[1] ++ r[1]; w in w};

Sparse Vector Matrix Multiply function sparseMxV(M,v) = {sum({v[i]*w : i,w in row}) : row in M};

MapReduce function mapReduce(MAP,REDUCE,documents) = let temp = flatten({MAP(d) : d in documents}); in flatten({REDUCE(k,vs) : (k,vs) in collect(temp)}); • function wordcount(docs) = • mapReduce(d => {(w,1) : w in wordify(d)}, • (w,c) => [(w,sum(c))], • documents); • wordcount(["this is is document 1”, • "this is document 2"]); • [(“1”,1),(“this”,2),(“is”,3),(“document”,2),(“2”,1)]

Exploring Collection-Oriented Programming: Techniques and Languages

Exploring Collection-Oriented Programming: Techniques and Languages

Presentation Transcript

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

FUNCTIONAL PROGRAMMING

Functional Programming

Functional Programming

Unifying Object-Oriented Programming with Typed Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming

Functional Programming