250 likes | 386 Vues
Optimizing Compilers CISC 673 Spring 2009 Potential Languages of the Future Chapel, Fortress, X10. John Cavazos University of Delaware. Overview. Developed for DARPA HPCS Program High Productivity Computing Systems Chapel: Cascade High-Productivity Language Fortress: The new Fortran?
E N D
Optimizing CompilersCISC 673Spring 2009Potential Languages of the FutureChapel, Fortress, X10 John Cavazos University of Delaware
Overview • Developed for DARPA HPCS Program • High Productivity Computing Systems • Chapel: Cascade High-Productivity Language • Fortress: The new Fortran? • X10: A Parallel Variant of Java
Chapel • Chapel: Cascade High-Productivity Language • Characteristics: • Global-view parallel language • Support for general parallelism • Locality-aware • Object-oriented • Generic programming
Global vs Fragmented models • Global-view programming model • Algorithm/data structures expressed as a whole • Model executes as single thread upon entry • Parallelism introduced through language constructs • Examples: Chapel, OpenMP, HPF • Fragmented programming model • Algorithms expressed on a task-by-task basis • Explicit decomposition of data structures/control flow • Examples: MPI, UPC, Titanium
Global vs Fragmented models • Global-view languages leave detail to compiler • Fragmented languages obfuscate code
Support for General Parallelism • “Single level of parallelism” • Prevelance of SPMD model • MPI (very popular) • Supports coarse-grained parallelism • OpenMP • Supports fine-grained parallelism • Should support “nested” parallelism • Should also cleanly support data/task parallelism
Data distribution and Locality • Hard for compiler to do good job of these • Responsibility of performance-minded programmer • Language should provide abstractions to: • control data distribution • control locality of interacting variables
Object-oriented Programming • Proven successful in mainstream languages • Separating interfaces from implementation • Enables code reuse • Encapsulate related code and data
Generic Programming • Algorithms are written without specifying types • Types somehow instantiated later • Latent types • Compiler can infer type from program’s context • Variable type inferred by initialization expression • Function args inferred by actual arguments at callsites • If compiler cannot infer declares an error • Chapel is statically-typed • All types inferred (type checking done) at compile-time • For performance reasons
Chapel: Data Parallelism // a 2D ARITHMETIC DOMAIN storing indices (1,1) …(m,n) var D: domain(2) = [1..m, 1..n]; // an m X n array of floating point values var A: [D] float; // an INFINITE DOMAIN storing string indicies var People: domain (string); // array of integers indexed with strings in the People domain var Age: [People] int; People += “John”; // add string “John” to People domain Age(“John”) = 62; // set John’s age
Chapel: Data Parallelism // FORALL over domain of tuple of integers of domain D forall ij in D { A(ij) = …; } // FORALL over domain of strings from People domain forall I in People { Age(I) = …; } // Simple Example forall I in 1..N do a(I) = b(I);
Chapel: Task Parallelism //Begin Statement spawns new task begin writeln (“output from spawned task”); writeln(“output from main task”); // Cobegin Statement // synchronization happens at the end of the cobegin block cobegin { stmt1(); stmt2(); stmt3(); }
Chapel: Task Parallelism // NOTE: Parallel tasks can coordinate with sync variables var finishedMainOutput$: sync bool; begin { finishedMainOutput$; writeln (“output from spawned task”); } writeln(“output from main task”); finishedMainOutput$ = true;
Fortress Overview • Developed at Sun • Entirely new language • Fortress features • Targeted to scientific computing • Mathematical notation • Implicitly parallel whenever possible • Constructs and annotations to serialize when necessary • Whenever possible, implement language feature in library
Fortress: Task Parallelism • For loops • All iterations can execute in parallel • do … also do … end • Can specify parallel tasks • Tuples • Set of parallel expressions or functions
Fortress: for loop parallelism • For loops 5 4 6 3 7 2 9 10 1 8
Fortress: Task Parallelism Examples do … also tuples do … end do (factorial(10), factorial(5), factorial(2)) factorial(10) also do factorial(5) also do factorial(2) end
Fortress: atomic expressions Note: Z can be 2 or 0, but not 1!
Fortress: Regions • Every thread, object, element in the array has an associated region • Hierarchically form a tree • Describe machine resources
X10 Overview • Developed at IBM • X10 is an extended subset of Java • Base language = Java 1.4 language
Fixes some Java limitations • Java programming model: single uniform heap • X10 introduces partitioned global address spaces • Java intra-node and inter-node parallelism heavyweight • Threads and message/processes too heavyweight • X10 introduces asynchronous activities
X10 != Java • Some features removed from Java language • Java Concurrency -- threads, synchronized • Java Arrays replaced with X10 arrays • Java dynamic class loading removed • Some features added to Java language • Concurrency -- async, finish, foreach, ateach, etc. • Distribution – block, blockCyclic, etc. • X10 arrays -- distributed arrays according to A.distribution
X10 Concurrency • Distributed Collections • Map collection elements to places • Collection<D,E> is a collection with distribution D and element type E • Parallel Execution • foreach (point p: R) S • Creates |R| async statements in parallel at current place • async (P) S • Creates a new activity to execute statement S at place P