210 likes | 328 Vues
This presentation by Kathy Yelick delves into the evolution and future directions of Programming Languages and Systems (PL&S) research at Berkeley. It discusses historical trends in language design, compiler development, and software engineering while acknowledging ongoing challenges such as performance, reliability, and programmer productivity. Yelick highlights various Berkeley projects, like Titanium and BANE, and examines the complexities of software reliability and correctness in an increasingly software-dependent world. The talk invites reflection on the future of comparative language design and compiler innovation.
E N D
Programming Languages and Systemsat Berkeley and BeyondPast, Present, and Future Kathy Yelick
The Questions • Programming Languages and Systems (PL&S): • aka Languages: • this is too narrow (some of us don’t do much “language” research) • aka Software: • this is too broad (what doesn’t involve software?) • Who are we? • What do we do?
The Culture of PL&S • The middle management of EECS • Blamed for • slow execution time • buggy software • low programmer productivity • languages that are too big, restrictive, ugly, etc. • Need to have control over • hardware complexity • programmer quality • consumers (features over robustness)
The Big Motivators • Ease of Programming • Hardware costs -> 0 • Software costs -> infinity • Correctness • Increasing reliance on software increases cost of software errors (medical, financial, etc.) • Performance • Increasing machine complexity • New languages and applications • Enabling Java; network packet filters
History of Programming Language Research General Purpose Language Design Domain-Specific Language Design Parsing Theory Type Systems Theory Flop optimization Memory Optimizations Data and Control Analysis Type-Based Analysis Garbage Collection Threads Program Verification Program Checking Tools 70s 80s 90s 2K
Topics • Programming Language and Systems Research • Language Design • Compilers & Tools • Libraries & Runtime Systems • Software Engineering • Berkeley Projects: Current and Future • BANE • Titanium • Proof Carrying Code • Future Emphasis: Reliability
Language Design • Economics of programming languages • Programming training is the dominant cost • implies languages are rarely replaced • Languages are adopted to fill a void • not because of language quality • Is there anything left for PL designers? • Niche languages: • Everyone does language design, but doing it well is hard • Understanding languages: • E.g., Titanium’s type system is sound, Split-C’s is not • Language design at Berkeley: • Lisp (Fateman), Ada (Hilfinger), Tioga (*), Titanium (*)
Compilers and Tools • Economics of compilers • Large industrial teams built commercial compilers • How can academia compete? • Focus on new algorithms and future problems • Need software infrastructure for experiments • from others (SUIF, gcc) or our own (Titanium, BANE) • Compilers and Runtime Systems at Berkeley • Historical and continuing strength • Code gen, profiling (Graham), sw pipelining (Aiken) • Analysis and optimization of parallel code (Yelick) • Automatic (compile-time) memory management (Aiken) • Environments (Graham, Fateman)
Libraries • Open problems in complex platforms/applications • Scientific libraries (overlaps with SciComp group) • Parallel and distributed machines • Economics of Libraries • Market and competition are less intense • Can’t afford to hand-code for each machine • Berkeley strength: • Load balancing (Graham, Yelick, and many others) • Data structures (Yelick), matrices (Demmel, Kahan, Yelick), Meshes (Shewchuk) • High precision (Demmel, Fateman, Kahan, Shewchuk) • Symbolic (Fateman, Kahan) • New: tools to automate library construction
Software Engineering • Economics of Software Engineering • Robust software is expensive • Old approaches: • Formal: Verification, specification • Informal: Software process, patterns • What Berkeley is doing: • Automatic analysis of large programs (Aiken) • Software fault isolation (Graham) • Proof Carrying Code (Necula) • Model checking (Henzinger, Brayton, S-V) • Experience (lots of large software construction projects) • What’s missing? • “Core” Software Engineering
Projects:Titanium • Problem: portable scientific computing • The Approach • Domain-specific language and compiler: • Old applications: astrophysics, combustion • New applications in Bioengineering • modeling the cell to cure cancer (Arkin) • modeling bio-MEMs devices for treatment (Liepmann) • Language design • Dialect of Java with in-house compiler (to C) • Support for fast, safe multidimensional arrays • Types for distributed data, regions • Optimizations • Communication, memory, arrays, synchronization
Projects: BANE • Problem: removing bugs from large programs • The Approach • automatic analysis • discover small facts about big programs • Target: 1,000,000 line systems • Examples: • Find relay races in RLL programs • RLL used in >50% of factories, at Disneyland, etc. • Prove C programs are Y2K ready • CVS 1.10 is OK, CVS 1.9 is not • Detect buffer overruns in security-critical code
Projects: Proof Carrying Code • The Problem: • How can I trust code from another language, person, machine? • The Approach: • programs carry a proof of what they promise • Semantic analog of digital signatures • Properties often from program analysis (e.g., types) • Passed through compilation by validating translations • client’s cheap trusted verifier checks the proof • Applications • Very fast network packet filters • “Native code” in ML that is safe • Mobile code security
Reliable Computing (Future) • Problem: build more reliable systems • Approaches: • Build from reliable components • Better languages for system design (H*) • Better environments for particular domains (F,G) • Build semantic models of system behavior (A,H,N) • Build reliable systems from unreliable components by spend cheap hardware resources (H,K,P,Y) • Introspection of network, disks, processor, software • Use statistical models to determine normal/abnormal • Fault tolerant, self-scrubbing data structures • Redundant computation: catch transient errors
Summary of PL&S at Cal • Good coverage in core language and compiler work • People move with opportunities • Traditional boundaries becoming blurred • Strength in analysis • Semantics with practical applications • Strength in collaborative work • Systems: Culler, Kubiatowicz, Patterson • Scientific computing: inside and outside department • Areas that are not well represented • Core Software Engineering • Logic
Faculty • Alex Aiken • Richard Fateman • Susan Graham • Mike Harrison • Tom Henzinger • Paul Hilfinger • George Necula • Kathy Yelick
Mobile Ambients PartialEvaluation Monads Continuations Pi Calculus Regions Software Fault Isolation Type Inference Set-Based Analysis Proof Carrying Code Long Term • Language research can be loooong term • e.g., garbage collection
Executive Summary • Anything related to programming • How do we know it does what we think it does? • A mix of • theory • systems • human factors
Language Design: History • 70s & 80s: • Design better general purpose languages • pure functional, object-oriented, logic… • Lisp (Fateman), Ada (Hilfinger) • 90s & 2Ks: • Domain-specific languages • Tioga (Stonebraker, Hellerstein, Aiken) • Titanium (Graham, Yelick, Hilfinger, Aiken) • Understanding semantics: type soundness, etc. • Titanium pointers types are sound (Split-C’s are not) • Good language design is hard • Almost everyone does it
Language Technology without Languages • Increasing connections to other areas of CS • transfer of PL ideas to non-language tools • avoids language adoption problems • foundational ideas are portable • High-performance thread systems • based on CPS conversion • Low overhead virtual machines • uses software fault isolation • More to come . . .
Compilers Software Engineering Semantics Systems Programming Language Design Logic Interests and Collaborations