1 / 62

C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware

CHAIMS: Mega-Programming Research. C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware Stanford University Objective : Investigate revolutionary approaches to large-scale software composition .

shiloh
Télécharger la présentation

C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHAIMS: Mega-Programming Research Compiling High-level Access Interfaces for Multi-site Software Stanford University Objective: Investigate revolutionary approaches to large-scale software composition. Approach: Develop & validate a composition-only language. Contributions and plans: Hardware and software platform independence. Asynchrony by splitting up CALL-statement. Performance optimization by invocation scheduling. Potential for multi-site dataflow optimization. www-db.stanford.edu/CHAIMS CHAIMS

  2. Participants • Support • DARPA ISO EDCS program (1996-1999) • Siemens Corporate Research (1996-1998) • DoD AFOSR AASERT student support (1997-1999) • Sloan Foundation - computer industry study (1996-97) • People • Gio Wiederhold (Prof. Res) PI - Marianne Siroker (Administration) • Dorothea Beringer (postdoc EPF Lausanne) since Dec.1997 • Ron Burback (CS PhD cand.) - Neil Sample (CS PhD Student) • Laurence Melloul (CS MS) - Woody Pollack (CS MS) • MS and BS CS graduated: Joshua Hui, Gaurav Bhatia, Prasanna Ramaswami, Kirti Kwatra, Pankaj Jain, Mehul Bastawala, Catherine Tornabene, Wayne Lim (I.E.), Connan King (E.E.). • Louis Perrochon (postdoc ETH Zurich) Fall quarter 1996 CHAIMS

  3. Gio Wiederhold: Personal Background • 1936 born Varese, Italy • 1957: Learned programming at NATO SHAPE ADTC • 1958-1975 Programmer and software engineer • at IBM, UC, Stanford, Index, MaSCOR • 1963 - now Consultant for government, Industry • 1974-1976 PhD on Database Design at UC SF • 1976- now Professor Stanford • Computer Science, Medicine, Electrical Eng., Business School • Elected fellow ACMI, IEEE, ACM • Innovations: • solid rocket fuel combustion - A-format - incremental compilers - • timeshared real-time data acquistion -time-oriented databases - • database design -knowledge-based system concepts - • object creation from relations-mediators-security filters. CHAIMS

  4. Dorothea Beringer: Personal Background • Masters in Computer Science: hybrid-monitoring tool for debugging and software performance analysis for distributed software • Software engineer: telecommunication systems • Consultant: software methodologies, quality assurance, project management, CASE-tools • PhD: Modeling scenarios in object-oriented analysis • Teaching: Fusion • Now: CHAIMS -- large-scale software composition, distributed systems CHAIMS

  5. Presentation • Motivation and Objectives • changes in software production • basis for new visions and education • Concepts of CHAIMS • CHAIMS language • CHAIMS architecture and composition process • Scheduling • Dataflow optimization • Status, Plans, Conclusions CHAIMS

  6. Shift in Programming Tasks Integration Coding 1970 1990 2010 CHAIMS

  7. Languages & Interfaces • Large languages intended to support coding and composition have not been successful • Algol 68 • PL/1 • Ada • CLOS • Databases are being successfully composed, using Client-server, Mediator architectures • distribution -- exploit network capabilities • heterogeneity -- autonomy creates heterogneity • simple schemas -- some human interpretation • service model -- public and commercial sources CHAIMS

  8. Typical Scenario: Logistics A general has to ship troops and/or various material from San Diego NOSC to Washington DC: • different kind of material: criteria for preferred transport differ • not every airport equally suited • congestion, prices • actual weather • certain due or ready dates Today: calling different companies, looking up information on the web, reservations by hand Tomorrow: system proposes possibilities that take into account various conditions • hand-coded systems • composition of processes CHAIMS

  9. Scaling alternatives ? CHAIMS

  10. C H A I M S Megaprogram for composition, written by domain programmer CHAIMS system automates generation of client for distributed system CHAIMS Megamodules, provided byvarious megamodule providers Megamodules CHAIMS

  11. Megamodules - Definition Megamodules are large, autonomous, distributed, heterogeneous services or processes. • large: computation intensive, data intensive, ongoing processes (monitoring services) • distributed: to be used by more than one client • heterogeneous: accessible by various distribution protocols (not only different languages and systems) • autonomous: maintenance and control over recourses remains with provider, differing ontologies ( ==> SKC) Examples: • logistics: “find best transportation route from A to B”, reservation systems • genomics: easier framework for composing various processing tools than ad-hoc coding CHAIMS

  12. I/O I/O Data Resources Challenge: Fat Clients Domain expert Client computer Control & Computation Services c e a b d Wrappers to resolve differences CHAIMS

  13. MEGA modules Sites Data Resources Challenge: Thin Clients Domain expert Client workstation IO module IO module C Computation Services e b a d T c S U T R CHAIMS

  14. Challenge: Heavy-weight Services Services are not free for a client: • execution time of a service • transfer time for data • fees for services What we need: ==>monitoring progress of a service ==> possibility to choose among equivalent services based on estimated waiting time and fees ==>parallelism among services ==> preliminary overview results, choosing level of accuracy / number of results for complex processes ==> novel optimization techniques CHAIMS

  15. Challenge:Empower Non-technical Domain Experts Company providing services: • domain experts of domain of service (e.g. weather) • technical experts for programming for distribution protocols, setting up servers in a middleware system • marketing experts “Megaprogrammer”: • is domain expert of domain that uses these services • is not technical expert of middleware system or experienced programmer, • wants to focus on problem at hand (=results of using megaprogram) • e.g. scientist, logistics officer CHAIMS

  16. Challenge: Purely Compositional Language Possible? Which languages did succeed? • Algol, ADA: integrated composition and computation • C, C++ focus on computation Why new language? • complexity: not all facilities of a common language (compare to approach of Java), • inhibiting traditional computational programming (compare C++ and Smalltalk concerning object-oriented programming) • focus on issue of composition, parallelism by asynchrony, and optimization CHAIMS

  17. CHAIMS “Logical” Architecture Customer Megaprogram clients (in CHAIMS) Network/Transport (DCE, CORBA,...) Megamodules (Wrapped or Native) CHAIMS

  18. CHAIMS Physical Architecture Megaprogram Clients in CHAIMS Network DCE, CORBA, JAVA RMI, DCOM... Megamodules (wrapped, native) each supporting setup, estimate, invoke, examine, extract, and terminate. CHAIMS

  19. Decomposing CALL statements CALL gained functionality • Copying • Code sharing • Parameterized computation • Objects with overloaded method names • Remote procedure calls to distributed modules • Constrained (black box) access to encapsulated data progress in scale of computing CHAIMS decomposes CALL functions Setup Estimate Invoke Examine Extract CHAIMS

  20. CHAIMS Primitives Pre-invocation: SETUP: set up the connection to a megamodule SET-, GETATTRIBUTES: set global parameters in a megamodule ESTIMATE: get estimate of execution time for optimization Invocation and result gathering: INVOKE: start a specific method EXAMINE: test status of an invoked method EXTRACT: extract results from an invoked method Termination: TERMINATE: terminate a method invocation or a connection to a megamodule Control: Utility: WHILE, IF GETPARAM: get default parameters CHAIMS

  21. Megaprogram Example: Overview General I/O-megamodule • Input function takes as parameter a default data structure containing names, types and default values for expected input Travel information: • Computing all possible routes between two cities • Computing the air and ground cost for each leg given a list of city-pairs and data about the goods to be transported Two megamodules that offer equivalent functions for calculating optimal routes • Optimum and BestRoute both calculate the optimum route given routes and costs • Global variables: Optimization can be done for cost or for time InputOutput - Input - Output RouteInfo - AllRoutes - CityPairList - ... AirGround - CostForGround - CostForAir - ... Routing - BestRoute - ... RouteOptimizer - Optimum - ... CHAIMS

  22. Megaprogram Example: Code io_mmh = SETUP ("InputOutput") route_mmh = SETUP ("RouteInfo") ... best2_mmh.SETATTRIBUTES (criterion = "cost") cities_default = route_mmh.GETPARAM(Pair_of_Cities) input_cities_ih = io_mmh.INVOKE ("input”, cities_default) WHILE (input_cities_ih.EXAMINE() != DONE) {} cities = input_cities_ih.EXTRACT() ... route_ih = route_mmh.INVOKE ("AllRoutes", Pair_of_Cities = cities) WHILE (route_ih.EXAMINE() != DONE) {} routes = route_ih.EXTRACT() … IF (best1_mmh.ESTIMATE("Best_Route") < best2_mmh.ESTIMATE("Optimum") ) THEN {best_ih = best1_mmh.INVOKE ("Best_Route", Goods = info_goods, Pair_of_Cities = cities, List_of_Routes = routes, Cost_Ground = cost_list_ground, Cost_Air = cost_list_air)} ELSE {best_ih = best2_mmh.INVOKE ("Optimum", Goods = info_goods, … ... best2_mmh.TERMINATE() // Setup connections to megamodules. // Set global variables valid for all invocations // of this client. // Get information from the megaprogram user // about the goods to be transported and about // the two desired cities. // Get all routes between the two cities. //Get all city pairs in these routes. //Calculate the costs of all the routes. // Figure out the optimal megamodule for // picking the best route. //Pick the best route and display the result. // Terminate all invocations CHAIMS

  23. Operation of one Megamodule • SETUP • SETATTRIBUTES provides context • ESTIMATE serves scheduling • INVOKE initiates remote computation • EXAMINE checks for completion • EXTRACT obtains results • TERMINATE I / ALL M handle M handle M handle M handle I handle I handle I handle I handle M handle CHAIMS

  24. CHAIMS Megaprogr. Language Purely compositional: • no primitives for arithmetic ==> math megamodules • no primitives for input/output ==> general and problem-specific I/O megamodules Splitting up CALL-statement: • parallelism by asynchrony in sequential program • novel possibilities for optimizations • reduction of complexity of invoke statements • higher-level language (assembler => HLLs, HLLs => composition/megamodule paradigm) CHAIMS

  25. Distribution System (CORBA, RMI…) Architecture: Runtime b d e a CSRT(compiled megaprogram) c MEGA modules CHAIMS

  26. Architecture: Composition Process Megamodule Provider wraps non-CHAIMS compliant megamodules adds information to Wrapper Templates CHAIMS Repository b d e e a MEGA modules c CHAIMS

  27. Megaprogram (in CHAIMS language) Architecture: Composition Process Megaprogrammer writes information CHAIMS Repository information CHAIMS Compiler generates CSRT(compiled megaprogram) CHAIMS

  28. Megaprogram (in CHAIMS language) Distribution System (CORBA, RMI…) Architecture: Overview Megamodule Provider Megaprogrammer wraps non-CHAIMS compliant megamodules writes adds information to information Wrapper Templates CHAIMS Repository information CHAIMS Compiler b d generates e a CSRT(compiled megaprogram) c MEGA modules CHAIMS

  29. Architecture: CHAIMS-Language and CHAIMS-Protocols Megaprogrammer CHAIMS API defines interface between megaprogrammer and megaprogram; the megaprogram is written in the CHAIMS language. CHAIMS-language Megaprogram The CHAIMS protocols define the calls the mega-modules have to understand. These protocols are slightly different for the different distribution protocols, and are defined by an idl for CORBA, another idl for DCE, and a Java class for RMI. CHAIMS-protocols CORBA-idl DCE-idl Java-class M e g a m o d u l e s CHAIMS

  30. Architecture: Gentype Minimal Typing in CHAIMS: Integer, boolean only for control All else is placed into an ASN.1 bag, transparent to compiler : A Gentype is a triple of name, type and value, where value is either a simple type or a list of other gentypes (i.e. a complex type). Simple types: given by ASN.1, the ASN.1-conversion library for C++, our own conversion routines. Example: Person_Information Name of Person complex Personal Data complex Address First Name string Joe Last Name string Smith Date of Birth date 6/21/54 Soc.Sec.No string 345-34-345 CHAIMS

  31. Wrapper: CHAIMS Compliance • CHAIMS protocol- support all CHAIMS primitives • State management and asynchrony: • clientId (megamodule handle in CHAIMS language) • callId (invocation handle in CHAIMS language) • results must be stored for possible extraction(s) until termination of the invocation • Data transformation: • all parameters of type blob (BER-encoded Gentype) must be converted into the megamodule specific data types (combination hand-coding/decoding routines CHAIMS

  32. Architecture: Three Views Composition View (megaprogram) - composition of megamodules - directing of opaque data blobs Data View - exchange of data - interpretation of data - in/between megamodules CHAIMS Layer Transport View moving around data blobs and CHAIMS messages Distribution Layer Objective: Clear separation between composition of services, computation of data, and transport CHAIMS

  33. s s,i s,i i e e e s setup / set attributes invoke a method i extract results e Scheduler: Decomposed Execution time time time decomposed (no benefit for one module) asynchronous synchronous execution of a remote method available for other methods CHAIMS

  34. invoke a method i extract results e Optimized Execution of Modules i1 M1 i3 e1 i1 M3 (>M1+M2) i4 M1 i2 M4 (<M1+M2) e1 M2 i2 M2 e2 time e4 i3 e3 M3 e2 time i5 M5 e5 e3 optimized by scheduler according to estimates i4 M4 e4 data dependencies i5 M5 e5 execution of a module non-optimized CHAIMS

  35. time Decomposed Parallel Execution M1 M4 (<M1+M2) M3 <M1+M2) Long setup times occur, for instance, when a subset of a large database has to be loaded for a simple search, say Transatlantic fights for an optimal arrival. M2 M5 set up / setattributes optimized by scheduler according to estimates invoke a method extract results CHAIMS

  36. M3 (>M1+M2) M1 M4 (<M1+M2) time M2 prior time M5 Decomposed Optimized Execution M3 (>M1+M2) M1 M4 (<M1+M2) M2 M5 set up / setattributes optimized by scheduler according to estimates invoke a method extract results CHAIMS

  37. M3 (>M1+M2) M1 M4 (<M1+M2) time M2 prior time M5 Repeated invocations M3 (>M1+M2) M1 M4 (<M1+M2) M2 M5 set up / setattributes optimized by scheduler according to estimates invoke a method extract results CHAIMS

  38. M3 (>M1+M2) M1 M4 (<M1+M2) time M2 prior time M5 Repeated Extractions M3 (>M1+M2) M1 M4 (<M1+M2) M2 M5 set up / setattributes optimized by scheduler according to estimates invoke a method extract results CHAIMS

  39. Scheduling: Simple Example 1 cost_ground_ih = cost_mmh.INVOKE ("Cost_for_Ground", 1 List_of_City_Pairs = city_pairs,Goods = info_goods) 2 WHILE (cost_ground_ih.EXAMINE() != DONE) {} 3 cost_list_ground = cost_ground_ih.EXTRACT() 3 cost_air_ih = cost_mmh.INVOKE ("Cost_for_Air", 2 List_of_City_Pairs = city_pairs,Goods = info_good) 4WHILE (cost_air_ih.EXAMINE() != DONE) {} 4 cost_list_air = cost_air_ih.EXTRACT() order in unscheduled megaprogram order in automatically prescheduled megaprogram CHAIMS

  40. Scheduling: Possible Actions INVOKES: call INVOKE’s as soon as possible • may depend on other data • moving it outside of an if-block: depending on cost-function (ESTIMATE of this and following functions concerning execution time, dataflow and fees (resources). EXTRACT: move EXTRACT’s to where the result is actually needed • no sense of checking/waiting for results before they are needed • instead of waiting, polling all invocations and issue next possible invocation as soon as data could be extracted TERMINATE: terminate invocations that are no longer needed (save resources) • not every method invocation has an extract (e.g. print-like functions) CHAIMS

  41. current CHAIMS system Mega Program Mega Program Module B Module F Module F Module D Module D Module A Module C Module E with distribution dataflow optimization Mega Program Module B Module F Module D Module A Module C Module E Compiling into a Network control flow data flow CHAIMS

  42. CHAIMS Implementation • Specify minimal language • minimal functions: CALLs, While, If * • minimal typing {boolean, integer, string, handles, object} • objects encapsulated using ASN.1 standard • type conversion in wrappers, service modules* • Compiler for multiple protocols (one-at-time, mixed*) • Wrapper generation for multiple protocols • Native modules for I/O, simple mathematics*, other • Implement API for CORBA, Java RMI, DCE usage • Wrap / construct several programs for simple demos • Schedule optimization * • Demonstrate use in heterogeneous setting * • Define full-scale demonstration * in process CHAIMS

  43. Status • Definition of architecture for Megaprogramming • bottom up assessment of code to be generated • examples: room reservation, shipping • primitives • handles for parallel operation • heterogeneity -- common features of distribution protocols • Minimal language that can generate the code • no versus very few types -- ASN.1 for complex types • natural parallelism -- still a major research issue • Awareness of novel optimizations • information flow constraints -- scheduling • direct data flow between megamodules CHAIMS

  44. Focus for Future • Finishing basic infrastructure and demo examples. • CHAIMS interpreter to complement compiler. • Dynamic scheduling of invocations and extractions. • Flexible interaction with megamodules; extracting and handling overview results. • Direct dataflows between megamodules • (future project). CHAIMS

  45. Upcoming Changes to Architecture: PreCompiler + Interpreter Compiler: CHAIMS compiler, simple scheduler user megaprogram in CHAIMS language client code in C, C++, Java, stub code Idl-file generator and compiler executable client (CSRT) C++, Java compiler and linker network Interpreter: user CHAIMS execution machine (interpreter and scheduler) completemegaprogram in CHAIMS language serves as input to CHAIMS-protocol user some CHAIMS statements serve as input to network CHAIMS

  46. Interpreter • Dynamic scheduler: • Parsed input is stored in an executable dependency graph. • Execution machine (interpreter / scheduler) works through the graph and makes appropriate calls: • estimate-calls are inserted to get necessary run-time information for scheduling (cost-function) • every invocation is issued as soon as possible (data-flow) and reasonable (according to cost-function) • all invocations for which the CSRT waits for results are polled regularly, and results extracted and new invocations issued as soon as possible CSRT would still be sequential! • Overview results, flexible interactions: • megaprogrammer can program statement by statement and get results immediately; results will influence what he/she does next • like ftp, web CHAIMS

  47. Conclusion: Research Questions • Is a Megaprogramming language focusing only on composition feasible? • Can it exploit on-going progress in client-server models and be protocol independent? • Can natural parallelism for distributed services be effectively scheduled? • Can high-level dataflow among distributed modules be optimized? • Can CHAIMS express clearly a high-level distributed SW architecture? • Can the approach affect SW process concepts and practice? CHAIMS

  48. Other Research Projects Related by common issue: Large-Scale Interoperation • Mediation -- modules in 3-tier Information Systems • {acess, abstraction, integration, summarization, delivery} • maintenance management is a major benefit • Security and Privacy Mediators • filter results to complement access control • for healthcare privacy / manufacturing collaboration • Scalable Knowledge Composition • develop algebra ( Ç È - ) over ontologies • articulate distinct distinct domains to create user contexts • Image databases • rapid search by match using wavelets • identifying pornography • extracting text from images and icons for privacy/search CHAIMS

  49. Buy Lease Limit Use poor some fair good protect update poor ok good good bill simple simple awkw. hard perform no no little some Paying for SW Services You can not run an effective (SW) business and not be reimbursed for it. How? Four approaches: • Sell Software sell oilfield to customer • Lease copy / usage rights lease well • Time / user limited access fill tank • Charge by use instance provide bus General problems, effects differ • IP protection? • keeping SW updated • billing for est.value • performance effect CHAIMS

  50. Conclusion: Questions not addressed • Will one Client/Server protocol subsume all others? • distributed optimization remains an issue • Synchronization / Concurrency Control • autonomy of sources negates current concepts • if modules share databases, then database locks may span setup/terminate all for a megaprogram handle. • Will software vendors consider moving to a service paradigm? • need CHAIMS demonstration for evaluation CHAIMS

More Related