330 likes | 404 Vues
C O M P U T A T I O N A L R E S E A R C H D I V I S I O N. High End Computing with K42. FastOS PI Meeting June 9, 2005.
E N D
C O M P U T A T I O N A L R E S E A R C H D I V I S I O N High End Computing with K42 FastOS PI Meeting June 9, 2005
The HECRTF and FastOS reports enumerate unmet needs in the area of Operating Systems for HEC, including • Availability of Research Frameworks • Support for Architectural Innovation • Performance Visibility • Adaptability to Application Requirements • This project uses the K42 Operating System to address these needs
What is K42? • K42 is a GPLed research O/S at IBM • Framework for PERCS (DARPA HPCS) work • O/S research and architectural innovation • API and ABI compatible w/ Linux on PPC64 • Runs most Linux kernel modules (fs, etc) • Many OS services implemented in user-space • Object oriented • Every virtual or physical instance is an object • Every class may have multiple implementations • Implementations can be hot-swapped, per-instance • Very modular for easy addition/modification • Has extensive performance/tracing capabilities • Design/implementation is very SMP-scalable
Who? • IBM • The group most experienced with K42 and responsible for its continued development • The group performing O/S research for PERCS • Has contributed K42-developed ideas to Linux • Linux Trace Toolkit (LTT) • Object-based reverse mapping (memory mgmt) • Read-Copy-Update (RCU)
Who? • LBNL • DOE applications and application scientists • Especially via BIPS and SciDAC PERC project • Scalable Systems Software SciDAC project • BLCR (system-initiated checkpointing for Linux) • Access to OpenMPI team (former LAM/MPI) • UPC and Titanium teams (GASNet runtime) • Linux kernel experience • Including M-VIA and BLCR
Who? • University of New Mexico • Experience with implementing and porting light-weight kernels • SUNMOS, Puma, Cougar and Catamount • Experience with the development of the Portals API • Experience in configurable/adaptable systems software • X-kernel, Scout and Cactus
Who? • University of Toronto • A prominent member of the existing K42 research community • Origin of Tornado, the direct predecessor to K42 • Key preliminary work in arbiter-object technology for hot swap and dynamic adaptation
What are we doing? • Work divides into three major areas: • Framework for OS/Runtime research for HEC applications • Dynamic Adaptation • Architecture of a parallel operating system
Framework for OS/Runtime research for HEC • Make K42 usable as a platform • To perform basic O/S and Runtime research of importance to HEC • To develop/run/debug/tune HEC applications
Framework (1) Issue: K42 is hard to build/install Approach: Create a distribution for a dual-boot K42/Linux system (src and bin) Issue: K42 runs only on PPC64 Approach: Port to AMD64 (maybe EM64T?) Issue: K42 lacks a full HEC environment Approach: Build/port the required environment (SSS OSCAR) • Numeric libraries (OSCAR) • Batch system (Scalable System Software suite) • Programming models (MPI UPC Titanium CAF)
Dynamic Adaptation • Utilize K42’s design • To expose performance information below the app-O/S interface • To allow static and dynamic specialization of O/S and runtime services
Dynamic Adaptation (1) • Issue: O/Ses and runtimes are performance-opaque • Approach: Extend what K42 has • K42 already has extensive performance/tracing capabilities • Expose K42’s object structure • Use arbiter objects for per-object collection of h/w counter data • Develop graphical tools to connect performance data to OS/runtime objects
Dynamic Adaptation (2) • Issue: What to adapt? • Approach: Study HEC Apps • Use performance tools to identify OS/Runtime objects which are bottlenecks and in what situations • Investigate what alternative implementations offer better performance in those situations • Add these implementations to K42
Dynamic Adaptation (3) • Issue: When to adapt? • Approach 1: user-directed customization • Approach 2: compiler-directed customization • Approach 3: Runtime adaptation using arbiter objects to monitor and adapt to changing conditions
Dynamic Adaptation (4) • Issue: How to adapt? • Approach: Use hot-swapping of object implementations in K42 • Allows one to replace implementations, per-instance, without the need to block the application
Dynamic Adaptation (5) • Issue: Need a small set of applications, representative of HEC today and in the future • Approach: LBNL has identified a set of applications that we feel are a good starting set. • separate presentation if time allows
Building a Parallel O/S • K42’s design principles yield excellent scaling on SMPs, with minimal UP impact • Apply these principles to parallel runtime services • Integrate these services with the O/S
Build a Parallel O/S (1) • Issue: K42 lacks a native RPC mechanism • Approach: • Adapt the best design features of • Protected Procedure Calls (inter-address space) • Active Messages (inter-node) • Simple but powerful AM-style mechanism for parallel runtime services • Reduce to protected procedure call in the intra-node case
Build a Parallel O/S (2) • Issue: AM has no “name service” • Approach: Design and prototype a simple mechanism for locating required services • Services may be load-balanced • Services may migrate/fail-over
Build a Parallel O/S (3) • Issue: Asynchronous events are common in a parallel environment • Approach: Reusable event service • K42 is already an event-driven system • Design and prototype a distributed event service • Simple Publish/Subscribe API?
Build a Parallel O/S (4) • Issue: Parallel job management • Spawn, signal, ps • Approach: Extended Process Spaces • Not the same as SSI, more like PAGs • Process id tuple: (ProcSpace, ID) • Each parallel job is a ProcSpace • A process can see those ProcSpaces to which it is “attached” (creator, member or observer)
Build a Parallel O/S (5) • Issue: Just TCP/IP sockets in K42 • Approach: • Characterize the application impact of s/w communication overheads • Investigate App/kernel/NIC APIs • Offload of communication processing • Application-specific customization • Implement O/S support for other communication abstractions • Active Messages • RDMA
Applications Performance Evaluation • Can K42 be a production HEC environment?
Applications Performance Evaluation • Head-to-head Linux-vs-K42 comparisons • Many comparisons already possible • Port of HEC environment will allow more complete comparison • Port to Opteron will allow head-to-head comparison to Catamount • Interesting: Apps programming in K42 is as easy as Linux, but can K42 perform as well as Catamount on HEC?
And then what? • What can/should one do with the resulting framework?
Possible Follow-on Work (1) • The research framework will allow/ease many potentially interesting research areas • Some cut from our proposal • Some are part of other FastOS projects • Others are new • Presented on following slides in no particular order
Possible Follow-on Work (2) • Filesystem work via native RPC mechanisms and RDMA networking • Cluster software management • K42 hot-swapping can allow full OS/Runtime upgrade of a live system with no downtime(like TELCO equipment) • Co-scheduling • Reduce O/S-induced load-imbalance (“noise”) that perturbs collective operations (especially barriers)
Possible Follow-on Work (3) • Scheduling and resource management for multi-threaded and multi-core processors • Example: page coloring for cache partitioning • Virtualization, checkpoint/restart and migration • Interposing objects makes virtualization trivial • Use of RCU makes most (all?) quiescing unnecessary when checkpointing • High performance network drivers for K42 • InfiniBand, Quadrics QSNetII, MyriNet
Summary (1) • Produce a K42 platform that: • Is easy to install and use for O/S and runtime research in HEC • Includes nearly all of the HEC environment a user is expecting • Helps users to track/understand performance within the O/S and runtime • Accepts static customization hints from users and/or compilers
Summary (2) • Produce a K42 platform that: • Will dynamically identify performance bottlenecks in the O/S and runtime and dynamically switch to more appropriate object implementations • Includes custom HEC-appropriate application/kernel/network APIs • Includes an infrastructure for building of parallel operating environments • Includes a scalable mechanism for parallel job control
Summary (3) • Work with DOE SC applications • To determine HEC-appropriate implementations/policies/APIs • To improve applications performance • Evaluate the performance of K42 as a production HEC platform • Head-to-head vs. Linux • Head-to-head vs. Catamount