540 likes | 668 Vues
Microsoft and Research Infrastructure DaCapo Winter Meeting 2007. Mark Lewin Program Manager External Research & Programs Microsoft Research mark.lewin@microsoft.com. Agenda. Phoenix SSCLI – The Shared Source Common Language Infrastructure Windows Research Kernel Singularity. Phoenix.
E N D
Microsoft and Research InfrastructureDaCapoWinter Meeting 2007 Mark Lewin Program Manager External Research & Programs Microsoft Research mark.lewin@microsoft.com
Agenda • Phoenix • SSCLI – The Shared Source Common Language Infrastructure • Windows Research Kernel • Singularity
What Is Phoenix? • Phoenix is a framework providing a unique and rich infrastructure for writing compilers, development & analysis tools, and plug-ins • Foundation of next generation of Microsoft code generators, optimizers and analysis tools • Platform for third-party plug-ins • Platform for research and teaching: Phoenix Academic Program
Phoenix Goals • Provide industry leading compilation and tools infrastructure: “VC++ and .NET compilers and tools” • Build research/development community around infrastructure: “the Phoenix Platform” • Make the infrastructure scalable, configurable, and extensible: “JIT to WPO, compilation and analysis” • Make infrastructure quick to retarget and rehost
Key Features • Written in C++, usable by any .NET language • Dual-Mode: entire platform compiles to run native or on top of .NET • Phase & Plug-in model for third party extensions to: VC++ Compiler, Binary reader/writer, Analysis Tools, … • Support for Multi-threaded clients • Support for Code and Data extensibility • A single, strongly typed, explicit dataflow/control flow IR used throughout framework • IR & Type system capable of processing native and/or managed code • Strong inter-phase consistency checking • Many diverse compilers and tools reuse the common core
Compilation and analysis framework for 10+ years .Net CodeGen MSR Adv. Lang JITs PreJITs OO and .Net optimizations Language Research Direct xfer to Phoenix Research Insulated from code generation Phoenix Building Blocks Native CodeGen MSR Tools Advanced C++/OO Optimizations FP optimizations OpenMP Built on Phoenix API’s Both HL and LL API’s Managed API’s Program Analysis Academic DevKit Retargetable Full sources Managed API’s IP as DLLs Docs “Machine Models” ~3 months: -Od ~3 months: -O2 Chip Vendor DevKit ~6 month ports Sample port + docs Key ports (Arm) done at msft
Compilers Tools Browser Visualizer Lint HL Opts HL Opts HL Opts LL Opts LL Opts LL Opts Code Gen Code Gen Formatter Obfuscator Refactor Xlator Profiler SecurityChecker Phx APIs Phoenix Core AST IR Syms Types CFG SSA assembly Native Image C++ IR C++AST Phx AST Profile C++ PREfast Lex/Yacc C# VB C++ Delphi Cobol Eiffel Tiger
Front-end : Language-specific Assem Decorated AST AST for (row = 0; row < sum += a[i];} for ( row Scan Parse CSA = 0 Middle-end : Optimizations Linear IR CSE ConstProp DeadCode OSR Alg Idents Inline LoopUnroll HoistInvars Back-end : Codegen Image Instr Selection RegAlloc Schedule
Status • Phoenix VC++ compiler backend building/running Windows XP (40+ million lines of code) • Phoenix .NET JIT compiler passing 95+% of production JIT compiler tests • Phoenix static analysis being incorporated into current Enterprise Development tools, some internal tools have already been deployed • 600+ RDKs being used by researches on a wide range of projects, active community feedback • Working on improving optimizations to surpass current VC++ compiler performance • Working on productization
Phoenix RDK • Conventional license for non-commercial use. • Free to use for research and education. • Free download. • Depends on Visual Studio 2005 technology. • Support forum for faculty and teaching assistants. Next version, Winter 2007 RDK, coming soon.
Research Support • 17 Research projects funded through Rotor research grants in 2006, 12 funded in 2005, 5 funded in 2004 • Many Phoenix RFP project PI’s here at the Summit – ask them about their experiences.
Academic Users of Phoenix • Constructing Compact Debugging Traces with Binary Code Analysis and Instrumentation • Phoenix-Based Compiler Course Development • Compiler Backend Experimentation and Extensibility Using Phoenix • Adaptive Inline Substitution in Phoenix • Domain-Specific Language for Efficient Design-Rule Checking • Setpoint: An Aspect Oriented Framework Based on Semantic Pointcuts • Phase Aware Profiling with Phoenix • Using Call Graph Analyses to Guide Selective Specialization in Phoenix • Program Visualization with Fulcra and Phoenix • Navel: Automating Software Support Using Traces of Software Behavior • Techniques and Tools for Software Assurance • Type-Checking the Intermediate Languages in the Phoenix JIT Compiler
Phoenix -- Downloads and Info: http://research.microsoft.com/phoenix
What is SSCLI? SSCLI is a shared source implementation of CORE TECHNOLOGIES that underlie Microsoft’s .NET architecture. SSCLI is current with advances in the commercial Common Language Runtime. SSCLI is designed and documented for ACADEMIC RESEARCH and TEACHING.
SSCLI Motivations & Milestones Motivations: • Support ECMA/ISO standards efforts • Validate OS platform neutrality of .NET • Create an open laboratory for academic research • Track evolution of commercial CLR • Share! Milestones: • SSCLI 1.0 released at OOPSLA 2002 • Over 250,000 downloads through June, 2006 • SSCLI 2.0 released March 2006
DEVELOPMENT DEPLOYMENT Source code Assembly PE header + MSIL + Metadata + EH Table PEVerify public static void Main(String[] args ) { String usr; FileStream f; StreamWriter w; try { usr=Environment.GetEnvironmentVariable("USERNAME"); f=new FileStream(“C:\\test.txt",FileMode.Create); w=new StreamWriter(f); w.WriteLine(usr); w.Close(); } catch (Exception e){ Console.WriteLine("Exception:"+e.ToString()); } } Compiler public static void Main(String[] args ) { String usr; FileStream f; StreamWriter w; try { usr=Environment.GetEnvironmentVariable("USERNAME"); f=new FileStream(“C:\\test.txt",FileMode.Create); w=new StreamWriter(f); w.WriteLine(usr); w.Close(); } catch (Exception e){ Console.WriteLine("Exception:"+e.ToString()); } } NGEN Evidence EXECUTION Host Policy Manager Assembly info Module + Class list AssemblyLoader GAC, app. directory, download cache Policy <?xml version="1.0" encoding="utf-8" ?> <configuration> <mscorlib> <security> <policy> <PolicyLevel version="1"> <CodeGroup class="UnionCodeGroup" version="1" PermissionSetName="Nothing" Name="All_Code" Description="Code group grants no permissions and forms the root of the code group tree."> <IMembershipCondition class="AllMembershipCondition" version="1"/> <CodeGroup class="UnionCodeGroup" version="1" PermissionSetName="FullTrust" Permission request Granted permissions (class) (assembly) CLR Services • GC • Exception • Class init • Security (method) ClassLoader JIT +verification Vtable +Class info Native code+ GC table Managed Code Execution
XSLT Serialization XPath The Shared Source CLI (SSCLI) VS.NET System.Web (ASP.NET) System.WinForms UI SessionState C# Design ComponentModel HtmlControls Caching JScript Security WebControls System.Drawing Configuration VB Simple Web Services VC/MC++ Drawing2D Printing Protocols Imaging Text Debugger Discovery Description Designers System.Data (ADO.NET) System.Xml ADO SQL SDK Tools Design Adapters CorDBG System ECMA CLI ILAsm Collections IO Security Runtime InteropServices Configuration Net ServiceProcess ILDbDump Remoting Diagnostics Reflection Text SN Serialization Globalization Resources Threading ILDAsm Common Language Runtime MetaInfo JIT GC App Domain Loader PEVerify MSIL Common Type System Class Loader Platform Adaptation Layer Networking Boot Loader Threads Sync Timers Filesystem
SSCLI: For Teaching and Research • Complete example to enable research in and support teaching of modern programming languages, compiler design, and runtime infrastructure. • SSCLI supports the ECMA standardization process with a real implementation • Commercial grade code (but documented for academia) • SSCLI license allows for “safe” examination of code • For compiler writers who want to target CLI: • JScript compiler shows dynamic techniques (in C#) • C# compiler shows nearly all runtime features • IL Assembler demonstrates low-level API implementation and use
How SSCLI Is Organized • Four major areas in source code • Runtime “execution engine” • Frameworks • Compilers and tools • Portability layer, tests, and build infrastructure • Other important points of interest • License • Documentation • Samples
Research Support • 40 Research projects funded through Rotor research grants in 2002, 40 funded in 2004 • SSCLI RFP Capstone Workshop II held Fall 2005 • Researchers from 18 countries • 27 research and teaching projects presented • http://research.microsoft.com/workshops/SSCLI2005/ • SSCLI RFP Capstone Workshop I held Fall 2003 • Researchers from over 20 countries • 16 refereed paper presented • IEE Journal: special Rotor research issue
Books • Shared Source CLI Essentials – Dave Stutz, Geoff Shilling, Ted Neward; O’Reilly (2003) • Compiling for the .NET Common Language Runtime (CLR) – John Gough; Prentice Hall (2002) • Inside Microsoft .NET IL Assembler – Serge Lidin; Microsoft Press (2002, 2006) • The Annotated CLI Standard – Jim Miller, Susann Ragsdale; Addison Wesley (2003) • Distributed Virtual Machines: Inside the Rotor CLI – Gary Nutt; Addison Wesley (2005)
SSCLI -- Downloads and Info: http://research.microsoft.com/sscli
CRK WRK ProjectOZ Windows Academic Program Windows Research Kernel – the core kernel sources and binaries integrated with an environment for building and testing experimental versions of the Windows kernel for use in teaching and research. Windows Operating System Internals Curriculum Resource Kit (CRK) -presentation slides, experiments, labs, quizzes and assignments for introducing case studies from the Windows kernel into operating system courses. R E S E A R C H I N S T R U C T I O N ProjectOZ - an operating systems project environment that uses the native kernel interfaces of Windows to provide simple, clean, user-mode abstractions of the CPU, MMU, trap mechanism, and physical memory that can be used to perform experiments in operating systems principles.
Windows Research Kernel (WRK) Source from the latest shipping Windows (NTOS) kernel Version – Windows Server 2003 (x86/x64) and Windows XP x64 Included sources – most everything in NTOS - processes, threads, LPC, VM, scheduler, object manager, I/O manager, synchronization, worker threads, kernel memory manager, … Excluded sources– plug-and-play, power-management, and specialized code such as the driver verifier, splash screen, branding, timebomb, etc. Build environment – makefile-based with object library for the excluded sources. Kernels boot on native hardware or using VirtualPC.
WRK Goals • Simplified licensing to allow classroom and lab use • Make it easier for faculty and students to compare and contrast Windows to other operating systems • Enable students to study source, and modify and build projects • Provide better support for research and publications based on Windows internals • Encourage more OS textbook and university-oriented internals books on Windows kernel
Windows Architecture Applications Subsystemservers DLLs System Services Login/GINA Kernel32 Critical services User32/GDI ntdll/run-time library User-mode Kernel-mode Trap interface / LPC Security refmon I/O Manager Virtual memory Procs and threads Win32 GUI Net devices File filters Scheduler Filesys run-time Net protocols File systems Net Interfaces Volume mgrs Cache mgr Synchronization Device stacks Object Manager/Configuration Management (registry) Kernel run-time/Hardware Adaptation Layer
Important NT Kernel Features • Highly multi-threaded • Completely asynchronous I/O model • Thread-based scheduling • Object-manager provides unified management of • Kernel data structures • Kernel references • User references (handles) • Namespace • Synchronization objects • Resource charging • Cross-process sharing • Centralized ACL-based security reference monitor • Configuration store decoupled from file system 31
Important NT Kernel Features • Extensible filter-based I/O model with driver layering, standard device models, notifications, tracing, journaling, namespace, services/subsystems • Virtual address space managed separately from memory objects • Advanced VM features for databases (app management of virtual addresses, physical memory, I/O, dirty bits, and large pages) • Plug-and-play, power-management • System library mapped in every process provides trusted entry points
Windows Research Kernel http://www.microsoft.com/resources/sharedsource/Licensing/WindowsAcademic.mspx
Singularity • A multidisciplinary research OS, Languages, and Tools project from MSR • Key approaches: • Pervasive use of safe and analyzable programming languages • Improve system resilience despite software errors • Design for verifiability • Microkernel architecture • Written in Sing# (extended Spec#) • Not a Windows/CLR replacement
Large, Diverse Research Team Lead by Galen Hunt and Jim Larus MSR Cambridge Paul Barham, Richard Black, Tim Harris, Rebecca Isaacs, Dushyanth Narayanan MSR Redmond Advanced Compiler Technology Group: Juan Chen, Qunyan Mangus, Mark Plesko, Bjarne Steensgaard, David Tarditi Foundations of Software Engineering Group: Wolfgang Grieskamp Operating Systems Group: Mark Aiken, Chris Hawblitzel, Orion Hodson, Galen Hunt, Steven Levi Security and Distributed Systems: Dan Simon, Brian Zill Software Design and Implementation Group: John DeTreville, Ben Zorn Software Improvement Group: Manuel Fahndrich, James Larus, Sriram Rajamani, Jakob Rehof MSR Silicon Valley Martin Abadi, Andrew Birrell, Ulfar Erlingsson, Roy Levin, Nick Murphy, Ted Wobber
Key Approaches Pervasive use of safe (& analyzable) programming languages type safety and memory safety including device drivers, OS components, applications Improve system resilience despite software errors failure boundaries between components improve extension model explicit error notification Increased verification specification at multiple levels of abstraction closed environments with explicit cross-domain interfaces design for verifiability
Singularity OS Closed Kernel 95% written in C# 17% of files contain unsafe C# 5% of files contain x86 or C++ OS services & drivers in processes kernel closed at boot time Software isolated processes (SIPs) all user code is verified safe some unsafe code in trusted runtime processes closed at start time Safe and efficient communication via strong interfaces channels between processes channel behavior is specified & checked checked behavior enables efficient communication Type safety is crux of verificationand protection app. class libs. driver class libs. serv. class libs. ext. classlibs. i/o mgr chan mgr scheduler proc mgr page mgr channels contentextension web server TCP/IPstack networkdriver processes runtime runtime runtime runtime kernel ABI kernel kernel class library runtime HAL
Pervasive Safe Languages • Singularity is written in extended C# • actually Spec#(C# + pre/post-conditions and invariants) • Added features for systems programming • increase programmer control over allocation, initialization, and memory layout • Language design to support programmingand verification • message passing • factoring libraries into composable pieces • compile-time reflection
What About The Runtime? JVM & CLR’s design not always appropriate rich runtime (“one size fits all”) monolithic, general-purpose environment large memory footprints (~4 MB process for CLR) many dependencies (CLR PAL requires >300 Win32 APIs) JIT compilation increases runtime size and complexity unpredictable performance replicate OS functionality security, threading, configuration, etc. more is less
Singularity Runtime Libraries Singularity Process Bartok Compiler Whole Program Optimization Application x86 Executable TAL Proof Singularity Runtime(GC, etc.)
Small, Customizable Runtime Small execution environment ahead-of-time, global optimizing compiler (MSR Bartok) specializes runtime and libraries eliminate code for unused/disabled language features and unused application/library code factorable runtime and libraries Runtime, garbage collector, and libraries selectable on per-process basis reduce memory and computation overhead enforce design discipline and system policies per process Eliminate OS functionality from runtime security, resource allocation, etc. Provide OS mechanism for enforcing system policy runtime can constrain behavior (e.g. driver environment)
Run-Time Resilience Software errors should not causesystem failure Resilient system architecture isolate system components to preventdata corruption provide clear failure notification implement policy for restartingfailed component
Process Architectures Open Process Single Process App Process OS OS Kernel
Open Process Architecture Open processes dynamic code loading and runtime code generation DLLs, Java class loading, browser plug-ins, device drivers in kernel, etc. cross-process memory sharing system API allows one process to alter state of another Near ubiquitous (Windows, Unix, etc.) Shared state reduces dependability 85% of Windows crashes are caused by third party code in kernel interfaces between extension and host are often poorly documented and understood no isolation boundary between code and extension extension can access non-public interfaces (reflection)
Single Process Architecture All code and data in single address space rely on language and memory safety to isolate components dynamic code loading and runtime code generation easy data sharing Xerox PARC (Cedar, Smalltalk, etc.) and Lisp Machine model Java and .NET model as well Runtime is single point of failure shared runtime must also meet all applications’ requirements Difficult to constraint interactions
Singularity Sealed Processes Singularity processesare sealed no dynamic code loading or run-time code generation all code present when process starts execution extensions execute inseparate processes separate closed environments with well-defined interfaces no shared memory Process is fundamental unit of failure isolation Better: security, verification, failure handling, optimization Process Extension OS Kernel
Software Isolated Processes (SIPs) Protection and isolation enforced by language safety and kernel API design, not hardware process owns a set of pages all of a process’s objects reside on its pages (object space, not address space) language safety ensures process can’t create or mutate references to other pages Global invariants: no process contains a pointer to another process’s object space no pointers from exchange heap into process P1 P2 P3
Interprocess Communications Channels are strongly typed (value and behavior), bidirectional communications ports messages passing with extensive language support Messages live outside processes, in exchange heap only a single reference to a message “Mailbox” semantics enforced by linear types P1 P2 P3 exchange heap
More Verification Integrate specifications throughout system language interprocess communication system configuration Detect errors early, verify code late language safety essential to system integrity x86 Singularity TCB Sing# C#source csc safetyproof Singularity system MSIL+ bartok sgc byte codeverification type assembly languageverification compilerverification