Towards Elastic Operating Systems

Towards Elastic Operating Systems Amit GuptaEhab AbabnehRichard HanEric Keller University of Colorado,Boulder

OS + Cloud Today OS/Process ELB/ CloudMgr • Resources Limited • Thrashing • CPUs limited • I/O bottlenecks • Network • Storage • Present Workarounds • Additional Scripting/Code changes • Extra Modules/Frameworks • Coordination • Synch/Aggregating State

Stretch Process OS/Process • Advantages • Expands available Memory • Extends the scope of Multithreaded Parallelism (More CPUs available) • Mitigates I/O bottlenecks • Network • Storage

ElasticOS : Our Vision

ElasticOS: Our Goals • “Elasticity” as an OS Service • Elasticize all resources – Memory,CPU, Network, … • Single machine abstraction • Apps unaware whether they’re running on 1 machine or 1000 machines • Simpler Parallelism • Compatible with an existing OS (e.g Linux, …)

“Stretched” Process Unified Address Space OS/Process Elastic Page Table Location

Movable Execution Context OS/Process • OS handles elasticity – Apps don’t change • Partition locality across multiple nodes • Useful for single (and multiple) threads • For multiple threads, seamlessly exploit network I/O and CPU parallelism

Replicate Code, PartitionData CODE CODE CODE Data 1 Data 2 • Unique copy of data (unlikeDSM) • Execution context follows data (unlikeProcess Migration, SSI )

Exploiting Elastic Locality • We need an adaptive page clustering algorithm • LRU, NSWAP i.e “always pull” • Execution follows data i.e “always jump” • Hybrid (Initial): Pull pages, then Jump

Status and Future Work • Complete our initial prototype • Improve our page placement algorithm • Improve context jump efficiency • Investigate Fault Tolerance issues

Contact: amit.gupta@colorado.edu Thank YouQuestions ?

Algorithm Performance(1)

Algorithm Performance(2)

Page PlacementMultinode Adaptive LRU Pulls Threshold Reached ! JumpExecution Context Pull First Mem Mem CPUs Swap CPUs Swap

Locality in a Single Thread Temporal Locality Mem Mem CPUs Swap CPUs Swap

Locality across Multiple Threads CPUs Swap Mem Mem CPUs Swap CPUs Swap

Unlike DSM…

Exploiting Elastic Locality • Assumptions • Replicate Code Pages, Place Data Pages (vs DSM) • We need an adaptive page clustering algorithm • LRU, NSWAP • Us (Initial): Pull pages, then Jump

Replicate Code, Distribute Data CODE CODE CODE Data 1 Data 2 AccessingData 1 • Unique copy of data (vs DSM) • Execution context follows data (vs Process Migration) AccessingData 2 AccessingData 1

Benefits • OS handles elasticity – Apps don’t change • Partition locality across multiple nodes • Useful for single (and multiple) threads • For multiple threads, seamlessly exploit network I/O and CPU parallelism

Benefits (delete) • OS handles elasticity • Application ideally runs unmodified • Application is naturally partitioned … • By Page Access locality • By seamlessly exploiting multithreaded parallelism • By intelligent page placement

How should we place pages ?

Execution Context JumpingA single thread example Process Address Space Address Space Node 2 Node 1 TIME

“Stretch” a Process Unified Address Space Process Address Space Address Space Node 2 Node 1 Page Table IP Addr

Operating Systems Today • Resource Limit = 1 Node Mem Process OS Disks CPUs

Cloud Applications at Scale More Queries ? Cloud Manager LoadBalancer More Resources ? Process Process Process Partitioned Data Partitioned Data Partitioned Data Framework (eg. Map Reduce)

Our findings • Important Tradeoff • Data Page Pulls VsExecution Context Jumps • Latency cost is realistic • Our Algorithm: Worst case scenario • “always pull” == NSWAP • marginal improvements

Advantages • Natural Groupings: Threads & Pages • Align resources with inherent parallelism • Leverage existing mechanisms for synchronization

“Stretch” a Process :Unified Address Space A “Stretched” Process =Collection of Pages + Other Resources { Across Several Machines } Page Table Mem Mem IP Addr Swap Swap CPUs CPUs

delete Exec. context follows Data • Replicate Code Pages • Read-Only => No Consistency burden • Smartly distribute Data Pages • Execution context can jump • Moves towards data • *Converse also allowed*

Elasticity in Cloud Apps Today ~~~~ ~~~~ ~~~~ Input Data D1 D2 Dx Mem …. Disk CPUs ~~~~ ~~~~ ~~~~ Output Data

Input Queries Load Balancer D1 D2 Dy Dx Mem …. Disk CPUs ~~~~ ~~~~ ~~~~ Output Data

(delete)Goals : Elasticity dimensions • Extend Elasticity to • Memory • CPU • I/O • Network • Storage

Thank You

Bang Head Here !

Stretching a Thread

Overlapping Elastic Processes

*Code Follows Data*

Application Locality

Possible Animation?

Multinode Adaptive LRU

Possible Animation?

Open Topics • Fault tolerance • Stack handling • Dynamic Linked Libraries • Locking

Elastic Page Table Local Mem Swap space Remote Mem RemoteSwap

“Stretch” a Process • Move beyond resource boundaries of ONE machine • CPU • Memory • Network, I/O

~~~~ ~~~~ ~~~~ Input Data CPUs CPUs …. D1 D2 Mem Mem Disk Disk ~~~~ ~~~~ ~~~~ Output Data

~~~~ ~~~~ ~~~~ Data CPUs CPUs Mem Mem D1 D2 Disk Disk

Reinventing Elasticity Wheel

Towards Elastic Operating Systems

Towards Elastic Operating Systems

Presentation Transcript

Operating Systems

Operating Systems

Operating Systems

Operating Systems

Operating Systems Real-Time Operating Systems

Elastic Systems

Towards an Elastic Distributed SDN Controller

Operating Systems

Operating Systems

Challenges towards Elastic Power Management

Operating Systems

Towards Application Security On Untrusted Operating Systems

Operating Systems

Network Operating Systems versus Operating Systems

Operating Systems

Operating Systems

Operating Systems

Operating Systems

Operating Systems

Operating Systems

Operating Systems

Operating Systems Real-Time Operating Systems