1 / 69

University of California, Irvine and San Diego

Energy-Aware System Design for Wireless Multimedia. University of California, Irvine and San Diego. Philips Semiconductors. Talk Outline. Overview Distributed Wireless Multimedia Case Study: The FORGE Framework IP Reuse at Philips. Power Optimization in battery operated mobile

raja
Télécharger la présentation

University of California, Irvine and San Diego

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Energy-Aware System Design for Wireless Multimedia University of California, Irvine and San Diego Philips Semiconductors

  2. Talk Outline • Overview • Distributed Wireless Multimedia • Case Study: • The FORGE Framework • IP Reuse at Philips

  3. Power Optimization in battery operated mobile • devices is a crucial research challenge • Devices operate in dynamic distributed environments. • Future power management strategies need to be • aware of global system changes.

  4. Infrastructure for Mobile Multimedia Environments laptop Online Gaming AP palmtop W i d e A r e a N e t w o r k L o w P o w e r D e v i c e s W i r e l e s s N e t w o r k response Video Streaming AP Distance Learning iPAQ request Online Banking, Chat • Best-Effort Service Low-power mobile device

  5. Enhanced Infrastructure Broker PROXY-N PROXY-1 services Caching compress encryption decryption Data from Mobile host Compositing transcode Data from Server output Execute Remote Tasks Directory Service laptop Online Gaming AP palmtop W i d e A r e a N e t w o r k L o w P o w e r D e v i c e s W i r e l e s s N e t w o r k Video Streaming AP Distance Learning iPAQ Online Banking, Chat Low-power mobile device

  6. Challenges in Wireless Multimedia Processing • Proliferation of Devices • System support for multitude of smart devices that • attach and detach from a distributed infrastructure • produce large volume of information at a high rate • limited by communication and power constraints • Need a customizable networking backbone • QoS driven resource provisioning algorithms for highly dynamic environments • Need to deal adaptively with incoming requests • Dynamically reconfigure system to service requests

  7. High Data Volume of Multimedia Information

  8. Challenges in Wireless Multimedia Processing • Dealing with Device Mobility • Need high degree of “network awareness” • congestion rates, mobility patterns etc. • global system state is constantly changing • Service Brokering for QoS Aware Resource Provisioning • Admission control, Load-balancing etc. • Multimedia Processing challenges • Soft Real Time Constraints • Synchronization (e.g. lip sync. , floor control) • Support for traditional media (text, images) and continuous media (audio/video) • Other considerations – Availability, Reliability, Cost-Effectiveness & Security

  9. Distributed Wireless Multimedia • Different forms of information accessible anytime • Multiple Sessions with varying characteristics. • Services, Networks and Systems • Heterogeneous, evolve dynamically • Quality of Service • Constraints: Timing, resource availability, network constraints, (e.g. bandwidth), security, reliability … • Example: For Multimedia Streaming to Handheld Devices • QoS Parameters: jitter, frame rate, resolution, bit-rate etc. • All these QoS parameters affect user perception. • Power is a new QoS dimension – in distributed multimedia. • User must be able to watch requested video without running out of battery

  10. Multimedia Streaming Example We use this framework for examining the design challenges Proxy node between servers and clients allows dynamic stream transformations (Transcoding, Adaptation, Annotation etc) MEDIA SERVERS NETWORK CLIENTS Wired Network Handheld PC ACCESS POINT PROXY PDA Wireless Network Phone

  11. Opportunities in Wireless Multimedia System Design • Dynamic nature of multimedia tasks leaves some computational slack • Slack = Difference between computational capability and computational requirements due to deadlines • QoS trade-offs possible for reducing energy consumption • Example: Lower quality video needs less computation/bandwidth • Multimedia Applications Characteristics • Kernels of computation-dominated operations • E.g. MPEG: IDCT, motion compensation, VLD • Predictable, regular behavior (most of the time) • E.g. VLD, followed by IQ, IDCT • Clear computation and/or data access patterns (cyclic) • E.g. video frames are traversed in a known order • Exploit multimedia specific characteristics to enable a range of optimization techniques

  12. Implications on Wireless Multimedia System Design • Devise strategies that reduce energy • These strategies must adapt to/optimize for changes in • Application Data (video stream) • OS/Hardware (CPU, Memory, Reconfigurable logic) • Network (congestion, noise, node mobility) • Residual Energy (battery) • Environment (Ambient light, sound) • Strategies can • Change application behavior (compression ratio) • Reduce backlight • Buffer Data (and switch off network card)

  13. Abstraction Layers in Distributed Multimedia Systems Abstraction Layers Video Player Other Tasks Applications Client1 Network Management Transcoding Admission Control Middleware Server Clienti Operating System DVS Scheduler Clientn Network Card CPU H/W Display Cache Memory Reg Files Challenges • Enable high quality of services (particularly multimedia services) at the mobile device: High Computational capability • Do so within strict Peak Power and Energy Budgets • Eg.: Play video stream at highest quality (requires computation), while ensuring the entire video plays back (requires energy)

  14. Energy Aware System Design Techniques • Several approaches optimize energy for each component and each abstraction level • Solutions – at each abstraction level • Architecture: • Cache/Memory optimizations, Processor architectural optimizations • Operating System • Dynamic voltage scaling (DVS) • Dynamic power management (DPM) of System components: disks, network interfaces • Middleware solutions • Adaptive streaming, mobility based adaptations • Application adaptations • Profiling applications for low power execution

  15. User/Application Architecture (cpu, memory) DVS, DPM, Driver Interfaces, system calls Quality of Service Application/user feedback Distributed Middleware Distributed Adaptation Cross-Layer Adaptation Appl. specific Adaptation Operating System Architecture Related Work • PROXY-BASED ADAPTATION for POWER AWARENESS • Shenoy(transcoding), Chandra(netwrk), Mohapatra (OS, arch, network + transcoding) • CROSS-LAYER ADAPTATION • GRACE (Illinois), FORGE/DYNAMO (UCI) • Flinn (ICDSP 2001), Yau (ICME 2002) • Krintz, Wolski (UCSD) • Noble (SOSP 97, MCSA 1999) • Li (CASES 2002), Othman (1998) • Abeni (RTSS 98) • Rudenko ( ACM SAC 99), Satyanarayan (2001) • Nahrstedt ( Grace, UIUC - MMCN 2002, 2003) • Shenoy (MMCN 2002), RajKumar (ICDCS 2003) • Mohapatra(ICDCS, MWCN 2003), Xu (DCS 03) • Efstratiou, Friday (Middleware 2000) • Forge Project UCI (ACM MM, RTAS, CIPC 03) • Ellis, Vahdat (EcoSystem, Currentcy, ASPLOS 02) • Hao, Nahrstedt (ICMCS 99, HPDC 99, Globecom) • DVS (Shin, Gupta, Weiser, Srivastava, Govil et. al.) • DPM (Douglis, Hembold, Delaluz, Kumpf et. al.) • Chandra (MMCN 02), Katz (IEICE 97), Chou(02) • Feeney, Nilson ( Infocom 2001) • Soderquist (ACM Multimedia 97) • Azevedo (AWIA 2001) • Hughes, Adve (MICRO 01, ICSA 01) • Brooks (ISCA 2000), Choi (ISLPED 02) • Leback (ASPLOS 2000), Microsoft’s ACPI

  16. Talk Outline • Overview • Distributed Wireless Multimedia • Case Study: • The FORGE Framework • IP Reuse at Philips

  17. Traditional Approach: A Closer Look Power Management Application Operating System Architecture Low-power device Network Infrastructure response Wireless Network Wide Area Network request Low-power mobile device server Wireless Distributed Infrastructure (traditional)

  18. Drawbacks • Limited co-ordination between the different • computation layers (Architecture, OS, application) • Lack of generalized framework • Example (DVS in presence of architectural opt.) • Do not exploit global system knowledge • Network congestion levels • Device mobility information • Data characteristics Cross-layer coordination directed by a distributed middleware framework can effectively address the above limitations.

  19. User/Application Distributed Middleware User/Application Operating System Distributed Middleware device Architecture Operating System Architecture LOCAL CROSS LAYER ADAPTATION GLOBAL PROXY BASED ADAPTATION A Global Coordinated Approach in FORGE Directory Service network Proxy • Build Power-aware Distributed Embedded System framework that can • Exploit global changes (network congestion, system loads, mobility patterns) to improve local adaptations • Distribute local information (e.g. device mobility, residual • power) for improved global adaptations • Co-ordinate power management strategies at different levels • (application, middleware, OS, architecture) • Maximize the utility (application QoS, power savings) of a mobile device.

  20. PROXY-BASED ADAPTATION CROSS LAYER ADAPTATION (Local Device) Proxy Middleware U S E R A P P L I C A T I O N S (Utility) • Transcoded payload/data • Settings for transmitted data • Control information ( n/w trans) • Admission Control • Task Partitioning • Adaptive network transmission • … App. specific info DVS Scheduler Device Drivers OS Operating System • Mobility Information, • Current Residual Power • Utility levels supported • User requirements for Adm. Control. Network optimization Task partitioning User info Collect/update Local data Proxy Network Card Display CPU Middleware strategies Cache Memory RegFiles H/W Middleware Device Runtime (API Interface) • NIC Idle periods • Video Encoding Info • Display Settings • Residual Power Info • Power API Arch. Specific Settings e.g. Cache config • Arch. Specific Knobs (Register file sizes, Cache config) FORGE: Layers and Interactions

  21. Outline for the rest of the talk • Examine Energy optimization knobs at each abstraction level • Examine how cross-layer coordination can reduce energy further • Specifically, we will talk about: • Using Reconfigurable Caches • Adaptive DVS techniques • Network Card shut-down by buffering video data • Reducing Backlight by Video Enhancement

  22. Hardware/Architectural Level Knobs • Major sources of power consumption • Display (Backlight) • Network Interface • CPU – particularly memory sub-system • We will discuss two Middleware/HW optimizations: • Quality-Driven Cache Reconfiguration • Dynamic Backlight Adjustment

  23. Quality-Driven Cache Reconfiguration(Hardware-Level Optimization) • Why caches? • High relative power consumption (above 50%) • Influences external memory power • Idea: reconfigure data cache for specific video stream format requirements • Cache power knobs used: size, associativity • Goal: Find best configuration for each quality level • Plus: combine with dynamic voltage scaling (DVS) • Application: MPEG decoding • Frame decoding may take less than frame delay • Slack time: θ = Fd – D (between deadline & end of computation)

  24. Impact of Cache Parameters on Energy • Profiled short (10sec) video clips (quality: low - high) for all cache configurations – parameters varied: • Size: 4KB – 64KB • Associativity: 1 – 32 • Energy savings: 10-15% (CPU + memory) over 32x32 baseline • Experimental Setup: • Wattch/Simplescalar • Berkeley MPEG tools Observations: • Associativity: largest impact on energy • Best cache configuration reflects • internal storage requirements for different frame sizes • decoding algorithm internal organization (data sets) “Action” clip, high quality

  25. Cache Configuration + DVS • Interaction of DVS with cache configurations • Cache configurations with the largest frame decoding slack enable largest DVS savings • Results: up to 60% energy savings over base config Middleware Rule Base for Best Config at each Quality Level Quality High to Low Base configuration: 400MHz, 1.3V, 32 kb, 32 set assoc

  26. OS Directed Power Management • OS has a global view of what is going on the whole system • Applications should communicate: • Quality of service, timing restrictions • The OS decides how to configure the knobs available • Ex: Processor frequency and voltage scaling

  27. Power Aware Software Architecture (PASA) • PA-API (Power Aware API) • Application/OS Interface • Makes power aware OS services available to the application writer. • PA-OSL (Power Aware Operating System Layer) • Implements modified OS services and active components such as a DPM manager. • PA-HAL (Power Aware Hardware Abstraction Layer) • OS/Hardware Interface • Makes power control knobs available to the OS programmer. Adaptable Applications Power Aware API Middleware PA OS Services Local PM OS Power Aware HAL OS HAL Hardware

  28. Operating system driven DVS • Slow down the CPU based on workload and timing restrictions (slowdown factors f < 1) • We model real time task sets with periods=deadlines using RMS • We implemented 4 variations of DVS with CPU shutdown: • Shutdown when idle – as soon as CPU becomes idle shutdown the processor • Static slow down factors – calculated offline and based on RM schedulability analysis (using the WCETs) • Dynamic slow down – run-time slow down factors are predicted based on a history of execution times • Adaptive slow down – A third slowdown factor adapted according to number of deadline missed in a previous window of executions.

  29. Implementation • We modified the eCos real time operating system running on a XScale platform (80200EVB) with dynamic frequency and voltage scaling hardware. • For the DVS techniques, we implemented real tasksets to validate the software implementation: • MPEG decoding, ADPCM and FFT

  30. Observations • Adaptive slowdown achieves about 30-40 % savings • However, deadline misses increase ( not shown here) • OS/Middleware have to trade-off deadline misses with energy savings/slowdown factors Task Set A: T1,T3,T4 B: T2,T3,T4 C: T1,T3,T5

  31. Middleware Controlled Network Data Buffering • Wireless NIC cards consume significantly less energy in sleep mode (NIC = Network Interface Card) • Avg. power consumption in sleepmode = 0.184 W, whereas • Idle & receivemodes consume 1.34 & 1.435 W respectively • Transmitting video data in bursts can help save power. • NIC on device can be transitioned into sleep mode • The middleware on the proxy is used to buffer video data and transmit it in bursts to the device. • Additionally, based on the residual energy feedback from the device, the middleware can transcode the video stream based on Quality/Power Matrix.

  32. Power savings decrease as video quality increases • Amount of Data Buffering possible is less at higher quality • This is an ideal model: in practice, network noise will mean that network interface has to be left on for longer periods of time Decreasing Increasing

  33. Reducing Backlight for Lower Power • Identify “Groups of Scenes” with little variance in luminosity • Increase pixel luminance and reduce backlight level • To avoid loss of contrast (due to pixel luminance saturation) • Perform spatial convolution using high pass filter • This sharpens objects in the image Power consumed at various backlight levels during streaming multimedia playback on the Compaq iPAQ

  34. Video Streams used for Experiments bipolar.mpgiceegg.mpg intro.mpg simpsons.mpg Snapshots of MPEG-1 video streams used in experiments Characteristics of video streams used in experiment

  35. Three Backlight Compensation Approaches • SBC: Simple Backlight Compensation • Only identify GOS, reduce backlight on handheld • No video stream contrast enhancement • CBVLC: Constant Backlight with Video Luminosity Compensation • Backlight level set once at start of video stream • Video stream is enhanced (dynamically at the proxy) • DCA: Dual Compensation Approach • Backlight level is dynamically changed based on GOS • Video stream is enhanced based on Backlight level decision

  36. Results for Backlight Compensation High Bright Super Bright Medium Bright

  37. Summary • We explored ways to reduce power by integrating power optimization techniques across abstraction layers • HW/OS/Middleware: Cache Reconfiguration, DVS, Backlight Reduction • OS/Application: Power Aware API for DVS • Middleware/Network: NIC Shutdown using data buffering • Conclusion: A Cross-Layer Coordinated Strategy is required for maximum energy savings • Information available at different abstraction levels can be used by either the OS or the middleware to make global decisions

  38. Ongoing Work • Exploits repetitive and cyclic characteristics of MPEG-2, MPEG-4/H.263 • Application and data profiling possible for reducing energy consumption • Energy Characterization of Security and Digital Media Protection algorithms • Security and IP protection of multimedia content has spawned a range of security measures • First step: We analyzed the effects of watermarking on energy and computation time on PDAs • Task partitioning between proxy and handheld for reducing total energy (=computation+communication) • For Video Streaming, Video Conferencing, Watermarking

  39. Talk Outline • Overview • Distributed Wireless Multimedia • Case Study: • The FORGE Framework • IP Reuse at Philips

  40. Blowing away the Barriers to Large Scale IP Reuse Ralph von vignau 5 January 2004 DATE Conference 2004 Paris, La Defense

  41. Philips and IP Reuse • Philips Semiconductors is a leading SoC developer • A reuse structure and policy for IP has been systematically introduced into the development environment. • There are rules and tools to support the reuse • CoReUse for HW • MoReUse for SW

  42. Philips and IP Reuse Background - 1 • Philips Semiconductors has a strategy of developing products based on System Silicon Platforms (SSP’s). • Chameleon (MIPS subsystem generator) • ChipBuilder (ARM based system generator) • Demonstrates the value of automatic methods of integrating IP blocks into a subsystem along with it’s verification environment.

  43. Philips and IP Reuse Background - 2 “Need a generic framework that enables platform developers to implement their system in a consistent, flexible and easy-to-use way” Combining automatic methods of integrating configurable IP blocks together with their verification environment

  44. Lessons Learned • Factors that enable successful IP reuse • A centrally driven and supported company policy • Wide deployment with consultancy • Central repository • High quality that can be trusted • Ease of use • Good documentation • Central support • Distributed champions • Visible improvements and successes

  45. The Limits of the Current Policies • A standard set of views is provided for each IP block • Ensures compatibility with the development flows • Supports easier integration • Ensures a minimum of documentation is available • Is supported by checking tools • However: • Verification reuse is not yet included • The checking is done by in-house tools • The rules only apply to in-house IP • A far more radical change is required to move to the next level of reuse methodology. • Higher automation • Faster integration and verification • Higher quality • Flexibility in design flows

  46. Requirements for the next Level of Reuse • Extend reuse both within Philips as well as to the IP bought for use within Philips • The use of IP from multiple vendors must be made easier and less costly • Tools from various EDA vendors must be easier to integrate into a design flow • The verification of IP must be more: • Comprehensive, stretching from unit tests to system verification • Reusable in all stages of the SoC development • A higher automation in the development flow must be supported • Automated IP integration • Verification suite compilation

  47. Supportive Standardization Although there are several activities and working groups throughout the industry and standardization groups, none have the industry focus or time drive set by the SPIRIT Consortium Only if there is an industry drive to common standards for the Reuse of IP can major improvements be achieved WWW.spiritconsortium.com

  48. The SPIRIT Consortium • SPIRIT • Structure for Packaging, Integrating and Re-using IP within Tool-flows • A consortium of leading companies in the EDA, IP, system and semiconductor industries • Aim • To develop industry standards • Ease integration of semiconductor IP into Systems • Enable the interoperability of tools for IP integration

  49. Reason for the SPIRIT consortium • Industry demands • Complex System-on-Chip and Programmable Platforms require IP re-use • Device manufacturers need to be able to select IP from multiple sources • Unifying IP descriptions and access to this information permits best-in-class choices for both IP and tools

  50. SPIRIT Consortium Background • The founding companies decided mutually to establish a unified set of standards to increase efficiency of IP based SoC design • Combining technological strengths of SPIRIT members to • Create standards that will help express complex IP • Deliver greater flexibility and efficiency to the SoC design process

More Related