1 / 12

Performance Analysis of a RTOS by Emulation of an Embedded System

Performance Analysis of a RTOS by Emulation of an Embedded System. June 17th, 1999 T. Steckstor, K. Weiß, W. Rosenstiel Lehrstuhl für Technische Informatik University of Tübingen D-72076 Tübingen, Germany e-mail: stecki@fzi.de. Outline. Introduction

Télécharger la présentation

Performance Analysis of a RTOS by Emulation of an Embedded System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Analysis of a RTOS by Emulation of an Embedded System June 17th, 1999 T. Steckstor, K. Weiß, W. Rosenstiel Lehrstuhl für Technische Informatik University of Tübingen D-72076 Tübingen, Germany e-mail: stecki@fzi.de

  2. Outline • Introduction • Emulation environment: SPYDER-CORE-P1 • Benchmark example: Actuator-Sensor-Interface (ASI) master unit • Embedded system performance analysis • Analysis results of different cache configurations and cache sizes • Conclusion

  3. Introduction • Embedded systems in the industrial automation • Application specific hardware implementation using a FPGA • Application specific software running on a microcontroller • The interaction between the hardware part and the software part demands hard real-time requirements (reaction times of about 200µs) • Motivation from an embedded system designers point of view • Sophisticated software task architecture (RTOS) • Novel microcontroller architecture with caches • Fast reaction times to external events cause that task switching and interrupt reaction times become a major performance bottleneck

  4. Emulation • Embedded system with complex internal system behavior • Emulation is very close to the final target system to get a detailed internal view • Emulation offers the possibility to find the best hw/sw partitioning early in the design process • Emulation gives answers to the following questions: • What is the optimum clock speed? • How much performance is consumed by the RTOS? • How great is the performance enhancement of the on-chip caches and what can be done with this enhancement? • What is the effect of different cache sizes on the important RTOS task switching and interrupt reaction times?

  5. peripherie devices 8 Bit I/O bus extension headers Intra/ Internet FPGA architectures FLASH 8MB Ethernet 10MBit Actel add-on driver AT-ISA bus DPRAM 2KB I Xilinx XC6000 2 serial ports analog module Xilinx XC4000 II III Emulation environment: SPYDER-CORE-P1 microcontroller core 32 bit microcontroller bus DRAM 1-128MB Embedded PowerPC PPC403 25..80MHz CORE-P1 AT-ISA add-on board

  6. ASI communication system up to 32 slaves ASI power supply ASI slave 4O 4O 4I 4I slaveanswer 0 I3 I2 I1 I0 PB 1 ASI slave ASI master 0 SB A4 A3 A2 A1 A0 I4 I3 I2 I1 I0 1 PB mastercall ASI real-time critical constant (220µs) Benchmark example: ASI master unit

  7. ASI application sofware http- server FLASH 8MB from microcontroller register interface tele_receive int_service control C-server to VxWorks real-time operating system tele_send TCP/ IP analog module ASI-UART SPYDER-CORE-P1 hardware Ethernet 10MBit ASI hardware (single channel) Actel add-on Target chip: XC4005E, 166 CLBs, utilization: 85% DPRAM 2KB I Xilinx XC6000 2 serial ports Benchmark example: Implementation microcontroller core peripherie devices 32 bit microcontroller bus 8 Bit I/O bus extension headers Intra/ Internet FPGA architectures DRAM 1-128MB Embedded PowerPC PPC403 25..80MHz driver AT-ISA bus analog module Xilinx XC4000 II III CORE-P1 AT-ISA add-on board

  8. Embedded system performance analysis int_reaction PPC403GA/33MHz ASI real-time critical constant (220µs) all caches disabled Int. 30 control task RTOS int_service task change semTake I/O 60 40 80 10 t µs 0 100 200 time used by RTOS time used by the application 50µs (23%) 170µs (77%)

  9. Above 1.0 system is under-sized • Below 1.0 system is over-sized optimal WP • Optimal working point is 33MHz 1.0 without caches 40% • With I-2KB/D-1KB at the optimal WP 40% performance gain with I-2KB/D-1KB with I-16KB/D-8KB • With 8 times larger caches the performance gain at the optimal WP is 350% 33 Embedded system performance analysis • Real-time critical constant is 220µs real-time execution time (used) real-time critical constant (220µs) 1.5 0.5 clock frequency MHz 25 40 80

  10. PPC403GA 33MHz (WP) PPC403GCX 33MHz (WP) without I-Cache without D-Cache without I-Cache without D-Cache without I-Cache with D-Cache without I-Cache with D-Cache with I-Cache without D-Cache with I-Cache without D-Cache with I-Cache with D-Cache with I-Cache with D-Cache task switching time task switching time 100% (87µs) 100% (87µs) -1% +10% +152% +46% +340% +60% interrupt reaction time interrupt reaction time 100% (27µs) 100% (27µs) +12% -4% +205% +50% +43% +377% dhrystones dhrystones 100% (6211) 100% (6211) +11% +10% +187% +207% +529% +455% Analysis results of different cache configurations

  11. Conclusion • The optimal working point is at 33MHz • At the optimal working point 77% of the total execution time (220µs) is consumed by the RTOS • At the optimal working point small caches improve execution performance by 40%, larger caches provide an average gain of 350% • Such enhancements can only be used for non-real-time dependent system services, e.g. network communication via the internet • The cache sizes should be in a range of about 8-16KByte to provide a significant performance gain, if the application is running under the control of a RTOS

  12. Demonstrator: Industrial shelf model

More Related