1 / 15

Paper Report

Paper Report. Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation. Po- Hui Chen, Chung-Ta King, Yuan-Ying Chang, Shau -Yin Tseng Institute of Information Systems and Applications, National Tsing Hua University, Hsinchu , Taiwan, ROC

shilah
Télécharger la présentation

Paper Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paper Report Multiprocessor System-on-Chip Profiling Architecture: Design and Implementation Po-HuiChen, Chung-Ta King, Yuan-Ying Chang, Shau-Yin Tseng Institute of Information Systems and Applications, National TsingHua University, Hsinchu, Taiwan, ROC Department of Computer Science, National TsingHua University, Hsinchu, Taiwan, ROC SoCTechnology Center, Industrial Technology Research Institute, Hsinchu, Taiwan, ROC 2009 15th International Conference on Parallel and Distributed Systems Presenter: Jyun-Yan Li

  2. Abstract • With the growing needs for advanced functionalities in modern embedded systems, it is now necessary to integrate multiple processors in the system, preferably on a single chip, to support the required computing complexity. The problem is that such multiprocessor system-on-chip (MPSoC) architecture is very complex and its internal behavior is very difficult to track. An effective tool for profiling the behavior of the MPSoC system is in great need. Such a tool is very useful during system design for exploiting various options and identifying potential bottlenecks. • In this paper, we introduce the MultiProcessorProfiling Architecture (MPPA) -- a general framework for profiling MPSoCembedded systems. The MPPA framework entails the use of FPGA emulation for the target system, the embedding of performance counters for recording system events, and the development of OS drivers for collecting the profiled data. To demonstrate its use, we show the implementation of an MPSoC emulation system based on Leon3 cores following the MPPA framework. We also show how the MPPA framework and the emulator help the designers to identify performance problems and improve their MPSoC embedded system design.

  3. Related work Debugging Hardware implementation Software implementation Hardware& Software debugging [7] Linux device driver [3] LEON3 & AMBA [11,13,17][10] Low level and system level record with timestamp & send to host Integration MultiProcessor Profiling Architecture (MPPA) Drive the MPPA Count events & recode [16,18,21] Count events This paper

  4. What is the Problem • Support multiprocessor architecture • A processor core can’t access other’s core without special support • Add profiling mechanism • Lead to large modify in original structure • Insert register in architecture • It’s no systematic

  5. Proposal Method • MultiProcessor Profiling Architecture (MPPA) • Event sensing : detecting specific hardware events and notifying the event collectiong • Event collectiong : accumulating event counts from the event sensing

  6. Mechanism

  7. Design flow

  8. Integrate MPPA with LEON3

  9. Hardware implementation enable and disable CVM and clear the EC’s record event occurrences (EC) (CVM) manipulating the counter value and monitoring the input event signal

  10. Software implementation • Using device driver • Small overhead (<5 cache miss) • A set of Power Management Unit (PMU) library for user program • pmu_init : opens device node and memory mapping • pmu_clear : zeros all the counter values and enables performance event counting • pmu_msg : stop monitoring and read event statistics • pmu_end : close device node

  11. Experiment Result • Target platform : Xilinx ML501 FPGA emulation board and about 80 MHz • Twenty-three 32 bits counters • Total gate count increase 0.66% Xilinx Vertex5 FPGA synthesis result of target platform with MPPA architecture

  12. Case study • Case 1

  13. Case study (cont.) • case2

  14. Conclusion • This paper present a MPPA which an efficient, compact, and less intrusive design for performance measurement without dedicated bus. • Using MPPA can help designer to find MPSoC bottlenecks

  15. My comment • Using this idea can insert event sensors into processor or master and slave to detect event • Ex: cache miss, pipeline stall • It not present how to deal with interrupt

More Related