1 / 24

RAMP Stats and Monitoring

RAMP Stats and Monitoring. Derek Chiou , Bill Reinhart, Nikhil Patil with Krste Asanovic and Joel Emer. Goals/Requirements. Provide functionality equivalent to software-based simulators at RAMP speeds Full observability Monitoring for events

lyle-bates
Télécharger la présentation

RAMP Stats and Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RAMP Stats and Monitoring Derek Chiou, Bill Reinhart, Nikhil Patil with KrsteAsanovic and Joel Emer

  2. Goals/Requirements • Provide functionality equivalent to software-based simulators at RAMP speeds • Full observability • Monitoring for events • Triggers for breakpoints, dumping state, etc. • Trace (lossy and lossless) • Aggregate Statistics • Baseline functionality automatically included • Resource efficient • Flexible • Dynamic and static configurablility • Integrated with other infrastructure (component interfaces)

  3. At Least Three Levels of Debug/Monitoring/Stats • Platform/Unmodellevel • Bringing up BEE3/ACP system independent of RAMP code • May be strange bugs that get exercised with RAMP usage model • Simulator (Model) level • Simulator may model target incorrectly • Monitor simulator bandwidth requirements • Could be very different than target machine (e.g., cache of target cache) • Target level • The target machine may have been implemented correctly, but that is incorrect • Stats/tracing of working target • We focus on simulator (model)/target level, but hopefully some will be useful for platform level as well

  4. Bill Reinhart, Nikhil A Patil Statistics/Monitoring Philosophy • Instrument simulator communication (eg, RAMP channels) • Communication mechanisms are logically connected to command network • Can export/examine/change anything being communicated • No need to add additional code if that is sufficient • Turn off to save resources when possible • Introduce additional communication to export where communication does not already exist • Use standard simulator communication (channel) interfaces • Automatically provides target timing information • Connected to null end-point that logically dumps • Pipe to /dev/null • Potentially have non-timed interface, but need time reference point

  5. Simple Example F D E M W State compressor compressor

  6. Required Support • Endpoint support • Channel support • Transport (network) • Naming

  7. User vs Simulator Initiated • Precise User-Initiated • function call to read/write value at specific target time • Can be implemented through timed channels • Commands live in target time • Can be handled logically as a compressor • discard data unless there is a command • How far ahead in target time should pull command be issued? • Too close impact performance but enables precise control • Too far makes reacting to event difficult • Imprecise User-Initiated • Issue a read of state, perform whenever, report back target time • Simulator-initited • dump everything, filter later • can be slow if there is limited bandwidth, storage, filtering

  8. Required Support: Endpoint • Provide state connected to command network • Same interface as a register, drop in replacement • Stats counters, monitor points, control points, etc. • Provide default compressors/filters • Output every n cycles • Output on rollover • Output toggled on signal • Etc.

  9. Required Support: Channel • Optional connection to control network • Use internal buffering to look back in time • Channels implements as circular buffer in BRAM • Far more storage than needed (in general) • Can look back in time • Can save bandwidth by only exporting when needed tail head

  10. Required Support: Transport • Transport • To units: commands, configuration, state changes, etc. • From units: Extract target/host state, statistics, etc. • Could be virtual channel(s) on common physical network • LossyNetwork? • Lossless for now, support lossy at endpoint • QoS? • A ring or a ring of rings for simplicity • Ordered network simpler • helps reconstruction of data outside • But, could result in less efficiency

  11. Required Support: Naming/Tagging • Naming of source of data • Command • read P1.iCache.num_hits stats register translated to actual register • Returned data/Trace entry • Needs to be tagged to indicate data • Each stats entry also includes at least • Target time • Potentially platform/host time for platform/simulator-level debugging

  12. FPGA Debug HariAngepat, Chris Craik and Derek Chiou Electrical and Computer Engineering University of Texas at Austin

  13. Introduction • FPGA Simulators offer magnitude speedup • However, can suffer from traditional hardware issues of limited visibility and debugging challenges • RAMP Simulators face additional complexity to due scalability requirements that may prevent instrumenting every signal in the simulator 1 FPGADBG

  14. Challenge • How to bring software level debugging visibility to RAMP simulators without dramatically increasing resources or affecting timing closure

  15. Challenge • How to bring software level debugging visibility to RAMP simulators without dramatically increasing resources or affecting timing closure • Revisit idea of FPGA state readback in combination with gdb style debug interfaces

  16. Our Technique • 1) Leverage FPGA readback mechanism to exploit as much free visibility as possible • FPGA frame readback exists in V2Pro, V4, V5 • Can sample flip-flop state dynamically • Can sample BRAM/LUT (notes on this later..) • Can use JTAG hardware for latency-tolerant low-resource physical link 1 FPGADBG

  17. Our Technique • 2) Provide a GDB interface that can debug both a software process, as well as a FPGA fabric simultaneously. • Can display FPGA netlist symbols alongside software symbols • Can allow for hybrid CPU/FPGA platform debugging (ie. X86-FSB-FPGA) 1 FPGADBG

  18. FPGADBG Toolflow Software Sources (C/C++/…) Hardware Sources (Verilog/VHDL/…) Compiler Hierarchy Name Preservation Constraints Debug Flags (-g -Ox) Synthesis FPGA Implementation Symbol Table ASCII Disassembly Binary Executable Logic Allocation Map PAR Netlist FPGA Bitstream Dummy! FPGADBG – Interactive extension that enables non-intrusive debugging of software running on FPGA (GDB-Py) Software Debugger (GDB) 1 FPGADBG

  19. Architecture • Designed as set of C/Python libraries • GDB Interface (plugin) • Netlist Frontend (parsing, mapping) • FPGA Backend (board comm, readback) • Hardware library (step control, ICAP readback) • GDB frontend allows connecting to software-based portions of a simulator • Assumes design-level support for step • Allows design to ensure consistent state before sampling 1 FPGADBG

  20. Architecture Target Application User Logic Target OS Target Virtual Machine GDB GDB Plugin Bindings (Python) Domain Step Control Readback Engine (ICAP) FPGADBG Core (Python) FPGA Chip Comm (C) FPGA Readback (C) Netlist Parser (Python) IO Logic (Transport Layer) FPGA Fabric HW/SW Simulation Platform 1 FPGADBG

  21. Netlist Parsing Top myREG regOut dout Bit  6597758 0x005e0200   5758 Block=SLICE_X88Y18 Latch=XQ Net=dout(3)Bit  6597838 0x005e0200   5838 Block=SLICE_X88Y16 Latch=XQ Net=dout(1)Bit  6604350 0x005e0400   5758 Block=SLICE_X88Y18 Latch=YQ Net=dout(2)Bit  6604430 0x005e0400   5838 Block=SLICE_X88Y16 Latch=YQ Net=dout(0) inst "regOut(1)" "SLICE",placed R72C45 SLICE_X88Y16  ,cfg " BXINV::BX BXOUTUSED::#OFF BYINV::BY BYINVOUTUSED::#OFF BYOUTUSED::#OFF ... DXMUX::0 DYMUX::0 F::#OFF F5USED::#OFF FFX:myREG/dout_1:#FF FFX_INIT_ATTR::INIT0 FFX_SR_ATTR::SRLOW FFY:myREG/dout_0:#FF FFY_INIT_ATTR::INIT0 FFY_SR_ATTR::SRLOW    ... ";inst "regOut(3)" "SLICE",placed R71C45 SLICE_X88Y18  ,cfg " BXINV::BX BXOUTUSED::#OFF BYINV::BY BYINVOUTUSED::#OFF BYOUTUSED::#OFF ... DXMUX::0 DYMUX::0 F::#OFF F5USED::#OFF FFX:myREG/dout_3:#FF FFX_INIT_ATTR::INIT0 FFX_SR_ATTR::SRLOW FFY:myREG/dout_2:#FF FFY_INIT_ATTR::INIT0 FFY_SR_ATTR::SRLOW ...“ ; 1 FPGADBG

  22. Netlist Parsing • FPGA toolflow introduces optimizations and naming issues Physical Netlist Alias Detection Vector Merger Hierarchy Construction Frame Address Mapping Symbolic Netlist FPGA Cmd Generator ReadbackCmd Parser Bitstream Reorder FPGA Board Communication ReadbackBitstream

  23. Limitations • Hardware readback has limitations: • RAMs require offline readback due to resource contention issues • FPGA frame span large vertical stripes potentially restricting visibility if some logic cannot be disabled during sampling • Hierarchy must be preserved during synthesis to ensure understandable netnames • Step control requires design-level support 1 FPGADBG

  24. Status & Future Work • Current prototype implements board communication with the XUP Virtex2Pro30 with JTAG-based frame readback • Frontend netlist parser support hierachical node generation, bit vector merging and some support for aliased signals. • Full GDB shell expected to be released in Q1-2009 with support for Virtex5{110/330} 1 FPGADBG

More Related