1 / 27

HAsim Status Update

HAsim Status Update. VSSAD, Intel CSG Group, CSAIL MIT UT Austin Princeton University. Joel Emer Michael Adler Angshuman Parashar Michael Pellauer Murali Vijayaraghavan Nikhil Patil Abhishek Bhattacharjee. Recap: Virtual Platform. Set of Abstractions

karen-lucas
Télécharger la présentation

HAsim Status Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HAsim Status Update VSSAD, Intel CSG Group, CSAIL MIT UT Austin Princeton University Joel Emer Michael Adler Angshuman Parashar Michael Pellauer Murali Vijayaraghavan Nikhil Patil Abhishek Bhattacharjee

  2. Recap: Virtual Platform • Set of Abstractions • Provide common set of functionalities across multiple physical platforms • XUP Board • PCI-express Board • Intel FSB Socket • Bluesim/Vsim • BEE3 • Leverage Asim Plug N Play • Minimize module replacements/recoding while moving across platforms

  3. Virtual Platform Infrastructure FPGA Modules Software Modules Fetch Decode Exe Memory Front Panel Control FuncModel Decode Virtual Platform Platform Interface Front Panel Memory RRR Layers RRR Layers Communication Layers Communication Layers Hardware Software

  4. RRR Specification Language // ---------------------------------------- // create a new service called ISA_EMULATOR // ---------------------------------------- service ISA_EMULATOR { // -------------------------------- // declare services provided by CPU // -------------------------------- server CPU <- FPGA; { method UpdateRegister(in REG_INDEX i, in REG_VALUE v); method Emulate(in INST_INFO i, out INST_ADDR a); }; // --------------------------------- // declare services provided by FPGA // --------------------------------- server FPGA <- CPU; { method SyncRegister(in REG_INDEX i, in REG_VALUE v); }; };

  5. Remote Request/Response RRR specification files ClientStub_ISA_EMULATOR cpu; ... ... cpu.UpdateRegister_MakeRequest( REG_R27, regFile[REG_R27]); ... ... cpu.Emulate_MakeRequest(inst); ... ... targetPC <- cpu.Emulate_GetResponse(); ISA_EMULATOR::UpdateRegister( REG_INDEX i, REG_VALUE v) { regFile[i] = v; } ISA_EMULATOR::Emulate( INST_INFO inst) { // emulate the instruction return target_PC; } User Code User Code Client Stub Server Stub Communication Layers (Runtime System) FPGA CPU

  6. Virtual Platform/RRR Status Update • Software + Hardware, Client + Server Stubs • Multiple Arguments for method calls • Auto-generation of Soft Connections through Platform Interface, and Remote Stubs • PCI-Express Physical Platform • Physical Channel implementation using CSRs • Soft Reset • Several services in HAsim • Very positive feedback from developers

  7. HAsim: MIPS  Alpha • Motivation • Couldn’t find any Full System MIPS simulator with multi-processor + large memory support • HAsim-Alpha • M5 “running” in software • Target Memory Image • Syscall Emulation • Other instructions not implemented on FPGA (e.g. FP currently) • Functional + Timing model on FPGA

  8. HAsim-Alpha Highlights • Implemented Alpha Functional Model • Primary changes • ISA spec • Instruction Format + Queries • Datapath • Execution Semantics • Unchanged • Dependency logic • Register File • Memory Subsystem (incl. Store Buffer) • Multiple timing models • Unpipelined • 5 Stage • In order with caches • OoO • Running long Alpha programs (e.g. SPEC2k)

  9. Time Old Instruction Emulation with Cache Flush FPGA Execute Execute Execute FunctionalCache Done Flush … … Emulation Done Sync Registers Sync Registers Emulate Instruction RRRLayer … Write Line MemoryServer EmulationServer Software Instruction Simulator

  10. Time Hybrid Instruction Emulation FPGA Execute Execute FunctionalCache … … Sync Registers Sync Registers Emulate Instruction Emulation Done RRRLayer Write Back orInvalidate Done Write Line Ack MemoryServer EmulationServer EmulationServer Software Instruction Simulator Instruction Simulator

  11. RRR ISA Emulation Specification service ISA_EMULATOR { server sw (cpp, method) <- hw (bsv, connection) { method sync(in RNAME[RNAME_BITS] rname,in RVAL[RVAL_BITS] rval); method emulate(in INST[INST_BITS] inst,in ISA_ADDRESS[FUNCP_ISA_V_ADDR_SIZE] pc,out ISA_ADDRESS[FUNCP_ISA_V_ADDR_SIZE] newPc); }; server hw (bsv, connection) <- sw (cpp, method) { method sync(in RNAME[RNAME_BITS] rname,in RVAL[RVAL_BITS] rval); }; };

  12. Time Dynamic Simulator Configuration Param Node Param Node Param Node Param Node FPGA DynamicParam Controller DynamicParam Controller EnableFunctional Cache? RRRLayer Set Value Done Set Parameters Done? Software

  13. RRR Dynamic Parameter Specification service PARAMS { // // Send one dynamic parameter ID and value to the hardware. // An ACK is returned to guarantee that the parameter has // been received. // server hw (bsv, connection) <- sw (cpp, method) { method sendParam(in UINT32[32] pname, in UINT64[64] pval, out UINT8[8] ack); }; };

  14. Other Uses of RRR • Stats • Events • Assertions • Control Messages • Streams

  15. Modeling Back-Pressure using A-Ports Producer Interface: Bool canSend() Do we have enough credits? Action enq(Maybe#(t) x) Send data or invalid. Action pass() Indicate end of cycle A-Port Producer Consumer if (canSend) enq(x) else pass() Consumer Interface: Bool canReceive() Is data available? AV#(Data) pop() Receive data Action done (cred) Indicate end of cycle, and send back credits Data A-Port Producer Consumer Credits A-Port Credit Port if (canReceive) x <- pop() done(x) No buffering present within the Ports

  16. Structures using Credit Ports Model FIFOs using Credit Ports Data (A1) Producer Consumer Credits (A1) “Stall ports”: A stall down the pipeline doesn’t get combinationally propagated Data (A1) Producer Consumer Credits (A0) “Pipeline ports”: The pipeline registers in traditional pipelines

  17. Caches • Functional Partition • Functional Cache • Target memory image data from M5 • Functional TLB • Target V  P translations • Timing Partition • I and D Cache models • Attempting to unify interface for all caches

  18. Timing Partition Cache Interface MEMORY stage • Cache Req Interface: • LOAD • STORE • PREFETCH • INVALIDATE LINE • INVALIDATE ALL • KILL ALL • FLUSH LINE • FLUSH ALL Immediate Response Delayed Response Request L1 Cache • Cache Response: • Immediate Response: • HITMISS SERVICING • MISS RETRY • Delayed Response: • MISS RESPONSE MAIN MEMORY

  19. Ongoing/Future Work • Virtual Platform Infrastructure • More Sophisticated Type System • Virtual Memory for FPGA • Share page tables with software application • Cache V  P translations in a TLB • FPGA requests user software for translations • Software kernel must shootdown FPGA TLB when mapping changes • Note: distinct from HAsim Functional TLB • Functional Model • Multiple Contexts • Ultimate goal: Run a full system • Timing Model • Multiple Contexts • Realistic Microarchitecture

  20. Backup

  21. “Connection”-style Stubs typedef struct {...} REG_INFO deriving (Bits, Eq); Connection_Send#(REG_INFO) link <- mkConnection_Send( “ISA_EMULATOR_UpdateRegister”); link.send(reg_info); User Code Connections: Per-method or Per-service? hand-written Soft connections How does Platform Interface get the RRR types? Connection_Receive#(REG_INFO) link <- mkConnection_Receive( “ISA_EMULATOR_UpdateRegister”); ClientStub_ISA_EMULATOR <- mkClient... let a = link.receive(); stub.makeRequest_UpdateRegister(a); Platform Interface auto-generated interface ClientStub_ISA_EMULATOR; method Action makeRequest_UpdateRegister( REG_INFO reg_info); endinterface Stub auto-generated RRR Stack

  22. typedef struct {...} REG_INFO deriving (Bits, Eq); `include “remote_client_stub_ISA_EMULATOR.bsh” ClientStub_ISA_EMULATOR stub <- mkClientStub_ISA_EM... stub.makeRequest_UpdateRegister(reg_info); User Code hand-written Connection_Receive#(Bit#(70)) link <- mkConnection_Send(“ISA_EMULATOR_UpdateRegister”); method Action makeRequest_UpdateRegister( REG_INFO reg_info); link.send(pack(reg_info)); endmethod Remote Stub auto-generated Soft connections Connection_Receive#(Bit#(70)) link <- mkConnection_Receive(“ISA_EMULATOR_UpdateRegister”); ClientStub_ISA_EMULATOR stub <- mkClientStub_ISA_EM... let a = link.receive(); stub.makeRequest_UpdateRegister(a); Platform Interface auto-generated interface ClientStub_ISA_EMULATOR; method Action makeRequest_UpdateRegister( Bit#(70) reg_info); endinterface Stub auto-generated RRR Stack

  23. Hello, World! hello.bsv module mkSystem#(LowLevelPlatformInterface llpi)(); Streams streams <- mkStreams(llpi); Reg#(Bool) done <- mkReg(False); rule hello (!done); streams.makeRequest(`STREAMS_MESSAGE_HELLO); done <= True; endrule endmodule hello.dict def STREAMS.MESSAGE.HELLO "Hello, World!\n";

  24. RRR Memory Interface Specification service FUNCP_MEMORY { server sw (cpp, method) <- hw (bsv, connection) { method Load (in MEM_ADDRESS_RRR[64] addr, out MEM_VALUE[FUNCP_ISA_INT_REG_SIZE] data); method LoadCacheLine (in MEM_ADDRESS_RRR[64] addr, out MEM_CACHELINE[FUNCP_CACHELINE_BITS] data); method Store(in MEM_STORE_INFO_RRR[MEMORY_STORE_INFO_SIZE] info); method StoreCacheLine(in MEM_STORE_CACHELINE_INFO_RRR[MEMORY_STORE_CACHELINE_INFO_SIZE] info); // Store cache line with ACK method StoreCacheLine_Sync(in MEM_STORE_CACHELINE_INFO_RRR[MEMORY_STORE_CACHELINE_INFO_SIZE] info, out UINT32[32] ack); method VtoP(in MEM_VALUE[FUNCP_ISA_INT_REG_SIZE] va, out MEM_ADDRESS_RRR[64] pa); }; server hw (bsv, connection) <- sw (cpp, method) { method Invalidate(in MEM_INVAL_CACHELINE_INFO_RRR[96] info, out UINT32[32] ack); method InvalidateAll(in UINT32[32] req, out UINT32[32] ack); }; };

  25. Timing Partition Cache Interface MEMORY stage • Cache Req Interface: • LOAD • STORE • PREFETCH • INVALIDATE LINE • INVALIDATE ALL • KILL ALL • FLUSH LINE • FLUSH ALL Immediate Response Delayed Response Request L1 Cache • Cache Response: • Immediate Response: • HIT • HIT SERVICINGMISS SERVICING • MISS RETRY • Delayed Response: • MISS RESPONSEHIT RESPONSE MAIN MEMORY

  26. Credit Ports Producer Interface: Bool canSend() Do we have enough credits? Action enq(Maybe#(t) x) Send data or invalid. Action pass() Indicate end of cycle Data Producer Consumer Credits if (canSend) enq(x) else pass() Consumer Interface: Bool canReceive() Is data available? AV#(Data) pop() Receive data Action done (cred) Indicate end of cycle, and send back credits Data A-Port Producer Consumer Credits A-Port if (canReceive) x <- pop() done(x) No buffering present in the Ports

  27. Structures using Credit Ports • Since buffering is not modeled in credit ports using FIFOs, any sort of buffer can sit on the consumer side • Reduced the code size of timing models drastically Data Consumer Completion Buffer Producer Credits

More Related