Overview of Back-end Design for CComp: Assembly Language and Low-level Languages

Overview of Back-end for CComp Zhaopeng Li Software Security Lab. June 8, 2009

Outline • Design Points • Assembly Language : “x86” • Low-level Intermediate Language • Future Work

Design Points • Assembly Language • Target : SCAP with x86 abstract machine; • Maybe next version the program logic is changed; • Or another machine will be used. • Low-level Intermediate Language • Hide some machine-specific things; • Note that, this level can be just a helper to generate code and proof.

Assembly Language : “x86”

Some Topics about “x86” • Data Representation • 32-bit vs “fake” 32-bit • Don’t care how to store the data as bits. • Integer : 4 bytes • Pointer : 4 bytes • Data Alignment • Callee-saved Registers • EBX, ESI, EDI, EBP

Some Topics about “x86” (cont.) • Calling convention: • Parameters passed on the stack, pushed from right to left; Or the first three are passed through register EAX, ECX and EDX, and the other are passed on the stack; • Register EAX, ECX, and EDX are used in the callee; Other registers must be saved on the stack and pop before the return of the function; • Return value is stored in the register EAX ; • Caller cleans up the stack (parameter).

Some Topics about “x86” (cont.) Prolog (typical) Epilog(typical) mov ebp, esp ;reset the stack to ; "clean" away the local variables pop ebp ;restore the original base pointer ret ;return from the function _function: push ebp ;store the old base pointer mov esp, ebp ;make the base ; pointer point to the current stack; location sub x, esp ; x is the size, in bytes leave ret enter x, 0 esp … local variables local variables ebp old ebp old ebp esp old eip old eip old eip esp parameters parameters parameters … … … ebp ebp after the return func. entry after Stack frame setup

Assembly Abstract Machine “m86” • Code Heap (C) • Code storage, • Unchanged during execution • Machine State • Memory (M) • Register File (R) • Instruction Pointer (eip), • current instruction c = C(eip) • Or just use instruction sequence (I)

Program Logic • Based on SCAP • Specification (p, g) • p : State -> Prop • g : State -> State -> Prop • Inference Rules • Well-formed program • Well-formed basic block • Well-formed instruction

Main Objects • Code Generation • Minimize the proof size • Eg. the temporary result should be put in register not on the stack • Assertion • Building (p, g) for each basic block • Generating (p, g) for each program point • Proof • Generating proof for functions/basic blocks • (reusing the proof of VC in source level)

Assertion Relationship f : {(p’, g)} f : {p} //{q} Basic block1 Basic block1 L1 : {p1} L1 : {(p’1,g1)} Basic block2 Basic block2 p’ = trans(p) /\ paramp/\stack-regp g = trans(q) /\ callee-saved-regg /\ stackg p’ 1= trans(p1) /\ paramp 1/\ stack-regp 1 g1 = ? Intermediate Language x86 Assembly Lanuage

Figure Out G R f : {R’(ebp)=R(ebp)/\R’(esp)=R(esp)+4} push ebp mov esp, ebp sub $12, esp R0(ebp) = R(ebp) /\ R0(esp) = R(esp) -4 R’(ebp) = R0(ebp) /\ R’(esp)=R0(esp)+8 R0 R’(ebp) = R(ebp) /\ R0(ebp) = R(ebp) /\ R’(esp)=R(esp)+4 /\ R0(esp) = R(esp) -4 g0 L1 : {g1} Basic block2 • The method: • Get state relation by rule of operational semantics; • Use the g of previous program point; • Do substitution and arithmetic. Leave ret R’

Figure Out G (cont.) R f : {R’(ebp)=R(ebp)/\R’(esp)=R(esp)+4} push ebp mov esp, ebp sub $12, esp R’(ebp) = R0(ebp) /\ R’(esp)=R0(esp)+8 R0 g0 R1 R1(ebp) = R0(esp) /\ R1(esp) = R0(esp) R’(ebp) = M1(R1(ebp)) /\ R’(esp)=R1(esp)+8 R’(ebp) = R0(ebp) /\ R1(ebp) = R0(esp) /\ R’(esp)=R0(esp)+8 /\ R1(esp) = R0(esp) g1 L1 : {g1} Basic block2 • The method: • Get state relation by rule of operational semantics; • Use the g of previous program point; • Do substitution and arithmetic. Leave ret R’

Figure Out G (cont.) R f : {R’(ebp)=R(ebp)/\R’(esp)=R(esp)+4} push ebp mov esp, ebp sub $12, esp R’(ebp) = R0(ebp) /\ R’(esp)=R0(esp)+8 R0 g0 R1 R’(ebp) = M1(R1(ebp)) /\ R’(esp)=R1(esp)+8 g1 R2 R2(ebp) = R1(ebp) /\ R2(esp) = R1(esp)-12 R’(ebp) = M2(R2(ebp)) /\ R’(esp)=R1(esp)+20 L1 : {g1} R’(ebp) = M1(R1(ebp)) /\ R2(ebp) = R1(ebp) /\ R’(esp)=R1(esp)+8 /\ R2(esp) = R1(esp)-12 Basic block2 g2 • The method: • Get state relation by rule of operational semantics; • Use the g of previous program point; • Do substitution and arithmetic. Leave ret R’

Low-level Intermediate Language

Potential Benefits • Hide some machine-specific things; • Some optimizations could be done (optional); • Make the implementation simple and reusable • (*Note that, this level is just a helper to generate code and proof.*) • Only add codes for translating from this level when targeting different assembly logic

Code Generation (optional) • Do some optimizations which do no affect proof, such as: • Branch tunneling • Dead code elimination • Future optimizations • Other low-level optimizations may be done here

Overview of Back-end Design for CComp: Assembly Language and Low-level Languages

Overview of Back-end Design for CComp: Assembly Language and Low-level Languages

Presentation Transcript

Back-End Synthesis*

ASIC Back-End Design

Back-End Synthesis

Back End FPGA

Back-end Timing Models

Back End Downconverter

Back-end Processing

ASIC Back-End Design

BEST Back End

Back End Compiler Panel

Back-end Timing Models

Back-end overview (both analog and digital) for VLBI2010

DAQ Back End Tutorial

Front End to Back end Collections

Compiler Back End Panel

Back End Development Services

Back-end overview (both analog and digital) for VLBI2010

The DSS Back-end

Back-End Bonding

Front End vs Back End of a Compilers

DAQ Back End Tutorial

Overview of Compilation The Compiler BACK End