1 / 53

From OODL Interpreters to VM’s and Beyond

From OODL Interpreters to VM’s and Beyond. Jonathan Bachrach MIT AI Lab. Outline. Material mostly from “Lisp In Small Pieces” Metacircular Interpreters CPS Interpreters Fast Interpreters VM’s and Compilation Speeding up VM’s VVM’s In Language VM’s Assignment #2. Lisp In Small Pieces.

Thomas
Télécharger la présentation

From OODL Interpreters to VM’s and Beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From OODL Interpreters to VM’s and Beyond Jonathan Bachrach MIT AI Lab

  2. Outline • Material mostly from “Lisp In Small Pieces” • Metacircular Interpreters • CPS Interpreters • Fast Interpreters • VM’s and Compilation • Speeding up VM’s • VVM’s • In Language VM’s • Assignment #2

  3. Lisp In Small Pieces • Written by Christian Queinnec • Highly recommend looking at this book • Basis for a fair amount of today’s lecture

  4. Q: Why Interpreters?

  5. A: Why Interpreters? • Semantics of language • is manifest • is measurable • Allows more • Dynamism • Extensibility

  6. Metacircular Interpreters • Write interpreter in same language using own special forms • Complexity of resulting interpreter is a measure of expressivity and simplicity of language • Dream is to get a compiler for free from this!

  7. (dv <frame> <assocs>) (dv <env> (isa <any>)) (slot <env> (env-next <env>)) (slot <env> (env-frame <env>) (isa <frame>)) (dv $empty-env (isa <env>)) (dm env-define-binding! ((env <env>) (name <sym>) (value <any>)) (set (elt (env-frame frame) name) value)) (dm env-extend! ((env <env>) (names <lst>) (values <lst>) => <frame>) (let ((new-env (isa <env> (set env-next env)))) (do2 (fun (name value) (set (elt (env-frame new-env) name) value)) names values) new-env)) (dm env-binding-value ((env <env>) (name <sym>) => <any>) (if (== env $empty-env) nul (let ((value (elt (env-frame env) name))) (if (== value nul) (env-binding-value (env-next env) name) value)))) Simple Environment

  8. Proto in Proto (dm evol ((e <any>) (env <env>)) e) (dm evol ((e ‘()) (env <env>)) e) (dm evol ((e <sym>) (env <env>)) (env-binding-value env e)) (dm evol ((e <lst>) (env <env>)) (evol-list (head e) e env)) (dm evol-list ((_ 'quote) (e <lst>) (env <env>)) (sexpr-text-of-quotation e))

  9. Q: What’s the Problem? • Why might a metacircular interpreter not be acceptable?

  10. A: What’s the Problem? • Creates bootstrapping issues • Very inefficient

  11. Tail Position / Calls • If an expression is in tail position then it is not necessary to restore the environment • (fun (x) (f x)) • (fun (x) (if (f x) 5 6)) • Tail calls are calls that are in tail position and don’t save/restore environment • Bind arguments • Goto function

  12. Continuation Passing Style (CPS) • CPS = Continuation Passing Style • Explicitly telling a computation where to send the result of that computation • Once computation is complete, executor applies receiver to result rather than returning it as a normal value • Example • (fun (x) (f x)) => (fun (x k) (f’ x k))

  13. CPS Interpreter • Control is more explicit by passing continuations through interpreter • Allows one to code complicated control forms without built-in support. • E.g., call/cc

  14. CPS Interpreter Example (dm evol-list ((_ ‘quote) (e <lst>) (env <env>) (k <fun>)) (k (sexpr-text-of-quotation e))) (dm evol-list ((_ 'if) (e <lst>) (env <env>) (k <fun>)) (evol (sexpr-if-test e) env (fun (e) (if e (evol (sexpr-if-then e) env k) (evol (sexpr-if-else e) env k))))) (dm evol-seq+ ((e+ <lst>) (env <env>) (k <fun>)) (if (empty? (tail e*)) (evol (head e*) env k) (evol (head e*) env (fun (e) (evol-seq+ (tail e*) env k))))) (dm evol-seq ((e* <lst>) (env <env>) (k <env>)) (if (empty? e*) (k #f) (evol-seq+ e* env k))) (dm evol-list ((_ 'seq) (e <lst>) (env <env>) (k <fun>)) (evol-seq (sexpr-begin-actions e) env k))

  15. CPS Wins (dm evol-list ((_ 'lab) (e <lst>) (env <env>) (k <fun>)) (let ((exit (make-function '(x) `(,k x) env))) (evol-seq (sexpr-block-body e) (env-extend! env (lst (sexpr-block-name e)) (lst exit)) k)))

  16. CPS Loses • Problem is that some special forms like fin (cf. Lisp’s unwind-protect) are difficult* to implement without resorting to built-in support • *It appears that they are possible with enough continuations • Conses (allocates) lots of closures for continuations during interpretation

  17. Interpreter Pretreatment • Lighten up environment • Invent beginnings of instruction set • Implemented in terms of thunks (zero parameter functions) • Threaded as tree of calls to thunks • Reject environment (rep’d as global *env*) • Keep track of tail position

  18. Globals stored in vector Locals are in list of inlined frames Model environment structure in static environment Resolve environment accesses to fixed or relative offsets Globals are fixed offset Locals are two relative offsets N.B., Can do even better by flattening ribcage environment into a stack Fast Environment

  19. Pretreatment • Pretreatment of programs is handled by the function meaning (compile) • Converts into thunks that will serve as threading mechanism • Each thunk can be thought of as an address to which we simply jump to execute

  20. Inventing Instructions • Use combinators which are like instructions for an abstract machine: (dm CONSTANT (value) (fun () value) ) (dm meaning-list ((_ 'quote) (e <lst>) r t?) (fun () (CONSTANT (sexpr-text-of-quotation e)) ))

  21. (fun (n f k) (if (= n 0) (k 1) (f (- n 1) f (fun (r) (k (* n r)))))) (FIX-CLOSURE (ALTERNATIVE (CALL2 #<=> (SHALLOW-ARGUMENT-REF 0) (CONSTANT 0)) (TR-REGULAR-CALL (SHALLOW-ARGUMENT-REF 2) (STORE-ARGUMENT (CONSTANT 1) (ALLOCATE-FRAME 1) 0) (TR-REGULAR-CALL (SHALLOW-ARGUMENT-REF 1) (STORE-ARGUMENT (CALL2 #<-> (SHALLOW-ARGUMENT-REF 0) (CONSTANT 1)) (STORE-ARGUMENT (SHALLOW-ARGUMENT-REF 1) (STORE-ARGUMENT (FIX-CLOSURE (TR-REGULAR-CALL (DEEP-ARGUMENT-REF 1 2) (STORE-ARGUMENT (CALL2 #<*> (DEEP-ARGUMENT REF 1 0) (SHALLOW-ARGUMENT-REF 0)) (ALLOCATE-FRAME 1) 0)) 1) (ALLOCATE-FRAME 2) 2) 1) 0))) 3) Interpreter Pretreatment Output

  22. Control Forms (dm ALTERNATIVE (m1 m2 m3) (fun () (if (m1) (m2) (m3))) ) (dm meaning-list ((_ 'if) (e <lst>) r t?) (let ((m1 (meaning (sexpr-if-test e) r #f)) (m2 (meaning (sexpr-if-then e) r t?)) (m3 (meaning (sexpr-if-else e) r t?)) ) (ALTERNATIVE m1 m2 m3) ) )

  23. Sequence (dm SEQUENCE (m m+) (fun () (m) (m+)) ) (dm meaning-sequence (e+ r t?) (if (empty? e+) (static-wrong "Illegal syntax: (seq)") (if (empty? (tail e+)) (meaning (head e+) r t?) (meaning*-sequence (head e+) (tail e+) r t?)))) (dm meaning*-sequence (e e+ r t?) (let ((m1 (meaning e r #f)) (m+ (meaning-sequence e+ r t?)) ) (SEQUENCE m1 m+) ) )

  24. Closures (dm FIX-CLOSURE (m+ arity) (let ((arity+1 (+ arity 1))) (fun () (loc ((the-function (v* sr) (if (= (frame-length v*) arity+1) (seq (set *env* (sr-extend* sr v*)) (m+) ) (wrong "Incorrect arity") ) ) (make-closure the-function *env*) ) ) ) (dm meaning-fix-abstraction (n* e+ r t?) (let ((arity (len n*)) (r2 (r-extend* r n*)) (m+ (meaning-sequence e+ r2 #t)) ) (FIX-CLOSURE m+ arity) ) )

  25. Calls (dm TR-REGULAR-CALL (m m*) (fun () (let ((f (m))) (invoke f (m*)) ) ) ) (dm REGULAR-CALL (m m*) (fun () (let ((f (m)) (v* (m*)) (sr *env*) (result (invoke f v*)) ) (set *env* sr) result ) ) ) (dm meaning-application (e e* r t?) (let ((m (meaning e r #f)) (m* (meaning* e* r (len e*) #f)) ) (if t? (TR-REGULAR-CALL m m*) (REGULAR-CALL m m*)) ) )

  26. Argument Processing (dm meaning* (e* r size t?) (if (empty? e*) (meaning-no-argument r size t?) (meaning-some-arguments (head e*) (tail e*) r size t?))) (dm meaning-no-argument (r size t?) (ALLOCATE-FRAME size)) (dm meaning-some-arguments (e e* r size t?) (let ((m (meaning e r #f)) (m* (meaning* e* r size t?)) (rank (- size (+ (len e*) 1)))) (STORE-ARGUMENT m m* rank)))

  27. Q: Where Do We Go From Here?

  28. VM Introduction • Portable • Small • Factors Implementation

  29. VM Design Parameters • Space • Runtime • Delivery • Efficiency • Runtime • Compilation • Analyzability • Inferred types • Control flow info

  30. From Interpreter to VM • Linearize • Invent registers • Invent stack • Code generation through concatenation • Byte code instructions

  31. Byte Code Machine

  32. (SHALLOW-ARGUMENT-REF j) (DEEP-ARGUMENT-REF I j) (DEEP-ARGUMENT-SET! I j m) (CHECKED-GLOBAL-REF I) (CONSTANT V) (SEQUENCE M M+) (FIX-LET M* M+) (CALL1 ADDRESS M1) (CALL3 ADDRESS M1 M2 M3) (REGULAR-CALL M M*) (ALLOCATE-FRAME SIZE) M,m1,m2,m3,m+,v are values Rank,arity,size,I,j are ints Address is predefined funct (PREDEFINED I) (SHALLOW-ARGUMENT-SET J M) (GLOBAL-REF I) (GLOBAL-SET! I M) (ALTERNATIVE M1 M2 M3) (TR-FIX-LET M* M+) (CALL0 ADDRESS) (CALL2 ADDRESS M1 M2) (FIX-CLOSURE M+ ARITY) (TR-REGULAR-CALL M M*) (STORE-ARGUMENT M M* RANK) “Instructions”

  33. Introduce *val* • Break implicit communication between instructions by introducing result register (dm CONSTANT (value) (fun () (set *val* value)))

  34. Inventing Stack (dm STORE-ARGUMENT (m m* rank) (fun () (m) (let ((v *val*)) (m*) (set (frame-argument *val* rank) v)))) => (dm STORE-ARGUMENT (m m* rank) (fun () (m) (stack-push *val*) (m*) (let ((v (stack-pop))) (set (frame-argument *val* rank) v))))

  35. Linearization (dm SHALLOW-ARGUMENT-SET! (j m) (fun () (m) (set (frame-argument! *env* j) *val*))) => (dm SHALLOW-ARGUMENT-SET! (j m) (cat m (SET-SHALLOW-ARGUMENT! J))) (dm SET-SHALLOW-ARGUMENT! (j) (lst (fun () (set (frame-argument! *env* j) *val*)))))

  36. (dm REGULAR-CALL (m m*) (cat m (PUSH-VALUE) m* (POP-FUNCTION) (PRESERVE-ENV) (FUNCTION-INVOKE) (RESTORE-ENV))) (dm PUSH-VALUE () (lst (fun () (stack-push *val*)))) (dm POP-FUNCTION () (lst (fun () (set *fun* (stack-pop)))) (dm PRESERVE-ENV () (lst (fun () (stack-push *env*)))) (dm FUNCTION-INVOKE () (lst (fun () (invoke *fun*)))) (dm RESTORE-ENV () (lst (fun () (set *env* (stack-pop))))) (dm run () (let ((inst (head *pc*))) (set *pc* (tail *pc*)) (inst) (run))) Inventing PC

  37. VM Dispatch Loop (dm run () (let ((instruction (head *pc*))) (set *pc* (tail *pc*)) (instruction) (run))) • Already removed decode overhead by translating incoming bytecodes to direct calls • Still extra overhead • Return from instruction • Loop

  38. Reducing Dispatch Overhead • Convert back to threaded code • Each instruction is responsible for calling next one: (dm CONSTANT (value) (fun () (set *val* value) (let ((instruction (head *pc*))) (set *pc* (tail *pc*)) (instruction)) • Example Push3 on PowerPC takes • 2 instrs to do real push3 • 11 total instrs for pc loop version • 5 total instrs for threaded version • Can do same with fast interpreter • Use CPS threading • Consing not an issue if closures created during pretreatment

  39. Hand Selected Macro Instructions • Still have overhead especially for simple instructions • Can do better by combining oft-occurring sequences of instructions • amortizing the decode/dispatch overhead • customizing resulting instruction to remove any other interpretive overhead • shrinks byte-code size

  40. Automatic Macro Instruction Detection • Can run statistics off-line to determine best macros • Can also run statistics on-line

  41. VVM’s • Tower of VM’s • Record expansions • Concatenate • Peephole optimize at each level • Specialization of instructions • Now that actual arguments are available • Between instruction optimizations

  42. Boxing/Unboxing • Objects all the way down • Integers are objects and must be self-identifying • One way to encode them is to wrap integer data in a “box” which also points to integer prototype • Integer ops must then first unbox before running machine level integer op and finally box result • We’ll be talking about alternative representations later

  43. (fun (x y) (* (+ x 1) y)) => (SHALLOW-ARGUMENT-REF 0 R0) (CONSTANT 1 R1) (I+ R0 R1 R2) (SHALLOW-ARGUMENT-REF 1 R3) (I* R2 R3 R4) (RETURN R4) (SHALLOW-ARGUMENT-REF 0 R0) (CONSTANT 1 R1) (UNBOX-INT R0 R2) (UNBOX-INT R1 R3) (PI+ R2 R3 R4) (BOX-INT R4 R5) (SHALLOW-ARGUMENT-REF 1 R6) (UNBOX-INT R5 R7) (UNBOX-INT R6 R8) (PI* R7 R8 R9) (BOX-INT R9 R10) (RETURN R9) VVM Example

  44. R/VVM’s • Provides extensible VM infrastructure • Permits the running of many different instruction sets customized to different languages • Examples • Greg Sullivan’s DVM • Ian Piumarta’s VVM

  45. Bytecodes can be Fasterthan Direct Execution • When • There is a high price for cache misses • VM can fit inside cache • The interpretive overhead of the bytecodes is small (i.e., they are high-level) • But • There will be cases where interpretive overhead is high • Suggests hybrid approach • Demand driven translation • Analysis determines interpretive overhead

  46. In C VM’s + Portable - Hard to interoperate with native code - Needs own versions of special forms - Needs its own FFI to C

  47. In Language VM’s + Write VM in Same Language - Challenging to get high performance type system can be overly restrictive - Tough Bootstrap

  48. IDVM • In Dylan Virtual Machine was developed at Harlequin by Tony Mann and Eliot Miranda • Consider IPVM

  49. Example (dm is-numeric? ((x <int>)) #t) => (dm IPVM-assign-constant-to-result ((vm-code <vec>) (ip <int>) result (vars …)) (let ((new-result (elt vm-code ip)) (new-ip (+ ip 1)) (next-inst (elt vm-code new-ip))) (apply next-inst vm-code (+ new-ip 1) new-result vars))) (dm IPVM-return ((vm-code <vec>) (ip <int>) result (vars …)) result) (fun ((x <int>)) (let ((ip 0) (vm-code (vec IDVM-assign-constant-to-result #f IDVM-return)) (first-inst (elt vm-code ip))) (first-inst vm-code (+ ip 1) #f a)))

  50. The Trick • Considerable effort was expending optimizing the following idiom: (dm foo (a b c (thingies …)) … (apply somefunction a b c thingies)) • Converted into a tail call with thingies being stack allocated • Each IPVM instruction tail calls the next with code, pc, and locals as arguments.

More Related