Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Mobility, Security, andProof-Carrying Code Peter LeeCarnegie Mellon University Lecture 4 July 13, 2001 LF, Oracle Strings, and Proof Tools Lipari School on Foundations of Wide Area Network Programming

“Applets, Not Craplets” A Demo

Java binary Written in OCaml ~52KB, written in C Native code Proof Cedilla Systems Architecture Special J Ginseng Code producer Host

Cedilla Systems Architecture Java binary Special J Native code VCGen Annotations VC Axioms Proof checker Proof Code producer Host

Cedilla Systems Architecture Java binary Native code Certifying compiler VCGen Annotations VC VCGen Axioms VC Axioms Proof generator Proof checker Proof Code producer Host

Class file Class file Native code Java Virtual Machine Java Verifier Checker Proof-carrying code JVM JNI

Show either the Mandelbrot or NBody3D demo.

Crypto Test Suite Results[Cedilla Systems] sec On average, 158% faster than Java, 72.8% faster than Java with a JIT.

Java Grande Suite v2.0[Cedilla Systems] sec

Java Grande Bench Suite[Cedilla Systems] ops

Ginseng ~15KB, roughly similar to a KVM verifier (but with floating-point). VCGen ~4KB, generic. Checker ~19KB, declarative and machine-generated. Safety Policy Dynamic loading Cross-platform support ~22KB, some optional.

Practical Considerations

Trusted Computing Base • The trusted computing baseis the software infrastructure that is responsible for ensuring that only safe execution is possible. • Obviously, any bugs in the TCB can lead to unsafe execution. • Thus, we want the TCB to be simple, as well as fast and small.

VCGen’s Complexity • Fortunately, we shall see that proofchecking can be quite simple, small, and fast. • VCGen, at core, is also simple and fast. • But in practice it gets to be quite complicated.

VCGen’s Complexity • Some complications: • If dealing with machine code, then VCGen must parse machine code. • Maintaining the assumptions and current context in a memory-efficient manner is not easy. • Note that Sun’s kVM does verification in a single pass and only 8KB RAM!

a == b a := x c := x a == c a := y c := y f(a,c) VC Explosion a=b => (x=c => safef(y,c)  x<>c => safef(x,y))  a<>b => (a=x => safef(y,x)  a<>x => safef(a,y)) Exponential growth in size of the VC is possible.

VC Explosion a == b (a=b => P(x,b,c,x)  a<>b => P(a,b,x,x))  (a’,c’. P(a’,b,c’,x) => a’=c’ => safef(y,c’)  a’<>c’ => safef(a’,y)) a := x c := x INV: P(a,b,c,x) a == c a := y c := y Growth can usually be controlled by careful placement of just the right “join-point” invariants. f(a,c)

Stack Slots • Each procedure will want to use the stack for local storage. • This raises a serious problem because a lot of information is lost by VCGen (such as the value) when data is stored into memory. • We avoid this problem by assuming that procedures use up to 256 words of stack as registers.

Exercise • 8. Just as with loop invariants, our actual join-point invariants includes a specification of the registers that might be modified since the dominating block. • Why might this be a useful thing to do? Why might it be a bad thing to do?

Callee-save Registers • Standard calling conventions dictate that the contents of some registers be preserved. • These callee-save registers are specified along with the pre/post-conditions for each procedure. • The preservation of their values must be verified at every return instruction.

Introduction to Efficient Representation and Validation of Proofs

High-Level Architecture Code Verification condition generator Checker Explanation Agent Safety policy Host

Goals • We would like a representation for proofs that is • compact, • fast to check, • requires very little memory to check, • and is “canonical” (in the sense of accommodating many different logics without requiring a total reimplementation of the checker).

Three Approaches • 1. Direct representation of a logic. • 2. Use of a Logical Framework. • 3. Oracle strings. • We will reject (1). • Today we introduce (2) and (3).

Logical Framework • For representation of proofs we use the Edinburgh Logical Framework (LF).

Reynolds’ Example Skip?

Formal Proofs • Write “x is a proof of P” as x:P. • Examples of predicates P: • (for all A, B) A and B => B and A • (for all x, y, z) x < y and y < z => x < z • What do the proofs look like?

A proof, written in our compact notation. Inference Rules • We can write proofs by stitching together inference rules. • An example inference rule: • If we have a proof x of P and a proof y of Q, then x and y together constitute a proof of P  Q. • Or, more compactly: • if x:P, y:Q then (x,y):P*Q.

More Inference Rules • Another inference rule: • Assume we have a proof x of P. If we can then obtain a proof b of Q, then we have a proof of P  Q. • if[x:P] b:Q then fn (x:P) => b : P  Q. • More rules: • if x:P*Q then fst(x):P • if y:P*Q then snd(y):Q

Types and Proofs • So, for example: • fn (x:P*Q) => (snd(x), fst(x)) : P*Q  Q*P • We can develop a full metalanguage based on this principle of “proofs as programs”. • Typechecking gives us proofchecking! • Codified in the LF language.

LFi Skip?

LF Example This classic example illustrates how LF is used to represent the terms and rules of a logical system.

LF Example in Elf Syntax The same example, but using Pfenning’s Elf syntax. exp : type pred : type pf : pred -> type true : pred /\ : pred -> pred -> pred => : pred -> pred -> pred all : (exp -> pred) -> pred truei : pf true andi : {P:pred} {R:pred} pf P -> pf R -> pf (/\ P R) andel : {P:pred} {R:pred} pf (/\ P R) -> pf P impi : {P:pred} {R:pred} (pf P -> pf R) -> pf (=> P R) alli : {P:exp -> pred} ({X:exp} pf (P X)) -> pf (all P) alle : {P:exp -> pred} {E:exp} pf (all P) -> pf (P E)

LF as a Proof Representation • LF is canonical, in that a single typechecker for LF can serve as a proofchecker for many different logics specified in LF. [See Avron, et al. ‘92] • But the efficiency of the representation is poor.

Size of LF Representation • Proofs in LF are extremely large, due to large amounts of repetition. • Consider the representation of PPP for some predicate P: • The proof of this predicate has the following LF representation: (=> P (/\ PP)) (impi P (/\ PP) ([X:pf P] andi PP x x))

Checking LF • The nice thing is that typechecking • is enough for proofchecking. [The theorem is in the LF paper.] • But the proofs are extremely large. (impi P (/\ PP) ([X:pf P] andi PP X X)) : pf (=> P (/\ PP))

Implicit LF • A dramatic improvement can be achieved by using a variant of LF, called Implicit LF, or LFi. • In LFi, parts of the proof can be replaced by placeholders. (impi * * ([X:*] andi * * X X)) : pf (=> P (/\ PP))

Soundness of LFi • The soundness of the LFi type system is given by a theorem that states: • If, in context , a term M has type A in LFi (and  and A are placeholder-free), then there is a term M’ such that M’ has type A in LF.

Typechecking LFi • The typechecking algorithm for LFi is given in [Necula & Lee, LICS98]. • A key aspect of the algorithm is that it avoids repeated typechecking of reconstructed terms. • Hence, the placeholders save not only space, but also time.

Effectiveness of LFi • In experiments with PCC, LFi leads to substantial reductions in proof size and checking time. • Improvements increase nonlinearly with proof size.

The Need for Improvement • Despite the great improvement of LFi, in our experiments we observe that LFi proofs are 10%-200% the size of the code.

How Big is a Proof? • A basic question is how much essential information is in a proof? • In this proof, • there are only 2 uses of rules and in each case they were the only rule that could have been used. (impi * * ([X:*] andi * * x x)) : pf (=> P (/\ PP))

Improving the Representation • We will now improve on the compactness of proof representation by making use of the observation that large parts of proofs are deterministically generated from the inference rules.

Additional References • For LF: • Harper, Honsell, & Plotkin. A framework for defining logics. Journal of the ACM, 40(1), 143-184, Jan. 1993. • Avron, Honsell, Mason, & Pollack. Using typed lambda calculus to implement formal systems on a machine. Journal of Automated Reasoning, 9(3), 309-354, 1992.

Additional References • For Elf: • Pfenning. Logic programming in the LF logical framework. Logical Frameworks, Huet & Plotkin (Eds.), 149-181, Cambridge Univ. Press, 1991. • Pfenning. Elf: A meta-language for deductive systems (system description). 12th International Conference on Automated Deduction, LNAI 814, 811-815, 1994.

Oracle-Based Checking

Necula’s ExampleSyntax of Girard’s System F ty : type int : ty arr : ty -> ty -> ty all : (ty -> ty) -> ty exp : type z : exp s : exp -> exp lam : (exp -> exp) -> exp app : exp -> exp -> exp of : exp -> ty -> type

Necula’s ExampleTyping Rules for System F tz : of z int ts : {E:exp} of E int -> of (s E) int tlam : {E:exp->exp} {T1:ty} {T2:ty} ({X:exp} of X T1 -> of (E X) T2) -> of (lam E) (arr T1 T2) tapp : {E1:exp} {E2:exp} {T:ty} {T2:ty} of E1 (arr T2 T) -> of E2 T2 -> of (app E1 E2) T tgen : {E:exp} {T:ty->ty} ({T1:ty} of E (T T1)) -> of E (all T) tins : {E:exp} {T:ty->ty} {T1:ty} of E (all T) -> of E (T T1)

LF Representation • Consider the lambda expression • It is represented in LF as follows: (f.(fx.x) (f 0)) y.y app (lam [F:exp] app (app F (lam [X:exp] X)) (app F 0)) (lam [Y:exp] Y)

Necula’s Example • Now suppose that this term is an applet, with the safety policy that all applets must be well-typed in System F. • One way to make a PCC is to attach a typing derivation to the term.

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Presentation Transcript

Proof Carrying Code

Carnegie Mellon University

Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University

An Introduction to Proof-Carrying Code Peter Lee Carnegie Mellon University

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Language-Based Security Proof-Carrying Code

The ConCert Project Peter Lee Carnegie Mellon University

Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University

Proof-Carrying Code

Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University

Proof Carrying Code

Proof Carrying Code

Peter A. Dinda Carnegie Mellon University

Carnegie Mellon University

Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University