520 likes | 858 Vues
Stephanie Weirich Cornell University. Resource Bound Certification. Joint work with Karl Crary, CMU. Problem. Sometimes code runs too long How do we prevent it?. Enforcement Methods. Run the code, and if it takes too long, kill it Look at the code and decide how long it will take
E N D
Stephanie Weirich Cornell University Resource Bound Certification Joint work with Karl Crary, CMU
Problem • Sometimes code runs too long • How do we prevent it?
Enforcement Methods • Run the code, and if it takes too long, kill it • Look at the code and decide how long it will take • Annotations on the code can makes this process decidable
Dynamic Method • Can decide how long is too long on the fly (even while the code is running) • Code producer does not have to write code in any particular language (or include annotations) • May be expensive to enforce
Static Method • Guarantee to the user that code will run without being prematurely terminated • May provide faster execution • Can encompass dynamic checking • Static verification that the code executes dynamic checks • Makes policy enforcement part of the model
Annotations • What annotations can we use for execution time verification?
Base Language • Type-Safe version of C • all pointer accesses checked for NULL • no pointer arithmetic • safe memory-management (garbage-collection or region-based) • tagged unions
Caveat • Most examples in this talk will look like functional programming • Intension is to verify a low-level language (such as type-safe assembly language) • Don’t have to trust the compiler • Hopefully annotation methodologies are general enough that people can be clever
Certification of Running Time • Simplification: Only count function calls and loop iterations • Functions and loops annotated with the cost of execution int add1(int x)<0> { return x+1; } int add3(int x)<3> { x = add1(x); x = add1(x); return add1(x); }
Example • Function-typed arguments can influence the running time int foo(int f(int x)<3>)<8> { return f(1)*f(3); } • But can restrict how the code is used foo(add1); foo(add3);
Abstract Time • Useful to abstract the time annotation int foo (int f(int x)<k>) <2k+2> { return f(1) * f(3); } foo (add1); // k=0, takes 2 steps + 1 for call foo (add3); // k=3, takes 8 steps + 1 for call
Static Dependent Cost • What if the time depends on a non-function argument? • Essential for recursive functions uint fact (uint n)<n>{ if (n == 0) return 1; else return (fact (n-1) * n); } • Note : • Time annotation must be non-negative • Ignore underflow/overflow
Size • What if the argument is not a uint or a function? • Option: Pre-defined mapping from structured data values to their “sizes”
Example - List struct list { const uint val; const struct list* next; } • The size of a list is its length • Simplifying assumption - The members of the struct are const so that we do not have to track aliasing.
Sumf int sumf (uint f(uint)<k>; struct list* x; int acc) <length(x)*(k+2)> { if (x == NULL) return acc; else { int acc2 = acc + f (x->val); struct list* x2 = x -> next; sumf (f, x2, acc2); } }
Calling Sumf sumf(add3, NULL,0); // 0*(3+2) + 1 struct list* x = new { val = 5; next = NULL }; sumf(add3, x,0) // 1*(3+2) + 1
Size • What if the time of f is dependent? uint sumf (uint f(uint y)<y>; struct list* x; uint acc) <length(x)*(??+2)> { if (x == null) return acc; else { uint acc2 = acc + f (x->val); struct list* x2 = x -> next; sumf (f, x2, acc2); } }
User-defined size • Need a programming language to express the mapping between datatypes and the time to iterate over them • Expressive enough to represent structured data, and functions over that data • Not so expressive that equivalence is undecidable • Need a way to connect dynamic data with a representation in this language
Decidable, Expressive Language • Typed-lambda calculus with products, sums and primitive recursion over inductive types • Syntax of functional programming language ML • Terminology • Dynamic language -- Type-safe C • Static language -- this annotation language
Static Language • Natural numbers (of type nat) and arithmetic operations • 3+4 , 5*x • Higher-order functions • fn (x : nat) => (fn (y :nat) => x+y) • (of type nat (nat nat) ) • Tuples • (3,5) : nat nat
Static Language • Sums and recursive types notated with datatypes datatype bool = False | True fun not (b:bool) = case b of True => False | False => True
Primitive Recursion • datatypes can recursively mention name only in positive positions datatype foo = Bar of foo | Baz of foo * foo datatype foo = Bar of foo foo datatype foo = Bar of (foo int) foo
Primitive Recursion • Recursive functions over these datatypes can only call themselves on subterms of their arguments datatype foo = Bar of foo | Baz of foo * foo fun iter (x : foo) = case x of Bar(w) => iter(x) | Baz(y,z) => iter(y)
List Representation datatype list = Null | Cons of nat * list fun time(m : list) = case m of Nil => 0 | Cons(val, next) => val+2+time(next)
Decision Procedure • We must be able to decide if two terms in the static language are equivalent • Algorithm: convert each term to a normal form and compare • Need a reduction system for terms that is confluent and strongly normalizing
Reduction Rules • Sample Reduction rules • 3 + 4 --> 7 • M + 0 --> M • case Ci M of C1 x1 => N1 | C2 x2 => N2 --> Ni [M/xi] • (fun f x => M ) N --> M[N/x, (fun f x => M)/f]
Connecting the Languages • We must be able to use this static language to describe the values of the dynamic language • Use the type system to enforce that a dynamic term matches a static description
Singleton Types • nat represents unsigned ints • Connect constants in the two languages • If m : nat , form singleton type uint<m> uint<3> x; x = 3; x = 4;
Using fact • New type of factorial uint fact(uint<m> x)<m>; fact(3); // takes time 3+1 uint<n> x; fact(x); // takes time n+1
Pointer Types • Consider pointer types • int* Either a reference to an int or null • int@ Must be a reference • int<0> Must be a null pointer • Want to refine the type of a variable // x has type int* if (x == NULL) { // x has type int<0> } else { // x has type int@ }
Enforcement types • Static representation of integer pointers datatype ptr = Null | Ptr intptr(m) = case m of Null => int<0> | Ptr => int@ • If x : intptr(Ptr) then we know x is not NULL
Refinement // suppose x : intptr(m) if (x == NULL) { // here we know that x : int<0> // so thereforemmust beNull, or // we’d get a contradiction } else { // know thatmisPtr }
List Enforcement Type datatype list = Null | Cons( nat, list) replist(m) = case m of Null => int<0> | Cons(val,next) => struct { const uint<val> val; const replist(next) rest }@
Using Enforcement Types // if x has typereplist(m) if (x == NULL) { // again x : int<0> } else { // m must be Cons (val, next) // x:{const int<val> val; // const replist(next) rest }@ } • We’ve used a comparison in the dynamic code to increase our knowledge of the static representation
User-defined Size • Iterate over list, calculating a nat to represent execution time fun time(m : list) = case m of Nil => 0 | Cons(val, next) => val+2+time(next)
Example : Code uint sumf (uint f(uint y)<y>; replist(m) x; uint acc) <time(m)> { if (x == null) return acc; else { // m must Cons( val, next) // call to f takes time val + 1 uint acc2 = acc + f (x->val); struct list* x2 = x -> next; // recursive call takes time(next) + 1 sumf (f, x2, acc2); } }
Other Resources • “Effect notation” int f(int)<3> doesn’t generalize to resources that can be recovered (e.g. space) • Alternative: Augment the operational semantics with a virtual clock that winds down as the program executes
Virtual Clocks • Function types specify starting and ending times • (int,12) f(int, 15) starts at time 15 and finishes at time 12 • Use polymorphism to abstract starting times • (int, n) f(int,n+3) runs in 3 steps - it is equivalent to int f(int)<3>
Recoverable Resources Consider: (int, n+12) f (int, n+15) vs. (int, n) f (int,n+3) If the resource is free-space: • the first function may allocate as many as 15 units, as long as it releases 12 of them before returning. • The second function only requires a minimum of 3 units of free-space to execute.
Upper Bound • Sometimes it is enough just to know an upper bound of the running time. • It is an approximation to help when static analysis fails. • Add the instruction waste to the language to increment the virtual clock • No run-time effect
Waste example bool member (int x; replist(m) w) <length(m)>{ if (w == NULL) return false; else // m=Cons( val , next) if (x == w->val) { waste <next>; return true; } else { return member( w->next ); } }
TALres • We have implemented a similar system within the Typed Assembly Language framework • Clock contained in a virtual register, decremented on backwards jumps • TAL already has a sophisticated type constructor language • Added sum and inductive kinds and refinement procedure for comparison instructions • Operations on static natural numbers
Source Language • Prototype implementation: PopCron • Resembles C + timing annotations • No separation between static and dynamic languages • Compiler to TALres • creates representations of datatypes and their enforcement types
The real details • Static and dynamic language are really two levels of the same language • Static language is embedded in the dynamic language as a language of type constructors • Types of dynamic language are constants in the static language • Types of static language are referred to as kinds
More details • Building block for refinement is “virtual case analysis”
Dynamic trees union tree { uint Leaf; struct node Node; } struct Node { const tree@ left; const tree@ right; } Static representation datatype tree = Leaf of nat | Node of tree * tree Another Example
Tree example size (t : tree) = case t of Leaf(x) => x +1 | Node (left, right) => size(left) + size(right) + 2 uint sumf (uint f(uint<k>)<k>; reptree(t) t) <size(t)> { switch t { case Leaf(x): return f(x); case Node(x): return sumf(x.left) + sumf(x.right); } }
Enforcement type for trees reptree (tree) = case tree of Leaf x => union { uint<x> Leaf; empty Node;} | Node (left, right) => union { empty Leaf; struct { const reptree(left)@ left; const reptree(right)@ right; } Node; }
Virtual Case switch t { case Leaf(x): // Suppose at runtime we could examine the static // representation case t of Leaf (x) => This is the only branch that could occur Node (x) => …as here x is of type empty
Virtual Case switch t { case Leaf(x): // Since one branch is impossible we don’t need to // specify it vcase t of Leaf(x) => // binds x in following code • This case statement is virtual because we know what branch will be taken -- no run time effect