410 likes | 512 Vues
System F with type equality coercions. Simon Peyton Jones (Microsoft) Manuel Chakravarty (University of New South Wales) Martin Sulzmann (University of Singapore). Type directed compilation. GHC “core” language. GHC compiles Haskell to a typed intermediate language: System F
E N D
System F with type equality coercions Simon Peyton Jones (Microsoft) Manuel Chakravarty (University of New South Wales) Martin Sulzmann (University of Singapore)
Type directed compilation GHC “core” language • GHC compiles Haskell to a typed intermediate language: System F • Two big advantages • Some transformations are guided by types • Type-checking Core is a strong check on correctness of transformations Haskell System F C/C--/x86 Optimise
Type directed compilation The Haskellgorilla Dozens ofdata types;hundreds ofcontructors Optimise C/C--/x86 System F Two data typesTen constructors
System F: the mighty midget Polymorphic types, function definitions System F
System F: the mighty midget Algebraic data types, pattern matching, list comprehensions, Polymorphic types, function definitions System F + data types
System F: the mighty midget Functional dependencies Type classes Algebraic data types, pattern matching, list comprehensions, Polymorphic types, function definitions System F + data types
System F: the mighty midget data T = forall a. MkT a (a →Int) Example of use: case x of MkT v f → f v Type of MkT is MkT :: a. a → (a →Int) → T Existential data types Functional dependencies Type classes Algebraic data types, pattern matching, list comprehensions, Polymorphic types, function definitions System F + (existential) data types
System F: the fatter midget GADTs Existential data types Functional dependencies Type classes Algebraic data types, pattern matching, list comprehensions, Polymorphic types, function definitions System F + GADTs
The problem with F Associated types GADTs Existential data types Functional dependencies Type classes Algebraic data types, pattern matching, list comprehensions, Polymorphic types, function definitions SPLAT
A practical problem • GHC uses System F + (existential data types) as its intermediate language • GADTs are already a Big Thing to add to a typed intermediate language • Practical problem: exprType :: Expr -> Typedoesn’t have a unique answer any more • Associated types are simply a bridge too far What to do?
What bits don’t “fit”? • The stuff that doesn’t “fit” is the stuff that involves non-syntactic type equality • Functional dependencies • GADTs • Associated types
GADTs data Exp a where Zero :: Exp Int Succ :: Exp Int -> Exp Int Pair :: Exp b -> Exp c -> Exp (b, c) eval :: Exp a -> a eval Zero = 0 eval (Succ e) = eval e + 1 eval (Pair x y) = (eval x, eval y) Also known as “Inductive type families” and “Guarded recursive data types” [Xi POPL’03] • In the Zero branch, a=Int • In the Pair branch, a=(b,c) • But how can we express that in System F?
Functionaldependencies[Jones ESOP’00] class Collects c e | c->e where empty :: c insert :: e -> c -> c instance Eq e => Collects [e] e where ... instance Collects BitSet Char where ... • Originally designedto guide inference;fundeps causeextra unifications (“improvement”) to take place • Absolutely no impact on intermediate language • BUT some nasty cases cannot be translated to F class Wuggle c where nasty :: (Collects c e) => c -> e -> c instance Wuggle BitSet where ...Argh!... nasty :: forall e. Collects BitSet e => c->e->e Can only be Char
Associatedtypes [POPL’05, ICFP’05] class Collects c where type Elem c empty :: c insert :: Elem c -> c -> c instance Eq e => Collects [e] where type Elem [e] = e; ... instance Collects BitSet where type Elem BitSet= Char; ... foo :: Char -> BitSet foo x = insert x empty • Elem is a typefunction, mappingthe collection typeto the associatedelement type • The original AT papers gave a translation into F by adding type parameters (a la fundep solution) • But that is (a) deeply, disgustingly awkward, and (b) suffers from similar nasty cases as fundeps • Can we support ATs more directly?
The happy answer: FC FC extends System F in a way that... • ... is much more modest than adding GADTs • ...supports GADTs • ...and ATs • ...and the nasty cases of fundeps • ...and perhaps other things besides
Two ingredients • Type-equality coercions: a very well-understood idea in the Types community, but details are interesting. • Abstract type constructors and coercion constants: perhaps not so familiar
data Exp a where Zero :: Exp Int Succ :: Exp Int -> Exp Int Pair :: Exp b -> Exp c -> Exp (b, c) eval :: Exp a -> a eval Zero = 0 eval (Succ e) = eval e + 1 eval (Pair x y) = (eval x, eval y) FC in action data Exp a where Zero : a.(a Int) => Exp a Succ : a.(a Int) => Exp Int -> Exp a Pair : a. bc.(a (b,c)) => Exp b -> Exp c -> Exp a Equality constraints express the extra knowledge we can exploit during pattern-matching Result type is always Exp a Type always starts “a” • This part is very standard • Some authors use this presentation for the source language (Sheard, Xi, Sulzmann)
FC in action data Exp a where Zero :: a.(a Int) => Exp a Succ :: a.(a Int) => Exp Int -> Exp a Pair :: a. bc.(a (b,c)) => Exp b -> Exp c -> Exp a zero : Exp Int zero = Zero Int (refl Int) one : Exp Init one = Succ Int (refl Int) zero Ordinary value argument Ordinary type argument A coercion argument: “evidence” that Int Int (refl Int) : Int Int • We are passing equality evidence around: again, a very standard idea
FC in action data Exp a where Zero :: a.(a Int) => Exp a Succ :: a.(a Int) => Exp Int -> Exp a Pair :: a. bc.(a (b,c)) => Exp b -> Exp c -> Exp a eval:: Exp a -> a eval = a.(x:Exp a). case x of Zero (g:aInt) -> 0 ► (sym g) ... (sym g) : Int a Pattern matching binds a coercion argument Cast (►)exploits the coercion to change the type: 0 : Int (0 ► (sym g)) : a
Coercions are types! A coercion is a type, not a term.e.g. refl Int : Int Int(refl Int) is a typewhose kind is (Int Int) Reasons: • Type erasure erases coercions • Terms would allow bogus coercions (letrec g = g in g) : Int BoolThe type language is strongly normalising; no letrec Weird! A type has a kind, which mentions types...!
class Collects c where type Elem c empty :: c insert :: Elem c -> c -> c instance Collects BitSet where type Elem BitSet = Char; ... instance Eq e => Collects [e] where type Elem [e] = e; ... Abstract type constructors Abstract type constructor: says only “Elem is a type constructor” type Elem : * -> * data CollectsD c where CD : c. c -> (Elem c -> c -> c) -> CollectsD c Class declaration generates a data type declaration as usual
class Collects c where type Elem c empty :: c insert :: Elem c -> c -> c instance Collects BitSet where type Elem BitSet = Char; ... instance Eq e => Collects [e] where type Elem [e] = e; ... Abstract type constructors type Elem :: * -> * data CollectsD c where CD : c. c -> (Elem c -> c -> c) -> CollectsD c Instance decl generates a top-level coercion constant, witnessing that Elem Bitset = Char coercion cBitSet : Elem Bitset Char dBitSet : CollectsD BitSet dBitSet = CD BitSet (...) (...)
class Collects c where type Elem c empty :: c insert :: Elem c -> c -> c instance Collects BitSet where type Elem BitSet = Char; ... instance Eq e => Collects [e] where type Elem [e] = e; ... Abstract type constructors type Elem :: * -> * data CollectsD c where CD : c. c -> (Elem c -> c -> c) -> CollectsD c foo :: Char -> BitSet foo x = insert x empty coercion cBitSet : Elem Bitset Char dBitSet : CollectsD BitSet dBitSet = CD BitSet (...) (...) foo : Char -> BitSet foo x = insert BitSet dBitSet (x ►(sym cBitSet)) (empty BitSet dBitSet)
class Collects c where type Elem c empty :: c insert :: Elem c -> c -> c instance Collects BitSet where type Elem BitSet = Char; ... instance Eq e => Collects [e] where type Elem [e] = e; ... Abstract type constructors type Elem :: * -> * data CollectsD c where CD : c. c -> (Elem c -> c -> c) -> CollectsD c A type-parameterised coercion constant coercion cList : (e:*). Elem [e] e dList : e. CollectsD e -> CollectsD [e] dList = ... ...and a type- and value-parameterised dictionary function
A worry • What is to stop us saying this? • Result: seg-fault city • Answer: the top-level coercion constants must be consistent coercion utterlyBogus : Int Bool
Terms: utterly unremarkable Used for coercion and application Only interesting feature
Coercions Types Ordinary types
Types Abstract types (must be saturated) Data types Full blown type application
Types Various forms of coercions Coercions are types
Slightly more surprising • If I know (Tree a Tree b) then I know that (a b) • Absolutely necessary to translate Haskell programs with GADTs • Only true for injective type constructors • Not true (in general) for abstract type constructors • That’s why applications of abstract type constructors must be saturated
Consistency • This says that if a coercion relates two data types then they must be identical • This condition is both necessary and sufficient for soundness of FC (proof in paper)
Operational semantics • Largely standard (good) but with some interesting wrinkles A “cvalue” is a plain value possibly wrapped in a cast Coercions are never evaluated at all
Not so standard Combine the coercions
Not so standard Move a coercion on the lambda to a coercion on its argument and result : (1 -> 2) = (u1 -> u2) 1 : u1 = 1 2 : 2 = u2 Evaluation carries out proof maintenance Proof terms get bigger and bigger; but can be simplified: sym (sym g) = g sym (refl t) = refl t etc
Results: FC itself • FC is sound: well-typed FC programs do not go wrong • Type erasure does not change the operational behaviour
Results: consistency • Consistency is a bit like confluence; in general, it’s difficult to prove for a particular program, but whole sub-classes of programs may be consistent by construction • If G contains no coercion constants then G is consistent. This is the GADT case. • If a set of coercion constants in G form a confluent, terminating rewrite system, then G is consistent. This is the AT case.
Translating into FC • The translation of both GADTs and ATs is almost embarrassingly simple • During type inference, any use of equality constraints must be expressed as a coercion in the corresponding FC program
Conclusions • Feels “right” • Next: implement FC in GHC • Implications for the source language? • Type equalities have the same flavour as sharing constraints in ML modules. Discuss. (e.g. could ML modules be translated into FC?) Paper Apr 2006 (not yet rejected by ICFP’06) http://research.microsoft.com/~simonpj