320 likes | 452 Vues
This document explores type theory concepts applied to memory allocation and data layout, focusing on how high-level abstractions relate to low-level memory representations. It discusses the implications of different memory layouts, initialization processes, and the importance of understanding adjacency and indirection in data structures. The text illustrates how ordered type theory manages memory properties, ensuring linearity and adjacency, and how explicit allocation can enhance type safety and manage resources effectively.
E N D
3 3 3 4 4 4 5 5 5 3 3 4 4 5 5 3 3 3 4 4 4 5 5 5 A Type Theory for Memory Allocation and Data Layout Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon University
Views of data • High-level languages • Abstract view of data, characterized by operations • e.g. pairs: • Introduction: (e1,e2) : t1 x t2 • Elimination: fst e : t1 , snd e : t2 • Low-level languages • Concrete view of data, characterized by layout in memory • e.g. C structs: • Contiguous layout • Memory size determined by type Carnegie Mellon University
Data layout • Usually programmers don’t care • But sometimes have to • Marshalling, interaction with low-level devices, precise control of initialization, interoperability • Generally no type safety • Compilers have to care • Represent high-level data abstractions • Allocation and initialization code Carnegie Mellon University
3 4 5 3 4 5 3 4 5 (3,(4,5)) : int x (int x int) Carnegie Mellon University
Type theory for data layout • Expose the fine structure • Expose memory layout in types • Implementation choices explicit • High-level object types defined in terms of low-level memory types • High-level operations on objects broken down into low-level operations on memory • What is the fine structure of memory? Carnegie Mellon University
Initialization • Data objects • Created by initializing raw memory. • Initialization changes types • e.g. from ns to int • Commonly dealt with via linearity • New memory is linear • No aliases • Linear type theory handles re-typing Carnegie Mellon University
3 4 Adjacency • Memory provides a primitive notion of adjacent items: e.g. 3 next to 4. • Large objects composed of adjacent smaller objects • Sub-objects referenced by offsets or interior pointers. Carnegie Mellon University
3 4 5 Associativity • Adjacency is associative: the same memory layout is described by: • (3 next to 4) next to 5 • 3 next to (4 next to 5) • But not commutative! • 3 next to 4 ¹ 4 next to 3 Carnegie Mellon University
4 5 3 Indirection • Not all objects are adjacent • Memory supports a notion of indirection (pointers or labels). • Refer to non-adjacent data via indirection • 3 next to (pointer to (4 next to 5)) Carnegie Mellon University
Ordered Type Theory • Linear type theory handles initialization • Doesn’t capture other memory properties • Ordered type theory • Variables used exactly once (linear) • Variables may not be permuted. • Adjacent variables remain adjacent • No weakening, contraction, or exchange. • Claim: Ordered constructs admit a natural interpretation as adjacency and indirection. Carnegie Mellon University
Variables and Resources • Typing judgments: • Ordering of x’s does not matter. • Unrestricted variables, bound to small objects • Ordering and usage of a’s does matter. • Bound to memory • Adjacent variables bound to adjacent memory Carnegie Mellon University
Ordered product • Ordered product (fuse): • Ordered products model adjacency Carnegie Mellon University
3 4 3 4 5 Adjacency • 3 next to 4 • 3 ² 4 : int ² int • 3 next to 4 next to 5 • 3 ² (4 ² 5) : int ² (int ² int) • (3 ² 4) ² 5 : (int ² int) ² int Carnegie Mellon University
Memory properties • Associativity: • (t1²t2) ²t3 and t1² (t2²t3) are isomorphic • Functions witness isomorphism • Non-commutativity: • t1²t2 and t2²t1 are not isomorphic • No function mapping one to the other (in general) Carnegie Mellon University
Indirection • Ordered modality models indirection • !M : !t corresponds to a pointer to M • Non-linear, un-ordered term Carnegie Mellon University
3 4 5 3 4 5 3 4 5 (3,(4,5)) : int x (int x int) Carnegie Mellon University
3 4 5 (3,(4,5)) : int x (int x int) int x (int x int) Ã !(int ² !(int ² int)) (3,(4,5)) Ã !(3 ² ! (4 ² 5) ) Carnegie Mellon University
3 4 5 (3,(4,5)) : int x (int x int) int x (int x int) Ã ! (! int ² !(! int ² ! int) (3,(4,5)) Ã !(!3 ² ! (!4 ² !5)) Carnegie Mellon University
3 4 5 (3,(4,5)) : int x (int x int) int x (int x int) Ã ! (int ² (int ² int)) (3,(4,5)) Ã !(3 ² (4 ² 5)) Carnegie Mellon University
Explicit Allocation • Ordered type theory • Fine structure of data layout • But not allocation • For example: !(x ² x) • Each time x is instantiated, new object • Initialized atomically • Make allocation explicit • Remove !M from syntax • Add allocation primitives to introduce !t Carnegie Mellon University
Memory Allocation • A well-known GC allocation protocol for copying garbage collectors: • Reserve: obtain raw, un-initialized space. • Initialize: assign values to individual locations. • Allocate: baptize some or all as valid objects. Carnegie Mellon University
Example: Memory Allocation Allocate Initialize Reserve Heap 1 2 0 ? ? ? ? x AP AP LP x = (0,(1,2)) Carnegie Mellon University
Memory Allocation • Type system separates terms and expressions • Terms M: no effects • Expressions E: have effects • Allocation is an effect • Allocation primitives are expressions Carnegie Mellon University
Allocating a Pair Create names for parts. Resource a is used up! Initialize a1, using it up. Reserve space at a. Re-introduce b1:int • Allocate (1,2): Fuse parts and allocate. Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Summary • Type theory for describing data layout • Adjacency requirements. • Precise control over representations. • Type system for allocation: • Allocate raw memory. • Initialize, destructively changing types. • Ensures correct use of allocation protocol. • Permits code motion optimizations. Carnegie Mellon University
What I’m not telling you • It’s more subtle than it seems. • Plain ordered l–calculus doesn’t work. • Need notion of size preserving terms, other refinements. • For details see the paper • Technical presentation and examples. • Interpretation of a l-calculus with pairs. Carnegie Mellon University
Current and Future Work • POPL paper • Only finite products • Technical Report: • Sums, recursive types, ordered functions. • Extended coercion language. • Ongoing • Dynamic extent (arrays) • Other allocation models Carnegie Mellon University
Conclusion • Ordered type theory is a natural framework for modeling data layout. • Low level issues dealt with entirely realistically in a l-calculus setting. • Correctness of allocation and initialization protocols can be captured in the type system Carnegie Mellon University