160 likes | 178 Vues
Learn about good programming practices for building memory-efficient EDA applications. Discover why industrial code is often inefficient, how optimizing for memory can improve runtime, and key strategies for reducing memory usage. Topics include designing custom data structures, storing objects in a topological order, optional fanout representation, using integers instead of pointers, and avoiding linked lists.
E N D
Good Programming Practices for Building Less Memory-Intensive EDA Applications Alan Mishchenko University of California, Berkeley
Outline • Introduction • What is special about programming for EDA • Why much of industrial code is not efficient • Why saving memory also saves runtime • When to optimize for memory • Simplicity wins • Suggestions for improvement • Design custom data-structures • Store objects in a topological order • Make fanout representation optional • Use 4-byte integers instead of 8-byte pointers • Never use linked lists • Conclusions
EDA Programming • Programming for EDA is different from • programming for the web • programming databases, etc • EDA deals with • Very complex computations (NP-hard problems) • Very large datasets (designs with 100M+ objects) • Programming for EDA requires knowledge of algorithms/data-structures and careful hand-crafting of efficient solutions • Finding an efficient solution is often the result of a laborious and time-consuming trial-and-error
Why Industrial Code Is Often Bad • Heritage code • Designed long ago by somebody who did not know or did not care • Overdesigned code • Designed for the most general case, which rarely or never happens • Underdesigned code • Designed for small netlists, while the size of a typical netlist doubles every few years, making scalability an elusive target
Less Memory = Less Runtime • Although not true in general, in most EDA applications dealing with large datasets, smaller memory results in faster code • Because most of the EDA computations are memory intensive, the effect of CPU cache misses determines their runtime • Keep this in mind when designing new data-structures
When to Optimize Memory? • Optimize memory if we store many similar entries (nodes in a graph, timing objects, placement locations, etc) • For example, when designing a netlist, which typically stores millions of individual objects, the object data-structure is very important • However, if only a few instances of a netlist are used at the same time, the netlist data-structure is less important
Design Custom Data-Structures • Figure out what is needed in each application and design a custom data-structure • The lowest possible memory usage • The fastest possible runtime • Simpler and cleaner code • Often good data-structures can be reused elsewhere • Translation to and from a custom data-structure rarely takes more than 3% of runtime • Example: In a typical synthesis/mapping application, it is enough to have ‘node’ and there is no need for ‘net’, ‘edge’, ‘pin’, etc
Store Objects In a Topo Order • Topological order • When fanins (incoming edges) of a node precede the node itself • Using topological order makes it unnecessary to recompute it when performing local or global changes • Saves runtime • Using topological order reduces CPU cache misses, which occur when computation jumps all over memory • Saves runtime • It is a good idea to have a specialized procedure or command to (re)establish a topo order of the network (graph, etc) in those rare cases when it changes
Fanout Representation • Traditionally, each object (node) in a netlist has both fanins (incoming edges) and fanouts (outgoing edges) • In most applications, only fanins are enough • Reduces memory ~2x • Reduces runtime • Fanouts can be computed on demand • Exercise: Implement computation of required times of all nodes in a combinational netlist without fanouts • If many cases, it’s enough to have “static fanout” • If netlist is fixed, fanouts are never added/removed
Use Integers Instead of Pointers • In the old days, integer (int) and pointer (void *) used the same amount of memory (4 bytes) • In recently years, most of the EDA companies and their customers switched to using 64-bits • One pointers now takes 8 bytes! • However, many old codes use a lot of pointers • This leads to a 2x memory increase for no reason • Suggestion: Design your code to store attributes of objects as integers, rather than as pointers
Avoiding Pointers (example) • Node points to its fanins • Fanins can be integer IDs, instead of pointers • Instead of a linked list of node pointers, use an array of integer IDs • A linked list uses at least 6x more memory • Iterating through a linked list is slower
Integer IDs for Indexing Attributes • Each node in the netlist can have an integer ID • The node structure can be as simple as possible struct Node { int ID; int nFanins; int * pFanins; }; • Any attribute of the node can be represented as an entry in the array with node’s ID used as an index Vec<int> Type; Vec<int> Level; Vec<float> Slack; • Attributes can be allocated/freed on demand, which helps control memory usage • Light-weight basic data-structure makes often-used computations (such as traversals) very fast
Avoid Linked Lists • Each link, in addition to user’s data, has previous and next fields • Potentially 3x increase in memory usage • Most of linked list implementations use pointers • Additional 2x increase in memory usage • Another drawback • Allocating numerous links leads to memory fragmentation • Most data-structures can be efficiently implemented without linked lists
Simplicity Wins • Whenever possible keep data-structures simple and light-weight • It is better to have on-demand attributes associated with objects, rather than an overly complex object data-structure, which stores all attributes at all times
Case Study: Storage for Many Similar Entries • Same-size entries (for example, AIG or BDD nodes) are best stored in an array • Node’s index is the place in the array where the node is stored • Different-size entries (for example, nodes in a logic network) are best stored in a custom memory manager • Manager allocates memory in pages (e.g. 1MB / page) • Each page can store entries of different size • Each entry is assigned an integer number (called ID) • There is a vector mapping IDs into pointers to memory for each object
Conclusion • Reviewed several reasons for inefficient memory usage in industrial code • Offered several suggestions and good coding practices • Emphasized the importance of considering memory usage when designing data-structures