180 likes | 360 Vues
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jacon Cong ICCAD 2004. Presented by: Wei Chen. FPGA Architecture. K-input LUT can implement any Boolean function of K variables. So called Completeness. FPGA Technology Mapping.
E N D
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA DesignsDeming Chen, Jacon Cong ICCAD 2004 Presented by: Wei Chen
FPGA Architecture K-input LUT can implement any Boolean function of K variables. So called Completeness.
FPGA Technology Mapping Given a circuit modeled as a DAG, partitioning the graph such that every partition has not more than K inputs while satisfying some objectives.
Terminology • A Boolean network N can be modeled as a DAG • Input(v): the set of fanin nodes of gate v • Cone rooted on node v (Ov) is a sub-network of N consisting of v and some of its predecessors, such that for any node w∈Ov, there is a path from w to v that lies entirely in Ov. • A cut is partitioning (X, X’) of a cone Ov such X’ is a cone of v. • The cut set of the cut V(X,X’) consists of the inputs of cone X’. • Cut size is the number of elements in cut set • The level of a node v is the length of the longest path from any PI node to v. • The depth of a network is the largest node level in the network. • A Boolean network is L-bounded if |input(v)| ≤L for each v.
Terminology cont. Input(6) = {4,6} Level(6) = 2 Depth = 2 2-bouned 1 4 2 6 5 3 A cone rooted at node 6 1 4 A cut C 2 6 Cut set (C) = {1,2,5} Cut size(C) = 3 5 3
DAOmap overall view • A cut-enumeration-based method that consists of cut generation and cut selection. • Cut generation/enumeration: for each node being considered, generate all the K-feasible cuts. • Cut selection: Choose the nodes (and their best cuts) for implementation using LUTs • Objective: Create a minimum area cover under the timing constraint (Optimal Depth).
Cut Enumeration • Guided by the following theorem: f(K,v) represents all the K-feasible cuts rooted at node v f(4,5) = {1,2} f(4,6) = {3 ,4} f(4,7) = [5 + f(4,5)][6 + f(4,6)] = {5,6} + {5, f(4,6)} + {f(4,5),6} + {f(4,5) + f(4,6)} = {5,6} + {5,3,4} + {1,2,6} + {1,2,3,4} 1 5 2 7 3 6 4
Delay propagation • Unit delay model: each cut (LUT) on the paths represents one unit delay. • The minimum arrive time for node v is: 0 0 1 1 1 1 1 5 5 5 1 5 2 2 2 2 0 0 7 7 7 7 3 3 3 1 0 3 0 1 6 6 6 6 4 4 4 0 4 0 Xv = the set of cuts that provides minimum arrive time Arr_5 = 1 Arr_6 = 1 Arr_7 = 1
Area propagation • The area of a cut c is calculated as: 2 4 1 3
Area propagation • The area of a cut c is calculated as:
Area propagation • The area of a cut c is calculated as: 2 4 1 3 2 4 1 3
Cut selection • After cut enumeration, we obtain the optimal mapping depth of the network. • Only critical paths need to use the cuts that lead to minimum delay. • Cuts on non-critical paths can be reconstructed to search for a better solution in terms of area. 2 1
Iterative Cut Selection Procedure Carry out a topological order traversal starting from POs, then the inputs of the generated LUTs are iteratively mapped. The procedure continues until all the PIs are reached.
Pick-up Algorithm Input Sharing Slack Distribution Cut Probing
Experimental results DAOmap is 16.02% better than CutMap in terms of LUT counts on average, and runs 24.2X faster when both are mapped with 5-LUT.
Conclusion • This paper presents a technology mapping algorithm, DAOmap, for FPGA architectures to minimize chip area under timing constraints. • Algorithm consists of Cut enumeration and Cut Selection. • Novel heuristics has been designed to captured the mapping cost accurately with consideration of both local and global optimization information. • Experimental results showed that DAOmap produced significant quality and run-time improvements compared to other mapping tools.