Are Floorplan Representations Important in Digital Design?

Are Floorplan Representations Important in Digital Design? H. H. Chan, S. N. Adya, I. L. Markov The University of Michigan

Motivation • Many FP representations have been proposed since sequence-pair [Murata et.al. ‘96] • Emphasize better area optimization results, with appealing properties based on math. results • Interconnect is often more important in physical design, but rarely discussed in the literature Does the choice of FP reps matter in interconnect-driven floorplanning?

Outline of the Talk • Common Practices in FP Research • Background on Representations • Evaluation Framework • Results and Analysis • Conclusions

Current Status Quo • Prevailing algorithm: simulated annealing • Can optimize power, performance, etc • Most work focuses on FP representations • Many FP reps exist, compared in terms of • Size of solution space • Area optimality (capture area-optimal floorplan?) • Asymptotic complexity in “realizing” a floorplan • Complexity of incremental changes in annealing Are these properties relevant in interconnect-driven floorplanning?

No Room For Improvement Left? • MCNC benchmark suite is almost always used • The suite has only 5 benchmarks, with <50 blks • Area-optimal results on apte, xerox and hp • Extremely close results on ami33 and ami49 • Area optimization results are emphasized • Interconnect optimization is often ignored • Temperature schedules are rarely reported • Hard to reproduce results

Contexts for Floorplanning • Outline-free floorplanning • Minimize a combination of area and HPWL • Many floorplanners are evaluated in this framework • Fixed-outline floorplanning • Minimize HPWL subject to a bounding box • More relevant to modern designs • Large scale floorplanning • Excellent area-packing results for >500 blks e.g., 10 – 20 copies of ami49 • Hardly interesting w/o interconnect optimization

Families of FP Representations (1) • Different FP reps can share the same solution space (“equivalent”) • Sequence pair, TCG, TCG-S • O-tree, B*-tree • Mosaic reps: CBL, Q-sequence, TBS etc • Equivalent FP reps produce similar solutions, only differ in runtime • We seek to compare solution spacesrather than specific representations

Families of FP Representations (2) We study sequence pair and B*-tree, since • They are best studied in the literature • Broad extensions for various constraints e.g. pre-placed blks, rectilinear blks, soft blks • Hierarchical extensions for over 10K blocks (Parquet-in-Capo, MB*-tree) • Two “extremes” of representations • Sequence pair: largest solution space • B*-tree: smallest solution space

Sequence Pair (SP) • Proposed by Murata et al in 1996 • Encodes a floorplan using two permutations • O(n2) time to realize a floorplan • O(n lg n), O(n lg lg n) algos exist • O(n2) time algo is easy to implement and fast in practice • Size of solution space: n!2 • Some area-optimal solutions • Equivalent to TCG, TCG-S

B*-tree • Proposed by Chang et al in 2000 • Encodes a floorplan by a binary tree and a permutation • O(n)-time to realize a floorplan • Size of solution space: O(n!22n-2 / n1.5) • Some area-optimal floorplans • All packings are compacted to the bottom • Equivalent to O-tree

Sequence Pair vs. B*-tree SP captures low interconnect floorplans, that are not captured by B*-tree • SP captures more floorplans than B*-tree • This packing is encoded by SP <a b c> <c a b> • Not captured by B*-tree

Evaluation Framework (1) • Floorplanner Parquet[Adya et.al ICCD 2001] • Simulated annealer based on sequence pair • Competitive results in • Outline-free floorplanning • Fixed-outline floorplanning • Handling both hard and soft blocks • Embedded in placer Capo 9.0 for mixed-sized placement (standard cells + large macros) [Adya et.al ICCAD 2004]

Evaluation Framework (2) • Replace the sequence pair annealer by B*-tree, with minimal changes • Identical temperature schedule • Probabilities of applying moves are identical,except for those specific to B*-trees • Both versions are open source http://vlsicad.eecs.umich.edu/BK/parquet/ • Report averages over 50 independent startsrather than best results • Runtime breakdown: Block-packing vs WL eval.

Evaluation Framework (3) Min-cut floorplacement with Capo • Use min-cut partitioning whenever possible • Otherwise, resort to annealing-based packing • Cluster standard cells into soft blocks • Fixed-outline floorplanning on clustered instances • Min-cut resumes on standard cells • Outperforms competitive annealers on large FP instances (> 100 blocks) • Clustered FP instances have up to 300 blks, often <20 • Generates thousands of very different FP benchmarks

t = cn1.3 What to Expect? • Wirelength evaluation is very CPU-intesive • Count # floating-point ops in floorplan and wirelength evaluation per move (for SP) • Instrument the code by adding counters • # ops in HPWL evaluation • Only count arithmetic ops, not assignments • # ops in FP evaluation • Worst-case complexity O(n2) is too pessimistic • Runtime of SP evalfits to cn1.3

Area Optimization (average performance reported, not best) • B*-tree packs better than SP • Similar for fixed-outline FP, B*-tree: higher success rate • SP beats B*-tree in evaluation timeup to 200 blks, despite their asymptotic complexities --- O(n2) vs. O(n) Outline-free FP (deadspace % / time-per-move)

Area + HPWL Optimization Outline-free FP (deadspace % / HPWL / time-per-move) • Similar performance for SP and B*-tree in area, HPWL, and evaluation time • Runtime dominated by HPWL evaluation • Same trends for fixed-outline FP and soft blocks

Min-Cut Floorplacement • A wide range of FP benchmarks • Instances with both hard and soft blocks • Facilitate a rigorous empirical comparison • Capo on the IBM-MSwPins benchmarks • Similar performance of SP and B*-tree (<1% diff in wirelength, <2.5% in runtime) • Stand-alone Parquet on the generated instances • 2000+ fixed-outline instances with both hard and soft blocks • Block counts range from 1 to ~300, often <20 • Similar performance of SP and B*-tree

Wirelength Evaluation Runtime (1) • In both outline-free and fixed-outline FP, HPWL evaluation dominates runtime (~80%) Outline-free (TPM area-only / area+wire) Consistent with our analysis that WL evaluation should dominate runtime.

Wirelength Evaluation Runtime (2) % time-per-move spent on WL evaluation (analytical estimates versus actual measurements) • Estimates less accurate in larger benchmarks • Larger netlists may not fit into processor cache FP evaluation is never a runtime bottleneck!

Must Improve WL EvaluationRather than FP Representations! • Ideas for speed-ups • Special-case the evaluation of 2-pin nets:no loops, 2 floating-point comparisons only • Remove inessential nets, whose bounding box contains the outline • Conglomerate 2-pin nets between the same blocks into heavy-weight nets • These techniques lead to 10% speed-upwithout loss of quality in Parquet 3 • Parquet 4: another 10% speed-up (2x speed-up per move) with improved wirelength (longer temp. schedule) • Better use of cache (float versus double) • Simultaneous computation of min and max (25% less comparisons)

t = cn1.3 Conclusions • We compared the performance ofSP and B*-tree in wirelength-driven floorplanning • Surprisingly similar performance! • The main bottleneck is interconnect evaluation: >75% runtime in realistic FP instances • Parquet-4: Improved WL eval.  tangible speed-up • Asymptotic complexity of FP eval. has little relevance • Realistic FP instances are small (<200 blks) • Worst-case analysis is too pessimistic • New FP reps seem irrelevant unlessthey support incremental wirelength evaluation • The status quo in FP literature needs to be changed

Are Floorplan Representations Important in Digital Design?