670 likes | 793 Vues
This overview of Ant System Optimization (AS) discusses how ants collaborate on graphs to create feasible solutions, leaving pheromones to influence decision-making. Key strategies include evaporation of pheromones, reinforcement of good solutions, and using both global and local heuristics. The Max-Min Ant System (MMAS) approach addresses challenges like local minima and premature convergence by bounding pheromone strength to enhance exploration. The lecture also covers the effective application of AS in scheduling problems, emphasizing priority list construction and dynamic heuristic adjustments for improved outcomes.
E N D
High Level Synthesis CSE 237D: Spring 2008 Topic #6 Professor Ryan Kastner
? Ant System Optimization: Overview • Ants work corporately on the graph • Each creates a feasible solution • Ants leave pheromones on their traces • Ant make decisions partially on amount of pheromones • Global Optimizations • Evaporation: Pheromones dissipate over time • Reinforcement: Update pheromones from good solutions • Quickly converges to good solutions
Solving Design Problems using AS • Problem model • Define the solution space: create decision variables • Pheromone model • Global heuristic: Provides history of search space traversal • Ant search strategy • Local heuristic: Deterministic strategy for individual ant decision making • Solution construction • Probabilistically derive solution from local and global heuristics • Feedback • Evaluate solution quality, Reinforce good solutions (pheromones), Slightly evaporate all decisions (weakens poor solutions)
Max-Min Ant System (MMAS) Scheduling • Problem: Some pheromones can overpower others leading to local minimums (premature convergence) • Solution: Bound the strength of the pheromones • If , always a chance to make any decision • If , the decision is based solely on local heuristics, i.e. no past information is taken into account
MMAS RCS Formulation • Idea: Combine ACO and List Scheduling • Ants determine priority list • List scheduling framework evaluates the “goodness” of the list • Global heuristics permutation index • Local heuristic – can use different properties • Instruction mobility (IM) • Instruction depth (ID) • Latency weighted instruction depth (LWID) • Successor number (SN)
RCS: List Scheduling • A simple scheduling algorithm based on greedy strategies • List scheduling algorithm: • Construct a priority list based on some metrics (operation mobility, numbers of successors, etc) • While not all operations scheduled • For each available resource, select an operation in the ready list following the descending priority. • Assign these operations to the current clock cycle • Update the ready list • Clock cycle ++ • Qualities depend on benchmarks and particular metrics
Global heuristic: Pheromones : the favorableness of selecting operation i to position j Global pheromone matrix Local heuristic: Local metrics : Instruction mobility, number of successors, etc Local decision making: a probabilistic decision Evaporate pheromone and reinforce good solution MMAS RCS: Global and Local Heuristics
op1 1 op2 2 op3 3 op4 4 op5 5 op6 6 Instructions Priority List Pheromone Model For Instruction Scheduling Each instruction opi Iassociated with n pheromone trailswhere j = 1, …, n each indicates the favorableness of assign instruction i to position j Each instruction also has a dynamic local heuristic
1 2 3 4 5 6 Priority List Ant Search Strategy • Each run has multiple iterations • Each iteration, multiple ants independently create their own priority list • Fill one instruction at a time op1 op1 op4 op2 op2 op1 op3 op3 op5 op4 op4 op6 op5 op5 op2 op6 op6 op3 Instructions
Ant Search Strategy • Each ant has memory about instructions already selected • At step j ant has already selected j-1 instructions • jth instruction selected probabilistically op1 op1 op4 1 op2 op2 2 op1 op3 op3 op5 3 op4 op4 4 op5 op5 5 op6 op6 6 Instructions Priority List
Ant Search Strategy • ij(k) : global heuristic (pheromone) for selecting instruction i at j position • j(k) : local heuristic – can use different properties • Instruction mobility (IM) • Instruction depth (ID) • Latency weighted instruction depth (LWID) • Successor number (SN) • , control influence of global and local heuristics
Pheromone Update • Lists constructed are evaluated with List Scheduling • Latency Lh for the result from ant h • Evaporation – prevent stigmergy and punish “useless” trails • Reinforcement – award trails with better quality
1 2 3 4 5 6 Priority List Pheromone Update • Evaporation happens on all trails to avoid stigmergy • Reward the used trails based on the solution’s quality op1 op1 op4 op2 op2 op1 op3 op3 op5 op4 op4 op6 op5 op5 op2 op6 op6 op3 Instructions
Max-Min Ant System (MMAS) • Risks of Ant System optimization • Positive feedback • Dynamic range of pheromone trails can increase rapidly • Unused trails can be repetitively punished which reduce their likelihood even more • Premature convergence • MMAS is designed to address this problem • Built upon original AS • Idea is to limit the pheromone trails within an evolving bound so that more broader exploration is possible • Better balance the exploration and exploitation • Prevent premature convergence
Max-Min Ant System (MMAS) • Limit (t) within min(t) and max(t) • Sgbis the best global solution found so far at t-1 • f(.) is the quality evaluation function, i.e. latency in our case • avg is the average size of decision choices • Pbest (0,1]is the controlling parameter • Conditional prob. of Sgb being selected when all trails in Sgb have maxand othershavingmin • Smaller Pbest tighter range for more emphasis on exploration • When Pbest 0, we setmin max
Other Algorithmic Refinements • Dynamically evolving local heuristics • Example: dynamically adjust Instruction Mobility • Benefit: reduce search space progressively • Taking advantage of topological sorting of DFG when constructing priority list • Each step ants select from the ready instructions instead from all unscheduled instructions • Benefit: greatly reduce the search space
Benchmarks: ExpressDFG • A comprehensive benchmark for TCS/RCS • Classic samples and more modern cases • Comprehensive coverage • Problem sizes • Complexities • Applications • Downloadable from http://express.ece.ucsb.edu/benchmark/
RCS Experimental Results • Heterogeneous RCS – multiple types of resources (e.g. fast and normal multiplier) • ILP (optimal) using CPLEX • List scheduling • Instruction mobility (IM), instruction depth (ID), latency weighted instruction depth (LWID), successor number (SN) • Ant scheduling results using different local heuristics (Averaged over 5 runs, each run 100 iteration with 5 ants)
RCS Experimental Results • Homogenous RCS – all resources have unit delay • New benchmarks (compared to last slide) too large for ILP
MMAS RCS: Results • Consistently generates better results over all testing cases • Up to 23.8% better than list scheduler • Average 6.4%, and up to 15% better than force-directed scheduling • Quantitatively closer to known optimal solutions
MMAS TCS Formulation • Idea: Combine ACO and Force Directed Scheduling • Quick FDS review • Uniformly distribute the operations onto the available resources. • Operation probability • Distribution graph • Self force: changes on DG of scheduling an operation • Predecessor/successor force: implicit effects on DG • Schedule an operation to a step with the minimum force
1 4 ACO Formulation for TCS • Initialize pheromone model • While (termination not satisfied) • Create ants • Each ant finds a solution • Evaluate solutions and update pheromone • Report the best result found trailsijindicates the favorableness of assigning instruction i to position j S S 1 1 v1 v2 + v1 v2 v6 v8 v10 v6 v3 2 v7 v9 v11 + < v3 2 v4 v4 - - + v10 3 3 v7 v8 v9 v5 v11 - - v5 + < 4 4 E E vn vn
ACO Formulation for TCS • Initialize pheromone model • While (termination not satisfied) • Create ants • Each ant finds a solution • Evaluate solutions and update pheromone • Report the best result found • Select operation oph probabilistically • Select its timestep as following: Global Heuristics: tied with the searching experience Local Heuristics: use the inverse of distribution graph, 1/qk(j) Here and β are constants
Initialize pheromone model While (termination not satisfied) Create ants Each ant finds a solution Evaluate solutions and update pheromone Report the best result found ACO Formulation for TCS Rewarding good partial solutions based on solution quality Pheromone evaporation
MMAS TCS: Results • MMAS TCS is more stable than FDS, especially solution highly unconstrained • 258 out of 263 test cases are equal to or better than FDS results • 16.4% fewer resources
Design Space Exploration • DSE challenges to the designer • Ever increasing design options • Closely related w/ NP-hard problems • Resource allocation • scheduling • Conflict objectives (speed, cost, power, …) • Increasing time-to-market pressure
Our Focus: Timing/Cost • Timing/Cost Tradeoffs • Known application • Known resource types • Known operation/resource mapping • Question: find the optimal timing/cost tradeoffs • Most commonly faced problem • Fundamental to other design considerations
Common Strategies • Usually done in an ad-hoc way • Experience dependent • Or Scanning the design space withResource Constrained (RCS) or Time Constrained (TCS) scheduling • What’s the problem? • RCS and TCS are dual problems • Can we effectively use information from one to guide the other?
Key Observations • A feasible configuration C covers a beam starting from (tmin, C) • tminis the RCS result for C
Key Observations • A feasible configuration C covers a beam starting from (tmin, C) • Optimal tradeoff curve L is monotonically non-increasing as deadline increases
Theorem • If C is the optimal TCS result at time t1, then the RCS result t2 of C satisfies t2 <= t1. • More importantly, there is no configuration C′with a smaller cost can produce an execution time within [t2, t1].
What does it give us? • It implies that we can construct L: • Starting from the rightmost t • Find TCS solution C • Push it to leftwards using RCS solution of C • Do this iteratively (switch between TCS + RCS)
Experiments • Three DSE approaches • FDS: Exhaustively scanning for TCS • MMAS-TCS: Exhaustively scanning for TCS • MMAS-D: Proposed method leveraging duality * Scanning means that we perform TCS on each interested deadline
Real Design Complications • Heterogeneous mapping • One operation has many implementations • Different bit-width, e.g. 32-bit multiplier good for mul(24) and mul(32) • Different area and delay • Real technology library extremely sophisticated • Hard to estimate final timing and total area • Sharing depends on the cost of multiplexers • Downstream tools may not generate what we expect • Resource sharing, register sharing • Downstream tools break components’ boundaries • Logic synthesis, placement and routing