1 / 29

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems. Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea Richa. Arizona State University. Agenda. Network Processor (NP) System Resource Mapping and Scheduling Problem Heuristic Approach

clover
Télécharger la présentation

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea Richa Arizona State University

  2. Agenda • Network Processor (NP) System • Resource Mapping and Scheduling Problem • Heuristic Approach • Linear Programming and Randomized Rounding • Resource Contention Issue • Detection and Elimination • Experimental Results • Summary and Future Work

  3. Network Processor Systems • Programmable devices designed to process packets at wire-speed • Non-homogeneous real-time systems • Comprise of a mix of ASICs, programmable processors and on-chip interconnects • Optimized to support multiple applications such as IPv4, Diffserv, etc.

  4. Resource Mapping and Scheduling Problem in NP • Given a set APP={APP1, APP2, …,APPk} of applications each specified by a DAG, where each application APPj has a set of constraints (e.g. timing constraints, area constraints etc.), find the mapping that minimize the system cost in terms of dollar value while satisfying all the design constraints • Assuming only one application active at any given time

  5. System Specification • Possible Task-to-Resource Mappings • Several algorithms may be available for execution of a task • Associated with each resource are cost and area parameters • There may be multiple instances of a resource

  6. Integer Linear Programming (ILP) formulation • Objective: • Find a task-to-resource mapping with minimum cost • Constraints: • Board area constraint • Timing constraint • Unique task constraint • Exclusive resource constraint • Communication delay constraint • Task-to-Resource mapping constraint • Task dependency constraint • Example design problem with 3-flows: • 800 variables • 2000 constraints

  7. Heuristic Approach-- Randomized Rounding • Based on Linear Programming solution • Traditional evolutionary algorithms require a set of feasible solutions as a starting point, i.e. Genetic Algorithms, Simulated Annealing • Hard to obtain an initial feasible set due to the conflicting constraints (area, time) in the problem

  8. Randomized Rounding • Relax integrality constraints of the ILP and solve the LP • Fractional values of the binary variables used as probabilities for rounding them to either 0 or 1 • Variable Randomized Rounding • Randomly select variables from a set of randomly chosen constraints • Round the selected variables • Iterative rounding in case of constraint violation

  9. Randomized Rounding (cont.) • Fixing Variables • Reducing the number of variable to be rounded • Fix variable with integer values after solving LP • Iteratively solve LP till the number of integer variables does not increase • Grouping variables • Assign priority based on the variable group affiliation

  10. Randomized Rounding (cont.) • Rollback Point Selection • Roll back only to the last group where constraint violation occurred • Rounding Step Size • Round one or more each time?

  11. Randomized Rounding Results • Near-optimal solution in a fraction of ILP solution time

  12. Exploration of Solution Space • If the deadline constraint is too strict, the ILP may not have any feasible solution for the existing set of resources. • On the other hand, with a too relaxed deadline feasible solution will be obtained with increased chance of resource contention. • Solution space is explored using binary search in order to find a least cost feasible solution without any resource contention.

  13. Improvement of Solution • Relaxed deadline for packet processing helps to reduce the system cost in dollar value. • Packet latency is increased, while satisfying the line speed. • This approach allows multiple packets to be inside the system simultaneously (packet level parallelism). • There may be resource contention if more than one packet try to access the same resource at the same instance of time for two different tasks.

  14. Resource Contention Example: • Line rate = 10Gbps, Packet size = 64 bytes • No Packet Gap • Packet arrives every 51ns

  15. Resource Contention Detection • Packet Flow Graph (PFG) • This is visual depiction of the flow of packets through various resources inside NP system • G(V, E): V is the set the of resources allocated by the ILP, with additional entry and exit nodes, s and t, respectively. • Edge e = (u, v) ε E, if resource u and v are sequentially allocated. • Weight w(e) is associated with edge e: w(e) = (x(e), y(e)); where x(e) is the allocation sequence of the resource and y(e) is the execution time on that sequence.

  16. Resource Contention Detection • Resource Cycle Time • Calculation in PFG • It is defined as the maximum time span for which a resource is busy in executing the set of tasks for a packet. • Resource is not available until it finishes all the tasks for a packet scheduled on it • Maximum Cycle Time: • It is defined as the maximum of all resource cycle times. • Resource contention is detected if maximum cycle time is greater than packet arrival rate. • Gantt chart is used to detect resource contention among multiple paths in a task graph

  17. Resource Contention (Single Path) • Example:

  18. Resource Contention (Multiple Paths)

  19. Resource Contention Elimination • Binary search approach to speed up the exploration of solution space iteratively. • Solution found by ILP is scrutinized for resource contention. • If there is no resource contention, no more work needed. • search iteratively for least cost feasible solution otherwise

  20. Resource Contention Elimination d is the arrival rate of the packets and l is the maximum diameter of the flow graphs

  21. Experimental Settings • Codesign method applied to a Packet Processing System similar to the Intel IXP2400 network processor • Resource set derived from Intel IXP2400 architecture • Application set derived from the standard benchmarking applications defined by the Network Processing Forum, for which there is a mapping available from Intel • Compared performance of the mapping generated by our approach with the standard mapping specified by Intel as part of the IXA Application Framework

  22. Performance Metrics • End-to-end Packet Latency Defined as the time interval starting when the first bit of a packet enters the input port and ending when the first bit of the packet reaches the output port • Throughput The number of data bits transferred in unit time. Measured at 0% packet loss while varying packet size • Resource Utilization The ratio of the time a resource was active and the total measurement time

  23. Input Task Graphs

  24. Experimental Parameters • Input:

  25. Experimental Results • Output:

  26. Experimental Results

  27. Experimental Results

  28. Conclusion and Future Work • Codesign framework for PPSs with consideration of multiple flows and real-time constraints • The iterative improvement scheme introduces packet-level parallelism into the system • For task graphs of the benchmark applications, the method produces solution in a small time and shows performance metrics comparable to the existing PPSs • The framework can be extended with: • An object-oriented or modeling language for specification • Effects of caching and multithreading • Dynamic analysis for workload characterization

  29. Thank You Questions ?

More Related