230 likes | 369 Vues
This paper explores advanced scheduling approaches for soft real-time multimedia applications utilizing Simultaneous Multithreading (SMT) processors. It discusses resource sharing algorithms aimed at maximizing throughput and ensuring performance guarantees for multiple threads. Co-scheduling algorithms are analyzed, emphasizing partitioned and global scheduling methods while considering symbiosis factors among tasks. The paper presents experimental results comparing various scheduling strategies, ultimately identifying the optimal algorithm that enhances task schedulability by prioritizing high utilization and symbiosis awareness.
E N D
Soft Real-Time Scheduling on Simultaneous Multithreaded Processors
Outline • Introduction • Resource Sharing Algorithms • Co-Scheduling Algorithms • Experimental Results • Conclusions
What is SMT? • Simultaneous multithreading (SMT) • combines wide issue superscalar with multithreading, • issues instructions from several threads simultaneously.
SMT vs. CMP SMT CMP
Introduction • SMT (Simultaneous Multithreading) improves processor throughput by processing instructions form multiple threads each cycle. • This paper concerns the use of SMT processors for soft real-time multimedia applications. • On Scheduling: • 1. determine which tasks to co-schedule • 2. how to partition processor resources among co-scheduled tasks
Resource Sharing Algorithms • Two classes: • Maximize overall throughput • Guarantee a level of performance for all threads
Resource Sharing Algorithms • Throughput-driven resource sharing • Resource sharing with performance guarantees • Resources controlled by threaded-specific resource sharing algorithms
Co-Scheduling Algorithms • Design space • Prediction • Partition algorithms • Global scheduling algorithms
Design space for co-scheduling algorithms • Partition vs. global scheduling algorithms • Partitioning approach provides for admission control • Symbiosis-aware vs. symbiosis-oblivious algorithms • Symbiosis factor = ∑ (realized IPC of jobi / single threaded IPC of jobi) N i=1
τ1 & τ3 τ2 & τ4 • EDF schedule • Symbiosis-aware schedule 0 50 100 150 200 τ1 & τ2 τ3 & τ4 0 50 100 150 200
Predicting execution time, utilization, and symbiosis • To predict execution time/utilization, we need to predict both the IPC and the number of instructions • Take average job IPC and average instruction count as the prediction • To predict symbiosis, task IPCs in single-threaded mode is also required.
Partitioning algorithms partitioning (PART) Symbiosis oblivious (NOSYM) Symbiosis aware (SYM) Dynamic (DYN) Static (STAT) Dynamic (DYN) 3 Base (b) Enhanced (e) Base (b) Enhanced (e) 5 1 2 4
PART-NOSYM-DYN-b • First-fit-decreasing-utilization & EDF admission test • PART-NOSYM-DYN-e • Modifies the admission test • Simulates the schedule for a hyperperiod • PART-NOSYM-STAT • FFDU & EDF admission test • It does not need additional approximations for predicting execution time
PART-SYM-DYN-b • Maximizes average symbiosis among tasks in different partition • PART-SYM-DYN-e • Corrects for the utilization approximation in PART-SYM-DYN-b
Global scheduling algorithms Global scheduling (GLOB) Symbiosis oblivious (NOSYM) Symbiosis aware (SYM) PAIIN (dynamic) US (dynamic) PAIIN (dynamic) US (dynamic) 6 7 8 9
GLOB-NOSYM-PLAIN • EDF • GLOB-NOSYM-US • Giving the highest priority to high utilization tasks in the task set • GLOB-SYM-PLAIN • Slightly more complex than EDF • GLOB-SYM-US • Defaults to GLOB-NOSYM-US, if Ui>N/(2N-1) for Ti • Defaults to GLOB-SYM-PLAIN, otherwise
Experimental Results - synthetic • metric : success ratio = the percentage of all generated task-sets successfully scheduled by an algorithm • Best algorithm : GLOB-SYM-US
Experimental Results - real • metric : critical serial utilization (CSU) • Serial utilization (S) = ΣCi/Pi • Ci = computation time of τi in single-threaded mode • Pi = period of τi • Best algorithm : GLOB-SYM-US
Conclusions • In terms of schedulability, the best algorithm uses global scheduling, exploits symbiosis, prioritizes high utilization tasks, and uses dynamic resource sharing. • Partitioning algorithm that utilizes static resource sharing • provides a strict admission control • Requires less profiling
Conclusions • Earliest deadline first global algorithm • Does not provide a strict admission control • Requires no profiling • individual design decision • Dynamic resource sharing • Partitioning algorithm • Symbiosis-awareness