770 likes | 916 Vues
This research explores the challenges of managing continuous media data, focusing on optimizing cost, latency, and throughput. The study scrutinizes conventional wisdom regarding disk latency, throughput, and utilization, particularly in the context of large-scale media applications. Addressing the requirements for just-in-time data delivery to simultaneous users, it highlights innovative solutions such as the "BubbleUp" scheduling strategy to improve performance. By analyzing various disk scheduling policies and their impact on memory utilization, the work proposes effective strategies for efficient media data management.
E N D
On Managing Continuous Media Data Edward Chang Hector Garcia-Molina Stanford University
Challenges • Large Volume of Data • MPEG2 100 Minute Movie: 3-4 GBytes • Large Data Transfer Rate • MPEG2: 4 to 6 Mbps • HDTV: 19.2 Mbps • Just-in-Time Data Requirement • Simultaneous Users
...Challenges • Traditional Optimization Objectives: • Maximizing Throughput! • Maximizing Throughput!! • Maximizing Throughout!!! • How about Cost? • How about Initial Latency?
Related Work • IBM T.J. Watson Labs. (P. Yu) • USC (S. Ghandeharizadeh) • UCLA (R. Muntz) • UBC (Raymond Ng) • Bell Labs. (B. Ozden) • etc.
Outline • Server (Single Disk) • Revisiting Conventional Wisdom • Minimizing Cost • Minimizing Initial Latency • Server (Parallel Disks) • Balancing Workload • Minimizing Cost & Initial Latency • Client • Handling VBR • Supporting VCR-like Functions
Conventional Wisdom(for Single Disk) • Reducing Disk Latency leads to Better Disk Utilization • Reducing Disk Latency leads to Higher Throughput • Increasing Disk Utilization leads to Improved Cost Effectiveness
Is Conventional Wisdom Right? • Does Reducing Disk Latency lead to Better Disk Utilization? • Does Reducing Disk Latency lead to Higher Throughput? • Does Increasing Disk Utilization lead to Improved Cost Effectiveness?
S DR Memory Use TR -- DR Tseek Tseek T Time Tseek: Disk Latency TR: Disk Transfer Rate DR: Display Rate S: Segment Size (Peak Memory Use per Request) T: Service Cycle Time
S DR Memory Use TR -- DR Tseek Tseek T Time S = DR × T T = N × (Tseek + S/TR)
Disk Utilization N × TR × DR × Tseek S = TR - N × DR S is directly proportional to Tseek S/TR Dutil = S/TR + Tseek Dutilis Constant!
Is Conventional Wisdom Right? • Does Reducing Disk Latency lead to Better Disk Utilization? NO! • Does Reducing Disk Latency lead to Higher Throughput? • Does Increasing Disk Utilization lead to Improved Cost Effectiveness?
What Affects Throughput? Disk Utilization × Disk Latency Throughput ? Memory Utilization
Memory Requirement • We Examine Two Disk Scheduling Policies’ Memory Requirement • Sweep (Elevator Policy): Enjoys the Minimum Seek Overhead • Fixed-Stretch: Suffers from High Seek Overhead
Per User Peak Memory Use N × TR × DR × Tseek S = TR - N × DR
Sweep (Elevator) • Disk Latency: Minimum • IO Time Variability: Very High B1 A1 A2 B2
Sweep (Elevator) • Memory Sharing: Poor • Total Memory Requirement: 2 * N * Ssweep
Fixed-Stretch • Disk Latency: High (because ofStretch) • IO Variability: No (because ofFixed) a b a b a
Fixed-Stretch • Memory Sharing: Good • Total Memory Requirement: 1/2 * N * Sfs
Sweep 2 * N * Ssweep Available Memory = 40 Mbytes N = 40 Fixed Stretch 1/2 * N * Ssf Available Memory = 40 Mbytes N= 42 Higher Throughput Throughput * Based on A Realistic Case Study Using Seagate Disks
What Affects Throughput? Disk Utilization × Disk Latency Throughput ? Memory Utilization
Is Conventional Wisdom Right? • Does Reducing Disk Latency lead to Better Disk Utilization? NO! • Does Reducing Disk Latency lead to Higher Throughput? NO! • Does Increasing Disk Utilization lead to Improved Cost Effectiveness?
Per Stream Cost Cost Per Stream Cost Memory Cost Disk Cost Number of Users
Per-Stream Memory Cost Cm × N × TR × DR × Tseek Cm× S = TR - N × DR
Example • Disk Cost: $200 a unit • Memory Cost: $5 each MBytes • Supporting N = 40 Requires 60MBytes Memory • $200 + 300 = $500 • Supporting N = 50 Requires 160 MBytes Memory • $200 + 800 = $1,000 • For the same cost $1,000, it’s better to buy 2 Disks and 120 Mbytes to support N = 80 Users! • Memory Use is Critical
Is Conventional Wisdom Right? • Does Reducing Disk Latency lead to Better Disk Utilization? NO! • Does Reducing Disk Latency lead to Higher Throughput? NO! • Does Increasing Disk Utilization lead to Improved Cost Effectiveness? NO!
Outline • Server (Single Disk) • Revisiting Conventional Wisdom • Minimizing Cost • Minimizing Initial Latency • Server (Parallel Disks) • Balancing Workload • Minimizing Cost & Initial Latency • Client • Handling VBR • Supporting VCR-like Functions
Initial Latency • What is it? • The time between when a request arrives at the server to the time when the data is available in the server’s main memory • Where is it important? • Interactive applications (e.g., video game) • Interactive features (e.g., fast-scan)
Fixed-Stretch • Space Out IOs Playback Point S M e m o r y Transfer Seek Time a b C a b
Fixed-Stretch a S1 b c S3 S2
Fixed-Stretch S1 S3 S2
Our Contribution: BubbleUp • Fixed-Stretch Enjoys Fine Throughput • BubbleUp Remedies Fixed-Stretch to Minimize Initial Latency
Schedule Office Work • 8am: Host a Visitor • 9am: Do Email • 10am: Write Paper • 11am: Write Paper • Noon: Lunch
BubbleUp S1 S3 S2
BubbleUp • Empty Slots are Always Next in Time • No additional Memory Required • Fill the Buffer up to the Segment Size • No additional Disk Bandwidth Required • The Disk Is Idle Otherwise
Evaluation 9 7 5 Latency (S) Sweep 3 1 BubbleUp N
Fast-Scan S1 S2 S3
Fast-Scan S1 S2 S4 S3
Data Placement Policies • Please refer to our publications
S1 S2 S3
Chunk Allocation • Allocate Memory in Chunks • A Chunk = k * S • Replicate the Last Segment of a Chunk in the Beginning of Next Chunk • Example • Chunk 1: s1, s2, s3, s4, s5 • Chunk 2: s5, s6, s7, s8, s9
Chunk Allocation • Largest-Fit First • Best Fit (Last Chunk)
18 Segment Placement 4 16 8
Largest-Fit First 4 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 v16 8
Best Fit s16 s17 s18 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15 v16
Outline • Server (Single Disk) • Revisiting Conventional Wisdom • Minimizing Cost • Minimizing Initial Latency • Server (Parallel Disks) • Balancing Workload • Minimizing Cost & Initial Latency • Client • Handling VBR • Supporting VCR-like Functions
Unbalanced Workload Video HOT Video Cold Video Cold
Balanced Workload Video HOT Video Cold Video Cold
Per Stream Memory Use (Use M Disks Independently) N × TR × DR × Tseek S = TR - N × DR M × N