320 likes | 435 Vues
This document explores the physical disk structure and performance metrics critical for file management systems, particularly in multimedia environments. It discusses key aspects such as seek time, latency, and transfer rates, alongside various file allocation methods like contiguous, linked, constrained, and striping approaches. The text further addresses the requirements for storing and retrieving multimedia files, methods of scheduling algorithms, and admission control policies to ensure efficient data handling. Understanding these principles is vital for optimizing multimedia file systems and enhancing performance. ###
E N D
MM File Management Karrie Karahlaios and Brian P. BaileySpring 2007
Sector Track Platter Cylinder R/W head Example 16 heads x 1400 cyls x 16 sectors/track x 512 bytes/sector = 183.5MB Physical Disk Structure
Measures of Performance • Seek time (ms) • time to move disk arm to a specific track • Latency (ms) • time for sector to rotate under disk arm • Transfer rate (Mbps) • data that can be read in one time unit
Zoned Bit Recording • Utilize larger, outer tracks • early disks could not handle varying number of sectors / track • reduce density of outer sectors • Each zone (set of tracks) has variable number of sectors • outer part can hold more data and support higher transfer rates
File System • Mapped onto physical disk structure • want to match user’s conceptual model • Collection of files and directories • file is logical storage unit • directories contain information about files (names, type, location, size, protection, etc.) • Basic operations • create, write, read, reposition, delete • sequential and random access
Allocation Methods • Contiguous • Linked • Constrained • Striping • … and many others
Continuous • Occupy contiguous set of blocks • Strengths • minimizes seek time • supports sequential and random access • Weaknesses • suffers external fragmentation
Linked • Stored as a linked list of blocks • Strengths • eliminates external fragmentation • supports files of arbitrary length • Weaknesses • random access slow, overhead of pointers • susceptible to block errors
Constrained • Linked structure, but allocate next block based on “distance” from previous one • distance = predicted seek and latency • Strengths • improves sequential access • minimizes seek time • Weaknesses • increases algorithm complexity
Striping (RAID-0) • Stripe file across an array of N disks • divide file into stripes, dive stripe into units, assign each unit to different disk • Strengths • reduces disk access time by N • Weaknesses • susceptible to failure of any one disk • p(failure) = N * p(any one disk failing)
MM File System Requirements • Storing/retrieving multimedia files • large size; continuous periodic requests • Maintain high throughput • Support RT and non RT requests • Guarantee a sustained level of service
Meeting the Requirements • Methods of placing data on disk • Scheduling algorithms • Admission control policies • Maximize transfer time
Zipfs Law • Probability of occurrence of the kth most common word is proportional to 1/k • applies to many observable events • More generally Pi = k / iα where • i is the ith most popular item; k is a constant; alpha is close to 1
Apply to File Allocation • For multimedia, assume that • alpha=1 • Sum(Pi)=1 • Compute the probability of each multimedia file being accessed • use for layout and prefetching
Scheduling Algorithms • FCFS • SSTF • SCAN and C-SCAN • EDF • SCAN-EDF • Understand each algorithm and weigh advantages and disadvantages
FCFS • Serve requests based on incoming order • Inherently fair • Does not consider location of requests • can lead to high overhead
SSTF • Select request closest to current position • minimizes seek time/overhead • May cause starvation of some requests
SCAN and C-SCAN • Serves all requests in current direction • reverses when no more requests • serves middle tracks better than edges • C-SCAN scans across disk in cycles • more fair to the edge tracks
EDF • Attach deadlines to each request • select request with earliest deadline • can have high overhead
SCAN-EDF • SCAN-EDF selects • earliest deadline, or if same deadline • select request closest to the disk’s center • Use EDF, but perturb deadlines • Di = Di + f(Ni); where f(Ni) = Ni / Nmax • Consider direction?
Admission Control • Based on the admission control policy discussed in the paper: • C. Martin, P.S. Narayan, B. Ozden, R. Rastogi, and A. Silberschatz. The Fellini Multimedia Storage System, Journal of Digital Libraries, 1997.
Mathematical Setup • Client requests received in cycles of duration T • T is referred to as the common period of the system • assumes circular (C-SCAN) scan of the disk • consumption rate of each real-time client is ri • Retrieval rate for each client must be > T*ri • Ensure that the file system in each period T can retrieve T*ri bits for each client
Setup (cont.) • Serve both real and non-real time clients • Serve real-time clients using fraction of T • Use to serve real-time clients • Use to serve non real-time clients • To retrieve T*ri bits for each client, the controller must ensure time to retrieveT*r1, …, T*rn bits does not exceed
Number of Disk Blocks • If b is block size, then maximum number of disk blocks to be retrieved for ri is
Latency • Retrieval of a disk block involves a seek to the track containing the block, a settle time delay, and a rotational delay • Let tseek, trot, and tsettle be the worst case times for each measure
Maximum Latency • Thus, the maximum latency for servicing clients r1, r2, …, rq is
Transfer Time • If the transfer rate from the innermost track of the disk is rdisk, then the time to transfer T*ri bits of data for request ri is
Admission for Real-Time Clients • Thus, the total time to retrieve T*r1, …, T*rq bits for requests R1, …, Rq is the sum of the latency and transfer times • Admit new client, if on adding it, this equation is still satisfied
Admission for Non RT Clients • Remainder of the period is for requests from non real-time clients • Let di be the data requested from Ci • Number of blocks is
Admission for Non RT Clients • For each request, latency plus transfer time is • Over all requests p, this becomes • Admit new non RT client, if on adding it, above equation is still satisfied
Example • Transfer rate (rdisk) = 100 KB / sec • Cycle time (T) = 10ms • Max latency = 1ms • Client A data rate (r1) = 45 KB/sec • Client B data rate (r2) = 40 KB/sec • Are the two real-time clients admissible? • If so, what proportion of the cycle time is needed to serve these clients?