80 likes | 214 Vues
This paper examines the failure trends of large disk drive populations, focusing on data collected 5-10 years ago when most drives were Parallel ATA, contrasting them with the currently prevalent SATA technology. It questions various assumptions related to power cycles, independent drive failures, and provides coarse measurements of annual failure rates (AFRs). The study also investigates utilization patterns, temperature variations, and empirical modeling. The discrepancies in findings compared to manufacturers and industry standards highlight the need for better statistical methods and empirical models to enhance reliability in different usage scenarios.
E N D
Shiva Srivastava and VaibhavRastogi offend Failure trends in a Large Disk Drive Population
Relevance • Work conducted 5-10 years back • Disk drives have changed • All hard disks were Parallel ATA • Prevalent technology today is SATA • Some aspects not covered • Power cycles
Assumptions • Drives fail independently of each other • Enables AFRs
Coarse measurements • Definition of failures • Too coarse grained • When do the disks get replaced • Utilization • Weekly averages • Do not have anything better • Same for temperature
Statistical correctness • What was the size of the fleet? • How does it compare with others • Patterns like those in Figure 3 may be random • Difference between 2 and 4 % is not much
Usefulness • No good empirical model • Perhaps the measurements are too coarse • How am I supposed to use them? • Can they be different for different data centers, for different usage patterns?
Improper presentation • Where is the control in Figure 8 and 11? • Why do the confidence levels decrease so much in Figure 11 • Shows there is a lot of variance? Why?
Comparison with others • Your finding not corroborating manufacturers’ findings • Does it not go against you? • People have used large number of disks • How do you compare with them?