Advanced Storage Solutions Using JASMine in Linux-Based Servers
This paper explores the implementation and performance of JASMine, a distributed data management system used for mass storage at Jefferson Lab. It details the architecture, including the use of tape and disk data movers, cache servers, and data access through NFS. Performance tests are conducted on disk and tape systems, comparing RAID configurations and measuring throughput. Issues related to RAID controllers and disk failures are discussed, alongside solutions. The results underscore the importance of efficient data management in high-throughput environments.
Advanced Storage Solutions Using JASMine in Linux-Based Servers
E N D
Presentation Transcript
Linux Servers with JASMine K. Edwards, A. Kowalski, S. Philpott HEPiX May 21, 2003
JASMine • JASMine • JLab’s Mass Storage System • i.e. CASTOR, Enstore, … • Distributed Servers • Data Movers (tape and disk) • Two tape drives per Data Mover • 600+GB of staging disk space (3 9840B tapes) • Need fast access to/from disk to keep up with the 9940B tape drives and gigabit ethernet • Cache Servers (disk) • 1-2TB file servers • JASMine manages the files • Copies from Data Movers via JASMine’s jcp protocol • User access via NFS (read-only)
Lastest Data Mover • Operating System • RedHat 7.3, kernel 2.4.20-xfs • XFS File System • Hardware • Dual 2.2GHz Xeon CPUs • SuperMicro P4DPE Motherboard • 2 GBytes RAM • 2 LSI Logic MegaRaid 320-2 raid controllers • 14 Seagate 73GB disk drives (hot swap) • Qlogic 2342 dual port fiber card ($$) • 2 9940B tape drives • Intel PRO/1000XT Server Ethernet Card • 3U Chassis with N+1 power supplies • $14,200.00 US (without the 2 9940B tape drives)
Disk Performance Tests • Used Standard Tests (Disktest, Bonnie++, IOZone) • 4GB file size used • Wanted to try the Fermi test (lack of time) • Parameters tested • Write-through vs Write-back cache policy • Optimum disk read/write block sizes • RAID-5 vs. RAID-50 performance • RAID 5 array done in hardware (1 RAID card) • RAID 50 • 2 RAID-5 arrays done in hardware (1 per RAID card) • RAID-0 array done in software
Issues/Problems Discovered • LSI Logic MegaRAID 320-2 raid controllers • Vendor support only if you use standard RedHat kernels • These do not have XFS support • RAID monitor software from LSI Logic • Causes SCSI Bus Resets • Occurs every 20 seconds (not changeable) • Throughput drops to 4-5MB/sec when occurring as it resets the bus and flushes cache • Work Around • Turn off Raid monitoring • Without this, there is no real way to monitor the status of the disks and raid hardware • Disk failures go unnoticed • Looking into Adaptec 2200S RAID cards
Disk Test Results • Disk Results • Use Write-back cache on RAID card • 32K block sizes are optimum • Raid 50 was fastest (no real surprise) • Idle System (1 reader or 1 writer) • 210MB/sec disk read throughput • 140MB/sec write throughput • Busy system (8 readers and 8 writers) • 40MB/sec aggregate read throughput • 110 MB/sec aggregate write throughput
Tape Performance Testing • Used JASMine test program (Java) • Double-buffered • Threads simultaneously reading and writing from/to the buffer • Calculates/Verifies file checksum • Moves file between disk and tape • Used real raw data from the experiments • 2GB files • HallA and HallC data in CODA format • Does not compress • CLAS data in BOS format • Does compress
Tape Test Results • No Issues or Problems • Qlogic 2342 dual port fiber card works well with Linux • Some Extra CPU required for checksums • Hyper-Threading really helps the performance here • 9940B Results as Expected • Direction does not matter (read/write) • 30MB/sec if file is not compressible • Up to 45MB/sec if file is compressible • Depends on the compressibility of the file • Two simultaneous copies • 30MB/sec each if file is not compressible (no change) • Expected 37.5MB/sec each for compressible file read from tape - Observed 30MB/sec each
Latest Cache Server • Operating System • RedHat 7.3, kernel 2.4.18-xfs • XFS File System • Hardware • Dual 2.0GHz Xeon CPUs • SuperMicro P4DPE Motherboard • 2 GBytes RAM • 2 3ware 7850 IDE/ATA RAID controllers (RAID-5) • 16 Hot Swap Disk Drives • Maxtor 160GB ATA133 • Western Digital 180GB ATA100 • Intel PRO/1000XT Server Ethernet Card • 4U Chassis with N+1 power supplies • $9,000.00 US
Issues/Problems Discovered • Western Digital 180GB/200GB ATA100 Drives • Drives go offline/idle (WD feature) • 3ware card thinks the drive died • Solution • Get Disk Firmware Version 63.13F70 from Western Digital • Use Maxtor 160GB ATA133 drives
Experience with IDE/ATA Drives in General • High failure rates during the first two months of use • 1-3 per week • Need a longer burn in period • Failure rates decrease after two months of use • 1 every 6-8 weeks • marginal drives gone? • They still fail more often than SCSI disks • Then again, we lost 2 SCSI disks today • Number of disks by type used in servers • 191 SCSI • 320 ATA