1 / 92

Storage Performance 2013

Storage Performance 2013. Joe Chang www.qdpma.com. About Joe. SQL Server consultant since 1999 Query Optimizer execution plan cost formulas (2002) True cost structure of SQL plan operations (2003?) Database with distribution statistics only, no data 2004 Decoding statblob/stats_stream

lovey
Télécharger la présentation

Storage Performance 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Performance 2013 Joe Chang www.qdpma.com

  2. About Joe • SQL Server consultant since 1999 • Query Optimizer execution plan cost formulas (2002) • True cost structure of SQL plan operations (2003?) • Database with distribution statistics only, no data 2004 • Decoding statblob/stats_stream • writing your own statistics • Disk IO cost structure • Tools for system monitoring, execution plan analysis See ExecStats on www.qdpma.com

  3. Storage Performance Chain SQL ServerEngine • All elements must be correct • No weak links • Perfect on 6 out 7 elements and 1 not correct = bad IO performance SQL Server Extent SQL Server File Dir At/SAN SAS/FC Pool RAID Group SAS HDD SDD

  4. Storage Performance Overview • System Architecture • PCI-E, SAS, HBA/RAID controllers • SSD, NAND, Flash Controllers, Standards • Form Factors, Endurance, ONFI, Interfaces • SLC, MLC Performance • Storage system architecture • Direct-attach, SAN • Database • SQL Server Files, FileGroup

  5. Sandy Bridge EN & EP EN Xeon E5-2400, Socket B2 1356 pins • 1 QPI 8 GT/s, 3 DDR3 memory channels 24 PCI-E 3.0 8GT/s, DMI2 (x4 @ 5GT/s) E5-2470 8 core 2.3GHz 20M 8.0GT/s (3.1) E5-2440 6 core 2.4GHz 15M 7.2GT/s (2.9) E5-2407 4c – 4t 2.2GHz 10M 6.4GT/s (n/a) QPI PCI-E QPI PCI-E C3 LLC C4 C3 LLC C4 C2 C5 QPI C2 C5 C1 C6 C1 C6 C0 C7 C0 C7 MI MI MI MI DMI 2 PCIe x8 PCIe x8 PCIe x8 PCI-E x8 PCI-E x8 PCI-E x8 x4 PCH EP Xeon E5-2600, Socket: R 2011-pin 2 QPI, 4 DDR3, 40 PCI-E 3.0 8GT/s, DMI2 Model, cores, clock, LLC, QPI, (Turbo) E5-2690 8 core 2.9GHz 20M 8.0GT/s (3.8)* E5-2680 8 core 2.7GHz 20M 8.0GT/s (3.5) E5-2670 8 core 2.6GHz 20M 8.0GT/s (3.3) E5-2667 6 core 2.9GHz 15M 8.0GT/s (3.5)* E5-2665 8 core 2.4GHz 20M 8.0GT/s (3.1) E5-2660 8 core 2.2GHz 20M 8.0GT/s (3.0) • E5-2650 8 core 2.0GHz 20M 8.0GT/s (2.8) • E5-2643 4 core 3.3GHz 10M 8.0GT/s (3.5)* • E5-2640 6 core 2.5GHz 15M 7.2GT/s (3.0) QPI PCI-E QPI PCI-E C3 LLC C4 C3 LLC C4 QPI C2 C5 C2 C5 C1 C6 C1 C6 QPI C0 C7 C0 C7 MI MI MI MI DMI 2 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCI-E x8 PCI-E x8 PCI-E x8 PCI-E x8 PCI-E x8 x4 PCH 80 PCI-E gen 3 lanes + 8 gen 2 possible Dell T620 4 x16, 2 x8, 1 x4 Dell R720 1 x16, 6 x8 HP DL380 G8p 2 x16, 3 x8, 1 x4 Supermicro X9DRX+F 10 x8, 1 x4 g2 Disable cores in BIOS/UEFI?

  6. Xeon E5-4600 Xeon E5-4600 Socket: R 2011-pin 2 QPI, 4 DDR3 40 PCI-E 3.0 8GT/s, DMI2 Model, cores, Clock, LLC, QPI, (Turbo) E5-4650 8 core 2.70GHz 20M 8.0GT/s (3.3)* E5-4640 8 core 2.40GHz 20M 8.0GT/s (2.8) E5-4620 8 core 2.20GHz 16M 7.2GT/s (2.6) E5-4617 6c - 6t 2.90GHz 15M 7.2GT/s (3.4) E5-4610 6 core 2.40GHz 15M 7.2GT/s (2.9) E5-4607 6 core 2.20GHz 12M 6.4GT/s (n/a) E5-4603 4 core 2.00GHz 10M 6.4GT/s (n/a) PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E QPI QPI QPI QPI PCI-E PCI-E PCI-E PCI-E C3 C3 C3 C3 LLC LLC LLC LLC C4 C4 C4 C4 QPI C2 C2 C2 C2 C5 C5 C5 C5 C1 C1 C1 C1 C6 C6 C6 C6 C0 C0 C0 C0 C7 C7 C7 C7 MI MI MI MI MI MI MI MI QPI QPI DMI 2 PCI-E PCI-E PCI-E PCI-E PCI-E Hi-freq 6-core gives up HT No high-frequency 4-core, QPI PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E Dell R820 2 x16, 4 x8, 1 int HP DL560 G8p 2 x16, 3 x8, 1 x4 Supermicro X9QR 7 x16, 1 x8 160 PCI-E gen 3 lanes + 16 gen 2 possible

  7. 2 PCI-E, SAS & RAID Controllers

  8. PCI-E gen 1, 2 & 3 • PCIe 1.0 & 2.0 encoding scheme 8b/10b • PCIe 3.0 encoding scheme 128b/130b • Simultaneous bi-directional transfer • Protocol Overhead • Sequence/CRC, Header • 22 bytes, (20%?) Adaptec Series 7: 6.6GB/s, 450K IOPS

  9. PCI-E Packet Net realizable bandwidth appears to be 20% less (1.6GB/s of 2.0GB/s)

  10. PCIe Gen2 & SAS/SATA 6Gbps • SATA 6Gbps – single lane, net BW 560MB/s • SAS 6Gbps, x 4 lanes, net BW 2.2GB/s • Dual-port, SAS protocol only • Not supported by SATA HBA SAS x4 6G 2.2GB/s PCIe g2 x8 3.2GB/s SAS x4 6G A A A B B B Some bandwidth mismatch is OK, especially on downstream side

  11. PCIe 3 & SAS • 12Gbps – coming soon? Slowly? • Infrastructure will take more time HBA SAS x4 6G SAS x4 6G PCIe g3 x8 SAS x4 6G SAS x4 6G SAS Expander SAS x4 6Gb HBA SAS x4 6Gb SAS x4 12G PCIe g3 x8 SAS x4 12G SAS Expander SAS x4 6Gb SAS x4 6Gb PCIe 3.0 x8 HBA 2 SAS x4 12Gbps ports or 4 SAS x4 6Gbps port if HBA can support 6GB/s

  12. PCIe Gen3 & SAS 6Gbps

  13. LSI 12Gpbs SAS 3008

  14. PCIe RAID Controllers? • 2 x4 SAS 6Gbps ports (2.2GB/s per x4 port) • 1st generation PCIe 2 – 2.8GB/s? • Adaptec: PCIe g3 can do 4GB/s • 3 x4 SAS 6Gbps bandwidth match PCIe 3.0 x8 • 6 x4 SAS 6Gpbs – Adaptec Series 7, PMC • 1 Chip: x8 PCIe g3 and 24 SAS 6Gbps lanes • Because they could HBA SAS x4 6G SAS x4 6G SAS x4 6G PCIe g3 x8 SAS x4 6G SAS x4 6G SAS x4 6G

  15. 2 SSD, NAND, FlasH controllers

  16. SSD Evolution • HDD replacement • using existing HDD infrastructure • PCI-E card form factor lack expansion flexibility • Storage system designed around SSD • PCI-E interface with HDD like form factor? • Storage enclosure designed for SSD • Rethink computer system memory & storage • Re-do the software stack too!

  17. SFF-8639 & Express Bay SCSI Express – storage over PCI-E, NVM-e

  18. New Form Factors - NGFF Enterprise 10K/15K HDD - 15mm 15mm SSD Storage Enclosure could be 1U, 75 5mm devices?

  19. SATA Express Card (NGFF) Crucial mSATA M2

  20. SSD – NAND Flash • NAND • SLC, MLC regular and high-endurance • eMLC could mean endurance or embedded - differ • Controller interfaces NAND to SATA or PCIE • Form Factor • SATA/SAS interface in 2.5in HDD or new form factor • PCI-E interface and FF, or HDD-like FF • Complete SSD storage system

  21. NAND Endurance Intel – High Endurance Technology MLC

  22. NAND Endurance – Write Performance Endurance Cost Structure MLC = 1 MLC EE = 1.3 SLC = 3 Process depend. 34nm 25nm 20nm Write perf? SLC MLC-e MLC Write Performance

  23. NAND P/E - Micron 34 or 25nm MLC NAND is probably good Database can support cost structure

  24. NAND P/E - IBM 34 or 25nm MLC NAND is probably good Database can support cost structure

  25. Write Endurance Vendorscommonly cite single spec for range of models, 120, 240, 480GB Should vary with raw capacity? Depends on over-provioning? 3 year life is OK for MLC cost structure, maybe even 2 year MLC 20TB / life = 10GB/day for 2000 days (5 years+), 20GB/day – 3 years Vendors now cite 72TB write endurance for 120-480GB capacities?

  26. NAND • SLC – fast writes, high endurance • eMLC – slow writes, medium endurance • MLC – medium writes, low endurance • MLC cost structure of $1/GB @ 25nm • eMLC 1.4X, SLC 2X?

  27. ONFI Open NAND Flash Interface organization • 1.0 2006 – 50MB/s • 2.0 2008 – 133MB/s • 2.1 2009 – 166 & 200MB/s • 3.0 2011 – 400MB/s • Micron has 200 & 333MHz products ONFI 1.0 – 6 channels to support 3Gbps SATA, 260MB/s ONFI 2.0 – 4+ channels to support 6Gbps SATA, 560MB/s

  28. NAND write performance MLC 85MB/s per 4-die channel (128GB) 340MB/s over 4 channels (512GB)?

  29. Controller Interface PCIe vs. SATA Some bandwidth mistmatch/overkill OK ONFI 2 – 8 channels at 133MHz to SATA 6Gbps – 560 MB/s a good match NAND NAND NAND NAND Controller NAND NAND PCIe or SATA? Multiple lanes? But ONFI 3.0 is overwhelming SATA 6Gbps? NAND NAND NAND 6-8 channel at 400MB/s to match 2.2GB/s x4 SAS? 16 channel+ at 400MB/s to match 6.4GB/s x8 PCIe 3 CPU access efficiency and scaling Intel & NVM Express

  30. Controller Interface PCIe vs. SATA SATA PCIe Controller DRAM Controller DRAM NAND NAND NAND NAND NAND NAND NAND NAND NAND NAND NAND NAND NAND NAND PCIe NAND Controller Vendors Vendor Channels PCIe Gen IDT 32 x8 Gen3 NVMe Micron 32 x8 Gen2 Fusion-IO 3x4? X8 Gen2?

  31. SATA & PCI-E SSD Capacities 64 Gbit MLC NAND die 150mm2 25nm 2 x 32 Gbit 34nm 1 x 64 Gbit 25nm 1 x 64 Gbit 29nm 1 64 Gbit die 8 x 64 Gbit die in 1 package = 64GB SATA Controller – 8 channels, 8 package x 64GB = 512GB PCI-E Controller – 32 channels x 64GB = 2TB

  32. PCI-E vs. SATA/SAS • SATA/SAS controllers have 8 NAND channels • No economic benefit in fewer channels? • 8 ch. Good match for 50MB/s NAND to SATA 3G • 3Gbps – approx 280MB/s realizable BW • 8 ch also good match for 100MB/s to SATA 6G • 6Gbps – 560MB/s realizable BW • NAND is now at 200 & 333MB/s • PCI-E – 32 channels practical – 1500 pins • 333MHz good match to gen 3 x8 – 6.4GB/s BW

  33. Crucial/Micron P400m & e P410m SAS specs slightly different EE MLC Higher endurance write perf not lower than MLC? Preliminary – need to update

  34. Crucial m4 & m500 Preliminary – need to update

  35. Micron & Intel SSD Pricing (2013-02) Need corrected P400m pricing P400m raw capacities are 168, 336 and 672GB (pricing retracted) Intel SSD DC S3700 pricing $235, 470, 940 and 1880 (800GB) respectively

  36. 4K Write K IOPS Need corrected P400m pricing P400m raw capacities are 168, 336 and 672GB (pricing retracted) Intel SSD DC S3700 pricing $235, 470, 940 and 1880 (800GB) respectively

  37. SSD Summary • MLC is possible with careful write strategy • Partitioning to minimize index rebuilds • Avoid full database restore to SSD • Endurance (HET) MLC – write perf? • Standard DB practice work • But avoid frequent index defrags? • SLC – only extreme write intensive? • Lower volume product – higher cost

  38. 3 Direct Attach Storage

  39. Full IO Bandwidth QPI • 10 PCIe g3 x8 slots possible – Supermicro only • HP, Dell systems have 5-7 x8+ slots + 1 x4? • 4GB per slot with 2 x4 SAS, • 6GB/s with 4 x4 • Mixed SSD + HDD – reduce wear on MLC 192 GB 192 GB QPI SSD SSD SSD SSD SSD SSD SSD SSD PCIe x4 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x4 HDD HDD HDD HDD HDD HDD HDD HDD Infini Band Infini Band Misc 10GbE RAID RAID RAID RAID RAID RAID RAID RAID Misc devices on 2 x4 PCIe g2, Internal boot disks, 1GbE or 10GbE, graphics

  40. System Storage Strategy QPI Dell & HP only have 5-7 slots 4 Controllers @ 4GB/s each is probably good enough? Few practical products can use PCIe G3 x16 slots 192 GB 192 GB SSD SSD SSD SSD QPI PCIe x4 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 HDD HDD HDD HDD 10GbE IB RAID RAID RAID RAID • Capable of 16GB/s with initial capacity • 4 HBA, 4-6GB/s each • with allowance for capacity growth • And mixed SSD + HDD

  41. Clustered SAS Storage Node 1 Node 2 Dell MD3220 supports clustering Upto 4 nodes w/o external switch (extra nodes not shown) SAS Host SAS Host SAS Host SAS Host SAS Host SAS Host SAS Host SAS Host QPI QPI SSD SSD SSD SSD QPI QPI IOC IOC PCIE Switch PCIE Switch SAS Exp SAS Exp 192 GB 192 GB 192 GB 192 GB 2GB 2GB HBA HBA HBA HBA HBA HBA HBA HBA MD3220 MD3220 MD3220 MD3220 HDD HDD HDD HDD Host Host Host Host Host Host Host Host 2GB 2GB IOC IOC PCIE PCIE Exp Exp

  42. Alternate SSD/HDD Strategy Backup System QPI 192 GB 192 GB SSD SSD SSD SSD QPI PCIe x4 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x8 PCIe x4 HDD 10GbE IB IB RAID HBA HBA RAID HBA RAID RAID HBA HDD HDD HDD HDD • Primary System • All SSD for data & temp, • logs may be HDD • Secondary System • HDD for backup and restore testing

  43. System Storage Mixed SSD + HDD Each RAID Group-Volume should not exceed 2GB/s BW of x4 SAS 2-4 volumes per x8 PCIe G3 slot SATA SSD read 350-500MB/s, write 140MB/s+ 8 per volume allows for some overkill 16 SSD per RAID Controller 64 SATA/SAS SSD’s to deliver 16-24GB/s 4 HDD per volume rule does not apply QPI 192 GB 192 GB QPI x4 x8 x8 x8 x8 x8 10GbE HBA HBA HBA HBA IB SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD SSD HDD HDD SSD SSD SSD SSD SSD SSD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD for local database backup, restore tests, and DW flat files SSD & HDD on shared channel – simultaneous bi-directional IO HDD HDD HDD HDD HDD HDD

  44. SSD/HDD System Strategy • MLC is possible with careful write strategy • Partitioning to minimize index rebuilds • Avoid full database restore to SSD • Hybrid SSD + HDD system, full-duplex signalling • Endurance (HET) MLC – write perf? • Standard DB practice work, avoid index defrags • SLC – only extreme write intensive? • Lower volume product – higher cost • HDD – for restore testing

  45. SAS Expander Disk Enclosure expansion ports not shown 2 x4 to hosts 1 x4 for expansion 24 x1 for disks

  46. Storage Infracture – designed for HDD 15mm • 2 SAS Expanders for dual-port support • 1 x4 upstream (to host), 1 x4 downstream (expan) • 24 x1 for bays 2U

  47. Mixed HDD + SSD Enclosure 15mm • Current: 24 x 15mm = 360mm + spacing • Proposed 16 x15mm=240mm + 16x7mm= 120 2U

  48. Enclosure 24x15mm and proposed Host Host 384 GB 384 GB PCIe x8 PCIe x8 PCIe x8 PCIe x8 SAS x4 6 Gpbs 2.2GB/s HBA HBA SAS x4 12 Gpbs 4GB/s SAS SAS SAS SAS 2 RAID Groups for SSD, 2 for HDD 1 SSD Volume on path A 1 SSD Volume on path B SAS Expander SAS Expander SAS Expander SAS Expander SAS SAS SAS SAS New SAS 12Gbps 16 x 15mm + 16 x 7mm bays 2 SAS expanders – 40 lanes each 4 lanes upstream to host 4 lanes downstream for expansion 32 lanes for bays Current 2U Enclosure, • 24 x 15mm bays – HDD or SSD 2 SAS expanders– 32 laneseach 4 lanes upstream to host 4 lanes downstream for expansion 24 lanes for bays

  49. Enclosure 24x15mm and proposed Host Host 384 GB 384 GB PCIe x8 PCIe x8 PCIe x8 PCIe x8 HBA HBA SAS SAS SAS SAS SAS Expander SAS Expander SAS Expander SAS Expander 2 RAID Groups for SSD, 2 for HDD 1 SSD Volume on path A 1 SSD Volume on path B SAS SAS SAS SAS New SAS 12Gbps 16 x 15mm + 16 x 7mm bays 2 SAS expanders – 40 lanes each 4 lanes upstream to host 4 lanes downstream for expansion 32 lanes for bays Current 2U Enclosure, • 24 x 15mm bays – HDD or SSD 2 SAS expanders– 32 laneseach 4 lanes upstream to host 4 lanes downstream for expansion 24 lanes for bays

  50. Alternative Expansion Enclosure 1 Expander Expander HBA SAS x4 SAS x4 Enclosure 3 Host Expander Expander SAS x4 SAS x4 PCIe x8 SAS x4 SAS x4 Enclosure 4 SAS x4 SAS x4 Each SAS expander – 40 lanes, 8 lanes upstream to host with no expansion or 4 lanes upstream and 5 lanes downstream for expansion 32 lanes for bays Enclosure 2

More Related