1 / 67

Design and Evaluation of Architectures for Commercial Applications

Design and Evaluation of Architectures for Commercial Applications. Part I: benchmarks. Luiz André Barroso. Why architects should learn about commercial applications?. Because they are very different from typical benchmarks Because they are demanding on many interesting architectural features

sivan
Télécharger la présentation

Design and Evaluation of Architectures for Commercial Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design and Evaluation of Architectures for Commercial Applications Part I: benchmarks Luiz André Barroso

  2. Why architects should learn about commercial applications? • Because they are very different from typical benchmarks • Because they are demanding on many interesting architectural features • Because they are driving the sales of mid-range and high-end systems UPC, February 1999

  3. Shortcomings of popular benchmarks • SPEC • uniprocessor-oriented • small cache footprints • exacerbates impact of CPU core issues • SPLASH • small cache footprints • extremely optimized sharing • STREAMS • no real sharing/communication • mainly bandwidth-oriented UPC, February 1999

  4. SPLASH vs. Online Transaction Processing (OLTP) A typical SPLASH app. has > 3x the issue rate, ~26x less cycles spent in memory barriers, 1/4 of the TLB miss ratios, < 1/2 the fraction of cache-to-cache transfers, ~22x smaller instruction cache miss ratio, ~1/2 L2$ miss ratio ...of an OLTP app. UPC, February 1999

  5. But the real reason we care? $$$! • Server market: • Total: > $50 billion • Numeric/scientific computing: < $2 billion • Remaining $48 billion? • OLTP • DSS • Internet/Web • Trend is for numerical/scientific to remain a niche UPC, February 1999

  6. Relevance of server vs. PC market • High profit margins • Performance is a differentiating factor • If you sell the server you will probably sell: • the client • the storage • the networking infrastructure • the middleware • the service • ... UPC, February 1999

  7. Need for speed in the commercial market • Applications pushing the envelope • Enterprise resource planning (ERP) • Electronic commerce • Data mining/warehousing • ADSL servers • Specialized solutions • Intel splitting Pentium line into 3-tiers • Oracle’s raw iron initiative • Network Appliances’ machines UPC, February 1999

  8. Seminar disclaimer • Hardware centric approach: • target is build better machines, not better software • focus on fundamental behavior, not on software “features” • Stick to general purpose paradigm • Emphasis on CPU+memory system issues • Lots of things missing: • object-relational and object-oriented databases • public domain/academic database engines • many others UPC, February 1999

  9. Overview • Day I: Introduction and workloads • Background on commercial applications • Software structure of a commercial RDBMS • Standard benchmarks • TPC-B • TPC-C • TPC-D • TPC-W • Cost and pricing trends • Scaling down TPC benchmarks UPC, February 1999

  10. Overview(2) • Day 2: Evaluation methods/tools • Introduction • Software instrumentation (ATOM) • Hardware measurement & profiling • IPROBE • DCPI • ProfileMe • Tracing & trace-driven simulation • User-level simulators • Complete machine simulators (SimOS) UPC, February 1999

  11. Overview (3) • Day III: Architecture studies • Memory system characterization • Out-of-order processors • Simultaneous multithreading • Final remarks UPC, February 1999

  12. Background on commercial applications • Database applications: • Online Transaction Processing (OLTP) • massive number of short queries • read/update indexed tables • canonical example: banking system • Decision Support Systems (DSS) • smaller number of complex queries • mostly read-only over large (non-indexed) tables • canonical example: business analysis UPC, February 1999

  13. Background (2) • Web/Internet applications • Web server • many requests for small/medium files • Proxy • many short-lived connection requests • content caching and coherence • Web search index • DSS with a Web front-end • E-commerce site • OLTP with a Web front-end UPC, February 1999

  14. Background (3) • Common characteristics • Large amounts of data manipulation • Interactive response times required • Highly multithreaded by design • suitable for large multiprocessors • Significant I/O requirements • Extensive/complex interactions with the operating system • Require robustness and resiliency to failures UPC, February 1999

  15. Database performance bottlenecks • I/O-bound until recently (Thakkar, ISCA’90) • Many improvements since then • multithreading of DB engine • I/O prefetching • VLM (very large memory) database caching • more efficient OS interactions • RAIDs • non-volatile DRAM (NVDRAM) • Today’s bottlenecks: • Memory system • Processor architecture UPC, February 1999

  16. Structure of a database workload Application server (optional) clients Database server Formulates and issues DB query Executes query Simple logic checks UPC, February 1999

  17. Who is who in the database market? • DB engine: • Oracle is dominant • other players: Microsoft, Sybase, Informix • Database applications: • SAP is dominant • other players: Oracle Apps, PeopleSoft, Baan • Hardware: • players: Sun, IBM, HP and Compaq UPC, February 1999

  18. Who is who in the database market? (2) • Historically, mainly mainframe proprietary OS • Today: • Unix: 40% • NT: 8% • Proprietary: 52% • In two years: • Unix 46% • NT 19% • Proprietary 35% UPC, February 1999

  19. Overview of a RDBMS: Oracle8 • Similar in structure to most commercial engines • Runs on: • uniprocessors • SMP multiprocessors • NUMA multiprocessors* • For clusters or message passing multiprocessors: • Oracle Parallel Server (OPS) UPC, February 1999

  20. The Oracle RDBMS • Physical structure • Control files • basic info on the database, it’s structure and status • Data files • tables: actual database data • indexes: sorted list of pointers to data • rollback segments: keep data for recovery upon a failed transaction • Log files • compressed storage of DB updates UPC, February 1999

  21. Index files • Critical in speeding up access to data by avoiding expensive scans • The more selective the index, the faster the access • Drawbacks: • Very selective indexes may occupy lots of storage • Updates to indexed data are more expensive UPC, February 1999

  22. Files or raw disk devices • Most DB engines can directly access disks as raw devices • Idea is to bypass the file system • Manageability/flexibility somewhat compromised • Performance boost not large (~10-15%) • Most customer installations use file systems UPC, February 1999

  23. Transactions & rollback segments • Single transaction can access/update many items • Atomicity is required: • transaction either happens or not • old value of balance(X) is kept in a rollback segment • rollback: old values restored, all locks released Example: bank transfer Transaction A (accounts X,Y; value M) { read account balance(X) subtract M from balance(X) add M to balance(Y) commit } failure UPC, February 1999

  24. Transactions & log files • A transaction is only committed after it’s side effects are in stable storage • Writing all modified DB blocks would be too expensive • random disk writes are costly • a whole DB block has to be written back • no coalescing of updates • Alternative: write only a log of modifications • sequential I/O writes (enables NVDRAM optimizations) • batching of multiple commits • Background process periodically writes dirty data blocks out UPC, February 1999

  25. Transactions & log files (2) • When a block is written to disk the log file entries are deleted • If the system crashes: • in-memory dirty blocks are lost • Recovery procedure: • goes through the log files and applies all updates to the database UPC, February 1999

  26. Table overhead concurrency Block Row Transactions & concurrency control • Many transactions in-flight at any given time • Locking of data items is required • Lock granularity: • Efficient row-level locking is needed for high transaction throughput UPC, February 1999

  27. 233 Data block directory 233 233 233 120 230 233 234 235 233 Row-level locking • Each new transaction is assigned an unique ID • A transaction table keeps track of all active transactions • Lock: write ID in directory entry for row • Unlock: remove ID from transaction table Transaction table Data block • Simultaneous release of all locks • Simultaneous release of all locks UPC, February 1999

  28. Transaction read consistency • A transaction that reads a full table should see a consistent snapshot • For performance, reads shouldn’t lock a table • Problem: intervening writes • Solution: leverage rollback mechanism • intervening write saves old value in rollback segment UPC, February 1999

  29. Oracle: software structure • Server processes • actual execution of transactions • DB writer • flush dirty blocks to disk • Log writer • writes redo logs to disk at commit time • Process and system monitors • misc. activity monitoring and recovery • Processes communicate through SGA and IPC UPC, February 1999

  30. Oracle: software structure(2) System Global Area (SGA) • SGA: • shared memory segment mapped by all processes • Block buffer area • cache of database blocks • larger portion of physical memory • Metadata area • where most communication takes place • synchronization structures • shared procedures • directory information Block buffer area Increasing virtual address Redo buffers Data dictionary Metadata area Shared pool Fixed region UPC, February 1999

  31. Oracle: software structure(3) • Hiding I/O latency: • many server processes/processor • large block buffer area • Process dynamics: • server reads/updates database • (allocates entries in the redo buffer pool) • at commit time server signals Log writer and sleeps • Log writer wakes up, coalesces multiple commits and issues log file write • after log is written, Log writer signals suspended servers UPC, February 1999

  32. Oracle: NUMA issues • Single SGA region complicates NUMA localization • Single log writer process becomes a bottleneck • Oracle8 is incorporating NUMA-friendly optimizations • Current large NUMA systems use OPS even on a single address space UPC, February 1999

  33. Oracle Parallel Server (OPS) • Runs on clusters of SMPs/NUMAs • Layered on top of RDBMS engine • Shared data through disk • Performance very dependent on how well data can be partitioned • Not supported by most application vendors UPC, February 1999

  34. Running Oracle: other issues • Most memory allocated to block buffer area • Need to eliminate OS double buffering • Best performance attained by limiting process migration • In large SMPs, dedicating one processor to I/O may be advantageous UPC, February 1999

  35. TPC Database Benchmarks • Transaction Processing Performance Council (TPC) • Established about 10 years ago • Mission: define representative benchmark standards for vendors (hardware/software) to compare their products • Focus on both performance and price/performance • Strict rules about how the benchmark is ran • Only widely used benchmarks UPC, February 1999

  36. TPC pricing rules • Must include • All hardware • server, I/O, networking, switches, clients • All software • OS, any middleware, database engine • 5-year maintenance contract • Can include usual discounts • Audited components must be products UPC, February 1999

  37. TPC history of benchmarks • TPC-A • First OLTP benchmark • Based on Jim Gray’s Debit-Credit benchmark • TPC-B • Simpler version of TPC-A • Meant as a stress test of the server only • TPC-C • Current TPC OLTP benchmark • Much more complex than TPC-A/B • TPC-D • Current TPC DSS benchmark • TPC-W • New Web-based e-commerce benchmark UPC, February 1999

  38. The TPC-B benchmark • Models a bank with many branches • 1 transaction type: account update • Metrics: • tpsB (transactions/second) • $/tpsB • Scale requirement: • 1 tpsB needs 100,000 accounts Branch Begin transaction Update account balance Write entry in history table Update teller balance Update branch balance Commit 100,000 10 Teller Account History UPC, February 1999

  39. TPC-B: other requirements • System must be ACID • (A)tomicity • transactions either commit or leave the system as if were never issued • (C)onsistency • transactions take system from a consistent state to another • (I)solation • concurrent transactions execute as if in some serial order • (D)urability • results of committed transactions are resilient to faults UPC, February 1999

  40. The TPC-C benchmark • Current TPC OLTP benchmark • Moderately complex OLTP • Models a wholesale supplier managing orders • Workload consists of five transaction types • Users and database scale linearly with throughput • Specification was approved July 23, 1992 UPC, February 1999

  41. Warehouse W Stock W*100K Item 100K (fixed) 100K W Legend 10 Table Name <cardinality> one-to-many relationship District W*10 secondary index 3K Customer W*30K Order W*30K+ New-Order W*5K 1+ 0-1 1+ 10-15 History W*30K+ Order-Line W*300K+ TPC-C: schema UPC, February 1999

  42. TPC-C: transactions • New-order: enter a new order from a customer • Payment: update customer balance to reflect a payment • Delivery: deliver orders (done as a batch transaction) • Order-status: retrieve status of customer’s most recent order • Stock-level: monitor warehouse inventory UPC, February 1999

  43. 1 Select txn from menu: 1. New-Order 45% 2. Payment 43% 3. Order-Status 4% 4. Delivery 4% 5. Stock-Level 4% TPC-C: transaction flow 2 Measure menu Response Time Input screen Keying time 3 Measure txn Response Time Output screen Think time Go back to 1 UPC, February 1999

  44. TPC-C: other requirements • Transparency • tables can be split horizontally and vertically provided it is hidden from the application • Skew • 1% of new-order txn are to a random remote warehouse • 15% of payment txn are to a random remote warehouse • Metrics: • performance: new-order transactions/minute (tpmC) • cost/performance: $/tpmC UPC, February 1999

  45. TPC-C: scale • Maximum of 12 tpmC per warehouse • Consequently: • A quad-Xeon system today (~20,000 tpmC) needs • over 1668 warehouses • over 1 TB of disk storage!! • That’s a VERY expensive benchmark to run! UPC, February 1999

  46. TPC-C: side effects of the skew rules • Very small fraction of transactions go to remote warehouses • Transparency rules allow data partitioning • Consequence: • Clusters of powerful machines show exceptional numbers • Compaq has current TPC-C record of over 100 KtpmC with an 8-node memory channel cluster • Skew rules are expected to change in the future UPC, February 1999

  47. The TPC-D benchmark • Current DSS benchmark from TPC • Moderately complex decision support workload • Models a worldwide reseller of parts • Queries ask real world business questions • 17 ad hoc DSS queries (Q1 to Q17) • 2 update queries UPC, February 1999

  48. TPC-D: schema Customer SF*150K Nation 25 Region 5 Order SF*1500K Supplier SF*10K Part SF*200K LineItem SF*6000K PartSupp SF*800K UPC, February 1999

  49. TPC-D: scale • Unlike TPC-C, scale not tied to performance • Size determined by a Scale Factor (SF) • SF = {1,10,30,100,300,1000,3000,10000} • SF=1 means a 1GB database size • Majority of current results are in the 100GB and 300GB range • Indices and temporary tables can significantly increase the total disk capacity. (3-5x is typical) UPC, February 1999

  50. TPC-D example query • Forecasting Revenue Query (Q6) • This query quantifies the amount of revenue increase that would have resulted from eliminating company-wide discounts in a given percentage range in a given year. Asking this type of “what if” query can be used to look for ways to increase revenues • Considers all line-items shipped in a year • Query definition: SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM WHERE L_SHIPDATE >= DATE ‘[DATE]]’ AND L_SHIPDATE < DATE ‘[DATE]’ + INTERVAL ‘1’ YEAR AND L_DISCOUNTBETWEEN [DISCOUNT] - 0.01 AND [DISCOUNT] + 0.01 AND L_QUANTITY < [QUANTITY] UPC, February 1999

More Related