1 / 37

Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours

Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours. Chi-Chung Hui Consulting I/T Specialist Information Management Software, IBM HK. Simplicity, Flexibility, Choice IBM Data Warehouse & Analytics Solutions. Custom Solution. True Appliance.

sirius
Télécharger la présentation

Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours Chi-Chung Hui Consulting I/T Specialist Information Management Software, IBM HK

  2. Simplicity, Flexibility, ChoiceIBM Data Warehouse & Analytics Solutions Custom Solution True Appliance Flexible Integrated System Netezza IBM Smart Analytics System IBM InfoSphere Warehouse Warehouse Accelerators Information Management Portfolio (Information Server, MDM, Streams, etc) Simplicity Flexibility The right mix of simplicityand flexibility 2

  3. Netezza Value Proposition • Speed: Price/performance leader using hardware-based data streaming • Simplicity: Black-box appliance with no tuning or storage administration provides low TCO and fast time to value • Scalability: True MPP enables customers to conduct rapid queries and analytics on petabyte sized data warehouses • Smart: Built-in advanced analytics pushed deep into database delivers analytics to the masses

  4. Netezza and ISAS Choose Netezza: • If the best price/performance is required • If customer cannot afford too much tuning and administration • If customer need the fastest time to value • If customer does not want to pay for a separate database software license Choose ISAS: • If AIX is preferred • If SAN and remote mirroring are required • If customer requires the warehouse to conform to the data center infrastructure standard • If customer likes a more customized warehouse • If customer need very specific/deep tuning techniques

  5. Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance

  6. Netezza – Solution Highlight Summary • True Appliance • Hardware, software and storage pre-built for data warehouse • With specially designed hardware designed for high-performance advanced analytics operations • Hardware compression based on table columns • Very fast • Usually 10x to 100x faster than traditional database • Minimal administration and tuning • Low TCO

  7. Why Netezza is Good?

  8. Legacy DWH Architectures:Moving large amounts of data becomes Bottleneck!! Large amounts of data moved from disk, causing bottleneck Data Results Query Server RDBMS SW Storage Data is moved to memory, then SQL processed

  9. Netezza Performance ServerWe are better in EDW operations with complex BI queries! Netezza Performance Server™ CPU: 2% of existing systems Network traffic:1% of existing systems Results Query SMP Host (2-4 CPU) Data processed as streams from disk, before moved to memory

  10. Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance

  11. The IBM Netezza TwinFin™ Appliance Slice of User Data Swap and Mirror partitions High speed data streaming Disk Enclosures SQL Compiler Query Plan Optimize Admin SMP Hosts Snippet Blades™ (S-Blades™) Processor & streaming DB logic High-performance database engine streaming joins, aggregations, sorts, etc. Page 11

  12. The S-Blade™: CPU Blade + FPGA sidecar Page 12

  13. S-Blade™ Components SAS Expander Module SAS Expander Module Dual-Core FPGA DRAM Intel Quad-Core Netezza DB Accelerator IBM BladeCenter Server

  14. The IBM Netezza AMPP™ Architecture FPGA CPU Advanced Analytics Memory Host Hosts BI ODBC/JDBC FPGA CPU Memory ETL Loader FPGA CPU Memory Applications Disk Enclosures Network Fabric S-Blades™ Netezza Appliance

  15. How Netezza Operates in Database Queries?

  16. Our Secret Sauce select DISTRICT, PRODUCTGRP, sum(NRX) from MTHLY_RX_TERR_DATA where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' FPGA Core CPU Core Restrict, Visibility Complex ∑ Joins, Aggs, etc. Uncompress Project Slice of table MTHLY_RX_TERR_DATA (compressed) sum(NRX) select DISTRICT, PRODUCTGRP, sum(NRX) where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO'

  17. Netezza Eliminates the I/O Bottleneck Move the SQL to the hardware… to where the data lives “Just send the Answer, not Raw Data”

  18. Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance

  19. Why traditional database systems are not enough: Endless tuning Query performance is slow business person 19

  20. Why traditional database systems are not enough: Endless tuning I’ll add an index business person technical person 20

  21. Why traditional database systems are not enough: Endless tuning Load performance is slow. When can I access my data? business person 21

  22. Why traditional database systems are not enough: Endless tuning I’ll investigate and get back to you … business person technical person 22

  23. Why traditional database systems are not enough: Endless tuning Okay… I will add an aggregate table to pre-calculate so that the report will run faster. business person technical person 23

  24. Why traditional database systems are not enough: Endless tuning I want my report to be refreshed every 1 hour. business person 24

  25. Why traditional database systems are not enough: Endless tuning Oh… that is impossible… The report will be updated once everyday after night batch… business person technical person 25

  26. Why traditional database systems are not enough: Wasted effort 26

  27. Solving the data load and query performance problem “ “ We act out the market every day to capitalize on opportunities. Complex merchandize reports that had taken days to process on the old platform now take five minutes on the new one. Simpler queries are even faster. -- Chief Information Officer at a large US retailer 27

  28. Netezza Loads Data at 2.5TB per Hour

  29. Netezza is Simple to Deploy Since it is so Fast Page 29 • Operations • Simply load and go .… it’s an appliance • Minimal DBA Tuning • No configuration or physical modeling • No indexes– out of the box performance • ETL Developers • No aggregate tables needed ->Less ETL logic • Faster load and transformation times • Business Analysts • Train of thought analysis – 10 to 100x faster • True ad hoc queries – no tuning, no indexes • Ask complex queries against large datasets

  30. Traditional Complexity … Netezza Simplicity 0. CREATE DATABASE TEST LOGFILE 'E:\OraData\TEST\LOG1TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG2TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG3TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG4TEST.ORA' SIZE 2M, 'E:\OraData\TEST\LOG5TEST.ORA' SIZE 2M EXTENT MANAGEMENT LOCAL MAXDATAFILES 100 DATAFILE 'E:\OraData\TEST\SYS1TEST.ORA' SIZE 50 M DEFAULT TEMPORARY TABLESPACE temp TEMPFILE 'E:\OraData\TEST\TEMP.ORA' SIZE 50 MUNDO TABLESPACE undo DATAFILE 'E:\OraData\TEST\UNDO.ORA' SIZE 50 M NOARCHIVELOG CHARACTER SET WE8ISO8859P1; 1. Oracle* table and indexes  2. Oracle tablespace    3. Oracle datafile      4. Veritas file        5. Veritas file system           6. Veritas striped logical volume              7. Veritas mirror/plex                8. Veritas sub-disk                   9. SunOS raw device                     10. Brocade SAN switch                       11. EMC Symmetrix volume                          12. EMC Symmetrix striped meta-volume                            13. EMC Symmetrix hyper-volume                                14. EMC Symmetrix remote volume (replication)                                 15. Days/weeks of planning meetings Netezza: Low (ZERO) Touch: CREATE DATABASE my_db;

  31. Netezza Delivers Simplicity Up and running 6 months before being trained 200X faster than Oracle system ROI in less than 3 months “ “ Allowing the business users access to the Netezza box was what sold it. -- Steve Taff, Executive Dir. of IT Services 31

  32. Agenda • Netezza Solution Highlight • What is it? • Why it is good? • Netezza Tour • Specialized hardware and architecture • How Netezza operates in running database queries? • Netezza Simplicity • Netezza Performance

  33. POC - A Telco Company • Environment • Netezza TwinFin 12 full rack • Raw Data volume • Call Level Detail : 3TB (9 billion rows) • Financial Bill : 600GB (5.4 billion rows) • Customer Info : 60GB (91.1 million rows)

  34. POC: Data Maintenance

  35. POC: Enquiries (With NO Indexes)

  36. Catalina Marketing: Building loyalty one customer at a time • Marketing to a segment of one – 195 million US loyalty program members • Every coupon printed is unique to the individual customer • Customized based on three years' worth of purchase history • Increased staff productivity – from 50 to 600 new models per year • Increased efficiency – from 4 hours to score a model to 60 seconds 36

  37. Netezza’s Deep Dive: Getting Your Data Warehouse Up and Running in 24 hours Chi-Chung Hui Consulting I/T Specialist Information Management Software, IBM HK

More Related