ETL Job Performance Optimization Steps for Teradata

ETL作业性能调优培训

议程 • ETL作业性能调优步骤 • 索引 • SQL关联策略 • 表统计信息 • Explain • TTU使用说明

Teradata性能调优步骤 ETL Repository 或 DBQL 步骤1 决定调优机会步骤7 撰写及开发最终方案步骤2 找到可疑查询性能调优处理步骤6 性能增加量化步骤3 决定可能的解决方案步骤5 回归测试（Regression Test）步骤4 决定最佳方案 Create a Tuning Environment

ETL作业性能信息分析（一） • ETL作业现状 • 日作业 • 性能优化指标：运行时间 • 运行时间大于5分钟290个，大于10分钟196个，大于20分钟91个，大于30分钟49个，大于1小时18个，最长任务运行10小时10分49秒 • 月作业 • 运行时间大于5分钟601个，大于10分钟447个，大于20分钟314个，大于30分钟236个，大于1小时131个，大于2小时55个，大于3小时30个，大于4小时19个，最长任务运行8小时19分51秒

ETL作业性能信息分析（二） • ETL作业优化建议 • 日作业 • 性能优化指标：运行时间 • 对运行时间大于10分钟（38/196个)的作业进行优化 • 月作业 • 对运行时间大于30分钟（67/236个）的作业进行优化

ETL作业性能调优步骤 • 获取ETL作业SQL，找出运行时间较长的SQL; • 对SQL进行优化，决定最佳优化方案： • PI是否倾斜； • 是否收集统计信息及关联统计信息； • SQL是否过长而需要分步实现； • 使用Explain命令： • 查看是否使用索引字段或索引字段是否生效； • 查看SQL使用关联策略； • 根据决定最佳优化方案修改SQL、测试；如果性能改进不大，回到第2步，直到找到最佳优化方案为止； • 修改ETL脚本，测试并发给相关人员进行替换。

索引 • Primary Index(PI) • Secondary Index(SI) • Partitioned Primary Index(PPI)

AMP AMP AMP AMP Table A rows Table B rows • The rows of every table are distributed among all AMPs • Each AMP is responsible for a subset of the rows of each table. • Ideally, each table will be evenly distributed among all AMPs. • Evenly distributed tables result in evenly distributed workloads. • The uniformity of distribution of the rows of a table depends on the choice of the Primary Index. Primary Index-Storing rows

The value of the Primary Index for a specific row determines the AMP assignment for that row. • This is done using a hashing algorithm. Other table access techniques: • Secondary index access • Full table scans Hashing Algorithm PI Value PE Row assignment Row access AMP AMP AMP • Accessing the row by its Primary Index value is: • always a one-AMP operation • the most efficient way to access a row Primary Index Values

Primary Index Characteristics Primary Indexes (UPI and NUPI) • A Primary Index may be different than the Primary Key. • Every table has only one, Primary Index. • A Primary Index may contain null(s). • Single-value access uses ONE AMP and, typically, one I/O. Unique Primary Index (UPI) • Involves a single base table row at most. • No spool file is ever required. • The system automatically enforces uniqueness on the index value. Non-Unique Primary Index (NUPI) • May involve multiple base table rows. • A spool file is created when needed. • Duplicate values go to the same AMP and the same data block. • Only one I/O is needed if all the rows fit in a single data block. • Duplicate row check for a Set table is required if there is no USI on the table.

NUPI Duplicate Row Check Limit NUPI rows per value to rows per block whenever possible. To avoid NUPI duplicate row checks: • Enforce uniqueness through a USI on other column(s) such as the PK. • Create the table as a MULTISET table. Number of Rows that must be logically read first Row Number to be inserted Cumulative Number of logical row reads 1 0 0 2 1 1 3 2 3 4 3 6 5 4 10 6 5 15 7 6 21 8 7 28 9 8 36 10 9 45 20 19 190 50 49 1225 100 99 4950 200 199 19900 500 499 124750 1000 999 499500 This chart illustrates the additional I/O overhead.

Multi-Column Primary Indexes Advantage More columns = more uniqueness • Number of distinct values increase. • Rows/value decreases. • Selectivity increases. Disadvantage More columns = less usability • PI can only be used when values for all PI columns are provided in SQL statement. • Partial values cannot be hashed.

ACCESSMaximize one-AMP operations: Choose the column(s) most frequently used for access. Consider both join and value access. DISTRIBUTIONOptimize parallel processing: Choose the column(s) that provides good distribution. VOLATILITYReduce maintenance resource overhead (I/O): Choose the column(s) with stable data values. Note: Data distribution has to balanced with Access usage in choosing a PI. Primary Index Choice Criteria Formula of Table Skew Factor: LOCK dbc.TableSize FOR ACCESS SELECT (100 - (AVG(CurrentPerm)/MAX(CurrentPerm)*100)) AS SkewFactor FROM dbc.TableSize WHERE DataBaseName='xxx' AND TableName='xxx';

There are 3 general ways to access a table: Primary Index access (one AMP access) Secondary Index access (two or all AMP access) Full Table Scan (all AMP access) • A secondary Index provides an alternate path to the rows of a table. • A table can have from 0 to 32 secondary indexes. • Secondary Indexes: • Do not effect table distribution. • Add overhead, both in terms of disk space and maintenance. • May be added or dropped dynamically as needed. • Are chosen to improve table performance. Secondary Index

Secondary Indexes consume disk space for their subtables. • INSERTs, DELETEs, and UPDATEs (sometimes) cost double the I/Os. • Choose Secondary Indexes on frequently used set selections. • Secondary Index use is typically based on an Equality search. • A NUSI may have multiple rows per value. • The Optimizer may not use a NUSI if it is too weakly selective. • Avoid choosing Secondary Indexes with volatile data values. • Weigh the impact on Batch Maintenance and OLTP applications. • USI changes are Transient Journaled. NUSI changes are not. • NUSIs are generally useful in decision support applications. • Remove or drop NUSIs that are not used. Data demographics change over time. Revisit ALL index (Primary and Secondary) choices regularly. Make sure they are still serving you well. Secondary Index Considerations

Index Feature Primary Secondary Required? Yes No Number per Table 1 0 - 32 Max Number of Columns (V2R5 …) 64 64 Unique or Non-unique Both Both Affects Row Distribution Yes No Created/Dropped Dynamically No Yes Improves Access Yes Yes Multiple Data Types Yes Yes Separate Physical Structure No Sub-table Extra Processing Overhead No Yes May be ordered by value No Yes (NUSI) May be Partitioned (V2R5) Yes No Comparison of Primary and Secondary Indexes

Partitioned Primary Indexes (PPI) What is a “Partitioned Primary Index” or PPI? • A indexing mechanism in Teradata for use in physical database design. • Data rows are grouped into partitions at the AMP level. • Partitions are just an ordering of the rows on an AMP. What advantages does a PPI provide? • Increases the available options to improve the performance of certain types of queries – specifically range-constrained queries. • Only the rows of the qualified partitions in a query need to be accessed – avoid full table scans. How is a PPI created and managed? • A PPI is easy to create and manage. • The CREATE TABLE and ALTER TABLE statements contain new options to create and/or alter partitions. • As always, data is distributed among AMPs and automatically placed within partitions.

Potential Disadvantages of PPI Potential disadvantages of a PPI include: • PPI rows are 2 bytes longer. Table uses more PERM space. • Also increases secondary index subtable row size by 2 bytes for each referencing rowid in the index. • A PI access can be degraded if the partitioning column is not part of the PI. • A query specifying a PI value, but no value for the partitioning column, must look in each partition for that value. • Joins to non-partitioned tables with the same PI may be degraded. • If one of the tables is partitioned, the rows won't be ordered the same, and the task, in effect, becomes a set of sub-joins, one for each partition of the PPI table. • The PI can’t be defined as unique when the partitioning column is not part of the PI.

PPI Considerations A table can have up to 65,535 partitions (also limit with Multi-level partitioning). • Partitioning columns do not have to be columns in the primary index. • There are numerous options for partitioning. As rows are inserted into the table, the partitioning expression is evaluated to determine the proper partition placement for that row.

CREATE TABLE Considerations • CREATE MULTISET TABLE DBNAME.TABLENAME ,NO FALLBACK • ( • DEAL_DATE DATE FORMAT 'YYYY-MM-DD' TITLE '处理日期', • ACCT_ID DECIMAL(14,0) TITLE '帐户ID' NOT NULL, • USER_ID DECIMAL(14,0) TITLE '用户ID' NOT NULL, • CUST_ID DECIMAL(14,0) TITLE '客户ID' NOT NULL, • ...) • PRIMARY INDEX ( ACCT_ID ,USER_ID ,CUST_ID ) • PARTITION BY RANGE_N(DEAL_DATE BETWEEN DATE '2005-01-01' AND DATE '2010-12-31' EACH INTERVAL '1' DAY ) • INDEX(USER_ID); Note：1、创建MULTISET表 2、创建NO FALLBACK表（FALLBACK表占用一倍的存储空间）

Join Processing Rows must be on the same AMP to be joined. • If necessary, the system creates spool copies of one or both rows and moves them to a common AMP. • Join processing NEVER moves or changes the original table rows. Typical kinds of joins are: • Merge Join • Product Join • Nested Join • Exclusion Join The Optimizer chooses the best join strategy based on: • Available Indexes • Demographics (COLLECTed STATISTICS or Dynamic Sample) EXPLAIN shows what kind of join a query uses.

Join Column Hash Join Column Hash A3 Data B8 Data C4 Data A3 Data A3 Data A3 Data B7 Data B7 Data C4 Data C4 Data Rows must be on the same AMP to be joined. • Merge Join reads blocks from both tables only once. • Usually chosen for an equality join condition. • Generally more efficient than a product join. Merge join process: • Identify the Smaller Table. • If necessary: • Put qualifying data of one or both tables into spool(s). • Move the spool rows to AMPs based on the join column hash. • Sort the spool rows into join column hash sequence. Compare the rows with matching join column row hash values. Causes significantly fewer comparisons than a product join. Merge Join

This is a special join case. • This is the only join that doesn't always use all of the AMPs. • It is the most efficient in terms of system resources. • It is the best choice for OLTP applications. • To choose a Nested Join, the Optimizer must have: – An equality value for a unique index (UPI or USI) on Table1. – A join on a column of that single row to any index on Table2. • The system retrieves the single row from Table1. • It hashes the join column value to access matching Table2 row(s). Employee Enum Name Dept PK FK UPI 1 BROWN 200 2 SMITH 310 3 JONES 310 4 CLAY 400 5 PETERS 150 6 FOSTER 400 7 GRAY 310 8 BAKER 310 Department Dept Name PK UPI 150 PAYROLL 200 FINANCE 310 MFG. 400 EDUCATION Example: SELECT E.Name ,D.Name FROM Employee E INNER JOIN Department D ON E.Dept = D.Dept WHERE E.Enum = 5; Nested Joins

Rows must be on the same AMP to be joined. Data Data Data Data Data Data Data Data Data Data • Does not sort the rows. • May re-read blocks from one table if AMP memory size is exceeded. • It compares every qualifying Table1 row to every qualifying Table2 row. • Those that match the WHERE condition are saved in spool. • It is called a Product Join because: Total Compares = # Qualified Rows Table 1 * # Qualified Rows Table 2 • The internal compares become very costly when there are more rows than AMP memory can hold at one time. • They are generally unintentional and often give meaningless output. • Product Join process: • Identify the Smaller Table and duplicate it in spool on all AMPs. • Join each spool row for Smaller Table to every row for Larger Table. Product Join

This is an unconstrained Product join. • Each row of Table1 is joined to every row in Table2. • Cartesian Product Joins consume significant system resources. Table row count is critical: Table 1 (50K rows) Table 2 (300K rows) X = 15,000,000,000 Rows Number of tables is even more critical: T8 (50 rows) T1 (50 rows) T2 (50 rows) . . . X X X = 39,062,000,000,000 Rows • Cartesian Product Joins rarely have practical business use. • The Teradata Database supports them for ANSI compatibility. • Cartesian Product Joins frequently occur when: • A join condition is missing or there are too few join conditions. • Join conditions are not based on equality. • A referenced table is not named in any join condition. • Table aliases are incorrectly used. • The transaction aborts if it exceeds the user’s spool limit. Cartesian Product

Join Column Hash Join Column Hash A3 Data B8 Data C4 Data C4 Data A3 Data C6 Data F6 Data B7 Data C4 Data A3 Data Cache Memory This optimizer technique effectively places the smaller table in cache memory and joins it to the larger table in unsorted spool. Row Hash Join Process: • Identify the smaller table. • Redistribute or duplicate the smaller table in memory across the AMPs. • Sort the cache memory into join column row hash sequence. • Hold the rows in memory. • Use the join column row hash of the larger table to binary search memory for a match. This join eliminates the sorting, and possible redistribution or copying, of the larger table. EXPLAIN plans will contain terminology such as “Single Partition Hash Join”. Hash Join

MERGE JOIN - Rows must be in JOIN COLUMN Row Hash sequence. M1 Do nothing if Primary Indexes match and are the join columns. OR REDISTRIBUTE one or both sides (depending on the Primary Indexes used in the join) and SORT on join column row hash. M2 . OR DUPLICATE the smaller Table on all AMPs and SORT on join column row hash. A spool copy is built LOCALLY of the Larger Table and SORT on join column row hash. M3 PRODUCT JOIN - Rows do not have to be in any sequence. P1 DUPLICATE the Smaller Table on all AMPs. Join Distribution Strategies • Join costs rise with the number of rows that are moved and sorted. • Join plans for the same tables change as the demographics change.

Exclusion Joins • Find rows that DON'T have a match. • May be done as merge or product joins. • Are caused in NOT IN subqueries and EXCEPT operations. • Use 3-value logic (= , <> , unknown) on nullable columns. • Define NOT IN columns as NOT NULL on the CREATE TABLE if possible. • Use: WHERE colname IS NOT NULL in queries against nullable join columns. = Set_A 1 2 3 4 NOT IN Set_B 1 3 5 Result 2 4 Set_B 1 3 5 NULL = Result NULL Set_A 1 2 3 4 NOT IN

Inefficient joins result from: • Poor physical design choices • Lack of indexes • Inappropriate indexes • Stale or missing Collected Statistics • Inefficient SQL code • Poor SQL code can degrade performance on a good database design. • Good SQL code cannot compensate for a poor database design. The system bases join planning on: • Primary and Secondary Indexes • Estimated number of rows in each subtable • Estimated ratio of table rows per index value COLLECTed STATISTICS may improve join performance. The fastest merge joins are based on matching Primary Indexes. Data demographics change over time. Join plans for the same tables change as demographics changes. Revisit ALL index (Primary and Secondary) choices regularly. Make sure they are still serving you well. Join Processing Summary

The optimizer needs information to create the best execution plan for a query. Environment information: • Number of nodes • Number of AMPs • Number and type of CPUs • Disk Array information • Interconnect (BYNET) information • Amount of memory available Data Demographics: • Number of rows in the table • Row size • Column demographics • Skew • Range of values in the table for the column • Number of rows per value • Index demographics Statistics Collected statistics by user or Random AMP Sampling Optimizer Statistics

Statistics • Statistics basically tell the Optimizer how many rows/value there are. • The Optimizer uses statistics to plan the best way to access data. • Usually improves performance of complex queries and joins. • The parser is more aggressive with collected statistics. • Stale statistics may mislead the Optimizer into poor decisions. • Helpful in accessing a column or index with uneven value distribution. • NUSI Bit Mapping is much more likely to be considered if there are collected statistics. • Statistics remain valid across a reconfiguration of the system. • COLLECT STATISTICS and DROP STATISTICS commands are DDL statements and typically are not executed during production hours. • COLLECT/DROP STATISTICS places an access lock on the data table and a row-hash write lock on DBC.TVFields or DBC.Indexes. • DBC.TVFields – holds statistics collected for single column or single column index • DBC.Indexes – holds statistics collected for multi-column or multi-column index

HELP STATISTICS tablename; Displays information about current column or index statistics. HELP STATISTICS Customer; Date Time Unique Values Column Names ----------- ------------ --------------------- ------------------------------- 07/12/13 22:10:05 7,000 Customer_Number 07/12/13 22:11:23 7,000 Social_Security DATE and TIME show when statistics were last collected or refreshed. HELP INDEX tablename: This statement returns information for every index on a table. HELP INDEX Customer; Primary or Approximate Index Ordered or Unique? Secondary? Column Names Index Id Count Name Partitioned? ----------- ---------------- ------------------------------- ----------- ------------------ -------- ------------------ Y P Customer_Number 1 7000 ? H N S Last_Name 4 100 ? H This command displays the index number and the approximate number of distinct values. Questions: How is the count of 7000 for Customer_Number determined? How is the count of 100 for Last_Name determined? Viewing Statistics

The Explain Facility May be used on any SQL statement, except EXPLAIN itself. Translates Optimizer output (the execution plan) into English. • Provides direct feedback on what the system intends to do. It is a good way to learn about the system and SQL. Use it consistently to analyze joins, long-running and complex queries. Time estimates are relative, not absolute. • Assumes the query will run stand-alone; doesn’t take a loaded system into account. • Time estimates cost formulas based on H/W configuration. EXPLAIN Facility Output: • The timings and spool sizes shown are ESTIMATES ONLY. • Spool sizes are based on dynamic sampling or statistics. • Use them as “figures of merit” for comparison purposes only. • Know what the Request is supposed to do before EXPLAINing it.

EXPLAIN Terminology Most EXPLAIN text is easy to understand. The following additional definitions may help: • ... (Last Use) … A spool file is no longer needed and will be released when this step completes. • ... with no residual conditions … All applicable conditions have been applied to the rows. • ... END TRANSACTION … Transaction locks are released, and changes are committed. • ... eliminating duplicate rows ... Duplicate rows only exist in spool files, not set tables. Doing a DISTINCT operation. • ... by way of the sort key in spool field1 (dbname.tablename.colname) … Field1 is created to allow a tag sort. Teradata 12.0 includes the column name used for the sort. • ... we do an ABORT test … Caused by an ABORT or ROLLBACK statement. • ... by way of a traversal of index #n extracting row ids only … A spool file is built containing the Row IDs found in a secondary index (index #n).

EXPLAIN Terminology (cont.) • ... we do a SMS (set manipulation step) … Combining rows using a UNION, MINUS, or INTERSECT operator. • ... we do a BMSMS (bit map set manipulation step) … Doing a NUSI Bit Map operation. • ... which is redistributed by hash code to all AMPs (dbname.tablename.colname) … Redistributing data (in SPOOL) in preparation for a join. Teradata 12.0 includes the column name. • ... which is duplicated on all AMPs … Duplicating data (in SPOOL) from the smaller table in preparation for a join. • ... (one_AMP) or (group_AMPs) … Indicates one AMP or a subset of AMPs will be used instead of all AMPs. • ... ("NOT (table_name.column_name IS NULL)") … Feature where optimizer realizes that the column being joined to is NOT NULL or has referential integrity.

The EXPLAIN facility may express “confidence” for a retrieve from a table. Some of the phrases used are: . . . with high confidence . . . – Restricting conditions exist on index(es) or column(s) that have collected statistics. . . . with low confidence . . . – Restricting conditions exist on index(es) or column(s) having no statistics, but estimates can be based upon a dynamic or random AMP sampling. – Sample statistics are collected – “Collected Statistics Using Sample” – Restricting conditions exist on index(es) or column(s) that have collected statistics but are “AND-ed” together with conditions on non-indexed columns. – Restricting conditions exist on index(es) or column(s) that have collected statistics but are “OR-ed” together with other conditions. . . . with no confidence . . . – Conditions outside the above. For a retrieve from a spool, the confidence is the same as the step generating the spool. Understanding Row and Time Estimates (Part 1)

The following are “confidence” phrases for a join: . . . with index join confidence . . . – A join condition via a primary index. . . . with high confidence . . . – One input relation has high confidence and the other has high or index join confidence. . . . with low confidence . . . – One input relation has low confidence and the other has low, high, or join index confidence. . . . with no confidence . . . – One input relation has no confidence. – Statistics do not exist for either join field. Notes: • Low and no confidence may indicate a need to collect statistics on indexes or columns involved in restricting conditions. • You may otherwise consider a closer examination of the conditions in the query for possible changes that may improve the confidence. • Explain plans show “low-end” time, rows, and/or bytes associated with the step. • Estimates too large to display show 3 asterisks (***). Understanding Row and Time Estimates (Part 2)

Example 1 – EXPLAIN of a SELECT (FTS) QUERY EXPLAIN SELECT * FROM daily_sales ORDER BY 1; EXPLANATION 12.0 EXPLAIN -------------------------------------------------------------------------------------------------------------------------------- 1) First, we lock a distinct TFACT."pseudo table" for read on a RowHash to prevent global deadlock for TFACT.daily_sales. 2) Next, we lock TFACT.daily_sales for read. 3) We do an all-AMPs RETRIEVE step from TFACT.daily_sales by way of an all-rows scan with no residual conditions into Spool 1 (group_amps), which is built locally on the AMPs. Then we do a SORT to order Spool 1 by the sort key in spool field1 (TFACT.daily_sales.Item_id). The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 1 is estimated with high confidence to be 76,685 rows (2,530,605 bytes). The estimated time for this step is 0.09 seconds. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.09 seconds. Notes: • Statistics were collected on the Primary Index of this table. • Spool file byte count and column name used as sort key are Teradata 12.0 Explain enhancements.

EXPLAIN Summary • Make EXPLAIN output an integral part of all design reviews and formal system documentation. • EXPLAIN results can expose inefficiencies in query structures. • Data Demographics change over time. • Retain EXPLAIN listings to facilitate periodic index re-evaluation. • Keep formal documentation of the query structure rationale. • Know what the Request is supposed to do before EXPLAINing it.

SQL Assistant（一） • 执行SQL语句 • 数据导入、导出 • 导出/导入配置 • 数据导入

SQL Assistant（二） • 数据导出

BTEQ（一） • 执行SQL语句 • 数据导入、导出 • 数据导入 • .LOGON tdp1/user1,passwd1 • .IMPORT VARTEXT ',' FILE = custdata_csv • .QUIET ON • .REPEAT * • USING ( in_custno VARCHAR(11), • in_lname VARCHAR(30), • in_fname VARCHAR(20), • in_ssn VARCHAR(9) ) • INSERT INTO Customer • VALUES (:in_custno, :in_lname, :in_fname,:in_ssn); • .QUIET OFF • .QUIT improt.btq 执行：bteq <import.btq

BTEQ（二） • 数据导出 • .LOGON tdp1/user1,passwd1 • .OS rm custdata_csv • .EXPORT REPORT FILE = custdata_csv • SELECT CAST(Customer_Number AS VARCHAR(11)) ||','|| • CAST(Last_Name AS VARCHAR(30)) ||','|| • CAST(First_Name AS VARCHAR(20)) ||','|| • CAST(Social_Security AS VARCHAR(9)) (TITLE '') • FROM Customer • SAMPLE 100; • .EXPORT RESET • .QUIT exprot.btq 执行：bteq <export.btq

Fastload • Fastload是Teradata数据库的批量加载工具 • Fastload只能加载空表 • 一个fastload任务只能加载一张空表 • 加载的目标表不能有二级索引(Secondary Indexes)、关联索引(Join Indexes)、哈希索引(Hash Indexes)或参照完整性(Referential integrity)限制 • Fastload不能加载重复记录 • DML functions中只能使用INSERT

ETL Job Performance Optimization Steps for Teradata

ETL Job Performance Optimization Steps for Teradata

Presentation Transcript