580 likes | 582 Vues
APM Best Practice: i3 FocalPoint and Performance Warehouse. Steve Cunnew APM Specialist. Best Practices Agenda. FocalPoint Evolution FocalPoint Planning FocalPoint Maintenance Customer Case Study Summary Optimising your FocalPoint Q&A. FocalPoint Evolution. About The FocalPoint.
E N D
APM Best Practice:i3 FocalPoint and Performance Warehouse Steve CunnewAPM Specialist
Best Practices Agenda • FocalPoint Evolution • FocalPoint Planning • FocalPoint Maintenance • Customer Case Study • Summary • Optimising your FocalPoint • Q&A
About The FocalPoint • FocalPoint: the i3 “kernel” • GUI • Datastore • Data processor • FocalPoint: impact • GUI performance • Efficiency of data collection • Scalability of i3 environment • Lets look at how i3 has evolved…
From Precise/SQL & Pulse!… • GUI - ‘Fat client’ • For both the ‘Tuning’ and ‘Alerting’ tools • All data stored on the monitored server • Flat files store ‘Recent’ activity • Alerting managed on monitored server • Performance Warehouse an ‘option’ • Defaulted to storing data in the monitored DB • A remote DB could be specified
To i3 v2… • GUI – mostly ‘fat’ clients • Alerting now using a thin client • Tuning tool still requires fat client • Most data stored on the monitored servers • Flat files store ‘Recent’ activity • Alerting managed by a FocalPoint • Metric data stored in a central DB • Performance Warehouse an ‘option’ • Default to store data in the monitored DB • A remote DB could be specified
Then i3 v6… • GUI – mixed ‘fat’ and ‘thin’ clients • Alerting, I4J2EE and other tools using thin client • Tuning tools still require fat client • All data stored on the monitored servers • Flat files store ‘Recent’ activity • All products managed by a FocalPoint • Performance Warehouse an ‘option’ for I4O • A central DB now used for all products in i3 suite
And finally, i3 v7 • GUI – ‘thin’ clients • All tools web based (but using ActiveX controls) • All data stored in the Performance Warehouse • Data gathered from flat files • All products managed by a FocalPoint • Performance Warehouse compulsory • All data seen in the GUI comes from PW
Load Load Load Load Monitored Server FP/PW Client Monitored Server The ‘Load Balance’ During Evolution Pre i3 v2
Load Load Load Load Monitored Server FP/PW Client Monitored Server The ‘Load Balance’ During Evolution i3 v2
Load Load Load Load Monitored Server FP/PW Client Monitored Server The ‘Load Balance’ During Evolution i3 v6
Load Load Load Load Monitored Server FP/PW Client Monitored Server The ‘Load Balance’ During Evolution i3 v7
The Result of the i3 Evolution • Less overhead in monitoring servers and applications But… • Performance of i3 FocalPoint server now much more significant • Performance of the Performance Warehouse database now much more significant
Getting the Best from the FocalPoint Rule #1 • Plan the installation. Carefully. • Consider: • How many users will there be? • How many servers/instances are there? • Will the number of servers/instances increase? • Which product FocalPoints are required? • Which product agents are required on each server? • Etc… Rule #2 • Follow Rule #1
Hardware Requirements for the FP • Minimumrequirements for a FocalPoint • On Windows, a CPU of at least Dual Pentium IV 2GHz, 32 bits, memory of at least 2 GB, and disk space of at least 100 GB are required • On Sun Solaris, a CPU of at least Dual Ultra Sparc Illi 1GHz, memory of at least 2 GB, and disk space of at least 100 GB are required • Minimum requirements means suitable for a small installation.
The Ideal FocalPoint Setup • The i3 FocalPoint should be separated from the PW DB server • Ensure sufficient memory to reduce or eliminate ‘swap’ • Faster CPU’s are preferable • 4 x 2GHz CPU’s better than 8 x 1GHz • Ensure disks can handle I/O demands • i3 is handling many log and trace files • Many small files used in data gather/load process
Plan the PW DB Instance • Separate Data/Index (Oracle) or Data/Log (MSSQL) files onto different ‘spindles’ • Install Indepth for Oracle/MSSQL against the PW instance, and Insight OS agent on the server(s) For Oracle: • Place Redo log files on fast disks • RAID 0+1 or 10, not RAID 5 • Settings for init.ora parameters…
Oracle PW init.ora • Current guidelines for a new 9i instance: • db_block_size = 16KB (8KB min) • db_cache_size = 256MB (minimum) • log_buffer = ~1MB • log_checkpoint_timeout = 950 • shared_pool_size = 1/4 of physical memory, up to 450 MB • pga_aggregate_target = 1/8 of the physical memory • session_cached_cursors = 300 • open_cursors = 300 • processes = 300 • These parameters may need to be tuned later
A Few ‘Tips’ for FP Planning • ‘InformPoints’ are only required on the FocalPoint and monitored servers where custom metrics are being used • An installed InformPoint that is not running still places load on the FocalPoint • If installing Indepth/J2EE, use Oracle instead of MySQL if possible • Find out if any patches are required, and apply them during the installation
Techniques for Analysis and Remediation • Evaluate data collection requirements • Are all agents really required? • Look into the usage of the server • Are there any CPU bound processes? • Review PW processes • How long are they taking to run? • Are any of them failing? • Analyse PW DB instance • How much time is spent ‘waiting’?
Customer Case Study • Customer complains about slow GUI response • Initial analysis showed inadequate i3 system: • Slow GUI • PW Loads failing • I4Web FP is 100% CPU bound • 30000 files in the web sum directory • Restarting PW takes 1 hour !! • “Most likely the FP server cannot scale” ??? • Apply the Techniques …
Are All Agents Required? InformPoint is required on FP ?
Recommendations • Remove unnecessary Inform/points • Change the Daily Maintenance process to run every 3 days instead of every day • Adjust the start time of the Daily Maintenance process The results? • Improvement of Oracle performance
Recommendations • Analyse The I4J2EE Schema • Indepth for J2EE schema is not maintained by the Performance Warehouse processes • ‘Analyse’ scripts supplied with i3 • <i3>/products/j2ee/db/oracle/analyzeJ2EETables.sql • <i3>/products/j2ee/db/mysql/analyzeJ2EETables.sql • Run this weekly • Can be setup from within ‘Alerts’ to run as a scheduled job if no other task scheduler available
Recommendations • Need to reduce ‘Redo Log Buffer Wait’ • Changes to the log files and buffers • Increase redo log files to 1GB • Move log files from RAID 5 storage • Reduce log_buffer in init.ora to 1MB • Reduce commit/rollback activity • I4J2EE aggregate interval changed from 5 to 15 mins
Recommendations (continued) • Change initialisation parameters to reduce I/O and buffer wait • Used Oracle 9i advice features • Several iterations resulted with the following changes: • large_pool = 40M • log_buffer = 1M • db_file_multiblock_read_count = 8 • sort_area_size = 2M • pga_aggregate_target = 1.5 GB • db_cache_size = 1.2 GB
I/O system problem? • Redo waits were reduced by moving redo log files to local devices • Buffer Busy wait • I/O wait • Investigation by the storage team found problem with battery backed cache • battery was dead so no caching!
So What Happened? • Due to a large number of instances, job runs and errors in the PW Processes, the Job History table grew quite quickly • When the Weekly Maintenance process ran, the rollback segment was too small for the amount of data being ‘deleted’, so the delete failed, and table grew in size each week • A manual ‘truncate’ was performed on the table to resolve the problem
Implemented Recommendations • Altered the Daily Maintenance PW Process • Analysed the I4J2EE Schema • Altered the redo log files • Tuned the init.ora parameters • Cleaned the PW_PWJH_JOB_HISTORY table • What difference has this made?
The Results Of The Tuning • CPU usage reduced on the server • All PW loads OK • GUI response time much improved • Restarting the PW takes less than 5 mins • Increased scalability • Avoided/deferred hardware purchase • Another happy customer!