1 / 25

Getting Value from Application Performance Metrics

Getting Value from Application Performance Metrics. Michael Sydor Engineering Services Architect, Author: APM Best practices. Agenda. Why so many metrics with APM? “Big Data”? What we are learning with CA-ABA (analytics) How to find KPIs What’s new for CA-APM 9.6 Release.

venice
Télécharger la présentation

Getting Value from Application Performance Metrics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Getting Value from Application Performance Metrics Michael Sydor Engineering Services Architect, Author:APM Best practices

  2. Agenda • Why so many metrics with APM? • “Big Data”? • What we are learning with CA-ABA (analytics) • How to find KPIs • What’s new for CA-APM 9.6 Release

  3. Typical APM Cluster • Dozens to hundreds of applications • 2800 JVMs/CLRs • Up to 5M metrics, every 15 seconds • Large applications span multiple data centers • 2-8 APM clusters, typical • 30-70 EM Collectors for a nationwide portal application • 12M to 28M metrics, every 15 seconds … certainly sounds like big data!!!

  4. What is Big Data??? • APM information is “big”… but it is not “big data” without enrichment Version Control Trouble Management 5M Metrics that you don’t fully understand Time of ____ Constraints 5M Metrics that you don’t fully understand OR Anomalies Trends E N R I C H M E N T Insights Correlation AP News Updates Weather Forecast Air Traffic Advisories Marketing Campaigns

  5. Challenges for Big Data • Data Variety – different sources gives different perspectives. Does your data have a significant perspective? • Validation – is the data source meaningful/predictive? • Consistency – are the values trustworthy? • Data Structure and Nomenclature – Mapping, Transformation • Temporal Impedance Mismatch • APM: real-time with 15 second reporting interval • Trouble Management: +15-30 minutes later • Stock Ticker: +15-30 minutes later • Air Traffic Advisories: +30-60 minutes later • Version Control: days to weeks in advance • Marketing Campaign Assessment: 2-4 weeks later

  6. KPI Management Maturity EKB: Errors, Key Resource Performance, Business Transaction Survey APC: Availability, Performance, Capacity VALUE SGCM: Stalls, GC Settings, Concurrency, Memory Management Trends (Transaction) (Platform) (Application) KPI MATURITY

  7. What We are Learning with CA-ABA

  8. ABA Logical Architecture APM Cluster Anomalies Alerts 100k Metrics (via RegEx) 5M Metrics Anomaly Engine Why only 100k Metrics??? Why not 5M???

  9. RegEx == Regular Expression • analytics.metricfeed.process.3 = • Custom Metric Host (Virtual) \\|Custom Metric Process (Virtual)\\|Custom Business Application Agent (Virtual) • analytics.metricfeed.metric.3 = • By Business Service\\|[^|]+\\|[^|]+\\|[^|]+:.+

  10. RegEx is hard… but easy to validate

  11. Metricfeed.3 Broader collection of metrics but only 87/500 == 17.4% are generally known as useful

  12. Suspects Identified via Baseline Technique 100% Useful metrics, ready for validation: 47/43625 == 0.1%

  13. Metric Count TypeView

  14. What is an Application? • Front-ends • Browser? Webservice? Messaging? • Back-ends • Databases Webservices Messaging Mainframes Trading_Partners • Muck-in-the-Middle • Software quality, stability and scalability • - We want to identify KPIs for each of these elements • - helps us build a useful dashboard for Operations • - helps expose with the resources are really doing • - helps us define acceptance criteria, to act proactively • - helps us to triage really effectively

  15. How to Find KPIs

  16. Capacity KPIs – “Tree Rings”

  17. Performance KPIs High Volume + Significant Response Time

  18. Create a Simple Alert and Threshold (ConnectionStatus)

  19. Create a Simple Alert, Find Restart and threshold (MetricCount) “UP” – but not actually doing anything!!!

  20. Understanding Your Environment • Identify the KPIs • Availability • Agent ConnectionStatus • Number Live Metrics (Metric Count) • Performance • High Volume components with significantresponse time • NOT “Top 10 Response Time” • Capacity • Highest Volume Components • Don’t Wait for Production!!! • Make it part of your pre-production review • Manage the application lifecycle by trending KPIs

  21. KPI Evolution Platform Coarse information ..but not really APM Application, Transactions, Resources The APM Advantage

  22. What’s New in CA APM 9.6Simplified, automated, and built on CA APM strengths. Faster, Easier APM • Intelligent Deep Transaction Trace is now dynamic, automated, and requires less developer involvement for deep dives into apps supporting the transactions • Simplified Triage with easier drill down with Application Triage Map including Socket Grouping • Improved response times withsoftware based Transaction Impact Monitor (end-user experience) • Expanding APMs scope with Java 7 EM& Agents • Increased insight by adding DB2 details to transaction traces • Greater awareness with CA SYSVIEW MQ alerts & complete status in APM • Driving further cross enterprise depth withCTG traces to fully expand backend calls • Other mainframe based enhancements Seamless Mainframe Awareness

  23. Preparing to Upgrade • HealthCheck the existing cluster prior to any upgrade • Good: • - Do a clean install of the APM Cluster, alongside of the existing cluster version. • - Manually duplicate management modules, domains.xml, etc. • - Bring down the old version, then bring up the new • Better: • - Install the new version in a separate environment, reduced size • - migrate a few applications to the new environment for validation • - upgrade the primary environment after validation achieved • Best: • - Install a new GOLD environment in production, separate from original cluster • - migrate agents, as schedules permit, until original cluster may be decommissioned • - this provides an opportunity to introduce pre-production review and generally correct any bad deployment habits

  24. Resources • Community Site • - Cookbook: APM HealthCheck • - Understanding Which Metrics Matter (KPI discussion) • - Cookbook: Application Audit • - more details on the baseline techniques and process • APM best practices – Realizing Application Performance Management • available on Amazon.com and Apress.com • - Baselines, Test Plans, App Audits, Triage, Firefighting • - Organizational Models, Service Catalogs

  25. Questions and Answers

More Related