170 likes | 571 Vues
The Role of Predictive Methods in Autonomic Computing April 27, 2005. Ric Telford Director of Architecture and Development, Autonomic Computing. Agenda. Autonomic Computing overview AC Problem Determination Technologies Customer Results The Self-Healing Vision Summary.
E N D
The Role of Predictive Methodsin Autonomic ComputingApril 27, 2005 Ric Telford Director of Architecture and Development, Autonomic Computing
Agenda • Autonomic Computing overview • AC Problem Determination Technologies • Customer Results • The Self-Healing Vision • Summary
Today’s Complex Infrastructure IT asset utilisation is too low Management of complex, heterogeneous environments is too difficult Privacy, security and business continuity WWW Swamped by the proliferation of technology and platforms to support Operational speed too slow;IT flexibility too limited Inability to manage the infrastructure seamlessly
Focus on business value, not infrastructure Autonomic Computing delivers intelligent open systems that: • Adapt to unpredictable conditions • Continuously tune themselves • Prevent and recover from failures • Provide a safe environment Sense and respond to ever-changing environments Providing customer value “IBM’s autonomic computing initiative will become its most important cross-product initiative (as the foundation of On Demand Business).” — Thomas Bittman, Gartner • Increased return on IT investment • Improved flexibility, resiliency and quality of service • Accelerated time to value
IBM Autonomic Computing Structure • Autonomic Computing Control Loop • Autonomic Computing Architecture Blueprint Autonomic Computing Architecture Products delivering autonomic features • 50 products with 415+ features • Partner solutions • Log/Trace Analyzer • Generic Log Adapter • Solution installation & dependency checking • Common Console • Autonomic Management Engine Management Engine Installation Autonomic Computing Common Components Problem Determination Provisioning Workload Mgt Admin Console Open Standards • Common log format • Solution installation schema
Fire Wall Fire Wall Load Balancers Network Routers/Switches Fire Wall Load Balancers Edge Servers Security Servers Load Balancers HTTP Servers Data Servers Application Servers Managing Servers LDAP Registries Backup Servers Fire Wall You Policy Servers The Pain Point….
Today’s Approach… Internal Swat Team– The Manual Process • Requires: • Key resources across the IT staff to get the breadth of skills to understand the end-to-end problem • Deep understanding of log file formats • Deep understanding of system components Blame Storming • Result: • Multiple man-hours/days/weeks of effort • Political issues – passing the blame • Insufficient / inadequate data can cause this approach to fail • Customers are repeating this step today for every major IT outage
Common Base Eventan OASIS standard Applications Adapters Adapters Database • Disparate pieces and parts • Tools focused on individual products • No common interfaces among tools • No synergies in building tools OR in creating log entries common base event ApplicationServer Servers • Generic log adapter • Common format for log files • Common set of tools • Common interfaces among tools Storage devices Networks Problem determination: Log format tomorrow Log format today
AIX errpt log AIX syslog Apache HTTP Server access log Apache HTTP Server error log CICS Transaction Server for z/OS System message log Common Base Event XML log ESS (Shark) Problem log IBM Communications Server log IBM DB2 Express diagnostic log IBM DB2 Universal Database Cli Trace log IBM DB2 Universal Database JDBC trace log IBM DB2 Universal Database SVC Dump on z/OS IBM DB2 Universal Database Trace log IBM DB2 Universal Database diagnostic log IBM HTTP Server access log IBM HTTP Server error log IBM WebSphere Application Server activity log IBM WebSphere Application Server for z/OS error log IBM WebSphere Application Server plugin log IBM WebSphere Application Server trace log IBM WebSphere Commerce Server ecmsg log IBM WebSphere Commerce Server ecmsg, stdout, stderr log IBM WebSphere InterChange Server log IBM WebSphere MQ FDC log IBM WebSphere MQ error log IBM WebSphere MQ for z/OS Joblog IBM WebSphere Portal Server appserver_err log IBM WebSphere Portal Server appserverout log IBM WebSphere Portal Server run-time information log IBM WebSphere Portal Server systemerr log IBM WebSphere Portal Server systemout log IBM Websphere Edge Server log Javacore log Logging Utilities XML log Microsoft Windows Application log Microsoft Windows Security log Microsoft Windows System log Oracle JDBC trace log Oracle alert log Oracle listener log Oracle server log Rational TestManager log RedHat syslog SAN File System log SAN Volume Controller error log SAP system log Squadrons-S Problem log SunOS syslog SunOS vold log TXSeries CICS Console/CSMT log z/OS Component trace z/OS GTF trace z/OS Joblog z/OS Logrec z/OS System log(SYSLOG) z/OS System trace z/OS master trace Supported Log Formats (Feb 2005)
Log Correlation – Generating the End-to-End View • With Correlation IDs in place, or Correlation methods identified: • Implement a Correlation Engine in the Log Analyzer • Generate a sequence diagram showing the log interactions and sequence of events • Help the IT staff hone in on where the problem occurred: • Identify quickly where to concentrate efforts • Transition from trying to understand log formats to identifying ways to analyze the overall data and the end-to-end view • Move the Mindset from Monitoring to Analysis
End Results… From To Single PD-Skilled Resource Multiple IT-Skilled Resources Multiple Man-Hours / Days / Weeks of analysis Root Cause identification in hours / minutes Unstructured Swat Team Approach with success unknown Repeatable Process with a reusable set of tools
Self-Healing - Customer Results From several hours/days to less than one hour 60% Improvement 85% Improvement 50% Improvement – IBM’s SAP Deployment 70% Improvement 60% Improvement 50% Improvement 10 to 30% Savings in IT Support Costs 20 to 30% Improvement 10 to 20% improvement in operational staff productivity – IBM Software Delivery and Fulfillment 75% Improvement From 3 people 2 hours to 1 person 15 min 40% Improvement New in 2005
Self Healing Remediation Self-Healing Roadmap Business Policy Continuous Availability Knowledge Sharing Action Representation Knowledge Accumulation Customer Pull Analysis Knowledge Representation Event Correlation and Analysis Partner Deployers 2007 Capture Event Representation Adapters IBM Deployers 2006 • Business policies guide self-healing system • Preemptive diagnostics automatically recognize and resolve problems • Call home facilities are integrated as part of self-healing solutions • Symptom data made available to customers, ISVs, partners • Standardize data model for change requests, change plans • Standardize grammar to describe change requests and constraints • Allow analysis and planning when uncertainty is present • Allow human to determine recovery action • High-profile customer deployments and references 2004-2005 • Standardize data model for symptom analysis • Transport & correlate events from all components in IT infrastructure • Predictive Analysis Constructs • ARM Correlation 2004 • Standard data model for common situation and event reporting • Tooling for easy adoption of standard • Commitments from IBM brands and IBM Partners to support the data model
P P P P A A A A E E E E M M M M Call Home Sensor Effector Human-based MAs and associated tooling for correlation, analysis, viewing Change Type Plan Analyze Change Plan Symptom Symptom Tooling Knowledge Policy Execute Monitor IT Professionals Config CBE Action Win SS AIX DB2 MQ zOS DB2 MQ Adapter Increased Embedded Self-Management Function Self-Healing Vision CBEs
Summary • IBM’s Autonomic Computing initiative has helped deliver the right “hygiene” to enable the industry for better Problem Determination • Predictive technologies can capitalize on this hygiene to help automate the “Problem Determination” process • We need continued research and cooperation across IBM and the industry at large to make the vision of Self-Healing systems a reality!