1 / 40

From Consolidated Operations to Service Management with the Netcool Suite

From Consolidated Operations to Service Management with the Netcool Suite. General Session Doug McClure Sr. Manager, Service and Technology Monitoring, EarthLink October 14, 2004. Agenda. EarthLink Overview Innovation, Technology, and Change

micah
Télécharger la présentation

From Consolidated Operations to Service Management with the Netcool Suite

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From Consolidated Operations to Service Management with the Netcool Suite General Session Doug McClure Sr. Manager, Service and Technology Monitoring, EarthLink October 14, 2004

  2. Agenda • EarthLink Overview • Innovation, Technology, and Change • And the need for open, flexible, adaptable monitoring solutions • IT Operations and Business Maturity • Challenges facing EarthLink and Roadmap to Improvement • EarthLink Service and Technology Monitoring • Improving Service, Customer, and Business Performance and Availability • Enabling ITIL Best Practices with the Micromuse Suite • Service Management Database, Change Management Dashboard • Linking IT Operations with the Business • Business Process/Activity Monitoring and Dashboards • Continuous Improvement

  3. EarthLink Overview • One of the Nation’s Largest ISPs • Headquarters in Atlanta, GA • Key facilities in Dallas, Pasadena, San Jose, Knoxville, and Seattle • Profitable, strong balance sheet • Largest DSL footprint • First-to-market with products that provide the best possible Internet experience • Customer Advocacy: Fighting SPAM, Abuse, and Fraud (Phishers) • Technical solutions • Litigation • Legislative support • Industry collaboration • Consumer education • 10th Anniversary (1994-2004) • http://www.redefineyourworld.com

  4. EarthLink Overview 5.25M Customers • ~4M Dialup (Premium ~3.5M, Value ~500K) • ~1.2M Broadband (Cable, xDSL) • ~160K Web Hosting (Unix, Windows) • ~50K Wireless (Blackberry, PDA, Laptops, Wi-Fi) • Dial Access Coverage > 90% of US Population • ~16K Local Dial Access Numbers • ~500K Active Modem Ports (~50% ELNK, ~50% Outsourced) • ~250 PoPs (18 Core Backbone PoPs, four data centers) • Broadband Coverage • ~200 Markets with Broadband Offerings Large and Diverse Infrastructure • ~2300 Network Elements • ~1600 Server Elements • Thousands of Access Circuits, Hundreds of WAN Circuits

  5. EarthLink Overview Access Technology Innovation • Premium and Value Dial-up • Broadband (Cable, xDSL, Satellite) • Voice (Converged Devices, VoIP, SIP) • Wireless (WiFi, CDMA, Blackberry, PDA) • Broadband over Power Lines (BPL) • IP Services (Triple Play) Value Added Service and Product Innovation • Blocker Family: spamBlocker, POP-UP Blocker, ScamBlocker, Virus Blocker, Spyware Blocker (www.blockoftheday.com) • Parental Controls • Webmail, Web Accelerator

  6. EarthLink Overview Exceptional Customer Service • 2004 J.D. Power and Associates Customer Satisfaction Award for High-Speed and Dial-Up Internet Service • 2003 PC Magazine Readers' Choice Awards for both high-speed and dial-up services • 2003 highest ranking in customer satisfaction for the second year in a row for high-speed Internet service by J.D. Power and Associates in its Internet Service Provider Residential Customer Satisfaction StudySM • 2003 CNET Editors' Choice award

  7. Innovation, Technology, and Change “ ” A company can't outgrow its competitors unless it can out-innovate them. Source: Gary Hamel and Gary Getz, in ‘Funding Growth in an Age of Austerity’

  8. Innovation = Constant Change Drivers • Customer Retention – Decrease Churn • Speed to Market, Competition – Do more, faster • Quality, Performance, Support Costs • Compliance - Sarbanes-Oxley, Visa CISP Operational Challenges • Release Management • Change Management • Service Level Management • Enterprise Security

  9. Leading Edge Technology = Constant Change Drivers • Voice – SIP • Broadband • Wireless (WiFi, Regulated, Unregulated) • Content, Rich Internet Applications • End-to-End Services • Custom Applications Operational Challenges • Fault, Performance, Availability, Utilization Monitoring • Vendor Lag in Support • Lack of a Standard Fault, Performance, Availability, Utilization API

  10. IT Operations and Business Maturity “ ” It is not the strongest of the species that survive, nor the most intelligent, but the one most responsive to change. Source: Charles Darwin

  11. Operations Maturity: Growing Up, Focused on Four Areas Service Level Management • All Tier 1, 2, 3 Support Groups in Operations • Set and manage expectations internal/external to Operations related to responsiveness and resolution of production issues Change Management • Provide oversight and control of the production environment • Minimize risk and impact from change activities Release Management • Development  Operations • Minimize poor quality production releases Enterprise Security • Compliance, control, audit

  12. Operations Maturity: Common Language and Best Practices Production Improvement Program (PIP) • Foundation in IT Service Management, ITIL, CobIT • Focusing on four main areas: Service Level Mgmt, Change Mgmt, Release Mgmt, and Production Security • Over the past four months, 20% of Operations staff have now attended ITIL Training • 1 Master Level Certified (two more pending results) • 12 Practitioner Level Trained in CCR Quadrant • 8 Change Management Practitioner Certified (more pending results) • 4 Configuration Management Practitioner Certified • Over 130 Foundation Level Trained and Certified

  13. Production Improvement Program CLOSED RFC Release Design, Build Release Acceptance Roll-out Planning Comm, Prep, Training Distribution/ Installation REQUEST FOR CHANGE (RFC) Mutual Benefit from EarthLink’s Innovation and Advanced Use of Micromuse Products Micromuse OMNIbus, Impact, Webtop, RAD Corp Project Ops Project Non-Project STATUS CHANGE (3) Final Change Approval and Implementation STATUS CHANGE (2) Change Approval and Proj. Service Availability STATUS CHANGE (4) Review Changes STATUS CHANGE (1) Prioritization, Risk Assessment and Forward Schedule of Change Change Mgt Release Planning Dev / Procurement Metrics & Reporting Release Mgt Policy, Procedures, Standards & Guidelines Security Consulting Security Assessment Security Test & Sign off Security Monitoring Prod Sec Source: EarthLink SLM Group

  14. EarthLink Service and Technology Monitoring “ ” Creativity involves breaking out of established patterns in order to look at things in a different way. Source: Edward de Bono

  15. EarthLink and Micromuse Facts Very Early Netcool Adopter • EarthLink (Mindspring) was Micromuse’s first US customer • Began evaluating Micromuse Netcool in 1996, official customer April 1997 Early Innovation • Early joint innovation and development helped build foundation for many of Micromuse’s key products Driving 3rd Party Vendor Integration & Partnerships • Much more than just “sending SNMP TRAPs  EarthLink requires in-depth integration with Micromuse suite Current Deployment • Netcool OMNIbus, Internet Service Monitors, SM Reporter, Desktop Clients, Webtop, Impact, numerous Gateways, Probes, Data Source Adaptors • Preparing for OMNIbus v7 migration, RAD 2.0 • Plan to evaluate Precision

  16. Moving Beyond “MoM” and Apple Pie EarthLink’s Early Micromuse Netcool Deployment • Focused on Netcool as the “Manager of Managers” or “MoM” • Needed during EarthLink’s rapid growth and expansion • Enabled event management eliminated “swivel chair NOC” “Apple Pie” is Event Correlation and Deduplication • The Netcool sweet spot was providing EarthLink with event correlation and deduplication • Enables Tier 1 and Tier 2 break/fix support groups to operate efficiently Focus now on End-to-End Service Management • Netcool Suite allows EarthLink to manage entire service • We can understand service relationships, service levels, and service impact; perform service modeling and service discovery • Enables impact assessment, prioritization, understanding full service delivery chain • Eliminate “needle in the haystack” approach of event management

  17. The Service IS Important End-to-End Service Management and Monitoring • End-to-End service monitoring is my team’s #1 goal! • Providing that all layers (L1-L7) of the infrastructure are thoroughly instrumented, real-time monitoring of the true end-to-end service is possible • Service discovery, topology, dependency mapping, and change control ARE REQUIRED for highly accurate service monitoring • “Intimate Service and Infrastructure Knowledge” can be instrumented • Developers and support staff have deep understanding of how our services operate and their unique operational characteristics and dependencies • This knowledge can be programmatically instrumented and monitored, correlated, analyzed, and presented in real-time • Immediate notification to support groups when service infrastructure capabilities or performance degrades

  18. Service Management Complexity Infrastructure Events to Netcool ClientApplications ANY WEB BROWSER Mail CLIENT CLIENT ANY WEB BROWSER PALM CLIENT HTML HTML HTML HTML PresentationLayer S81 S82 S83 S84 S85 S86 S80 APIs API 1 API 2 Tickets S90 ApplicationServicesLayer S88 S91 S87 APIs SMTP IMAP POP3 S102 S100 S101 S103 CoreServicesLayer S108 S106 S107 S105 S104 APIs API 3 API 4 API 5 API 6 API 7 Storage S109 S110 S112 InfrastructureLayer To Other Systems S111 Good Customer Experience? Performance? Source: EarthLink Product Group

  19. Service Management Complexity Number of Components Infrastructure Events D D D D D D D D D D D D D D Time(24x7x365) D D D D D D D D D D Infrastructure Changes • Event information increases exponentially by amount of number of components, time (growth), and infrastructure changes • Over 1500 Servers, 2300 Network Elements, and 20K Interfaces/Circuits • Netcool/ObjectServer is a must have for effectively managing service event stream from end-to-end • Impact 3.0’s cluster capability will greatly improve ability to analyze, enrich, suppress, and manage event stream regardless of our growth Source: EarthLink Product Group

  20. The Customer IS Important Customer Experience Management and Monitoring • The Micromuse Netcool Suite enables consolidation and understanding of proactive, real-time monitoring of the customer’s experience for core EarthLink services • Proactive, real-time monitoring of the customer’s experience • Traditional Infrastructure Monitoring (SNMP, System Agents, Service Port Monitoring) • Synthetic transaction monitoring • Customer Agent based monitoring, • Agentless application, transaction, and customer performance monitoring (Emerging) • Becomes the “glue” that ties infrastructure monitoring together • Powerful information when customer experience and infrastructure monitoring data is correlated, analyzed, and presented in real-time • Immediate notification to support groups when customer’s experience degrades

  21. The Business IS Important Business Activity Monitoring and Management • Expands IT Operations visibility vertically and horizontally • Ties IT Operations data and Business data together • System Downtime vs. Contact Center Call Volume • Real-Time Customer Subscriptions vs. Sales Forecasts • Almost any process can be instrumented and monitored in real-time, have policies applied to it, and be presented in a dashboard or portal for presentation • Enables Real Time Monitoring and Management of Business and IT processes • Change and Downtime Management • Customer Registration Management

  22. Enabling ITIL Best Practices with the Micromuse Suite “ ” If you always do what you've always done, you'll always get what you always got. Source: From a speech, unattributed

  23. Enabling ITIL Best Practices Incident and Problem Management • IM: Low level event classification, service dependencies, full integration with Remedy, Service Management DB (SMDB) • PM: Long-term historical event database for trend research, Service Management DB (SMDB) Change and Release Management • CM: Change Management System (CMS/RFC), Service Management DB (SMDB), service dependencies, impact on infrastructure from changes or downtimes • RM: Monitoring can greatly help in the development, test, and staging environments PRIOR to release to production Performance and Availability Management • PM/AM: Continuous low-level element and system level testing and data collection, trending, reporting, and alerting Capacity Management • CM: Continuous low-level element and system data collection, trending, reporting, and alerting

  24. Service Management Database: ITIL/PIP & Service Management • SMDB • Information about end-to-end service, service dependencies, relationships, topology, elements, production status, etc. • Self-serve customer interfaces into the service management and monitoring process • Auto-provision monitoring on all applications  reduce administrative overhead • Not a low-level configuration management database (CMDB), but could be the virtual high-level CMDB • SMDB Modules • Change Management System (CMS) / Downtime Request (DTR) • All RFC’s/DTR’s managed from within the SMDB complex, full lifecycle management, full risk and approval matrices, service management policies, interested parties • Impact of changes/downtimes immediately known within infrastructure through Impact 3.0 integration, policy creation, and event management • Element Management (network, server, application), ISM Creation, Agent Configuration, etc. • Service Management Policies • Information about customer and business defined service management policies, SLA/OLAs, etc.

  25. Service Management Database: ITIL/PIP & Service Management Source: EarthLink Service and Technology Monitoring

  26. Business Process Monitoring – ITIL Change Management “ ” What gets measured, gets done! Source: Tom Peters

  27. Overview – Controlling Change and Benefits Drivers • Adoption of ITIL/COBIT Best Practices for Change Management • Significant change for many groups – Fear, Uncertainty, Doubt (FUD) • No Real-Time Visibility into Change/Downtime Management Activities • Business Process • Who, What, When, Where, Why, and How, Cost, Risk, and Impact • Workflow – Monitor Lifecycle, SLAs, Bottlenecks – Is the process enabling Operations or is it a bottleneck? • Impact on Infrastructure – False Positives, Contact Center Call Volume (COGS) • Drive out False Positives from Production Monitoring Systems • Huge burden on NOC and other support staff • Desire to have Automated Remedy Trouble Ticket Creation • Reduce time to address problems, reduces MTTR

  28. Enabling Change Management with Netcool Suite Solution • Provide Real-Time Visibility into Change/Downtime Process • Create Actionable Information • Ensure Business Rules are Guiding/Enabling the Process – Not Hindering It • Eliminate FUD • Report (dashboards, reports) on Process and Impact • NOC and other support groups know what’s happening during change and downtime windows • Management has oversight and visibility • Business understands impact of change and downtime activity

  29. Source: EarthLink Service and Technology Monitoring Source: EarthLink Service and Technology Monitoring

  30. Source: EarthLink Service and Technology Monitoring

  31. Source: EarthLink Service and Technology Monitoring

  32. Business Activity Monitoring Source: EarthLink Service and Technology Monitoring

  33. RAD 2.0 Presentation Source: EarthLink Service and Technology Monitoring

  34. Netcool Event Management Change/Downtime Request Events Change / Downtime ID Change / Downtime Status Suppressed Change/Downtime Activity Events Event Suppressed by Change / Downtime Source: EarthLink Service and Technology Monitoring

  35. Future Enhancements Planned Netcool/Impact Policies • Impact on EarthLink • COGS: Assess support cost impact due to change and downtime activities within Operations and Customer Support in Real-Time • Tier 1, 2, 3 Support Cycles • Better Change and Release Management Planning • Data Gap Management • A common question: Why does my chart or graph have gaps? • The solution: Annotate graphs, charts, portals, etc. with the reason for data gaps caused by planned change/downtime activities • How: Integrate change and downtime event information with all performance, utilization, and capacity monitoring solutions via Impact 3.0

  36. RAD 2.0 Joint Development Business Activity Monitoring: Real-Time Customer Registration Dashboard Source: EarthLink Service and Technology Monitoring

  37. Continuous Improvement “ ” We have a ‘strategic plan’. It’s called doing things. Source: Herb Kelleher

  38. Continuous Improvement • Making Applications “Monitoring Aware and Netcool Ready” • Work with developers on getting a monitoring API embedded into applications • Every application and tier linked into Netcool directly (not through server agent) • Discovery, Topology, Dependency Modeling • Monitoring accuracy and root cause depend on this! • Need solution for Layer 1-7, likely two solutions (L1-3 & L4-7) • Application, Transaction and Customer Performance Monitoring • Synthetic transactions only get us so far…but will continue to evolve • Don’t forget about client-server – everything isn’t web enabled! • Agentless technologies are emerging to accurately map out application and transaction flows, relationships, and topology • Next (2nd/3rd) Generation Quality, Performance, Capacity, Utilization solution needed • Services, Applications, Servers, Storage, Network

  39. Continuous Improvement Building better Network and Systems Management • Founded Atlanta Network and Systems Management Technical User Group (ANSMTUG) in January 2004 • http://www.ansmtug.org • Metro-Atlanta Fortune 100, Service Providers, Enterprise, Media, and Emerging Technology Companies • Bell South, The Home Depot, EarthLink, Southern Company, N2 Broadband, eDeltacom, Delta, CNN, Cingular, E*Trade, Knology Broadband, Cox Communications • Customers helping Customers • Use Micromuse and other NSM products better • Collectively drive product requirements and features into Micromuse and other NSM vendors

  40. Closing and Questions and Answers • EarthLink is a happy Micromuse customer • EarthLink depends on the Netcool suite’s openness, flexibility and adaptability to keep up with innovation, technology, and constant change • EarthLink will continue to push the Netcool suite beyond the sales and marketing slick • EarthLink’s infrastructure, service, customer, and business performance and availability continues to improve because of our advanced use of the Netcool suite • Q&A

More Related