Datacenter Availability: Product Strategy & Future Vision

Datacenter Availability:Product Strategy & Future Vision Poulomi Damany Director, HA Product Management January, 2006

Steelers Stats • AFC Champs (14-5) • On their way to their 5th SB • First 6th ranked team going to the playoffs • Won 3 post-season games in a row on the road… VCS Stats • Revenue up 114% qtrly & 55% yearly over 2004 • The leading, independently available, cross-platform clustering product, per IDC • Customers standardizing on VCS for all their mission-critical apps (Fidelity, Bell South, CSC) • Best-in-class technology according to customers

Major VCS 5.0 Release Themes Standardization, Integration, Automation • Centralized Cluster Management • Broader app migration support (platforms, apps, Db) • Support for Advanced HA/DR Architectures • Automated DR Testing • Extensive support for virtualized server environments • Competitive Pricing/Packaging Changes

Achieving all levels of availability remote clustering Symantec Strategy • Deliver a full spectrum of availability solutions • Drive down HA/DR costs with innovative new architectures • Enable standardization with broad environment support & mgmt remote replication availability local clustering local mirroring/snapshots backup Cost

Traditional Challenges with High Availability Challenges Solution N+1, N+M clusters No server or storage restrictions Expensive, passive hardware Easy installation Cluster simulation Complex to configure Separate tools for HA & DR Integrated clustering & replication Automated recovery All storage operations on-line Dynamic multi-pathing included Storage downtime affects app

New Challenges for Datacenter Availability Standardize HA/DR tools to simplify Reduce DR cost & improve reliability Leverage virtual machines in production

The Challenge: Heterogeneity is Complex ServiceGuard PolyServe JFS2 ReiserFS SVC LVM UFS GPFS Sun Cluster MSCS QFS SAM FS ASM DLM TOOLSNEEDED: TrueCluster GeoSpan ZFS Ext3 LDM SVM JFS SAN-FS HACMP SDS ClusterFrame EVM OCFS 46 45 50+ 36 35 34 15 32 31 37 14 08 12 02 38 03 05 06 04 09 10 07 13 01 29 42 40 26 25 24 20 22 21 28 19 18 17 16 23 11 33 30 44 43 39 41 27 49 47 48 TrueCopy MirrorView SnapShot SAN Copy MPxIO MPIO DoubleTake RepliStor SnapView Snap FlashCopy InstantImage HDLM PowerPath PPRC ShadowImage SRDF TimeFinder ShadowCopy SNDR SecurePath Data Replication Manager MirrorDisk -UX

The Solution: Standardize and Simplify Storage Foundation HA

Our Strategy: Support Your Environment & Provide Centralized Management Requirement Symantec Strategy • Support in 90 days from O/S release • Windows, Linux, Solaris, AIX, & HP-UX Support all my platforms 5.0 Support all my applications Now • Pre-tested agents for major apps • Agent dev framework for custom apps Support all my databases Now • Oracle, DB2, Sybase, SQL Server • Storage Foundations for Oracle RAC Support my local, metro, and long-distance architectures Now • Full Cluster Server support • Full Storage Foundations support Enable me to manage centrally & consistently • Multi-cluster management included • Operational mgmt & uptime reporting 5.0

Standardization: Support All Platforms • Planning • MPs to • support: • OS updates • Platform parity • Oracle RAC 10g R2 • SLES10 and RHEL5 • SFRAC on Solaris x86 Q4: 5.0 Longhorn

(+SLES) Standardization: Support all Apps & DBs 5.0 (mid’06) -- Today -- 10gR2 , 15 , .10 NetWeaver , 7.8 , 9.0

Standardization: Support All Architectures Local Clustering (LAN) Metropolitan Disaster Recovery (MAN) Wide Area Disaster Recovery (WAN) Remote Mirroring Replication Remote Mirror, SAN Attached, Fibre Replication, IP, DWDM, Escon VERITAS Cluster Server VERITAS Storage Foundation + Volume Replicator /3rd party Replication

Multi-Site Cluster ManagementVeritas Cluster Management Console (VCMC) • A completely integrated console • Superset of VCS, CCA GUI functionality • Consistent look and feel • Legacy cluster support (VCS 4.0 and greater) • Enterprise-wide cluster/group status • Reporting • Centralized policies • Completely free, part of base VCS license!

VCMC Platform Coverage GUI Support • Windows • 2000 (x86) • 2003 (x86, IA-64, x86-64) • Solaris • 2.8 – 2.10 (SPARC) • Linux • RHEL 4.0 (x86, IA-64, x86-64) • SLES 9 (x86, IA-64, x86-64) Cluster Monitor Support • Windows • 2000 (x86) • 2003 (x86. IA-64, x86-64) • Solaris • 2.7 – 2.10 (SPARC) • HP-UX • 11iv2 (PA-RISC, IA-64) • AIX • 5.1 • 5.2 • Linux • RHEL 4.0 (x86, IA-64, x86-64) • SLES 9 (x86, IA-64, x86-64)

Standardization: Centralized Cluster Mgmt Web-based console enables management from anywhere on the network Manage multiple VCS versions on multiple platforms, at multiple remote locations Send commands or make configuration changes to multiple managed clusters, groups, and resources Quickly locate specific managed objects by filtering on key attributes

Centralized DR Management At a glance status, both operational (online/offline) and readiness (Fire Drill) With a single operation, migrate all apps from a primary site to the DR site With a single operation, online all global groups at DR site when primary fails

Standardization: Historical View of Cluster Operations Records operational history (events) of all managed clusters and clustered applications Track activity by individual user account as way to audit cluster operations Filter events by severity, type, time to quickly locate items of interest

Standardization:Policy-Based Alert Management Define centralized policies to monitor state changes and notify via visible alerts, email, or SNMP traps Built-in mechanism - consistent means to acknowledge and resolve alerts High-level alert summary visible from any view. Links to detailed alerts view Filter alerts by severity, time, and other key attributes

Standardization: Integrated Operational Reporting View output from current, or cached, reports Filter jobs and output views Run reports manually or schedule automatic Create reports based on pre-defined report types Raw historical data available via ODBC for 3rd party report generators

New Challenges for Datacenter Availability Standardize HA/DR tools to simplify Reduce DR cost & improve reliability Leverage virtual machines in production

Advanced HADR ArchitecturesDrive Down Cost, Improve Availability • Remote Group Agent allows dependencies across clusters • Coupled with Fire Drill, allows for realistic full application testing Multi-Tier Failover Multi-Site Failover • Stretch Clusters with System Zones (5.0MP1) • Free with VERITAS Storage Foundations • Global Clusters with GCO (now free!!) • Sync replication at async distances • VVR Bunker Mirror: Less storage required, Free • SRDF/STAR support for EMC shops 3-site DR Dual Use DR • VCS + Virtual Machines (CYQ3) • Use secondary for dev/test, performance tuning • Entire 3-tier environments contained in a single physical machine

VERITAS Stretch Cluster • Maintains data consistency • Prevents “crossed grows” • Reduces site-to-site bandwidth through preferred plex reads Stretch Clusters with Remote Mirroring Primary Site Secondary Site Traditional Stretch Cluster • Data can become inconsistent during failure scenarios • “Crossed grows” of volumes can prevent failover  

Stretch Clusters with System Zones • Allows for auto failover within a system zone and manual across system zones • Users configure all systems in one data center in one zone, and opposite data center systems in another system zone • If a local failure happens, we failover. For data center failover, we have a human in the loop • Upside: a single cluster config to maintain • Downside: slightly higher complexity on cluster startup and a increased load on the interconnect • Recommended when two data centers have very little local failover and mostly center to center failover • Slated for 5.0MP1

Global Availability Increasingly Important • Recent events illustrate that metropolitan-only protection may not be enough • Asynchronous replication over the WAN to a geographically distant location provides enhanced protection • Position VCS and async replication to either: • Add a distant 3rd site • Migrate existing 2nd site further away

Bunker Replication: Any Distance & Zero Data Loss Traditional multi-hop approach • 5X storage requirement • Storage hardware lock in • Cascaded (more dependencies) • Heavy-weight bandwidth reqs • Data is 3 generations old Primary Site Bunker site Secondary Site VERITAS Bunker Replication approach • Reduces storage requirements • Reduced bandwidth requirements • Zero RPO over any distance • Little or no application impact

Physical-to-Virtual HA/DR • Physical servers are used for normal production workload and VMs are used for recovery • VMs required to stay running on target physical node, with full SF stack • Provides server consolidation benefits above traditional N+1/N+M, with minimal complexity • Provides an entry path to virtualization for customers wanting to stay physical in production • OpForce can facilitate the initial migration from Physical-to-Virtual machines (P2V)

Dual-Use DR with Virtual Machines Primary Site DR Site • VMs provide server consolidation benefits at both Data centers • Replication of boot images simplifies change management • DR site can be fully utilized while primary site is operational • Production applications take priority during site failover • VMware first, then HP, then AIX

Extended 3rd Party Replication Support

Integrated, Automated, DR Testing • Automated test/validation of DR applications, while in production without disruption • Minimizes storage requirements/costs Fire Drill with VVR Fire Drill with hardware replication • Automated test/validation of DR applications, while in production without disruption • Leverages existing replication investments Multi-Tier Fire Drill with RGA • Closely simulates real-world testing • Provides meaningful RTO, RPO results Virtual Fire Drill • Automated test/validation of VCS HA/DR infrastructure (resources, licenses, storage) • Works in both local and remote data centers

Determine actual RP/RT with a detailed time breakdown to identify bottlenecks Physical Fire Drill Measure DR performance against SLA objectives (RPO/RTO) Real-world testing through a successful connection to the Fire Drill application instance Benchmark DR performance for continuous improvement over time

Virtual Fire Drill Verify availability of required storage, licenses (e.g. VxVM) Verify network connectivity Verify availability of mount points and filesystem licenses (e.g. VxFS) Verify application-specific configuration, binary locations and licensing

Clustering 5.0 Themes and Features COMPATIBILITY • Synchronous release for all operating systems • Advanced HA support for virtual machines • Centralized cluster management • Enhanced DR support (VERITAS and 3rd party) • Enhanced storage infrastructure support (VERITAS and 3rd party) SOLARIS THEMES • SPARC: 8, 9, 10 • x64: 10 AIX • P4, P5: 5.3, 5.2, 5.1 HP-UX • Broader app migration support (platforms, apps, Db) • Enhanced large cluster configuration & management • DR configuration, reporting, SLA • FireDrill with all major 3rd party HW replication (EMC, Hitachi, Oracle, IBM, Netapp) • Extensive support for virtualized server environments • Itanium & RISC: 11iv2 LINUX FEATURES • RedHat: 3, 4 • SuSE: 8, 9, 10 WINDOWS • Server 2000, 2003 • Longhorn Beta Back

Questions? poulomi_damany@symantec.com

Appendix

Bunker Replication How it works: Bunker Replication • Mirror or replicate log to bunker • Use IP or SAN for bunker • Async replication to secondary • Add bunker to existing RVGs • Supports UDP or TCP • RTO depends on size of buffer (and time to drain) • Supported with 3-way GCO • Bunker can be Sync, sync-override, async DR Site Asynchronous over IP Sync over IP or fibre channel Non-dedicated IP (only used to drain bunker SRL) Primary Site SRL Bunker Site

What Is SRDF/STAR? • A three-site DR solution enabler using both SRDF/S and SRDF/A • Extends a 2-site sync setup with a third async site • Benefit • Enables resumption of SRDF/A protection differentiallyat either secondary (R2) site if primary (R1) site fails

R2 R1 SRDF/A (Resumed Differentially) R2 BCV SRDF/A MSC Multi-Session Consistency z/OS host and BCV at Site B or Site C SRDF/A MSC Multi-Session Consistency Consistency Group z/OS host Local Site (B) SRDF/Synchronous R1 Primary Site (A) “Production Site” SRDF/Asynchronous Active link Inactive link Remote Site (C)

R2 SRDF/A (Resumed Differentially) R2 R1 BCV SRDF/A MSC Multi-Session Consistency z/OS host and BCV at Site B or Site C SRDF/A MSC Multi-Session Consistency Consistency Group z/OS host Local Site (B) SRDF/Synchronous R1 Primary Site (A) “Production Site” SRDF/Asynchronous Active link Inactive link Remote Site (C)

STAR Supported Operational Scenarios • Planned site failover • Site A to Site B failover: Workload restarted after SRDF reconfig and data resync completed • Ending state: STAR configuration resumed B->A, B->C • Site A to Site C failover: Workload restarted after SRDF reconfig and data resync competed, • Ending State: Data SRDF/A protected C->B • Unplanned site failover • Site A failure: Restart at site B using B data • Ending State: SRDF/A B->C • Site A failure: Restart at site B using C data • Ending state: SRDF/A B->C • Site A failure: Restart at site C using C data • Ending state: SRDF/A C->B • Link resumption • SRDF/S link resumed after a failure • SRDF/A link resumed after a failure

STAR Requirements/Restrictions • STAR is defined by the MSC group definition • Can be a subset of a Congroup definition • One STAR per MSC task • No ESCON support for SRDF/A link • RDF groups must be predefined for B->C link • B->C Link must be in place • R2 devices must be protected • Gatekeeper must be outside of ConGroup • BCV or Full Volume SNAP (CLONE) required at site B or C • SRDF HC must be available on R2 site host at recovery time

VVR Bunker Replication vs. SRDF/Star • VVR Pros: • Minimal storage required at bunker site • Wide area site can use inexpensive storage • Replication performance • Application isolation – protection from misbehaving apps • VVR Cons: • Greater load on primary to handle multiple replication targets • STAR Pros: • Cascade of replication reduces load on primary (doesn’t replicate to both bunker and async target) • STAR Cons: • Requires expensive, fully provisioned storage at bunker site • Still uses array cache for replication – no application isolation

VVR versus 3rd Party Asynchronous Replication

Datacenter Availability: Product Strategy & Future Vision