1 / 58

Multi-Site Clustering with Windows Server 2008 Enterprise

Multi-Site Clustering with Windows Server 2008 Enterprise. Symon Perriman Program Manager Microsoft Corporation WSV316 . Multi-Site Clustering. Benefits Deployment Replication Networking Faster Failover Quorum Best Practices. Benefits of a Multi-Site Cluster.

emily
Télécharger la présentation

Multi-Site Clustering with Windows Server 2008 Enterprise

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multi-Site Clustering with Windows Server 2008 Enterprise SymonPerriman Program Manager Microsoft Corporation WSV316

  2. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  3. Benefits of a Multi-Site Cluster • Protects Against Loss of an Entire Datacenter • Power outage, fires, hurricanes, floods, earthquakes, terrorism • Automates Failover • Reduced downtime • Lower complexity of disaster recovery plan • Reduces Administrative Overhead • Automatically synchronize application and cluster changes • Easier to keep consistent than unclustered servers • What is the primary reason why disaster recovery solutions fail? Dependence on People

  4. Multi-Site Clustering Checklist • http://technet.microsoft.com/en-us/library/dd197546.aspx • Organized multi-site cluster deployment guide

  5. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  6. 2+ physically separate sites 1+ node at each site Storage at each site with data replication Application moves during a failover Multi-Site Clustering Basics Site A Site B SAN SAN

  7. Redundancy Everywhere • 2 or more computers (nodes) • 2 NICs • 3rd NIC for iSCSI • HBA • Fibre Channel (FC) • Serial Attached-SCSI (SAS) • Multipath IO (MPIO) • Redundant Storage Interconnects • Replicated Storage • OS, Service or Application HA Roles

  8. Mix and Match Hardware • You Can Use Any Hardware Configuration if • Each component has a Windows Server 2008 / R2 logo • Servers, Storage, HBAs, MPIO, etc… • It passes Validate • It’s That Simple! • Connect your Windows Server 2008 / R2 logo’d hardware • Pass every test in Validate • It is now supported! • If you make a change, just run Validate again • Details: http://go.microsoft.com/fwlink/?LinkID=119949

  9. FCCP • Failover Cluster Configuration Program • Windows Server 2008 / R2 • Buy validated solutions • “Validated by Microsoft Failover Cluster Configuration Program” • Not required for Microsoft support, must be logo’d • More information: http://www.microsoft.com/windowsserver2008/en/us/clustering-program.aspx

  10. demo Introduction to Multi-Site Clustering

  11. Cluster Validation and Replication • Multi-Site clusters are not required to pass the Storage tests to be supported • Validation guide and policy: • http://go.microsoft.com/fwlink/?LinkID=119949

  12. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  13. Why is Replication Needed? • Loss of a site won’t cause complete data loss • Data must exist on other site after a failover • Different storage needs than local clusters • Multiple storage arrays, independent on each site • Nodes usually access local site’s storage first Site B Site A Changes are made on Site A and replicated to Site B Replica

  14. Replication Solutions • Replication Levels • Hardware (block level) storage-based replication • Software (file system level) host-based replication • Application-based replication • Exchange Server 2007 CCR • Replication Types • Synchronous • Asynchronous A data replication mechanism between sites is needed

  15. Synchronous Replication • Host receives “write complete” response from the storage after the data is successfully written on both storage devices Replication WriteRequest SecondaryStorage WriteComplete Acknowledgement PrimaryStorage

  16. Asynchronous Replication • Host receives “write complete” response from the storage after the data is successfully written to the primary storage device Replication WriteRequest SecondaryStorage WriteComplete PrimaryStorage

  17. Synchronous vs. Asynchronous

  18. What About DFS-Replication? • DFS-R performs replication on file close • Some file types stay open for a very long time • VHDs for Virtual Machines • Databases for SQL Server • Data could be lost during a failover if it had not yet replicated Using DFS-R to replicate the cluster disk’s datain a multi-site Failover Cluster is not supported

  19. IP Address Resources* Network Name Resource Disk Resource Custom Resource (manages replication) Resource Dependencies Group determines smallest unit of failover Resource Group Establishes start order timing Workload Resource (example File Server) “ depends on ”

  20. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  21. Site B Network Considerations • Cluster nodes can reside in different subnets (2008/R2) • No need to connect nodes with VLANs Public Network Site A 20.20.20.1 10.10.10.1 40.40.40.1 30.30.30.1 Separate Network

  22. Stretching the Network • Longer distance means greater network latency • Too many missed health checks can cause false failover • Fully configurable in 2008/R2 • Failover Clustering has NO DISTANCE & NO SUBNET LIMITATIONS • Check if your vendor’s hardware / replication has limitations • SameSubnetDelay (default = 1 second) • Frequency heartbeats are sent • SameSubnetThreshold (default = 5 heartbeats) • Missed heartbeats before an interface is considered down • CrossSubnetDelay (default = 1 second) • Frequency heartbeats are sent to nodes on dissimilar subnets • CrossSubnetThreshold (default = 5 heartbeats) • Missed heartbeats before an interface is considered down to nodes on dissimilar subnets • Command Line: Cluster.exe /prop • PowerShell (R2): Get-Cluster | fl *

  23. Security Over the WAN • Improved Security • Prevent Clients from Connecting to Networks • Encrypt Intra-cluster Traffic • 0 = clear text • 1 = signed (default) • 2 = encrypted

  24. IP Address Resource B IP Address Resource A Network Name Resource Enhanced Dependencies – OR • Network Name resource stays up if either IP Address Resource A ORIP Address Resource B is up OR

  25. IP Address Resources A IP Address Resources B Network Name Resource Custom App (replication) Disk Resource Workload Resource (example File Server) Resource Dependencies OR Comes online on site A Comes online on site B

  26. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  27. Nodes in dissimilar subnets Failover changes resource’s IP Address Clients need that new IP Address from DNS to reconnect DNS Updates DNS Server 2 DNS Server 1 DNS Replication Record Created Record Updated Record Obtained Record Updated 10.10.10.111 20.20.20.222 FS = 20.20.20.222 FS = 10.10.10.111 Site A Site B

  28. Network Name Properties • RegisterAllProvidersIP (default = 0 for FALSE) • Determines if all IP Addresses for a Network Name will be registered by DNS • TRUE (1): IP Addresses can be online or offline and will still be registered • Ensure application is set to try all IP Addresses, so clients can come online quicker • HostRecordTTL (default = 1200 seconds) • Controls time the DNS record lives on client for a cluster network name • Shorter TTL: DNS records for clients updated sooner • Exchange Server 2007 recommends a value of five minutes (300 seconds)

  29. Local Failover First • Local failover first • No change in IP Address • Cross-site failover for disaster recovery DNS Server 2 DNS Server 1 10.10.10.111 20.20.20.222 FS = 10.10.10.111 FS = 20.20.20.222 Site A Site B

  30. Failover Order • Preferred Owners • Local failover first • Possible Owners Always Enforced • Resource will not start on non-possible owner • AntiAffinityClassNames • Groups with same AACN try to avoid moving to same node • http://msdn.microsoft.com/en-us/library/aa369651(VS.85).aspx

  31. Virtual LAN (VLAN) • Deploying a VLAN minimizes client reconnection times • Can be harder to configure • Required for SQL & live migration DNS Server 2 DNS Server 1 10.10.10.111 10.10.10.111 VLAN FS = 10.10.10.111 Site A Site B

  32. demo Multi-Site Clustering Groups and Settings

  33. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  34. Node majority Node and File Share majority Disk only (not recommended) Node and Disk majority Quorum Overview • Majority is greater than 50% • Possible Voters: • Nodes (1 each), Disk Witness (1 max), File Share Witness (1 max) • 4 Quorum Types Vote Vote Vote Vote Vote

  35. Node and Disk Majority • Nodes get 1 vote each and Disk gets vote • Loss of disk or node OK if majority is maintained • Do not use in multi-site clusters unless directed by vendor Vote Vote Vote ? Replicated Storage from vendor

  36. Site A Node Majority Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership 5 Node Cluster: Majority = 3 Site B SAN SAN Cross site network connectivity broken! Majority in Primary Site

  37. Site A Node Majority Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership We are down! 5 Node Cluster: Majority = 3 Site B SAN SAN Disaster at Site 1 Majority in Primary Site

  38. Forcing Quorum • Always understand why quorum was lost • Used to bring cluster online without quorum • Cluster starts in a special “forced” state • Once majority achieved, no more “forced” state • Command line: • net start clussvc /forcequorum (or /fq) • PowerShell (R2): • Start-ClusterNode –FixQuorum (or –fq)

  39. Multi-Site With File Share Witness File Share Witness Site C Complete resiliency and automatic recovery from the loss of any 1 site \\Foo\Cluster1 WAN Site A Site B SAN SAN Replicated Storage from vendor

  40. Multi-Site With File Share Witness File Share Witness Site C Complete resiliency and automatic recovery from the loss of any 1 site \\Foo\Cluster1 WAN Site A Site B SAN SAN Replicated Storage from vendor

  41. Multi-Site With File Share Witness File Share Witness Site C Complete resiliency and automatic recovery from the loss of the File Share Witness \\Foo\Cluster1 WAN Site A Site B SAN SAN Replicated Storage from vendor

  42. FSW Considerations • Simple Windows File Server • Needs to be in the same forest • Running Windows Server® 2003, 2008 or 2008 R2 • Recommended to be at 3rd separate site • Single file server can serve as a witness for multiple clusters • Each cluster requires its own share • Can be clustered in a second cluster • FSW cannot be on a node in the same cluster • It is an additional voter for free (almost)

  43. demo Quorum on a Multi-Site Cluster

  44. Quorum Model Summary • No Majority: Disk Only • Note Recommended • Only use as directed by vendor • Node and Disk Majority • Only use as directed by vendor • Node Majority • Odd number of nodes • Node and File Share Majority • Best availability solution • Recommended for • Exchange Server 2007 CCR

  45. Multi-Site Clustering • Benefits • Deployment • Replication • Networking • Faster Failover • Quorum • Best Practices

  46. Cluster your Branch Offices • Cluster several standalone File Servers from branch offices • Keep network traffic low • High-Availability for the files • Redundancy for the data Site A Site B Clients primarily accessing applications in Site A Clients primarily accessing applications in Site B

  47. Multi-Site Across the Enterprise • More distributed cluster nodes & clusters gives higher availability • Complete resiliency and automatic failover • Remember your quorum model • Loss of any single site should not bring down the cluster • File Share Witness • 1 File Server hosts all File Share Witnesses for multiple clusters • Make it highly-available • Separate site • Not a node in that same cluster Cluster 2, Branch 1 Cluster 2, Branch 2 Cluster 2, Main Office Cluster 1, Site 2 Cluster 3, Many FSWs Cluster 1, Site 1

  48. Multi-Site Clustering Review File Share Witness Site C 4, 6, 8… nodes + FSW = odd # votes Local failover first (preferred owner) Site failover second (possible owner) AntiAffinityClassNames Faster DNS Updates Register all IPs for a Network Name Shorten client’s DNS record TTL Ensure application tries all IPs WAN Site A Site B Encrypt WAN traffic for security Adjust health checks for latency Configure ‘OR’ dependencies SAN SAN Replicated Storage from vendor

  49. Session Summary • Multi-Site Failover Clustering has many benefits • Variety of hardware options & configurations • Redundancy is needed everywhere • Understand your replication needs • Compare VLANs with multiple subnets • Plan your quorum model & nodes before deployment • Follow the checklist and best practices • http://technet.microsoft.com/en-us/library/dd197546.aspx

More Related