1 / 58

Exchange server 2010 high availability deep dive

SESSION CODE: EXL407. Scott Schnoll Principal Technical Writer Microsoft Corporation. Exchange server 2010 high availability deep dive. Agenda. Exchange Server 2010 High Availability Deep Dive Database Availability Group Networks Active Manager Best Copy Selection

zubaida
Télécharger la présentation

Exchange server 2010 high availability deep dive

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SESSION CODE: EXL407 Scott Schnoll Principal Technical Writer Microsoft Corporation Exchange server 2010high availability deep dive (c) 2011 Microsoft. All rights reserved.

  2. Agenda • Exchange Server 2010 High Availability Deep Dive • Database Availability Group Networks • Active Manager • Best Copy Selection • Datacenter Activation Coordination Mode (c) 2011 Microsoft. All rights reserved.

  3. Exchange Server 2010 High Availability Deep Dive: Database Availability Group Networks

  4. DAG Networks • A DAG network is a collection of one or more subnets • There are two types of DAG networks • MAPI Network - connects DAG members to network resources (Active Directory, other Exchange servers, DNS, etc.) • Registered in DNS / DNS configured • Uses default gateway • Client for Microsoft Networks/File and Print Sharing enabled • Replication Network - used for/by continuous replication (log shipping and seeding) • Not registered in DNS / DNS not configured • Typically no default gateway • Client for Microsoft Networks/File and Print Sharing disabled

  5. DAG Networks • All DAGs must have: • Exactly one MAPI network • Zero or more Replication networks • Separate network(s) on separate subnet(s) • LRU determines which replication network is used with multiple replication networks • DAG networks automatically created when Mailbox server is added to DAG • Based on cluster’s enumeration of networks • Cluster enumeration based on subnet • One cluster network is created for each subnet

  6. DAG Networks • Maximum round trip return latency between all DAG members must be 500 ms or less • Regardless of the latency of the solution, customers should validate that the network between all DAG members is capable of satisfying the data protection and availability goals of the deployment • May need to investigate increasing the number of databases or decreasing the number of mailboxes per database to achieve desired goals

  7. DAG Networks

  8. DAG Networks

  9. DAG Networks • Collapse subnets into two DAG networks and disable replication for the MAPI network: Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork01 -Subnets 192.168.0.0,192.168.1.0 -ReplicationEnabled:$falseSet-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 -Subnets 10.0.0.0,10.0.1.0Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork03Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork04

  10. DAG Networks • Collapse subnets into two DAG networks and disable replication for the MAPI network: Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork01 -Subnets 192.168.0.0,192.168.1.0 -ReplicationEnabled:$falseSet-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 -Subnets 10.0.0.0,10.0.1.0Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork03Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork04

  11. DAG Networks • Automatic detection occurs only when members added to DAG • If networks are added after member is added, you must perform discovery Set-DatabaseAvailabilityGroup -DiscoverNetworks • DAG network configuration persisted in cluster registry • HKLM\Cluster\Exchange\DAG Network • DAG networks include built-in encryption and compression • Encryption: Kerberos SSP EncryptMessage/DecryptMessage APIs • Compression: Microsoft XPRESS, based on LZ77 algorithm

  12. DAG Networks • Block cross-network communication to minimize heartbeat traffic Allowed Subnet 1 Subnet 3 Subnet 2 Subnet 4 Blocked

  13. DAG Networks • If using iSCSI storage, configure DAG and cluster to ignore iSCSI networks • Set-DatabaseAvailabilityGroupNetwork -Identity <DAGNetworkName> -ReplicationEnabled:$false -IgnoreNetwork:$true • Cluster network <ClusterNetworkName> /prop Role=0

  14. DAG Networks • When a DAG spans multiple subnets you need an IP address on the MAPI network for each subnet • Use DHCP in site resilience configurations to assign IP addresses to Replication network • Enables delivery of the typically required static routes • If using static IP addresses, use netsh to configure static routes • Configure a DNS TTL on service access connection records that is consistent with your SLA, e.g. ~5 minutes for a one hour RTO SLA

  15. Exchange Server 2010 High Availability Deep Dive: Active Manager

  16. Active Manager • What are the three Active Manager roles? • Standalone • PAM (Primary Active Manager) • SAM (Standby Active Manager) • Transition of role state logged into Microsoft-Exchange-HighAvailability/Operational event log (Crimson Channel)

  17. Active Manager Functionality • Mount and Dismount Databases • Provide Database Availability Information • Provide Interface for Administrative Tasks • Monitor for Failures • Maintains Database and Server State Information

  18. AutoMount on DAG Members • In a DAG, all AutoMount operations are coordinated through the PAM • AutoMount operations occur: • When the first server in the DAG is initialized • When the ownership of the PAM role is changed

  19. AutoMount on DAG Members • Checks msExchMasterServerOrAvailabilityGroup to determine all databases hosted on the DAG • Checks if database can be mounted on startup • If msExchEDBOffline is TRUE, stop processing • If msExchEDBOffline is FALSE, proceed with processing

  20. AutoMount on DAG Members • Checks persistent database information stored in cluster registry • Determines if database is mounted on another DAG member • If the database is mounted on another server, take no action • If the database is not mounted on another server, proceed

  21. AutoMount on DAG Members • Checks AdminDismount in cluster registry: • If AdminDismount is TRUE, take no action • If AdminDismount is FALSE, proceed • Checks persistent database state information in cluster registry for server on which database was last mounted • If server available, issue mount request to Information Store on that server • If server not available or property not set, issue mount request to next server in sorted list

  22. AutoMount on DAG Members • If AutoMount operation succeeds: • Update persistent database state information stored in cluster database • Propagate information to all other DAG members

  23. Mount / Dismount Database Copy • Mount Database • An administrator action invoked through a task • The last part of a move operation • Dismount Database • An administrator action invoked through a task • The first part of a move operation

  24. Mount Database – DAG Member • Initiate RPC to member of the DAG • If the server contacted is not the PAM, the task is referred to the PAM • If the server is the PAM, continue with no referral • Checks the msExchMasterServerOrAvailabilityGroup to ensure database is hosted in the DAG • If database is hosted in DAG, proceed • If database is not hosted in DAG, error out

  25. Mount Database – DAG Member • Checks if the database is already mounted • If already mounted, task fails • If not already mounted, task continues • PAM invokes callback • This invokes a pre-check for the database mount operation • Persistent database state updated to show mount Initiated

  26. Mount Database – DAG Member • PAM invokes RPC call to Information Store to mount database • If mount fails, task fails • If mount succeeds, task completes successfully • Persistent database state updated to record results of operation and propagated to other members

  27. Dismount Database – DAG Member • Task initiates call to PAM or is referred to PAM • PAM checks that msExchMasterServerOrAvailabilityGroup value matches the DAG • PAM verifies that database is mounted in the DAG by checking persistent database state information stored in registry • If database is mounted, task proceeds • If database is dismounted, task fails

  28. Dismount Database – DAG Member • PAM updates persistent state information in cluster database to show state Initiated • PAM makes RPC call to Information Store on DAG member and invokes dismount • If dismount operation succeeds, persistent database state information stored in cluster database is updated • If dismount operation fails, task fails

  29. Auto Dismount – DAG Member • Occurs when a DAG loses quorum • All DAG members are running (but may not be participating in the cluster) • Databases dismounted as quickly as possible to avoid split-brain • Information Store service is terminated

  30. Auto Dismount – DAG Member • Dismount operation should attempt to update database state information in cluster database • This is the only case where a database operation occurs on a server other than the PAM

  31. Active Manager – Move Database • Move Database • An administrator action invoked by a task • Automatic operation initiated by the PAM (failover) • Begins with a Dismount operation and ends with a Mount operation

  32. Exchange Server 2010 High Availability Deep Dive: Best Copy Selection

  33. Best Copy Selection • Process of finding the best copy of an individual database to activate, given a list potential copies for activation and their status • Active Manager selects the “best” copy to become the new active copy when the existing active copy fails or when an administrator performs a targetless switchover

  34. Best Copy Selection – RTM • Sorts copies by copy queue length to minimize data loss, using activation preference as a secondary sorting key if necessary • Selects from sorted listed based on which set of criteria met by each copy • Attempt Copy Last Logs (ACLL) runs and attempts to copy missing log files from previous active copy

  35. Best Copy Selection – SP1 • Sorts copies by activation preference when auto database mount dial is set to Lossless • Otherwise, sorts copies based on copy queue length, with activation preference used a secondary sorting key if necessary • Selects from sorted listed based on which set of criteria met by each copy • Attempt Copy Last Logs (ACLL) runs and attempts to copy missing log files from previous active copy

  36. Best Copy Selection • Is database mountable? • Is copy queue length <= AutoDatabaseMountDial? • If Yes, database is marked as current active and mount request is issued • If not, next best database tried (if one is available) • During best copy selection, any servers that are unreachable or “activation blocked” are ignored

  37. Best Copy Selection

  38. Best Copy Selection – RTM • Four copies of DB1 • DB1 currently active on Server1 Server1 Server2 Server3 Server4 X DB1 DB1 DB1 DB1

  39. Best Copy Selection – RTM • Sort list of available copies based by Copy Queue Length (using Activation Preference as secondary sort key if necessary): • Server3\DB1 • Server2\DB1 • Server4\DB1

  40. Best Copy Selection – RTM • Only two copies meet first set of criteria for activation (CQL< 10; RQL< 50; CI=Healthy): • Server3\DB1 • Server2\DB1 • Server4\DB1 Lowest copy queue length – tried first

  41. Best Copy Selection – SP1 • Four copies of DB1 • DB1 currently active on Server1 • Auto database mountdial set to Lossless Server1 Server2 Server3 Server4 X DB1 DB1 DB1 DB1

  42. Best Copy Selection – SP1 • Sort list of available copies based by Activation Preference: • Server2\DB1 • Server3\DB1 • Server4\DB1

  43. Best Copy Selection – SP1 • Sort list of available copies based by Activation Preference: • Server2\DB1 • Server3\DB1 • Server4\DB1 Lowest preference value – tried first

  44. Best Copy Selection • After Active Manager determines the best copy to activate • The Replication service on the target server attempts to copy missing log files from the source (ACLL) • If successful, then the database will mount with zero data loss • If unsuccessful (lossy failure), then the database will mount based on the AutoDatabaseMountDial setting • If data loss is outside of dial setting, next copy will be tried

  45. Best Copy Selection • If an activated database copy is mounted • It will generate new log files (using the same log generation sequence) • Transport Dumpster requests will be initiated for the mounted database to recover lost messages • When original server or database recovers, it will run through divergence detection and either perform an incremental resync or require a full reseed

  46. Exchange Server 2010 High Availability Deep Dive: Datacenter Activation Coordination Mode

  47. Datacenter Activation Coordination Mode • DAC mode is a property of a DAG • Acts as an application-level form of quorum • Controls whether or not a Mailbox server attempts to mount its active databases on startup • Designed to prevent multiple copies of same database mounting on different members due to loss of network (split brain) • Also enables use of Site Resilience tasks • Stop-DatabaseAvailabilityGroup • Restore-DatabaseAvailabilityGroup • Start-DatabaseAvailabilityGroup

  48. Datacenter Activation Coordination Mode • RTM: DAC Mode for DAGs with three or more members that are extended to two Active Directory sites • Don’t enable for two-member DAGs where each member is in different AD site or DAGs where all members are in the same AD site • SP1: DAC Mode can be enabled for all DAGs • If using Third Party Replication (TPR) mode, check with your vendor for guidance on DAC mode

  49. Datacenter Activation Coordination Mode • Uses Datacenter Activation Coordination Protocol (DACP) • A bit in memory (in MSExchangeRepl.exe) set to either: • 0 = can’t mount • 1 = can mount

More Related