Last month I promised a series of blog posts on the replication and HA options for Exchange, because it seems everywhere I go, people think they’re stuck with Database Availability Groups. I started off with a level set on replication options in general, and then ended with the notion that you don’t necessarily need DAG replication and multiple copies to achieve HA and remote disaster recovery with Exchange 2010. As Chad Sakac says, I’m agreeing to disagree with Microsoft on this point. Microsoft contends that native Exchange approaches for protection, HA, and DR covers all needs, at all scales, and always at the lowest TCO. I contend that sometimes it does, but not always.
So this is Option 1 in the series. I'll be presenting three more options over the coming weeks.
DAG is actually pretty cool kit when you get into it, but like all techniques and technologies, it has its advantages and disadvantages. Let’s get into it:
So here’s an illustration of what we’d call Native DAG Replication:
As you can see, it’s characterized by the use of DAG as the replication engine, and Active Manager as the failover manager. You might be wondering why I distinguish between the DAG members and physical copies. That’s because when you use third party replication (enabled by the Third Party Replication API in Exchange 2010 in conjunction with EMC’s Replication Enabler for Exchange), you can have multiple DAG members accessing the same physical disk (very similar to single copy clusters in previous versions of Exchange). Although it’s not shown here, this option can be deployed in conjunction with a virtualization platform like VMware or Hyper-V.
As a point of comparison, I’ll assign cost factors for storage and network. It’s fairly expensive in terms of storage – you need at least two copies locally. You can have one or two, or more copies at the remote site, but if you have only one copy, you should have enough bandwidth to comfortably reseed a full copy of all the databases on the server, since any number of situations can cause database divergence. If you don’t have the bandwidth, you can reduce the number of situations where a WAN reseed would be needed by putting a second copy at the DR site (as illustrated here). But operationally, network cost is low – only the transaction logs are replicated.
So the factors:
- Storage: 4
- Network: 1
Typically people would use this option when there is:
- A need for uptime while patching
- Plenty of bandwidth between sites
- Enough budget to acquire and maintain disk for 3-4 copies of the databases
- No need to coordinate Exchange site to site failover with other applications
- No need for zero data loss disaster recovery between sites
- No need to maintain consistency between Exchange and other line of business apps[1]
- The Best Copy Selection (BCS) process meets the requirements of the business unit
[1] An example of this might be a procurement application that leverages email or public folders for workflow
Comments