This is the fourth in a series of posts about the various options to achieve HA and DR with Exchange 2010. In the first, I broke the DAG into its basic components (Active Manager and DAG replication). In the second, I gave a quick overview of Native DAG. In the third, I covered a hybrid approach that combined DAG replication and Active Manager for local HA, and array/SAN based replication for remote site recovery.
I call this option “Virtualized Host Clustering”, because like the Virtualized Local DAG option, this options leverages a hypervisor, but it leverages the HA capabilities of the hypervisor instead. The Exchange mailbox role is deployed in a VM as a standalone server. This is by far the least expensive option from all perspectives: acquisition, operation, complexity, footprint, and power.
As you can see, there are some clear cost benefits to this option. You are replicating only one copy of the database, and since the HA function is handled by the hypervisor, a second copy of the database is unnecessary. It’s also the most flexible option – the full suite of workload management tools provided by the hypervisor can be used, and we add a fourth potential replication engine – VPLEX.
I suppose it’s also worth noting that one does not necessarily need a virtualization layer to accomplish this. With the right operational recovery plan to replace the server and restore from backup, three nines (99.9% availability) could easily be achieved without any HA facility whatsoever (either at the application or hypervisor layer).
This is by far the least expensive option from all perspectives: acquisition, operation, complexity, footprint, and power.
Here are the cost factors broken down:
- Storage: 2
- Network: 2 (SRDF, MirrorView), .5 (RecoverPoint)
Administrators and managers will typically choose this option when want to:
- Minimize complexity of the deployment
- Use advanced virtualization features such as Live Migration/Vmotion, DRS, etc
- Achieve consistency with other line of business applications
- Control failover with scripts or tools like VMware Site Recovery Manager
- Control their RPO from zero data loss to minutes
- Have multiple recovery points at each site
- Control bandwidth utilized by replication
- Meet failover requirements not achievable with native Exchange’s Best Copy Selection
This solution is not without its drawbacks however. Here are a couple of things to consider:
- This is the only option that where the administrator does not have the ability to do non-disruptive patching. However, one should consider that boot times are pretty quick on virtual machines. It’s very possible to achieve four 9’s (99.99% availability) with this solution despite the lack of live patching, and reboots can be scheduled for non-peak hours.
- Since only one copy of the data is available at each site, a rapid recovery mechanism for logical and physical failure modes is well advised. This is usually achieved through hardware based snapshots or bookmarks at minimal cost. It’s usually a good idea to have a rapid recovery scheme outside of the context of the application anyway, for a variety of reasons.
Comments