How to deal with backup when you switch to hyperconverged infrastructure
- 16 April, 2019 14:05
Companies migrating tohyper converged infrastructure (HCI) systems are usually doing so to simplify their virtualisation environment. Since backup is one of the most complicated parts of virtualisation, they are often looking to simplify it as well via their migration to HCI.
Other customers have chosen to use HCI to simplify their hardware complexity, while using a traditional backup approach for operational and disaster recovery. Here’s a look at cover both scenarios.
Different types of HCI
Sometimes an HCI system is a collection of hardware on which you run your favourite hypervisor. This is the most common type of HCI vendor and includes products like Nutanix, Simplivity, Datrium and VxRail.
Each HCI vendor offers a hardware configuration using components supported by the virtualisation vendors it wishes to support. Since the system comes pre-built you can be assured that all the hardware components will work together and will work with any supported hypervisors. Any incompatibilities between the various components will be handled by the HCI vendor.
Some HCI vendors also offer their own hypervisors. The best example of this would be Nutanix with their Acropolis hypervisor. Typically such a hypervisor will offer tighter integration with the HCI hardware and integrated data-protection features. Often, the built-in hypervisor is also less expensive than traditional hypervisors, especially if you take advantage of the native data-protection features.
The final type of HCI vendor supports neither VMware nor Hyper-V, nor do they use their own hypervisor. Scale Computing uses the KVM hypervisor, which is open source. Like Nutanix, they do this to reduce their customers’ TCO while offering much of the same functionality that VMware offers. In addition, they also offer integrated data protection.
Integrated data protection
All three types of HCI vendors mentioned above offer integrated data protection. These companies recognise that backup and recovery of traditional servers is complicated enough; virtual servers can be even worse due to the scarcity of I/O resources. So one of the problems that HCI vendors attempt to solve for their customers is the simplification of backup and recovery.
Traditionally, this is done is through the use of snapshots in the HCI product. The snapshots are done at the storage level, with some type of integration into the hypervisor so that the correct thing happens in the operating system when an HCI snapshot is taken.
For example, most hypervisors integrate with the Windows Volume Shadow Service (VSS). When it's time for an HCI snapshot, the hypervisor first tells each VM to create a VSS snapshot. This creates an application-consistent view of the operating system to the hypervisor or HCI data protection system that will then take a snapshot of that snapshot. Once the hypervisor or HCI snapshot has been created, it can release the VSS snapshot.
The entire snapshot-creation process may take only a few seconds from end to end, since it isn’t moving much data, but it is important to understand that at this point it is only a virtual copy of the data. The snapshot needs to be replicated to another system in order to be an actual backup, and all HCIs vendors do just that. Once the bytes specific to that snapshot have been replicated to another system, you have a complete backup of the latest version of all VMs in the snapshot.
Some customers replicate their secondary copy to another HCI system inside their data center, and others send it to the cloud. Customers who store their secondary copy on a local HCI system may also replicate that copy to the cloud. That way they have an onsite and an offsite copy of all systems.
Some HCI vendors with integrated data protection can then replicate the VM's into the cloud and spin them up for DR purposes. If they are able to do that, customers have a local recovery option and a cloud recovery option without spending any money with a third-party data-protection system.
Integrated data protection is not new
The most well-known vendor that started saying that data stored on their systems didn't need to be backed up in the traditional way is NetApp. (Many other storage vendors have since followed NetApp’s example, but they were clearly the early leaders in this space.)
Their customers were the first to experience a more modern style of recovery, which was much faster and easier to use than anything else that was offered at the time. The key to their story was snapshots; every volume on a NetApp filer can create up to 255 snapshots without a performance impact.
This offers their customers way more restore points than were typically possible via normal backup means, and also offers restores much faster than any typical restore. Snapshots are a virtual copy of the file system that relies on the original volume for a complete restore, so they need to be replicated to another system to provide complete backup and recovery.
Many NetApp customers go "all in" with a NetApp-only data-protection solution. They replicate their primary filer to a secondary filer on-site and a tertiary filer off-site, thus providing both on-site and off-site recovery without ever using a traditional backup product – and certainly without using any tapes.
Others are critical of this solution, as they wonder if there could ever be some sort of bug in the software that would corrupt all copies of the data. These customers prefer to have at least one copy of their data on some system other than NetApp. They might hope they never need to use that copy, but they have it just in case.
This customer response is similar to the response to the integrated data-protection features of HCI vendors. Some see it as a way to offer quicker recovery for most recovery scenarios, while others do not like the “all their eggs in one basket” approach. These customers prefer to supplement the integrated data protection features with a more traditional backup approach.
There are many backup solutions that are hypervisor-friendly. Such solutions need to integrate with the data protection features of the hypervisor, such as VADP from VMware.
They also need to do everything they can to minimise the I/O impact of backups, as this is the Achilles heel of any traditional backup solution that is simply going to pretend that the VMs are actually physical servers. If a backup product does that and wants to continue doing occasional full backups, it will probably fail miserably. Backup products need to support things like block-level incremental backups forever and changed block tracking to be welcome in a hypervisor world.
It’s your choice
You can choose an HCI product with integrated data protection and eliminate traditional backup approaches as long as you make sure you replicate those snapshots to another system and/or to a system offsite. You can ignore your HCI product’s native data-protection features and simply use a virtualisation-friendly backup product. Or you can use both in a belt-and-suspenders approach. Just make sure to understand the benefits and drawbacks of each before.