What are Virtual Volumes?

Anyone who uses VMware and has taken a look at vSphere 6 is probably familiar with the buzz around the new Virtual Volume feature. The Virtual Volume (VVol for short) is a new concept for handling virtual machine storage.

Previously, virtual machines (VMs) resided in a datastore of a set size, the datastore having specific capabilities associated with it. VVols create and manipulate VMs (and the virtual disks associated with them) as individual objects. The capabilities, policies, and features of each of these objects are controlled separately, rather than being lumped with everything else on the datastore.

Because VMware administrators still need to do some operations at a datastore level, VVols introduces a storage container as a logical grouping of VMs and VVols. A VVols datastore is now a storage container, no longer represented as a SCSI logical unit (LU) or NFS filesystem. Instead, it’s simply a way of grouping VVols from the VMware administrator’s perspective — while allowing VVols to be independent from a storage perspective.

vvols-new-operations.jpg

Why is there a need for a new storage concept?

Obviously VMware thought there were improvements that could be made in storage, and designed VVols to address a handful of issues that exist when using traditional storage techniques in a virtualized environment. These features needed to be designed in a way to make them attractive to storage vendors.

Ultimately, though, VVols needs to offer something of value to the virtualization customer. Here’s a quick summary from all three perspectives of why VVols is a good thing.

Comparison of traditional virtual storage and Virtual Volumes (Image courtesy of Rawlinson Rivera, VMware)
Comparison of traditional virtual storage and Virtual Volumes (Image courtesy of Rawlinson Rivera, VMware)

What drove VVols at VMware?

VMware continuously worked with a number of storage vendors who had specific features to offer to each of their consumers. Some of these features (like Quality of Service, encryption, replication, and snapshotting) overlapped with capabilities of a VMware datastore available through NFS or the VMware file system (VMFS). However, there were not any capabilities that allowed a storage system to understand what was being stored in an opaque VMFS on SCSI storage or an NFS filesystem. The individual VMs and virtual disks being stored weren’t identifiable as individual objects, and therefore storage vendors weren’t able to employ their secret herbs and spices to VMs or their component pieces.

Advancing and standardizing new storage protocols is a very arduous process. Additionally, storage vendors sometimes package their identifying features into the storage protocols. In any case, vendors aren’t going to be thrilled with implementing a completely new storage protocol. VVols is designed to be storage-protocol independent, which expands the existing SCSI and NFS protocols and incorporates an out-of-band control path. This eliminates the time it takes to shake out a new standardized protocol and gain acceptance. VMware took this approach instead of limiting the features available to VVols by using either the SCSI or NFS protocol commands.

As a distributed file system, VMFS exposes the problem of metadata contention among all of the vSphere hosts accessing the file system. VMware introduced their vSphere Storage API for Array Integration (VAAI) several years ago to alleviate the contention problem seen with SCSI reservations. Even so, contention still limits VMFS datastores to several tens of VMs each, meaning you need to continue to create new datastores as the number of VMs grow. VVols removes this limitation and uses a logical storage container to represent a datastore to vSphere. As a logical concept, there’s no contention between VMs and objects in the storage container. Because VVols are created and manipulated individually, they’re no longer contained in a file system. This allows as many VMs as a consumer wants in a storage container, without an arbitrary limit of “too many.”

Another VVols design consideration is eliminating wasted storage capacity. A VMFS datastore must be laid out on a fixed-sized SCSI LU. These datastores are often over-provisioned for the amount of space needed (even though VMFS could be extended), so that when a VMFS datastore is at capacity due to metadata contention, there can still be unused space. No storage space needs to be set aside for VVols before they are created, and there is no reserved but dormant capacity.

A final motivation for VMware was the value of storage network bandwidth. Using the vSphere storage stack as a data mover consumes network bandwidth every time a VM object is copied or moved. Performing storage operations, such as cloning and snapshotting entirely at the storage level, potentially saves a lot of valuable storage network resources. On top of this, VVol capabilities are now handled on an individual basis, and not at the datastore level. This provides the ability to change individual VVol capabilities without requiring Storage vMotion, saving more bandwidth. The need for Storage Dynamic Reconfiguration Service (SDRS), another big consumer of storage bandwidth, is also eliminated with VVols. This is because VVols allow storage to work with vSphere to “rebind” storage connections based on storage capability requirements, without needing to move the VM data from one datastore to another.

Why VVols are useful for storage vendors

The primary reason storage vendors embrace VVols is they offer the vendor the ability to tout their proprietary features at a per-VM and per-VVol level. Being able to guarantee QoS for a particular VM is a much more compelling story than doing so at the datastore level. The ability to clone, snapshot, and replicate only make sense with virtual disks or VMs. VMFS provides no visibility of virtual disks, so storage vendors would never be able to offload these capabilities to a meaningful extent. The ability to apply encryption, compression, and deduplication for each VVol means that separate datastores aren’t needed for each level of capability.

There’s a lot for storage vendors to like about VVols. While there is some implementation work, there’s no need to completely revamp their storage system to support an entirely new storage protocol. This means elaborate multipath capabilities and failover techniques don’t have to change, which might put availability or throughput at risk. Less wasted storage space and less demand on storage bandwidth can make sales into virtualized environments easier.

Why VVols are valuable to our customers

The most obvious advantage is the ability to manage VM resources more easily and with better granularity. Storage capabilities are configured and assigned for each VVol. Storage containers can be used to group VMs and VVols logically, rather than by capability. Storage containers don’t need to be added simply because a datastore is at capacity, either in terms of space or contention limits. This allows a VMware administrator to manage VMs more effectively, without having to know the underlying storage configuration to understand the grouping. VVols is also about exposing storage capabilities that are consumed in different ways for the benefit storage administrators, vSphere administrators, and application builders, alike.

A less obvious but equally compelling advantage is the increase in efficiency. Better use of storage capacity and storage bandwidth mean less unnecessary overhead, and potentially less infrastructure cost, both in terms of storage and storage network capacity. It also is more efficient by providing infinite flexibility of where VVols are stored, rather than only having small number of datastores to choose from. The efficiency also comes from the lack of filesystem contention. This efficiency should offer significant improvements in scale for customers as well.

Watch this technical demo of VVols with SolidFire Quality of Service to learn more or contact SolidFire directly for a live demo.

Andy Banta