The distributed storage myth that CloudScaling’s Randy Bias alludes to in his recent white paper is primarily perpetrated by marketing departments eager to reach too far outside the original use case of their system architecture. The myth that Bias is referring to is the idea that a single pooled storage system can capably consolidate all three tiers of storage. While you can certainly have distributed storage architectures for each tier, the expectation that a single architecture can capably span all tiers will likely end badly. In this respect we agree with Randy’s position that one size doesn’t fit all and you need to choose the right tool for the job.
We’re big fans of open source at SolidFire, both using and contributing to many projects including OpenStack and CloudStack, however any discussion about distributed storage solutions for cloud should include commercial options as well. In the case of cloud storage for performance sensitive applications, the options provided by open source as well as legacy storage vendors are significantly lacking.
There are very few production-quality distributed storage systems available today. Popular open source storage solutions like Ceph and Gluster were architected for capacity-optimized storage use cases such as file servers and unstructured data. When it comes to performance-optimized workloads, however, these solutions were simply not built with this use case in mind. To help identify, as Randy puts it, “the right tool for the job”, we have created a list of key considerations for anyone evaluating performance-optimized distributed storage for cloud infrastructures:
- Consistent performance – Tier 1 applications generally expect consistent latency and throughput from storage systems. Achieving this in a multi-tenant legacy storage system is challenging enough, but in a complex distributed system it becomes an even larger problem.
- Performance control – Without the ability to provision performance separate from capacity and dial-in performance allocations for each application, a storage system will quickly be dominated by the “noisiest neighbor,” starving resources from more critical applications.
- Data reduction – By definition, Tier 1 storage is going to utilize faster, more expensive media – either fast disk or preferably SSDs. Inline deduplication and compression, without impacting performance, are critical for making the system cost effective and achieving maximum density in a cloud environment.
- Manageability – API’s are an often overlooked component of block storage in cloud environments. A robust API that lends itself to automation of all aspects of the storage system is imperative to achieve the promised operational benefits of cloud.
- Professional testing and support – Tier 1 applications are called mission critical for a reason. Ensuring the storage hardware and software you use is thoroughly tested and supported helps minimize the downtime and errors encountered when these platforms are deployed in production environments.
- Qualified hardware – Consuming storage in an appliance form factor has real, measurable benefits. Vendors bear the burden of ongoing qualification of the hardware and software while providing a single resource for support without finger pointing. Firmware bugs in commodity storage controllers and drives are a very real problem, and system vendors are in the best position to identify and correct or work-around these issues. Why resource an effort so far outside of your core competence when your vendor will aggressively ride the hardware cost curve for you?
- Flash aware – With cost declining at a rapid pace, flash is now appropriate for a large percentage of Tier 1 use cases, particularly when combined with data reduction. However plugging SSDs into a storage system designed around disk is a recipe for problems. Disk-based architectures can’t deliver the maximum IOPS from flash, while wear and endurance are real concerns due to write amplification. Only native flash storage architectures can deliver both the performance and endurance required for Tier 1 applications.
In crafting this list we decided not to tackle the most commonly assumed traits of distributed storage systems: availability and scalability. These traits should be viewed as table stakes in any tier of storage, but still thoroughly vetted. Instead, we focused on some of the key attributes unique to Tier 1 storage that are seldom delivered by capacity-optimized systems. After reading through the list it is clear that certain tools, while good for other things, simply weren’t intended for performance-optimized use cases in your cloud storage infrastructure.
Guaranteed performance is a central pillar of the next generation data center. Don’t know where to start? Try our Definitive Guide to Guaranteeing Storage Performance.