In the initial post of our series on tiering we covered the merits of a proactive performance-driven approach to tiering relative to the more traditional capacity-centric discussions. Today we take a closer look at some of the less obvious cost implications of “automated” tiering. On the surface, the promise of tiering looks like an clear win – SSD performance with spinning disk capacity and cost. However, the true economics of this type of solution are not nearly as compelling as some vendors would lead you to believe. Considered in the context of the unique burdens faced by cloud service providers and the proposed value proposition is even less appealing.
To start with, the “SSD performance” promise part of the catchy tagline above must be caveatted by the fact that this only proves to be the case if the data is actually residing in the SSD tier. Easier said than done. The ability to guarantee SSD performance in a tiered architecture requires a substantial SSD tier and/or extremely accurate data placement algorithms. Rightsizing the former skews the proposed economics of a tiered solution substantially, while the latter has been long on promise but short on delivery for at least three generations of marketing executives. Before the industry marketed this functionality as Automated Tiering it was known as Information Lifecycle Management (ILM) and a few years before that it was Hierarchical Storage Management (HSM). Regardless of what you call it, tiering has always been impaired by the inability to accurately predict and automate the movement of data between tiers. In the context of cloud environments the significant scale requirements and extremely low application-level visibility make solving this challenge even more difficult.
It’s also important to consider the flash media requirements of a tiered solution. The write patterns in the flash layer of a tiered architecture require a higher grade flash solution to withstand the impact of write amplification and churn. Vendors are forced to use the most expensive SLC flash to ensure adequate media endurance. The cost impact even modest amounts of SLC flash destroy the economic advantage of a tiered architecture relative to an all-MLC design. In many examples we’ve seen that the “combined” $/GB of a storage solution that incorporates SLC-flash, 15k SAS and SATA is actually higher than an all-flash MLC solution with similar raw capacity. Importantly, this price advantage for MLC over tiered storage is achieved before factoring in the favorable impact of compression and deduplication for the all-flash solution, making the flash design even more compelling.
Tiering also hurts capacity utilization and controller performance. In order to ensure data is in the right place at the right time it is constantly being promoted and demoted between the flash and disk tiers. There needs to be a certain capacity buffer to accommodate this movement. There is also a controller processing cost to keep up with all this activity. Most legacy systems have limited CPU and controller memory relative to their overall capacity, making the overhead of tiered storage processing one more burden for them to manage. Even complex tiering requires only a fraction of the processing power and memory needed for in-line data reduction features like compression and dedupliction, which is why those features are seldom found on legacy primary storage controllers. A recent article from TechWorld references a Forrester Research report by Andrew Reichman (@ReichmanIT) that expands on the data management burden of a tiered storage topology.
The issues outlined above are just a few examples of the hidden costs embedded in an “automated” tiering solution. In some cases these deficiencies may be acceptable in smaller IT environments. However, in a large scale multi-tenant cloud infrastructure the capital and management costs of these shortcomings are magnified. The hyper-competitive nature of service provider business model necessitates a more efficient approach.