Couchbase is a high-performance, distributed-deployment NoSQL server. NoSQL deployments are growing in storage capacity and performance requirements, as well as visibility and importance. However, deployments often rely on the use of internal drives within the servers along with network-based replication to provide a redundancy layer, and there are several disadvantages to this architecture approach specifically as it scales to meet additional demand. Compute and storage cannot scale independently, storage management becomes complex, performance suffers under failure conditions, and costs and complexity increase due to replication and failure models.
The good news is the NetApp EF560 provides a number of significant advantages over internal drives for Couchbase deployments. These advantages include robust storage management capabilities, dramatically improved reliability and high availability, ease of storage expansion/scaling and limited performance degradation because of failure conditions such as disk failures. Another advantage is the strong performance while handling the most demanding NoSQL workloads with very low latency.
NetApp recently tested a simulated Couchbase environment with the EF560. The tested environment is depicted in figure 1.
The test environment used a multiserver Couchbase cluster and several client servers using the Yahoo Cloud Serving Benchmark (YCSB) to generate appropriate workloads for the combined architecture. YCSB was configured during all tests as either 100% random reads or 100% random updates to documents within the database using a uniform distribution across the entire dataset. The EF560 under test was configured with a single DDP of 24 400GB SSDs, with a pool preservation capacity of 2 drives offering ~6TiB of usable capacity.
The test data showed that 8 cluster servers produced nearly 100,000 read operations per second using the EF560 and read latency, as measured at the application, of under 5ms. Similarly, for update operations, 8 cluster servers approached 60,000 update operations per second with a maxium update latency of under 8ms.
In a typical Couchbase deployment with internal drives within the server, there is no redundancy at the controller level. A failure of the internal controller would be similar to a complete server failure and would require rebalancing of the data across the remaining cluster servers. But with the EF560 and the redundancy provided through the dual redundant RAID controller design, no rebalancing is required because all volumes simply transition to the remaining controller.
In figure 2 a test was conducted in which a controller within the EF560 was failed under an active workload. The test shows a sequence before, during, and after a controller failure. A dip can be seen when the surviving controller experiences path failover, but then performance climbs back to approximately 50,000 operations per second. The length of time during which the dip occurred was less than 2 seconds.
In another test, a single drive failure and rebuild process in one of the internal drives in a Couchbase cluster server had a significant impact on the cluster’s capability to process requests from clients. The operations-per-second rate dropped by over 90%. However, with the EF560 and DDP, the impact was limited and after approximately 15 minutes after the initial disk failure, normal service was restored.