The synergies of third platform technologies-social, mobile, analytics, and the cloud-have made it possible to unlock new value in big data. Connected devices, also known as the Internet of Things (IoT), provide yet another opportunity to gain business insights from big data repositories. However, in order to use these new data sources effectively, you need storage solutions that can provide high performance at scale and with enterprise-grade reliability.

 

NetApp data analytic solutions for Splunk, Hadoop and NoSQL can help. NetApp EF-Series all-flash arrays and E-Series storage systems can boost the performance of your analytics applications and provide significant benefits when compared with data repositories based on commodity servers:

    • More than 50 percent performance improvement for data analytics applications 1
    • 33 percent reduction of IT infrastructure costs 1
    • Improved visibility of system health and performance
    • Professional services to speed deployments
    • Premium support options for production environments

 

Faster searches, more visibility for Splunk

Realizing top performance from Splunk for fast ingesting and searching of data requires a correspondingly fast storage platform. NetApp EF/ E-Series solutions for Splunk enable faster searches while making deployment simpler and more efficient. Benefits include:

    • 69 percent improved search performance vs. commodity servers with internal disks 1
    • 30 percent static performance gains compared to the commodity servers with internal DAS under a Splunk cluster node failure stress test 1
    • Optimized performance and capacity buckets with one architecture for Splunk’s hot, warm, cold, and frozen data tiers

 

Figure 1) EF/E-Series performance vs. commodity/white box servers with internal drives. EF and E-Series results are based on lab validation by Function1, Inc..

A NetApp SANtricity Performance App for Splunk Enterprise provides visibility into the health and performance of NetApp EF/E-Series storage systems. The free plug and play app can be downloaded from the Splunk website.

 

In addition, distributor Arrow Enterprise Computing Systems provides pre-configured and pre-validated NetApp and Splunk bundled solutions. These bundles leverage the high performing EF/E-Series storage to optimize performance in enterprise Splunk deployments requiring small, medium, large, and huge data ingest rates.

 

More control, more insights from Hadoop

The NetApp Storage Solution for Hadoop provides a ready-to-deploy, enterprise-class infrastructure for the Hadoop platform so you can control and gain insights from your big data. Brian Garrett, vice-president for ESG Lab, recommends evaluating this solution  “to accelerate the delivery of insight to your business with an enterprise-class big data analytics infrastructure.”
Validated reference architectures deliver reliable Hadoop clusters and seamless integration of Hadoop with existing infrastructures. Benefits include:

    • Reduced infrastructure (nodes, power, space) costs by 33 percent 1
    • Support for all Apache-compatible distributions, including Cloudera, Hortonworks, and MapR.

 

Figure 2) Impact of Hadoop replication factor. Data based on lab validation report with E-Series.

There are two main types of Hadoop reference designs available for deployment:

    • NetApp Storage Solution for Hadoop. For businesses that already have their own servers and networking and need, enterprise-class storage validated with a Hadoop distribution
    • FlexPod Select with Hadoop. Joint reference architecture featuring pre-sized, enterprise-class components for Hadoop; validated on NetApp storage and Cisco UCS servers, with Cloudera and Hortonworks distributions of Hadoop

For FAS systems, you can use the NetApp NFS Connector for Hadoop to run big data analytics natively on NFSv3 data and start analyzing existing data immediately-without the need to move data, create a separate analytics silo, or set up a Hadoop cluster.


Faster performance for NoSQL databases

Leverage NetApp all-flash storage for NoSQL to gain the performance and reliability you need to handle new types of data. Benefits include:

    • Reconstruct 400GB SSD in 15 minutes vs. 10 hours for NoSQL database 1
    • HDD or SSD failure causes 1 percent longer completion time instead of 104 percent 1

The NoSQL databases currently validated are Couchbase and MongoDB. Other NoSQL databases NetApp storage supports include Cassandra, HBase, and MarkLogic.

 

Figure 3) EF560 performance scaling for Couchbase.

 

For more information, access the NetApp big data analytics page and watch this short video from industry analyst ESG. For more information on NetApp services and support, and AutoSupport technology, visit NetApp.com/Support.

 

1 Results when compared with commodity servers using internal disk drives based upon third-party lab report and NetApp internal testing.

mm

Mike McNamara

Mike McNamara is a senior manager of product and solution marketing at NetApp with 25 years of storage and data management marketing experience. Before joining NetApp over 9 years ago, Mike worked at Adaptec, EMC and Digital Equipment Corporation. Mike was a key leader driving the launch of the industry’s first unified scale-out storage system (NetApp), iSCSI and SAS storage system (Adaptec), and Fibre Channel storage system (EMC CLARiiON ). In addition to his past role as marketing chairperson for the Fibre Channel Industry Association, he is a member of the Ethernet Technology Summit Conference Advisory Board, a member of the Ethernet Alliance, a regular contributor to industry journals, and a frequent speaker at events.