Mapping Technology Trends to Enterprise Product Innovation

Scope: Focusses on enterprise platform software: Big Data, Cloud platforms, software-defined, micro-services, DevOps.
Why: We are living in an era of continuous change, and a low barrier to entry. Net result: Lot of noise!
What: Sharing my expertise gained over nearly two decades in the skill of extracting the signal from the noise! More precisely, identifying shifts in ground realities before they become cited trends and pain-points.
How: NOT based on reading tea leaves! Instead synthesizing technical and business understanding of the domain at 500 ft. 5000 ft., and 50K ft.

(Disclaimer: Personal views not representing my employer)

Sunday, July 8, 2012

What does Cloud Storage mean for Private Clouds?

Cloud Computing has a lot of buzz, and sometimes over-used in marketing literature. So, what does Cloud Computing mean to a technologist? It is indicative of a fundamental shift in the way IT is provisioned, managed, and consumed by enterprises. When I mean a "fundamental shift," I am not only referring to enterprises putting their data on the public cloud -- the more interesting piece is the transformation of existing data-centers, also referred to as the private cloud. The drivers for this shift is IT budgets, business agility, new emerging web-scale application, to name a few. As CIOs and IT architects compare their operating costs to those offered by public cloud providers, the obvious question is whether the Cloud providers have a "secret sauce," besides economies of scale?  

Why care about Cloud Storage? Well, it is one of the key ingredients of the secret sauce! Storage is a  key contributor to both the capex and opex for IT -- typically managed storage has a TCO of $1.5-2/GB/year. Business critical apps today run of centralized networked storage using SAN, NAS, iSCSI as popular protocols. These enterprise storage boxes come at a premium (between 10-20X the basic hardware cost). With the growing data deludge, fueled by Big Data and Analytics becoming business differentiators, Cloud Storage represents a new paradigm in storage architectures -- existing Cloud providers and NoSQL databases are essentially the early adopters.

Finally, what is Cloud Storage? It represents a scale-out, shared nothing storage architecture running on commodity-scale storage components on compute nodes. In other words, local storage from individual server nodes is pooled together to create a single global namespace, similar to centralized storage. Under the covers, the intelligence for data layout, redundancy, caching, data coherency, fault-tolerance, load balancing, all resides within the software layer. In fact, this is not too different from the internals of a typical enterprise storage array today that uses commodity x86 components running a specialized software. The difference though, the server-based scale-out storage is a less controlled environment w.r.t. heterogeneous hardware performance, network connectivity, failure rates. Also, with cluster scales of 100s - 1000s, failures are more of a norm than an exception. The overall system is loosely coupled with nodes joining and leaving the cluster on an ongoing basis. 

To summarize, the scale-out shared nothing storage architectures are already present within Hadoop (i.e., HDFS), NoSQL databases such as Cassandra, MongoDB, Voldemort, CouchDB, etc. The difference from Cloud Storage, is that these architectures are designed for a particular set of workload characteristics and data model requirements. Cloud Storage on the other hand needs to be more generic and support a wide variety of storage workloads, as well as traditional file and block interfaces in addition to CRUD. 

No comments:

Post a Comment