Mapping Technology Trends to Enterprise Product Innovation

Scope: Focusses on enterprise platform software: Big Data, Cloud platforms, software-defined, micro-services, DevOps.
Why: We are living in an era of continuous change, and a low barrier to entry. Net result: Lot of noise!
What: Sharing my expertise gained over nearly two decades in the skill of extracting the signal from the noise! More precisely, identifying shifts in ground realities before they become cited trends and pain-points.
How: NOT based on reading tea leaves! Instead synthesizing technical and business understanding of the domain at 500 ft. 5000 ft., and 50K ft.

(Disclaimer: Personal views not representing my employer)

Tuesday, September 10, 2013

How is data persisted within a Scale-out Storage System?


In a Shared Nothing Scale-out storage system, data is distributed across multiple nodes. Each node persists the data on the physical disk/Flash resources. There are different design patterns available for accomplishing this persistence:

  • Local Filesystem (Vanilla): This is the most common pattern -- the data on each node is tracked as files in a local filesystem such as ext3, ext4, xFS, VMFS. HDFS/GFS, Lustre, PVFS, Azure Storage, and some of the systems that use this pattern. Typically, there is no 1:1 mapping between the scale-out namespace and the local file. This is because objects are striped across nodes -- for instance, in HDFS, a file is striped across nodes in 64MB chunks, such that each chunk is persisted as a file in ext3.  
  • Local Filesystem (Record-oriented): In the vanilla model, the local file stores only the actual data. Typically, a simple file-per-stripe is not optimal for most enterprise workloads. Instead, the data is tracked as a collection of records within a small number of large files.  
    • Log Structured: The individual updates of objects are persisted as records within a file, combined with some form of b+-tree or hashtable to track the associated records. Log structured allows high performance for disk-based storage, by converting random writes into into sequential.    
    • SSTables (Stored String Tables): In Cassandra, the file data is stored as immutable SSTables  
    • Multiple stripes-per-file: In PVFS, in contrast to HDFS's stripe-per-file, all the stripes associated with a specific file/object are persisted as a single local file.
  •  Physical disks: IBM's GPFS, Microsoft's Flat Datacenter Storage, etc. allocate and track data on physical disks. This provides full-control on the layout of the data within the disk, and removes a layer of in-direction in the IO path. Conversely, logical volumes can also be used in a similar fashion instead of physical disks. 




No comments:

Post a Comment