Mapping Technology Trends to Enterprise Product Innovation

Scope: Focusses on enterprise platform software: Big Data, Cloud platforms, software-defined, micro-services, DevOps.
Why: We are living in an era of continuous change, and a low barrier to entry. Net result: Lot of noise!
What: Sharing my expertise gained over nearly two decades in the skill of extracting the signal from the noise! More precisely, identifying shifts in ground realities before they become cited trends and pain-points.
How: NOT based on reading tea leaves! Instead synthesizing technical and business understanding of the domain at 500 ft. 5000 ft., and 50K ft.

(Disclaimer: Personal views not representing my employer)

Monday, September 2, 2013

Masterless taxonomy continued... 

       
In this post, we continue with the design patterns employed in various workflows in a  Masterless scale-out storage:


  • Data locking  
    • The node responsible for the key-space manages the associated locks requested by the clients. The granularity of locks can be similar to byte-range, file/object level, directory-level (similar to other taxonomies)
    • The lock state is typically maintained in-memory, and refreshed periodically using client-side leases. If the node crashes, the clients need to request the locks again.
  • Distributed Transactions 
    • In a Masterless system, a transaction across multiple objects will require communication between the individual nodes responsible for the key-spaces. This is much more difficult to accomplish in a Masterless taxonomy, compared to Master-based. Typically, most scale-out systems provide transaction guarantees at the granularity of a single row or object
  • Replication
    • Given the key-based routing in Masterless systems, the replicas have to placed such that they can be computed from the primary copy i.e., replica keyID = f(primary keyID)
    • Dynamo's Ring-based approach copies the replica in the successor and the successor's successor in the key-space i.e., if the primary node is down, the client communicates the the node responsible for the next contiguous key-space.   
  • Failure Detection
    • Node failures are typically discovered using gossip-based techniques, instead of a centralized coordinator heartbeating the individual nodes. 
  • Distributed Node Recovery
    • As mentioned earlier, a node in the Masterless system is responsible for both the data and metadata
    • Replication (with quorum semantics) takes care of the data. There might be transient in-flight operations at the time of the crash i.e., space allocation for a new file/object.
    • Typically, the in-flight operations are logged in a WAL (Write Ahead Log) that is replicated (similar to normal data). At the time of recovery, the task of parsing the WAL records can be distributed among the nodes (will be covered in future posts).   
  • Maintenance Daemons
    • In a Masterless system, there is no global state. Errors in the data and metadata are discovered at the time of access by the clients. As such, background daemons threads are used to constantly scrub data, load balance, create new replicas as required. Note, background daemons are also used in Master-based -- for instance load balancing can be done by the Master since it knows the load on the individual nodes and can re-distribute data-to-node assignment very easily. 

No comments:

Post a Comment