Multi-Master w/ Coordinators Taxonomy
In this blog post, we cover the Multi-Master taxonomy. As the name suggests, it is a variant of the Master-based taxonomy. Some of the key examples of this taxonomy are Ceph, Lustre, IBM's GPFS, PVFS, xFS.
To understand this taxonomy, lets re-visit the limitations of the Master-based taxonomy -- a single Master coordinates the metadata operations of the entire namespace, and also performs cluster management activities. To clarify, the data operations i.e., reads and in-place updates are distributed among the cluster nodes, with the clients caching the location metadata after initial interaction with the Master. In contrast, the Multi-master taxonomy divides the namespace such that both the data and metadata operations related with a subset of the namespace are assigned to a node within the cluster. Further, the nodes are assigned special coordinator roles for cluster-wide activities -- cluster management (e.g., monitor nodes tracking heartbeats of nodes within the Ceph filesystem), or a filesystem manager (e.g., in GPFS, a filesystem manager node to track disk space allocation, quota, ACLs, etc.), or as an arbitrator for leader election and configuration management (e.g., Zookeeper for configuration bootstrapping, lock management, tracking Master allocations).
So how is Multi-Master different from Masterless? They both divide the data and metadata responsibility of the sharded namespace among the cluster nodes? In Masterless, the nodes are truly symmetric and the responsibilities for cluster-wide management are accomplished without any centralized management (i.e., cooperative consensus-based management algorithms). In contrast, Multi-Master is a decedent of the Master-based taxonomy, where the cluster nodes are not symmetric, but a subset perform additional coordinator roles for cluster-wide activities.
So why consider the Multi-Master taxonomy: Depending on the workload and infrastructure properties, this taxonomy can offer the best of both worlds i.e., the scalability and availability of of Masterless, with the performance and data consistency guarantees of the Master-based. To illustrate, in Multi-master, a node failure detection within the cluster will be faster than gossip-based techniques used in Masterless (multicast limits scale), the load reassignment much more flexible (than limited key-based routing semantics), at the same time ensuring no SPOS (single point of saturation) since the activities of the Master are distributed among multiple nodes.
-
No comments:
Post a Comment