Early catch: Before ground realities become trends!

Multi-Master w/ Coordinators continued...

In this post, we continue the discussion of the Multi-Master taxonomy.

As discussed in the previous post, nodes are assigned special coordinator roles for cluster-wide activities such as consensus management, group membership tracking, lock management, recovery, load balancing, etc. These nodes are selected dynamically within the cluster. In addition to coordinator roles, the allocation of nodes to serve subsets of the namespace is also done dynamically. The dynamic selection process can be via group-based consensus protocols such as Paxos, or Zookeeper style where each node nominates itself for the responsibility, and the first one to update the Zookeeper entry is assigned the responsibility. There are several other heuristics for the selection process as well.

Example roles of the Coordinator:
We list a few common design patterns for cluster-wide coordinator roles:

Coordinator for Cluster Management: It is becoming increasingly common to separate cluster management responsibilities from the object/filesystem management. Cluster management involves monitoring machine status/heartbeating, updating cluster membership, tracking resources usage, dealing with machine failures, and potentially scheduling jobs.
Coordinator for Namespace Configuration Management: In this case, a node is assigned to ensure there is at most one active master at any time, storing filesystem superblock or equivalent, data model schema and access control lists, etc. This approach radically simplifies the selection of the Master or dynamic assignment of coordinator responsibilities among the nodes. The coordinator node in this case can be instantiated using Zookeeper.
Coordinator for Lock Management: The coordinator is assigned the responsibility of lock arbitration and lease management. Locking is mostly handled in a distributed fashion across the individual master nodes. But, for objects/files shared by multiple clients, the centralized locking manager arbitrates the recall process for re-assignement of fine-grained byte-range locks.
Coordinator for Space/Quota Management: In GPFS, coordinator initializes free space statistics by reading the allocation map when the file system is mounted. The statistics are kept loosely up-to-date via periodic messages in which each node reports the net amount of disk space allocated or freed during the last period. Instead of all nodes individually searching for regions that still contain free space, nodes ask the Space coordinator for a region to try whenever a node runs out of disk space in the region it is currently using. To the extent possible, the allocation manager prevents lock conflicts between nodes by directing different nodes to different regions.

Early catch: Before ground realities become trends!

Mapping Technology Trends to Enterprise Product Innovation

Tuesday, September 10, 2013

Multi-Master w/ Coordinators continued...

No comments:

Post a Comment