ZooKeeper is a distributed, open-source coordination service for distributed applications
It exposes a simple set of primitives that distributed applications can build upon to implement higher level services for synchronization, configuration maintenance, and groups and naming
At the heart of ZooKeeper is an atomic messaging system that keeps all of the servers in sync
FLP proved that consensus cannot be achieved in asynchronous distributed systems if failures are possible. To ensure we achieve consensus in the presence of failures we use timeouts
ZAB is not Paxos, it is primarily designed for primary-backup systems, like Zookeeper, rather than for state machine replication
Clients can set watches on znodes. Changes to that znode trigger the watch and then clear the watch. When a watch triggers,ZooKeeper sends the client a notification
The data stored at each znode in a namespace is read and written atomically
These znodes exists as long as the session that created the znode is active
When creating a znode you can also request that ZooKeeper append a monotonicly increasing counter to the end of path.This counter is unique to the parent znode
由于只有2个角色,因此直接设置两个 znode 即可:master、slave
当 master 节点挂掉的时候,原来的 slave 升级为 master 节点,因此用 ephemeral 类型的 znode
由于 slave 成为 master 后,会成为新的复制源,可能出现数据冲突,因此 slave 成为 master 后,节点写入成为 master 的时间,这样方便人工修复冲突数据
集群共用父节点 parent znode,集群中的每个节点在 parent 目录下创建自己的 znode
当 Leader 节点挂掉的时候,持有最小编号 znode 的集群节点成为新的 Leader,因此用ephemeral_sequential 类型 znode
可以根据业务需要灵活写入各种数据
集群所有节点只有一个 leader znode,本质上就是一个分布式锁
当 Leader 节点挂掉的时候,剩余节点都来创建 leader znode,看谁能最终抢到 leader znode,因此用ephemeral 类型
可以根据业务需要灵活写入各种数据
集群共用父节点 parent znode,集群中的每个节点在 parent 目录下创建自己的 znode
当 Leader 节点挂掉的时候,持有最小编号 znode 的集群节点成为“法官” ,因此用 ephemeral_sequential 类型 znode
可以根据业务需要灵活写入各种数据,例如写入当前存储的最新的数据对应的事务 ID
实现复杂度 | 选举灵活性 | 应用场景 | |
---|---|---|---|
最小节点获胜 | 低 | 低 | 计算集群 |
抢建唯一节点 | 低 | 低 | 计算集群 |
法官判决 | 高 | 高,可以设计满足业务需求的复杂选举算法和规则 | 存储集群 |