A very important feature of zookeeper is "single system image". It is a guarantee that a client will see the same view of the system. It would not see older data.
Lets try to understand what exactly is a single system image in zookeeper with an interaction diagram.
Let us take a simple example of an ensemble. An ensemble is a group of machines participating in zookeeper in order to provide high availability. In this example, the ensemble has a total of four machines. There are three followers and one leader.
On top of every node what you see is a ZX ID or zookeeper transaction ID. It is essentially a number that represents how new is the content in a particular node. The node which got the latest update would have the highest ZX ID. This ID keeps increasing as we update the content. Using this ID, we quickly identify which of the machines are left behind and which of the Machines have caught up with the leader. The leader would always be a node with highest ID. There might be other followers having same ZX ID as the leader.
When a client sends a request to write data, the request would be redirected to a leader if it is sent to a follower. In the diagram T1 represents the write request.
Examples of a write request are: creating a Z node, deleting a Z node or updating the data of a Z node etcetera.
After receiving the request, the leader will save the data and also send the updates to all followers. Whenever a node saves the data it’s ZX ID is increased. Here T2, T3 and T4 represent the updates sent by leader to followers.
Afterwards, when it is confirmed that the majority of the nodes have saved the data, the confirmation goes to the client. Since there are total of four nodes, for majority we need three nodes. Had there been three nodes, the majority would be two. Here follower 1, follower 2 and leader have saved the data, therefore it would give the confirmation to the client that the write request was successful. Please note that the leader has not yet received the update from follower 3. It could be because of many reasons such as network outage, or may be the follower 3 has some software or hardware issue. Notice that in the diagram the ZX ID for follower 1, follower 2 and leader has increased but follower 3 is still at ZX ID of 10.
At this point, the client will be allowed to read the data from follower 1, follower 2 or the leader.
But the client will not be allowed to read data from follower 3 because follower 3 is still lagging behind. Notice the ZX ID equals 10 at the top of the follower 3 block. I hope this clears your ideas on single system image guarantee of Zookeeper.
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.