Notes: https://pdos.csail.mit.edu/6.824/notes/l-zookeeper.txt
Video: https://youtu.be/HYTDDLo2vSE
Prep:
-
https://pdos.csail.mit.edu/6.824/papers/zookeeper.pdf
ZooKeeper- Wait-free coordination for Internet-scale systems
Widely-used system
high perf
async
no strong consistnecy
coordination servers
Replicated Sate Machine

https://excalidraw.com/#json=Wplfj28TgFAdWqlX4_D8N,w0yYEyjqR9zSRF7gn8quzg
ZAB (ZK Atomic Broadcast) is like Raft

https://excalidraw.com/#json=tODZrJXsJk1ZNFxWyk6U6,Kcfa-tyvRrqlbpskPaIE-w
leader receives operation put from client, async-ly sends to followers
once quorum is met, operation is successful
how many puts per sec?
1 network roundtrip, Leader to Followers
~1 ms (across same region DataCenters)
2 writes, one at followers (overlapping time considering async), one at leader
~2ms for 1 write to SSD
~4ms for 2 writes
total ~5ms for one op ⇒ 200 puts per sec
ZK’s Throughput
21 ops/s with 0% of reads ⇒ all are writes
-
async + batched
- 1 write for entire batch result
-
reads can be processed by any server (not only leader)
This violates linearizability

Stale & Past Reads
Clients can read stale values if they connected to stale follower
at the same time, if a client is connected to updated follower can read up to date value, and in next session, if connected to a stale follower, it’ll read past values
Linearizability
distributed system behaves as a single machine
- total order of operations
- order matches real-time
- read op returns value of latest write
ZK provides
- linearizable writes but not reads
- FIFO client order
- writes in client order
- observe last write from same client when reading
- prefix of log when other clients
client req.s will have zxid which is like log index
zxid helps determine if data is up to date if not server will wait until it’s committed till there