Notes: https://pdos.csail.mit.edu/6.824/notes/l-raft.txt

Video: https://youtu.be/h3JiQ_lnkE8

Prep:


Log Divergence (resumed)

rules

  • any leader needs to achieve majority from a node which participated in previous elections
  • followers only vote positively when they’re at least as updated as leader

Log Catchup

image 24.png

https://excalidraw.com/#json=nvZm5sB8LpLhg9mNZ1ZwT,TogbNdeWHBa5Et2WhTqibg

in unoptimized version

  • nextIndex which is optimistic, thinks that followers are up to date, so starts at index where leader’s curr index
    • when it recieves info that it’s wrong, it’ll decrement
  • matchIndex which is pessimistic, thinks that followers are not up to date (don’t even have any entries), so starts at 0
    • when it receives info that follower’s fine, then it’ll get updated to that curr index

Erasing Log Entries

followers can only commit after leader has commited atleast once in it’s own term

Catch Up quickly

In optimized version, instead of sending multiple not up to date and leader decrementing indices

follower will send at what index conflict is happening. Now server sends entries from there

Persistence

strategies

  • rejoins ⇒ replay log
  • start from persistent state
    • voted for, log on disk[], curr term

Service Recovery

  1. replay log ⇒ recreate state
  2. snapshot
    1. contains all ops until some idx i, log will get truncated/cut from that snapshot part

Using Raft

image 1 16.png

https://excalidraw.com/#json=dV42j4bd8ElJLwCHmKNsW,vmBS9ae4VN2VSpGW5OAHXg

A clerk interacts with service like a middleman, clients communicate through clerk

clerk maintains IDs for operations and cluster membership details

clerk is part of client

Linearizability

  1. total order of execution

  2. match real time

  3. read returns results of last write

difference from serializability is it doesn’t match real time (transactinos?)