replication

10 Dec, 2024

Some notes from reading DDIA's replication chapter, the chapter kicked off with terms that I didn't understand like "read-your-writes" and "monotonic reads guarantees", and that was enough motivation to read the chapter and figure out what they ment!

leader based replication, examples include:
- message brokers
- distributed file systems (whats the usecase of these? sounds interesting)
- replicated block devices (not sure what these are)
questions/problems with leader base replication
- how do we know the write actually got to the leader
- what happens when leader goes out?
asynchronous replication is widely used
- despite weak durability (leader dies, writes are loss cause they may not get broadcast-ed and replicated)
  - "chain replication" (synchronous replication variant but addresses durability problem)
- pro is leader can continuously process writes
WAL shipping (don't really get this, should reread)
row based log replication is what we did for ocaml_db?
"triggers" and "stored procedures"
- triggers allow you to run code on the database system when performing a write/modification
  - for now i'm thinking of this as the tee bash command, we can redirect part of the information elsewhere
"read-your-writes" (read-after-write consistency)
- users read the changes they submitted
monotonic reads
- when you read data you can read not the latest data, but guarantees that you won't read less stale data
- in other words, you can't go back in time and read old data after reading newer datag
- one way to do this is to make each user always read from the same replica
  - upon replica failure, we have to reroute users to a new replica (might cause problems here, such as reading stale data)
"consistent prefix reads"
- guarantees that if writes happen in a particular order, anyone reading the writes will see them in that order
- typically needed in partitions/sharded databases (where writes could be out of order due to lag of partition updates)
  - when do we want to use partitions (industry practice to use them) ?
when working with eventually consistent systems, its good to ask how the application behaves if lag increased (to minutes or hours)
multi-leader replication's major drawback is conflict resolution (how to resolve write conflicts)
- offline mode with each device being a leader, and once it's online device data then replicated (CouchDB designed for this)
- avoiding conflicts (force writes to go through a single leader) is the simplest solution
- last write wins is another strategy (prone to data loss)
- conflict resolution research
  - crdt (two-way merge function)
  - mergeable persistent data structures (three-way merge function)
  - operational transformation (used by collaborative editors like google docs)
there are many multi-leader topologies with different trade-offs
- circular and star - single point of failure could halt data propogating to other replicas
- all-to-all - could have race conditions resulting in conflicts between replicas
  - version vectors (use this instead of timestamps/clocks to determine order)
conflict detection is hard and its best to thoroughly read your db's documentation and test your db to make sure it provides the guarantees you want
leaderless replication (became popular again because of dynamo, and is often called dynamo-style (don't confuse this with dynamodb which is single leader))
- typically defined by two mechanisms (not all dynamo systems implement both)
  - read repair - detect stale data (typically with version number) and writes new data to replica with old data
  - anti-entropy process - background process that monitors for inconsistent or missing data (does not copy writes in any particular order and may have long delays)
quorum read and writes (n - the number of nodes; r - number of reads to confirm; w - number of writes to confirm)
- lower numbers for r/w means lower latency and higher availability, but also higher chance of stale data
- important to monitor staleness in dynamo systems (harder than leader systems)
  - leader systems can track follower logs vs leader logs to see how far they lag
  - leaderless systems there is no write order making it hard to track lag (Note: research area)
  - sloppy quorums - optional in most dynamo implementations to increase write availability (makes system more durable)
    - when you lose access to nodes that normally hold the data, but you have access to other nodes, temporarily write to those nodes
    - when you have access to the correct nodes again, use "hinted handoff"
concurrent write detection using version numbers (bit confused on this)
extend version numbers to version vectors (vector clocks)