CRDT – CONFLICT‐FREE REPLICATED DATA TYPES
Distributed Systems (Hans‐Arno Jacobsen) 1
Pixabay.com
CRDTs Units
• Eventual consistency, informally
• State‐based objects
• Eventual consistency, more formally • Conflict‐free replicated data types
Distributed Systems (Hans‐Arno Jacobsen) 2
EVENTUAL CONSISTENCY, INFORMALLY
Distributed Systems (Hans‐Arno Jacobsen)
3
Pixabay.com
Eventual Consistency
• Eventualconsistencyisdesirableforlarge‐scale distributed systems where high availability is important
• Tendstobecheaptoimplement(e.g.,viagossip)but may serve stale data
• Constitutesachallengeforenvironmentswhere stronger consistency is important
Distributed Systems (Hans‐Arno Jacobsen) 4
Handling Concurrent Writes
• Premise for eventual consistency were scenarios with few (no) concurrent writes to the same key (cf. client‐centric consistency)
• However, we do need a mechanism to handle concurrent writes should they so happen
• If there were a way to handle concurrent writes, we could support eventual consistency more broadly
• Would “only” need to guarantee that after processing all writes for a key, all replicas converge, no matter what order the writes are processed (e.g., assuming gossip)
Distributed Systems (Hans‐Arno Jacobsen) 5
Max register L1: 0 W(4) (4) W(2) (4)
Examples
Growth‐only counter (G‐counter)
L1: 0 W(+5) (5) W(+2) (7) W(+1) 8
L2: 0 W(+2) (2) W(+5) (7) W(+1) 8 Writes propagate to L2, L1, respectively
✔
Different locations (replicas)
merge(5) 5 L2: 0 W(5) (5) W(3) (5) merge(4) ✔
State propagate to L2, L1 via periodic merging
Distributed Systems (Hans‐Arno Jacobsen)
6
5
Self‐study Questions
• Think of a few basic data structures, like lists, sets, counters, binary trees, heaps, maps, etc., and visualize for yourself what happens if replicated instances of these structures are updated via gossip.
• Does their state converge, no matter the update sequence?
• What happens if update operations are lost or duplicated?
• What mechanisms we know other than gossip could be used to keep these replicated structures updated without violating their convergence.
• What are pros and cons of these mechanisms?
Distributed Systems (Hans‐Arno Jacobsen) 7
Distributed Systems (Hans‐Arno Jacobsen) 8
CRDT – FROM STATE‐BASED OBJECTS TO REPLICATED STATE‐BASED OBJECTS
Distributed Systems (Hans‐Arno Jacobsen)
9
Pixabay.com
State‐based objects Mostly plain old objects
• Offerupdateandqueryrequeststoclients
• Maintaininternalstate
• Processclientrequests
• Perform merge requests amongst each other • Periodicallymerge(supportinfrastructure)
Distributed Systems (Hans‐Arno Jacobsen) 10
State‐based Object
• What we commonly know as object • Comprised of
– Internal state
– One or more query methods – One or more update methods – A merge method
Distributed Systems (Hans‐Arno Jacobsen) 11
class Avg(object): def __init__(self):
def update(self, x): self.sum += x self.cnt += 1
self.sum = 0
self.cnt = 0
def query(self): if self.cnt != 0:
def merge(self, avg): self.sum += avg.sum self.cnt += avg.cnt
return
else: return 0
self.sum /
self.cnt
Class Average Running Example
Distributed Systems (Hans‐Arno Jacobsen) 12
Average
State‐based object representing a running average
• Internalstate
– self.sum and self.cnt
• Query returns average
• Update updates average with a new value x
• Merge merges one Avg instance into another one
Distributed Systems (Hans‐Arno Jacobsen) 13
Replicated State‐based Object
• State‐based object replicated across multiple nodes
• E.g., replicate Avg across two nodes
• Both nodes have a copy of state‐based object
• Clients send query and update to a single node
• Nodes periodically send their copy of state‐based object to other nodes for merging
Distributed Systems (Hans‐Arno Jacobsen) 14
Node a
a0 Timeline Update
Unique
Causal history based on operation identifiers
operation identifier
Each state represents a snapshot of object in time that results from updates applied
state
query
history
State
a1
a0 sum:0, cnt:0
0
a1 sum:1, cnt:1
1 0
a2 sum:4, cnt:2 Distributed Systems (Hans‐Arno Jacobsen)
2 0,1
State
15
Operation identifier is unique across replicas
Each state represents a snapshot of object in time that results from updates applied
state
query
history
Timeline
a0 sum:0, cnt:0
0
a1 sum:1, cnt:1
1 0
a2 sum:4, cnt:2
2 0,1
Distributed Systems (Hans‐Arno Jacobsen)
16
States and Causal Histories
If y = x.update(…) where the update has identifier i, then the causal history of y is the causal history of x union { i }.
a0 sum:0, a1 sum:1, b0 sum:0, b1 sum:2, b2 sum:6,
cnt:0 cnt:1 cnt:0 cnt:1 cnt:2
0 1 0 2 3
{} {0} {} {1} {1,2}
Distributed Systems (Hans‐Arno Jacobsen)
17
state
query ()
history
a0 sum:0,
cnt:0 cnt:1 cnt:0 cnt:1 cnt:2
0{} 2 {0} 0{} 4 {1}
a1 sum:2,
b0 sum:0,
b1 sum:4,
update 2 update 4
state
query () history
Merge
a2 sum:6,
Distributed Systems (Hans‐Arno Jacobsen)
3 {0,1}
18
Nodes Periodically Propagate Their State
Distributed Systems (Hans‐Arno Jacobsen) 19
Self‐study Questions
• Think of a few basic data structures, like lists, sets, counters, binary trees, heaps, maps, etc., and visualize for yourself what happens if replicated instances of these structures are updated via gossip.
• For the above data structures, specify merge operations that merge the state of two instances of a given structure.
• Assume merge happens periodically, does your replicated structures’ state converge?
Distributed Systems (Hans‐Arno Jacobsen) 20
Distributed Systems (Hans‐Arno Jacobsen) 21
CRDT –
EVENTUAL CONSISTENCY, MORE FORMALLY
Distributed Systems (Hans‐Arno Jacobsen)
22
Pixabay.com
Eventual Consistency (EC)
• A replicated state‐based object is
– eventually consistent if whenever two replicas of the state‐based object have the same causal history, they eventually (not necessarily immediately) converge to the same internal state
Distributed Systems (Hans‐Arno Jacobsen) 23
Strong Eventual Consistency (SEC)
• A replicated state‐based object is
– strongly eventually consistent if whenever two replicas of the state‐based object have the same causal history, they (immediately) have the same internal state
• Strong eventual consistency implies eventual consistency
Distributed Systems (Hans‐Arno Jacobsen) 24
– NoMergeAverage – BMergeAverage – MaxAverage
EC or SEC
That is the question?
• Variants of our Average object, defined next – Average
• Note that some of these objects do not represent realistic functionality (i.e., needed functionality)
• These objects are meant to illustrate convergence concepts only
Distributed Systems (Hans‐Arno Jacobsen) 25
Average
a, b attain the same causal history but do not converge to the same internal state – they do not converge at all!
state
query history
Neither eventually
b0 sum:0, cnt:0 b1 sum:4, cnt:1
0
consistent, nor
4