Papers /

Vogels-CACM 2009

Reading

Outdoors

Games

Hobbies

LEGO

Food

Code

Nook

sidebar

Vogels-CACM 2009

Eventually Consistent

Vogels

database replication consistency data management gossip

@article{vogels:cacm2009,
  title="Eventually Consistent",
  author="Werner Vogels",
  journal="Communications of the ACM",
  volume="52",
  number="1",
  month="January",
  year="2009",
  pages="40--44"
}

Internet infrastructure services require high security, scalability, availability, performance, cost-effectiveness while serving trillions of requests from around the globe

  • With that many requests, even very low probability failures are bound to happen

Systems can meet any two of three objectives: Data consistency, system availability, fault tolerance

  • Developers need to determine priorities for their system, tradeoff between consistency, durability, availability, performance, cost

Traditional, ideal consistency model: All users see all updates as soon as they are made

  • Provides distribution transparency
  • For many systems, it is better to fail the whole system than to not provide that guarantee

In other systems, availability is more important than consistency

From client's perspective, several models exist:

  • Strong consistency: Updated data available by all nodes after update
  • Weak consistency: Data may be inconsistent until set of conditions is met, e.g., time elapsed passes an inconsistency window
  • Eventual consistency: If there are no more updates, all nodes will eventually have most up to date values
  • Causal consistency: If a process informs another of an update, then the second process will be able to retrieve the new value; writes overwrite older updates
  • Read-your-writes: Special case of causal---after making an update, client will always retrieve that value
  • Session: Within a continued interaction the client will always retrieve its most recent updates
  • Monotonic read: After seeing a particular value, client will never retrieve an older value
  • Monotonic write: Updates from one client are serialized in order

Not providing monotonic reads and read-your-writes generally makes systems very unmanageable

Server side characterized by several parameters:

  • N: Number of nodes; R: Number of nodes read from; W: Number of nodes written to for updates
  • If W+R > N then write set and read set overlap and strong consistency is provided
    • But if W is too high, writes will fail
  • Systems focusing on fault tolerance usually use N=3, W=2, R=2
  • Systems for high performance scale N as necessary
    • R usually set to 1
    • W sometimes set to 1, providing some data storage albeit with some risk, and a lazy/epidemic process to disseminate further as time goes on
  • If W<(N+1)/2 then writes may conflict
  • Eventual consistency occurs when W+R <= N
    • In which case, R only makes sense to be 1 if this is intentional
    • Happens when scaling for fast reads, or data access is not trivial, e.g., returning sets of objects

How well eventual consistency works in practice depends a lot on "client" stickiness

  • If they generally keep hitting the same server then it probably works out ok
  • Sessions make this connection explicit

Of note:

  • DeCandia. Dynamo. SOSP 2007.
  • Gilbert, Lynch. Brewer's Conjecture. SIGACT News 2002.
Recent Changes (All) | Edit SideBar Page last modified on June 01, 2009, at 04:45 PM Edit Page | Page History