Tuesday, January 5, 2016

Jepsen fails to bust RethinkDB

Kyle Kingsbury's remarkable Jepsen tool has a correspondingly remarkable track record of success.

Time and again, when the tool is turned loose on one or another data management system, it finds holes, bugs, inaccuracies, crashes.

Recently, though, Kingsbury turned Jepsen's laser beam on RethinkDB, and the results are different this time:

As far as I can ascertain, RethinkDB’s safety claims are accurate. You can lose updates if you write with anything less than majority, and see assorted read anomalies with single or outdated reads, but majority/majority appears linearizable.

Rethink’s defaults prevent lost updates (offering linearizable writes, compare-and-set, etc), but do allow dirty and stale reads. In many cases this is a fine tradeoff to make, and significantly improves read latency. On the other hand, dirty and stale reads create the potential for lost updates in non-transactional read-modify-write cycles. If one, say, renders a web page for a user based on dirty reads, the user could take action based on that invalid view of the world, and cause invalid data to be written back to the database. Similarly, programs which hand off state to one another through RethinkDB could lose or corrupt state by allowing stale reads. Beware of sidechannels.

Where these anomalies matter, RethinkDB users should use majority reads. There is no significant availability impact to choosing majority reads, though latencies rise significantly. Conversely, if read availability, latency, or throughput are paramount, you can use outdated reads with essentially the same safety guarantees as single–though you’ll likely see continuous, rather than occasional, read anomalies.

Although this sounds like nuanced, even faint praise, when it comes to Jepsen this is a strong an endorsement as it (and Kingsbury) have ever delivered, so this is certainly interesting.

Somewhat as an aside, while reading Kingsbury's article, I was interested to read about Gavin Lowe's Testing for Linearizability work, which I hadn't seen before. This, too, will be fun to dig into and absorb.

No comments:

Post a Comment