Home / Companies / Cockroach Labs / Blog / Post Details
Content Deep Dive

Lessons Learned from 2+ Years of Nightly Jepsen Tests

Blog post from Cockroach Labs

Post Details
Company
Date Published
Author
Ben Darnell
Word Count
1,514
Language
English
Hacker News Points
11
Summary

CockroachDB utilized Jepsen tests nightly to ensure database correctness amidst failures, discovering a post-release bug two years after initial testing. The rigorous testing involved complex environments with network dependencies and cloud VMs, often encountering non-bug-related failures. The bug was identified in a register test with a split nemesis, indicating an inconsistency in the handling of pipelined writes, where a transaction incorrectly signaled completion, leading to resolved intents despite incomplete processes. Despite initial misdirection, the investigation revealed that a simple code fix resolved this significant issue. The endeavor underscored Jepsen's efficacy and limitations as a testing tool, emphasizing the need for more sensitive test workloads to better capture such errors in the future.