Company
Date Published
Author
David Eisenstat
Word count
1293
Language
English
Hacker News points
None

Summary

CockroachDB has recently added support for Unicode collation, allowing for strings to be ordered according to various language and cultural expectations, though it still lacks full compatibility with PostgreSQL in this regard. The implementation of collated strings as a first-class type in CockroachDB presents challenges, particularly in distinguishing string types based on their collation locale and managing key storage in SQL tables. While UTF-8 encoding is typically the default order, CockroachDB's approach to collation allows for more nuanced ordering, such as ignoring case or punctuation. Despite the benefits, there are limitations, including compatibility issues with certain string functions and operators, due to the complexity of the SQL type system and performance considerations in Go. The database's current design prioritizes clarity and potential bug prevention, with the flexibility to adopt changes without compromising backward compatibility. As the system evolves, developers are encouraged to report issues with collated strings on GitHub.