Scalable, Distributed Secondary Indexing in ScyllaDB
Blog post from ScyllaDB
ScyllaDB and Apache Cassandra utilize different strategies for implementing secondary indexes, which are crucial for efficient data retrieval on non-partition keys. While Apache Cassandra employs local indexing, where indexes are stored on the same node as the data, ScyllaDB leverages global indexing, creating a Materialized View for each index that uses the indexed column as the partition key. This approach enhances read scalability by breaking queries into two parts: querying the index table and then retrieving data from the indexed table. However, it introduces slower write performance due to the overhead of maintaining the index view. Secondary indexes are generally transparent to applications, allowing for flexibility in querying columns with less storage overhead compared to Materialized Views. The choice between using secondary indexes or Materialized Views largely depends on application requirements, with Materialized Views offering maximum performance for fixed column queries, while secondary indexes provide adaptability for querying various column sets.