Richard Low's blog – Algorithms, distributed systems and other computer science things

The sweet spot for Cassandra secondary indexing

Secondary indexes Secondary indexes have been in Cassandra since 0.7 and can be incredibly useful. For example, if you were implementing a user accounts database, you might have the schema CREATE TABLE user_accounts ( username text PRIMARY KEY, email text, password text, last_visited timestamp, country text ); The only key you can lookup on is… Continue reading The sweet spot for Cassandra secondary indexing

Tombstoning Cassandra’s super columns

High-dimensional data Cassandra’s native data model is two dimensional: rows and columns. This is great for data that is naturally grouped together e.g. fields of a message. However, some uses require more dimensions – maybe you want to group your messages by recipient too, creating something like this: “alice”: { “ccd17c10-d200-11e2-b7f6-29cc17aeed4c”: { “sender”: “bob”, “sent”:… Continue reading Tombstoning Cassandra’s super columns

Counting keys in Cassandra

Today a colleague asked me: how can I find out how many keys I just inserted into Cassandra? Â You’d expect any half-decent database to be able to tell you how much stuff it has got. Â Cassandra being (somewhat better than) half-decent can, but there are many subtleties that are worth understanding. Firstly, a clarification on… Continue reading Counting keys in Cassandra

Welcome

Welcome to my blog! I’m going to be writing about some of my thoughts on distributed systems, algorithms and whatever else comes to me during my daily business. Please check back soon when I’ve written some up. Meanwhile, here is a picture of a monkey.