The random partitioner is implemented by org.apache.cassandra.dht.RandomPartitioner and is Cassandra’s default. It uses a BigIntegerToken with an MD5 hash applied to it to determine where to place the keys on the node ring. This has the advantage of spreading your keys evenly across your cluster, because the distribution is random. It has the disadvantage of causing inefficient range queries, because keys within a specified range might be placed in a variety of disparate locations in the ring, and key range queries will return data in an essentially random order.
The order-preserving partitioner is implemented by org.apache.cassandra.dht.OrderPreservingPartitioner, which implements IPartitioner<StringToken>. Using this type of partitioner, the token is a UTF-8 string, based on a key. Rows are therefore stored by key order, aligning the physical structure of the data with your sort order. Configuring your column family to use order-preserving partitioning (OPP) allows you to perform range slices.
It’s worth noting that OPP isn’t more efficient for range queries than random partitioning—it just provides ordering. It has the disadvantage of creating a ring that is potentially very lopsided, because real-world data typically is not written to evenly. As an example, consider the value assigned to letters in a Scrabble game. Q and Z are rarely used, so they get a high value. With OPP, you’ll likely eventually end up with lots of data on some nodes and much less data on other nodes. The nodes on which lots of data is stored, making the ring lopsided, are often referred to as “hot spots.” Because of the ordering aspect, users are commonly attracted to OPP early on. However, using OPP means that your operations team will need to manually rebalance nodes periodically using Nodetool’s loadbalance or move operations.
If you want to perform range queries from your clients, you must use an order-preserving partitioner or a collating order-preserving partitioner.
This partitioner orders keys according to a United States English locale (EN_US). Like OPP, it requires that the keys are UTF-8 strings. Although its name might imply that it extends the OPP, it doesn’t. Instead, this class extends AbstractByteOrderedPartitioner. This partitioner is rarely employed, as its usefulness is limited.
New for 0.7, the team added ByteOrderedPartitioner, which is an order-preserving partitioner that treats the data as raw bytes, instead of converting them to strings the way the order-preserving partitioner and collating order-preserving partitioner do. If you need an order-preserving partitioner that doesn’t validate your keys as being strings, BOP is recommended for the performance improvement.