I'm concerned by this note in Riak's documentation:
N=3 simply means that three copies of each piece of data will be stored in the cluster. That is, three different partitions/vnodes will receive copies of the data. There are no guarantees that the three replicas will go to three separate physical nodes; however, the built-in functions for determining where replicas go attempts to distribute the data evenly.
https://docs.basho.com/riak/kv/2.1.3/learn/concepts/replication/#so-what-does-n-3-really-mean
I have a cluster of 6 physical servers with N=3. I want to be 100% sure that total loss of some number of nodes (1 or 2) will not lose any data. As I understand the caveat above, Riak cannot guarantee that. It appears that there is some (admittedly low) portion of my data that could have all 3 copies stored on the same physical server.
In practice, this means that for a sufficiently large data set I'm guaranteed to completely lose records if I have a catastrophic failure on a single node (gremlins eat/degauss the drive or something).
Is there a Riak configuration that avoids this concern?
Unfortunate confounding reality: I'm on an old version of Riak (1.4.2).