We had one of our Riak nodes that in a couple weeks time started eating up all it's disk recently.
Here's what we noticed:
- Two of the twenty or so partitions on the node were 5 to 10 x the average size of the other partitions. The average partitions size was between 20 and 30 GB, and yet 2 of the partitions were 160GB and 210GB.
- The logs showed that we had run out of open files even though we have the riak user set for max_open_files at 100k. As it turns out, during the hardware maint I had started riak from a sudo -i session which gave the shell the default 1024 max_open_files setting.
- After restarting Riak with the correct max_open_files setting, we noticed a lot of 0 byte bitcask files which we removed, as well as some invalid bitcask hint files which we cleaned up.
- Once all the invalid bitcask files were cleaned up, we realized that any merge process against the 2 large partitions always failed, implying there were some corrupt bitcask files, or the merge process was timing out.
- Rather than rebuild the whole node, we decided just to rebuild the specific partitions.
Here's the process we used for rebuilding the specific partitions:
- Stop Riak
# riak stop- Move the bad partitions elsewhere for backup purposes
# mv /{riak_datadir}/riak/bitcask/{partition} /{backup_dir}/- Start Riak
# riak start- Wait for riak_kv process to start
# riak-admin wait-for-service riak_kv riak@{riak_node_name}- Attach to riak and start the repair process
# riak attachNote: to quit the riak attach shell, use cntl-D, not cntl-C (otherwise you will stop riak)
(riak@{node}) 1> Partitions = [{part 1} || P <- Partitions].,{part 2} ,...{part n} ].
(riak@{node}) 2> [riak_kv_vnode:repair(P)
- Check status of the repair process
# riak-admin transfers
All in all, Riak recovers quite nicely, and it wasn't terribly difficult to find out what was going on.
On a side note, Basho does a great job if you have the benefit of using their support .