There is a known issue with how riak_ensemble currently syncs ensemble state to disk -- 5 times per second per each running ensemble. On slow disks (eg. not SSDs) and with a 64 partition or greater ring size, riak_ensemble will overwhelm the I/O system, fall beyond, message queues will grow, and Erlang GC will kick in. This will manifest as Riak using 100% CPU and increasing amounts of RAM, with slow performance.
With fast disks and/or small rings things will work without any issue at all.
This is a well understood and easy to fix bug that needs to be fixed before we ship 2.0. Until this bug is fixed, those previewing the strong consistency feature are advised to use machines with SSDs or use small rings (eg. 16 partitions) and not run multiple nodes on the same machine.
/cc basho/riak#536
There is a known issue with how
riak_ensemblecurrently syncs ensemble state to disk -- 5 times per second per each running ensemble. On slow disks (eg. not SSDs) and with a 64 partition or greater ring size,riak_ensemblewill overwhelm the I/O system, fall beyond, message queues will grow, and Erlang GC will kick in. This will manifest as Riak using 100% CPU and increasing amounts of RAM, with slow performance.With fast disks and/or small rings things will work without any issue at all.
This is a well understood and easy to fix bug that needs to be fixed before we ship 2.0. Until this bug is fixed, those previewing the strong consistency feature are advised to use machines with SSDs or use small rings (eg. 16 partitions) and not run multiple nodes on the same machine.
/cc basho/riak#536