commit log 最好写到单独的磁盘，ssd 该最好。
However, it's recommaned that you store that datafiles and the commit logs on sparate hard disks for maximum performance.
concurrent_reads 最好设置为 CPU 核数 * 2
you might wang to update the concurrent_reads setting immediately before you start your server. That's because the concurrent_reads setting is optimal at two threads per processor core. By default, this setting is 8, assuming a fore-core box.
the concurrent_writes setting behaves somewhat differently. This should match the number of clients that will write concurrently to the server. If cassandra is backing a web application server, you can tune this setting from its default 32 to match the number of threads the application server has available to connect to Cassandra.
相比 bitcask 把索引整个放在内存里，cassandra 把它视为一个缓存，可以设置缓存键的比率或者数量。
The keys_cached setting indicates the number of key locations—not key values—that will be saved in memory. This can be specified as a fractional value (a number between 0 and 1) or as an integer. If you use a fraction, you’re indicating a percentage of keys to cache, and an integer value indicates an absolute number of keys whose locations will be cached.
This setting will consume considerable memory, but can be a good trade-off if your locations are not hot already.
使用 MMap 读取磁盘对性能提升意义不大。不过 sqlite4 的 lsm 模块说 mmap() 相比 read() 要快的多，可能是因为 sqlite4 lsm 没有自己的缓存管理机制： LSM_CONFIG_MMAP If LSM is running on a system with a 64-bit address space, this option may be set to either 1 (true) or 0 (false). On a 32-bit platform, it is always set to 0. If it is set to true, the entire database file is memory mapped. Or, if it is false, data is accessed using ordinary OS file read and write primitives. Memory mapping the database file can significantly improve the performance of read operations, as database pages do not have to be copied from operating system buffers into user space buffers before they can be examined.
The purpose of disk_access_mode is to enable memory mapped files so that the oper- ating system can cache reads, thus reducing the load on Cassandra’s internal caches. This sounds great, but in practice, disk_access_mode is one of the less-useful settings, and at this point doesn’t work exactly as was originally envisioned. This may be im- proved in the future, but it is just as likely that the setting will be removed. Certainly feel free to play around with it, but you might not see much difference.
- rows_cached 缓存行数据本身- 如果读远多于写，那么如果设置它过高，会消耗过多资源- 如果读小于写，但行比较大，最好权衡一下- 除非某些行的读写频率远大于其它行，那么很难得到明显的性能提升
The rows_cached setting specifies the number of rows that will be cached. ... You’ll want to use this setting carefully, however, as this can easily get out of hand. If your column family gets far more reads than writes, then setting this number very high will needlessly consume considerable server resources. If your column family has a lower ratio of reads to writes, but has rows with lots of data in them (hundreds of columns), then you’ll need to do some math before setting this number very high. And unless you have certain rows that get hit a lot and others that get hit very little, you’re not going to see much of a boost here.
Do not use the Serial GC with Cassandra.
However, do not simply set your JVM to use as much memory as you have available up to 4GB. There are many factors involved here, such as the amount of swap space and memory fragmentation. Simply increasing the size of the heap using -Xmx will not help if you don’t have any swap available.