We were recently testing Flashcache on one of our servers at Directi. From various resources over the web, and from a little of my personal experience, I'm writing this post about flashcache.
Flashcache, for those who haven't seen before, by definition, is a block-cache for Linux. That just means - it's a method for extending Linux block-cache with SSDs, by which, we save a few bucks avoiding half a TB of RAM just for caching. The performance is determined by various factors. In an abstraction, I'd tell these factors are important -
- the size of the SSDs (best performance is obtained if the size of your SSDs is larger than the active-set of blocks. And, if you get perfect caching, the performance will be similar to running your entire system on SSDs) ,
- the estimate of reads and writes,
- flush frequency (when the writes are set to flush immediately - in case of file-system ops and db ops, caching is briefly done and performance will not be affected by the presence of flashcache).
Some internals:
CPU cache - The Linux-block cache works in a manner that it caches accessed blocks and NOT files. So long as you're not giving your program / some KVM direct access to these block devices, the Linux block cache will be in play. However, if you provide direct access, the answer to will this work?
would become less clear.
Quoting from flashcache docs ( flashcache - modes of caching ) , It provides 3 modes of caching: writeback, writethrough, and writearound. What happens when your SSD fails/disappears depends on your caching mode. Follow the above mentioned link for more info.
During the initial stages of testing, we were handling a large number of writes per second - which resulted in periodic saturation of primary storage. We decided on using write-back caching mode (as the writes were very high) and the performance increase was significant. As our reads got higher, the performance went down, and now, I'm scratching my head to find a way to get around this.