Wednesday, July 13, 2011

Koda: disposable means reusable

This is one feature I forgot to mention in my previous post. Koda supports disposable caches. What is disposable cache? OK, imagine that you need to create cache in memory to keep several millions (or billions if you have enough RAM :) entries, process entries and then discard (dispose) them. Imagine if you need to do this many times per second (in other words - very often). Releasing cache entries explicitly means a huge impact on native memory allocator and take some time (its probably several millions de-allocations per second). Instead of deallocating (releasing) cache entries explicitly Koda put them into so called reuse pool and they can be re-used (realloc'ed ) later on for new cache entries.  This allows to dispose (re-use) caches almost instantly. Nice feature by the way and it has some very interesting applications.

PS I love Google. I do not promote my blog at all but if you try search "infinispan benchmark" you will get my blog on the first page in a blog category. 
PPS The same is true for "Terracotta BigMemory benchmark" search. First page in a Blog category. 

Tuesday, July 5, 2011

What is Koda (in one sentence)?

Key-code-value in memory data store.  Here is the promised feature set:

1.    Scalable off-heap storage allows using the whole server’s RAM. Tested up to 30G of memory.
2.    Very low and predictable latencies. With cache size = 30G: Avg latency = 3.5 microsecond, Median = 2.5 microsecond. 99% - 25 microseconds, 99.9%  = 50 microseconds, 99.99% < 300 microseconds.
3.    Max latencies are in 100’s ms even for very large caches.
4.    Very fast  (up to 7x times faster than BigMemory). 3.5M requests per second (90/10 get/put) with eviction = LRU and cache size = 28G.
5.    Very fast (faster 4 x times than on-heap Ehcache ).
6.    Hard limit on a maximum cache instance size (in bytes) and hard limit on overall allocated memory.
7.    Low overhead per cache entry (24 bytes).
8.    Compact String representations in memory.
9.    Fast serialization using Kryo
10.  Custom serialization through Writable, Serializer interface.
11. Multiple eviction policies: LRU, FIFO, LFU, RANDOM.
12. Two types of Secondary cache indexes (optimized for scan and optimized for lookup).
13. Very low index memory overhead (< 10 bytes per cache entry).
14.  Indexes are lightweight (creation performance: 10-20M of cache entries per second).
15.  Indexes are off-heap as well. So you do not have to worry about Java OOM.
16. Execute/ExecuteForUpdate API allows to execute Java code inside server process thus allowing more efficient data processing algorithms (no need to lock-load-store-unlock data on a client side).
17. Direct Memory Access allows to implement rich off heap data structures such as: lists, sets, trees etc.
18.  Query API.
19.  Queries are performed on serialized version of objects (without materializing objects themselves, using DMA).
20. Queries are very fast (scan of 10s of millions cache entries per second per server w/o indexes).
21.  Supported OS: Linux, FreeBSD, NetBSD, Mac OSX, Solaris 10,11. 64 bit only.
22.  Infinispan 5.0 integration.
23. Ehcache integration.

Sunday, July 3, 2011

Introducing Koda - Key-Code-Value in memory data store

OK , its hard to begin. I have been working on Koda for the last  six-seven months. This is my spare time project and now I think its time to introduce Koda to the general public. Its better to begin with some benchmarks to give you impression how fast is it and what you can do with Koda. But before I continue with some benchmark results I want to talk a little bit about Java garbage collection issues on large and extra large heaps (> 6-8GB).

If you are Java programmer as myself you probably know that GC issues  (read - long pauses) can ruin any server application which try to utilize more than 4-8GB of heap memory. Some people think that there are some magic combinations of HotSpot configuration parameters which can minimize GC pauses and even make them negligible. In some cases (read - types of server app load) it can help to some extent in many others - it can't. There is no magic bullet yet. Therefore, some companies have decided to go the unusual way : hide allocated memory from Java GC.  Meet Terracotta's  BigMemory. The idea was pretty simple: passivate Java objects into off heap memory thus hiding them from garbage collector. This can help keep Java heap (and GC pauses) small even if we cache gigabytes of data.  I have run some tests on with cache sizes up to 30G and can tell you that it works. You will get BigMemory performance numbers as well later  on.  But there is  one downside with BigMemory - performance (requests throughput). It is not on par with pure Java cache (which keeps all objects in heap memory and does not do any serialization/de-serialization voodoo).  Despite this limitation there are definitely some use cases which require deterministic, predictable GC behavior and BigMemory will find its customers.
The idea behind Koda was the same: to keep Java objects in  off heap memory. The project has started as attempt to build something similar to BigMemory, but better (faster). Now I can say, that Koda has much more features than BigMemory and some of them are quite unique ones (I will describe them in my next posts). Ok, lets get to the business - performance benchmark.

  Benchmark description

This is a multi-threaded (number of threads is 16) put/get key-value pairs into the Java cache simple use scenario .Read/write ratio is 90/10. The key size is approx 20 bytes, value size varies between 500-1000 bytes. Both keys and values are byte arrays. Cache eviction policy is LRU for all caches. Test measures: Total request throughput,  maximum request latency (maximum GC pause duration), average latency, mean latency and number of percentiles: 99%, 99.9%, 99.99%. All measurements are done after 30 minutes of test execution.
  Our contestants 
  1.   Ehcache 2.4.2 (enterprise edition)
  2.   Infinispan 5.0 RC6
  3.   Ehcache 2.4.2 + BigMemory
  4.   Koda (sorry, no link yet)
  5.   Infinispan 5.0 CR6 + Koda   
The number 5 is BigMemory for Infinispan implemented as a custom DataContainer (which is Koda). I have replaced default Infinispan DataContainer implementation with Koda-based (which stores cache entries in off heap memory similar to BigMemory).


2x Intel Xeon (8 CPU cores), 32G RAM
OS: RHEL 5.5
Java: 1.6.23

  Ehcache 2.4.2 (enterprise edition)

HotSpot options: java -server -Xms28G -Xmx28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=16 

RPS                             = 1.1M     - requests per second
Max latency                = 65 sec   - longest GC pause duration
Avg latency                 = 0.014 ms (14 microseconds)
Mean latency              = 0.002 ms (2 microseconds)
99%   latency              < 0.084 ms
99.9%   latency           < 0.120 ms 
99.99%  latency          < 4.3 ms
Number of cache items: 30M

Ehcache 2.4.2 (enterprise edition) + BigMemory

HotSpot  options: java -server -XX:MaxDirectMemorySize=28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:NewSize=64M -XX:SurvivorRatio=16 

RPS                             = 0.5 M     - requests per second
Max latency                = 0.5 sec   - longest GC pause duration
Avg latency                 = 0.030 ms (30 microseconds)
Mean latency              = 0.011 ms (11 microseconds)
99%   latency              < 0.064 ms
99.9%   latency           < 4.4 ms 
99.99%  latency          < 13.5 ms

  Infinispan 5.0 RC6

HotSpot options: java -server -Xms28G -Xmx28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=16 

Infinispan has failed to produce any meaningful results. It ran very well until it reached maximum number of cache items them it stuck forever. I just killed the process when avg number of requests per second dropped below 100K.


HotSpot options: java -server 
I allocated 28G of memory for off heap cache.

RPS                             = 3.5 M     - requests per second
Max latency                = 0.11 sec   - longest GC pause duration. This is 110 ms
Avg latency                 = 0.0045 ms (4.5 microseconds)
Mean latency              = 0.0035 ms (3.5 microseconds)
99%   latency              < 0.020 ms
99.9%   latency           < 0.043 ms 
99.99%  latency          < 0.5 ms
Number of cache items: 42M