OK , its hard to begin. I have been working on Koda for the last six-seven months. This is my spare time project and now I think its time to introduce Koda to the general public. Its better to begin with some benchmarks to give you impression how fast is it and what you can do with Koda. But before I continue with some benchmark results I want to talk a little bit about Java garbage collection issues on large and extra large heaps (> 6-8GB).
If you are Java programmer as myself you probably know that GC issues (read - long pauses) can ruin any server application which try to utilize more than 4-8GB of heap memory. Some people think that there are some magic combinations of HotSpot configuration parameters which can minimize GC pauses and even make them negligible. In some cases (read - types of server app load) it can help to some extent in many others - it can't. There is no magic bullet yet. Therefore, some companies have decided to go the unusual way : hide allocated memory from Java GC. Meet Terracotta's
BigMemory. The idea was pretty simple: passivate Java objects into off heap memory thus hiding them from garbage collector. This can help keep Java heap (and GC pauses) small even if we cache gigabytes of data. I have run some tests on with cache sizes up to 30G and can tell you that it works. You will get BigMemory performance numbers as well later on. But there is one downside with BigMemory - performance (requests throughput). It is not on par with pure Java cache (which keeps all objects in heap memory and does not do any serialization/de-serialization voodoo). Despite this limitation there are definitely some use cases which require deterministic, predictable GC behavior and BigMemory will find its customers.
The idea behind Koda was the same: to keep Java objects in off heap memory. The project has started as attempt to build something similar to BigMemory, but better (faster). Now I can say, that Koda has much more features than BigMemory and some of them are quite unique ones (I will describe them in my next posts). Ok, lets get to the business - performance benchmark.
Benchmark description
This is a multi-threaded (number of threads is 16) put/get key-value pairs into the Java cache simple use scenario .Read/write ratio is 90/10. The key size is approx 20 bytes, value size varies between 500-1000 bytes. Both keys and values are byte arrays. Cache eviction policy is LRU for all caches. Test measures: Total request throughput, maximum request latency (maximum GC pause duration), average latency, mean latency and number of percentiles: 99%, 99.9%, 99.99%. All measurements are done after 30 minutes of test execution.
Our contestants
- Ehcache 2.4.2 (enterprise edition)
- Infinispan 5.0 RC6
- Ehcache 2.4.2 + BigMemory
- Koda (sorry, no link yet)
- Infinispan 5.0 CR6 + Koda
The number 5 is BigMemory for Infinispan implemented as a custom DataContainer (which is Koda). I have replaced default Infinispan DataContainer implementation with Koda-based (which stores cache entries in off heap memory similar to BigMemory).
Hardware/Software
2x Intel Xeon (8 CPU cores), 32G RAM
OS: RHEL 5.5
Java: 1.6.23
Ehcache 2.4.2 (enterprise edition)
HotSpot options: java -server -Xms28G -Xmx28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=16
RPS = 1.1M - requests per second
Max latency = 65 sec - longest GC pause duration
Avg latency = 0.014 ms (14 microseconds)
Mean latency = 0.002 ms (2 microseconds)
99% latency < 0.084 ms
99.9% latency < 0.120 ms
99.99% latency < 4.3 ms
Number of cache items: 30M
Ehcache 2.4.2 (enterprise edition) + BigMemory
HotSpot options: java -server -XX:MaxDirectMemorySize=28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:NewSize=64M -XX:SurvivorRatio=16
RPS = 0.5 M - requests per second
Max latency = 0.5 sec - longest GC pause duration
Avg latency = 0.030 ms (30 microseconds)
Mean latency = 0.011 ms (11 microseconds)
99% latency < 0.064 ms
99.9% latency < 4.4 ms
99.99% latency < 13.5 ms
Infinispan 5.0 RC6
HotSpot options: java -server -Xms28G -Xmx28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=16
Infinispan has failed to produce any meaningful results. It ran very well until it reached maximum number of cache items them it stuck forever. I just killed the process when avg number of requests per second dropped below 100K.
Koda
HotSpot options: java -server
I allocated 28G of memory for off heap cache.
RPS = 3.5 M - requests per second
Max latency = 0.11 sec - longest GC pause duration. This is 110 ms
Avg latency = 0.0045 ms (4.5 microseconds)
Mean latency = 0.0035 ms (3.5 microseconds)
99% latency < 0.020 ms
99.9% latency < 0.043 ms
99.99% latency < 0.5 ms
Number of cache items: 42M