Designed to scale (and fly :): Introducing Koda - Key-Code-Value in memory data store

OK , its hard to begin. I have been working on Koda for the last six-seven months. This is my spare time project and now I think its time to introduce Koda to the general public. Its better to begin with some benchmarks to give you impression how fast is it and what you can do with Koda. But before I continue with some benchmark results I want to talk a little bit about Java garbage collection issues on large and extra large heaps (> 6-8GB).

If you are Java programmer as myself you probably know that GC issues (read - long pauses) can ruin any server application which try to utilize more than 4-8GB of heap memory. Some people think that there are some magic combinations of HotSpot configuration parameters which can minimize GC pauses and even make them negligible. In some cases (read - types of server app load) it can help to some extent in many others - it can't. There is no magic bullet yet. Therefore, some companies have decided to go the unusual way : hide allocated memory from Java GC. Meet Terracotta's BigMemory. The idea was pretty simple: passivate Java objects into off heap memory thus hiding them from garbage collector. This can help keep Java heap (and GC pauses) small even if we cache gigabytes of data. I have run some tests on with cache sizes up to 30G and can tell you that it works. You will get BigMemory performance numbers as well later on. But there is one downside with BigMemory - performance (requests throughput). It is not on par with pure Java cache (which keeps all objects in heap memory and does not do any serialization/de-serialization voodoo). Despite this limitation there are definitely some use cases which require deterministic, predictable GC behavior and BigMemory will find its customers.

The idea behind Koda was the same: to keep Java objects in off heap memory. The project has started as attempt to build something similar to BigMemory, but better (faster). Now I can say, that Koda has much more features than BigMemory and some of them are quite unique ones (I will describe them in my next posts). Ok, lets get to the business - performance benchmark.

Benchmark description

This is a multi-threaded (number of threads is 16) put/get key-value pairs into the Java cache simple use scenario .Read/write ratio is 90/10. The key size is approx 20 bytes, value size varies between 500-1000 bytes. Both keys and values are byte arrays. Cache eviction policy is LRU for all caches. Test measures: Total request throughput, maximum request latency (maximum GC pause duration), average latency, mean latency and number of percentiles: 99%, 99.9%, 99.99%. All measurements are done after 30 minutes of test execution.

Our contestants

Ehcache 2.4.2 (enterprise edition)
Infinispan 5.0 RC6
Ehcache 2.4.2 + BigMemory
Koda (sorry, no link yet)
Infinispan 5.0 CR6 + Koda

The number 5 is BigMemory for Infinispan implemented as a custom DataContainer (which is Koda). I have replaced default Infinispan DataContainer implementation with Koda-based (which stores cache entries in off heap memory similar to BigMemory).

Hardware/Software

2x Intel Xeon (8 CPU cores), 32G RAM

OS: RHEL 5.5

Java: 1.6.23

Ehcache 2.4.2 (enterprise edition)

HotSpot options: java -server -Xms28G -Xmx28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=16

RPS = 1.1M - requests per second

Max latency = 65 sec - longest GC pause duration

Avg latency = 0.014 ms (14 microseconds)

Mean latency = 0.002 ms (2 microseconds)

99% latency < 0.084 ms

99.9% latency < 0.120 ms

99.99% latency < 4.3 ms

Number of cache items: 30M

Ehcache 2.4.2 (enterprise edition) + BigMemory

HotSpot options: java -server -XX:MaxDirectMemorySize=28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:NewSize=64M -XX:SurvivorRatio=16

RPS = 0.5 M - requests per second

Max latency = 0.5 sec - longest GC pause duration

Avg latency = 0.030 ms (30 microseconds)

Mean latency = 0.011 ms (11 microseconds)

99% latency < 0.064 ms

99.9% latency < 4.4 ms

99.99% latency < 13.5 ms

Infinispan 5.0 RC6

HotSpot options: java -server -Xms28G -Xmx28G -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=16

Infinispan has failed to produce any meaningful results. It ran very well until it reached maximum number of cache items them it stuck forever. I just killed the process when avg number of requests per second dropped below 100K.

Koda

HotSpot options: java -server

I allocated 28G of memory for off heap cache.

RPS = 3.5 M - requests per second

Max latency = 0.11 sec - longest GC pause duration. This is 110 ms

Avg latency = 0.0045 ms (4.5 microseconds)

Mean latency = 0.0035 ms (3.5 microseconds)

99% latency < 0.020 ms

99.9% latency < 0.043 ms

99.99% latency < 0.5 ms

Number of cache items: 42M

Infinispan 5.0 RC 6 + Koda

HotSpot options: java -server

I allocated 28G of memory for off heap cache.

RPS = 2.65 M - requests per second

Max latency = 0.10 sec - longest GC pause duration. This is 100 ms

Avg latency = 0.0059 ms (5.9 microseconds)

Mean latency = 0.0047 ms (4.7 microseconds)

99% latency < 0.020 ms

99.9% latency < 0.051 ms

99.99% latency < 0.75 ms

Number of cache items: 42M

As you can see, even with serialization/deserialization of Java objects to/from off heap memory Koda is able significantly outperform pure Java Ehcache as well as Ehcache + BigMemory combination. Vanilla Infinispan 5 RC6 is not usable yet (unfortunately) but combination of Infinispan and Koda beats Ehcache + BigMemory by a factor of 5x - which is not bad and it works. In my next post I will give a brief description of Koda's feature set. Stay tuned.

Designed to scale (and fly :)

Sunday, July 3, 2011

Introducing Koda - Key-Code-Value in memory data store

No comments:

Post a Comment

About Me

Blog Archive