Looking around for a good and simple key-value store to use in my current project, friends recommended KyotoCabinet. A bit of further research showed that for our application (AIDA – source available), in which we do lots of random reads, KyotoCabinet outperforms all other interesting solutions in that respect.
Getting KyotoCabinet on my Mac was straightforward. I’m using MacPorts, and a simple
port install kyotocabinet
did the trick.
However, AIDA is written in Java, which complicated things a bit. I first got the latest Java driver, but configure was first not finding the MacPorts KyotoCabinet installation (which makes sense and can be easily remedied) and secondly it was missing the jni header. My solution was as follows:
tar xzvf kyotocabinet-java-1.24.tar.gz cd kyotocabinet-java-1.24 CPPFLAGS="-I/System/Library/Frameworks/JavaVM.framework/Versions/A/Headers" ./configure --with-kc=/opt/local/
If you are using HomeBrew or install KyotoCabinet directly from source, don’t add the –with-kc=… parameter to configure.
Still, make did not find the jni.h, looking in the wrong places. There probably is a less hacky solution than this, but I simply appended the line
to the CPPFLAGS parameter in the Makefile that configure created. Now make will work, and you can install regularly:
make (sudo) make install
Integrating KyotoCabinet in Eclipse
I’m working in Eclipse, and the Java driver of KyotoCabinet needs the LD_LIBRARY_PATH set. Eclipse allows for a convenient solution here. In the project build path, add the kyotocabinet.jar (now located in /usr/local/lib). Next, click the triangle to the left of the newly added jar and set the “Native library location” to the same path, /user/local/lib. This should do the trick, you can now create a kyotocabinet.DB object and work with it.
You can still run the java application from the command line using the -D parameter:
java -cp .:kyotocabinet.jar -Djava.library.path=.:/usr/local/lib
Setting the Database Type
KyotoCabinet allows for two different implementations of the actual storage that behave differently. One is a hash-backed storage, the other one a B+ tree storage. For lots of random access the hash based one is preferable, however the decision which one to choose really depends on the needs of your application. I want a hash, but the big question was how to tell the DB object to actually create a hash based DB and not a tree based one. After searching the Web unsuccessfully for a while I finally found a blog post detailing this. It all depends on the file extension you are using for the filename that is passed to the DB constructor. Using .kch as extension will create a hash based DB, using .kct a tree based one. You can even make KyotoCabinet completely in-memory by using ‘+’ or ‘-‘ as filename, minus creating a hash based, plus creating a tree based one.
I won’t give any concrete examples, but my main intention for using KyotoCabinet is together with Google’s protobuf. Protocol buffers allow the specification of nested data structures which nicely serialize into a byte array. The serialization can then be fed to KyotoCabinet. This should make for a really nice way to store more complex, nested data which does not play well with relational databases.