LevelDB on Kyoto Tycoon

ID: 30
creation date: 2011/06/10 02:41
modification date: 2011/06/10 02:41
owner: fallabs

Google has released LevelDB, which is a fast persistent key-value store. Although I think that typical use cases of Kyoto Cabinet and LevelDB are different, I have no dought that LevelDB is promissing and useful.

BTW, Kyoto Tycoon can embed arbitrary DBM-style databases by the plug-in mechanism. I explain how to use LevelDB on Kyoto Tycoon.

Installation of LevelDB

As preparation, you have to install LevelDB. You can check out the source code by using SVN.

$ svn checkout http://leveldb.googlecode.com/svn/trunk/ leveldb-read-only

As for now, you have to modify the Makefile before building LevelDB in order to make the object code "position-independent", which is required by shared libraries. Add the "-fPIC" option to the value of the CFLAGS macro.

+CFLAGS = -c -I. -I./include $(PLATFORM_CFLAGS) $(OPT)
-CFLAGS = -c -I. -I./include $(PLATFORM_CFLAGS) $(OPT) -fPIC

Moreover, if you use Ubuntu 11.04, you have to modify the "port/port_posix.h" to use a modern header file.

+#include <cstdatomic>
-#include <atomic>

Then, build the library and install some files.

$ make
$ sudo cp libleveldb.a /usr/local/lib
$ sudo cp -r include/leveldb /usr/local/include

Build and Use the Pluggable Database

When you download the latest version of Kyoto Tycoon and expand the archive file, you will see the directory "labs/leveldb". Let's compile the source file there after

$ cd kyototycoon-x.y.z
$ ./configure
$ make
$ sudo make install
$ cd lab/leveldb
$ make
$ sudo cp ktplugdblevel.so /usr/local/lib

Run the server of KT using the pluggable database of LevelDB.

$ ktserver -pldb /usr/local/lib/ktplugdblevel.so casket.ldb

Set some records from another terminal.

$ ktremotemgr set japan tokyo
$ ktremotemgr set korea seoul
$ ktremotemgr set china beijing

Check the records.

$ ktremotemgr list -pv

That's all. You can use LevelDB over network. You can combine LevelDB and almost all features of Kyoto Tycoon icluding HTTP RESTful commands, memcached pluggable server, asynchronous replication, hot-backup, Lua extension, and so forth.

Implementation

Let's see the source code of the pluggable database. I quote the core implementation.

class LevelDB : public kt::PluggableDB {
  ...
  bool accept_impl(const char* kbuf, size_t ksiz, Visitor* visitor, bool writable) {
    size_t lidx = kc::hashmurmur(kbuf, ksiz) % RLOCKSLOT;
    if (writable) {
      rlock_.lock_writer(lidx);
    } else {
      rlock_.lock_reader(lidx);
    }
    std::string key(kbuf, ksiz);
    std::string value;
    lv::Status status = db_->Get(lv::ReadOptions(), key, &value);
    const char* rbuf;
    size_t rsiz;
    if (status.ok()) {
      rbuf = visitor->visit_full(kbuf, ksiz, value.data(), value.size(), &rsiz);
    } else {
      rbuf = visitor->visit_empty(kbuf, ksiz, &rsiz);
    }
    bool err = false;
    if (rbuf == kc::BasicDB::Visitor::REMOVE) {
      lv::WriteOptions wopts;
      if (autosync_) wopts.sync = true;
      status = db_->Delete(wopts, key);
      if (!status.ok()) {
        set_error(_KCCODELINE_, Error::SYSTEM, "DB::Delete failed");
        err = true;
      }
    } else if (rbuf != kc::BasicDB::Visitor::NOP) {
      lv::WriteOptions wopts;
      if (autosync_) wopts.sync = true;
      std::string rvalue(rbuf, rsiz);
      status = db_->Put(wopts, key, rvalue);
      if (!status.ok()) {
        set_error(_KCCODELINE_, Error::SYSTEM, "DB::Put failed");
        err = true;
      }
    }
    rlock_.unlock(lidx);
    return !err;
  }
  ...
};

As you see, this implemenation is not so efficient and it doesn't bring out the genuine performance of LevelDB. Every pluggable database must implement the "accept" interface which supports arbitrary call back functions. So, even when the "set" method is called by the client, the existing record is retrieved and the result is just discarded. Moreover, external synchronization is performed to guard the call back function. I'm thinking of more efficient interface by specialization of each public method.

Conclusion

Anyway, I think that now you know how to use LevelDB as a pluggable database. If you are interested in LevelDB and other DBM-style implementations, it can be a good idea to start with writing a pluggable database of Kyoto Tycoon.

25 comments
aris : can I use levelDB using KT polyDB? (2011/07/20 08:37)
fallabs : Yes. This article explains how to do that. KC's PolyDB is KT's TimedDB. (2011/07/20 23:08)
aris : Is this mean if I willl use LevelDB in KC's PolyDB, then I must implement (wrap) LevelDB API myself? (2011/07/21 06:36)
fallabs : You want to use LevelDB on KT, don't you? If so, you don't have to write your own wrapper. Please use my wrapper for LevelDB on KT. If you want to use LevelDB via PolyDB of KC, you have to write your own wrapper implementing BasicDB. (2011/07/21 08:50)
aris : I want to use LevelDB via (embedded) PolyDB of KC, so yes, I need to write my own wrapper. (2011/07/21 13:31)
fallabs : I see. In that case, you will register your wrapper object by calling PolyDB::set_internal_db. (2011/07/21 13:58)
maxpert : Can you give me some brief answer why Google used TreeDB and not HashDB for benchmark comparison; isn't HashDB faster than TreeDB (and as far as I know I can iterate too) (2011/07/30 04:20)
aris : Hi maxpert, in my opinion, it's not about who is faster than, HashDB or TreeDB. It's depend on our needs. TreeDB will be faster for sequential op (eg: sorting, incremental insert) and HashDB will be (may) faster if we use random operation. (2011/08/01 10:06)
fallabs : maxpert: I guess the reason that LevelDB provides ordered access and TreeDB does too but HashDB doesn't. Although HashDB is faster in many cases, it cannot replace TreeDB and something like those ordered structures. (2011/08/01 11:05)
fallabs : aris: I totally agree with you. thanks. (2011/08/01 11:07)
RNZ : excelent use case! (2011/08/01 22:45)
maxpert : I did my own benchmarks here is a totally different result http://maxpert.tumblr.com/post/8330476086/leveldb-vs-kyoto-cabinet-my-findings (2011/08/01 23:30)
fallabs : maxpert: thanks. It's encouraging to me. (2011/08/02 00:07)
fallabs : As we all know, the result of any benchmark test depends on its access pattern and data set. It's no use quarreling about which product is superior. The only matter is whether each product reaches the criteria within each use case. (2011/08/02 00:12)
maxpert : I am not creating quarrel at all I am just adding it as contribution to world and you are right about the the criteria thing; but here is the deal, Basho is adopting LevelDB (atleast heard so) instead of Innostore. Don't you think its time for you to write a distribution system on top of KT (may be create new project); I am pretty sure KC can out perform many of distributed KV stores out there. (2011/08/02 02:51)
RNZ : excelent use case! (2011/08/02 03:15)
fallabs : maxpert: sorry, it's not for you. I just worried about impetuous tendency of some people in the tech industry. (2011/08/02 10:14)
fallabs : Writing distributed storage is interesting to me too. However, I don't have enough time to do so. I have to learn English more in order to survive in my current company. (2011/08/02 10:19)
tim : When morons cannot understand complex and/or abstract concepts they complain that the English language is not employed correctly in order for them to grasp whatever it's important. Your work is excellent, while your English is okay. A lot of morons don't understand Shakespeare either, while they complain about his English, too... The art of surviving in big companies is to avoid morons, and to be careful not to label them accordingly.... I'd also seen Linus Torvalds telling a bunch of Google software engineers that they are morons and that he didn't trust at all their programming.... (2011/08/15 00:34)
fallabs : Hmm, although I don't know the details, swearing at people and things wastes a lot of time and mind. Take it easy and have fun to write programs. (2011/08/16 23:44)
stanley : how can one intergrate the LuxIO storage into Kyoto Tycoon http://luxio.sourceforge.net/ ? (2011/09/09 20:29)
fallabs : Of course, you can integrate LuxIO by the same way for LevelDB. However, you have to consider mutual exclusion control by yourself if LuxIO is not thread-safe. (2011/09/12 18:06)
jms : Hi fallabs, what is your suggestion for a fast, read-only, not-ordered, persistent database? Size of db on-disk should preferably be minimal (because then it's more probable to be cached by the OS). I am considering both tc (b+tree) and leveldb. (2012/01/15 11:08)
bir : Have you reviewed Fractal Tree indexes (TokuDB) ? (2012/06/29 06:34)
saurabh : I just would like to understand the difference in performance of KyotoTycoon with KyotoCabinet vs LevelDB. Have you compared that ? (2014/09/01 21:52)
riddle for guest comment authorization:
Where is the capital city of Japan? ...