Kyoto Cabinet 1.0.0 was released!

ID: 7
creation date: 2010/05/25 03:03
modification date: 2010/05/25 03:03
owner: mikio

I released the initial stable version of Kyoto Cabinet on 25th May 2010. This entry describes introduction to Kyoto Cabinet.

image:1:1274915061-kclayers.png

Features

Now that all of my plans have been achieved, Kyoto Cabinet has the following features. Especially, Windows support is remarkable.

  • time efficiency: Throughput of updating is more than 100 millions query-per-second.
  • space efficiency: Footprint for each record is 8-16 bytes in the hash DB, 2-4 bytes in the tree DB.
  • concurrency: The hash DB uses read-write lock for each record. The tree DB uses read-write lock for each page.
  • usability: Generic operations of database by interface like the "Visitor" pattern are provided.
  • robustness: Manual transaction, auto transaction, and auto recovery are provided.
  • portability: UNIX-like systems (Linux, FreeBSD, Solaris, Mac OS X) and Windows (VC++) are supported.
  • language bindings: C++, C, Java, Python, Ruby, and Perl are supported.

Compared with Tokyo Cabinet, KC is superior in concurrency, usability, and portability. Although time efficiency for single-thread is better in TC, I recommend KC from now on because multi-core/many-core CPU has been popular. However, I will keep on maintaining TC and fix bugs if they are found.

Language Bindings

I recommend Java first. Typical use cases of KC which I expected mainly are in the backend of large Web services. I think that Java is the most popular one except for C++ in such use cases. Because concurrency is a major sales point of KC, concurrent runtime environments are preferred.

Python and Ruby come second. Although each has the GIL (global interpreter lock) to guard its API from race condition, KC provides "concurrent mode" which uses the API to unlock the GIL temporarily while native functions are called.

Perl is also supported. However, the Perl binding is not thread-safe for ithread (the thread mechanism of Perl). Because ithread is "green thread", which implemented in the user land, such native threading primitives as locking and thread local storage do not work.

Getting Started

Download the latest version from the homepage. Then read the installation section and the tutorial section. Choose your favorite language and write some sample codes.

Basic Example

Although all of the following sample codes are described in the documents for each language, I place them here to compare them with each other. All bindings conform to the common interface defined by IDL.

C++

#include <kcpolydb.h>

using namespace std;
using namespace kyotocabinet;

// main routine
int main(int argc, char** argv) {

  // create the database object
  PolyDB db;

  // open the database
  if (!db.open("casket.kch", PolyDB::OWRITER | PolyDB::OCREATE)) {
    cerr << "open error: " << db.error().name() << endl;
  }

  // store records
  if (!db.set("foo", "hop") ||
      !db.set("bar", "step") ||
      !db.set("baz", "jump")) {
    cerr << "set error: " << db.error().name() << endl;
  }

  // retrieve a record
  string* value = db.get("foo");
  if (value) {
    cout << *value << endl;
    delete value;
  } else {
    cerr << "get error: " << db.error().name() << endl;
  }

  // traverse records
  DB::Cursor* cur = db.cursor();
  cur->jump();
  pair<string, string>* rec;
  while ((rec = cur->get_pair(true)) != NULL) {
    cout << rec->first << ":" << rec->second << endl;
    delete rec;
  }
  delete cur;

  // close the database
  if (!db.close()) {
    cerr << "close error: " << db.error().name() << endl;
  }

  return 0;
}

Java

import kyotocabinet.*;

public class KCDBEX1 {
  public static void main(String[] args) {

    // create the object
    DB db = new DB();

    // open the database
    if (!db.open("casket.kch", DB.OWRITER | DB.OCREATE)){
      System.err.println("open error: " + db.error());
    }

    // store records
    if (!db.set("foo", "hop") ||
        !db.set("bar", "step") ||
        !db.set("baz", "jump")){
      System.err.println("set error: " + db.error());
    }

    // retrieve records
    String value = db.get("foo");
    if (value != null){
      System.out.println(value);
    } else {
      System.err.println("set error: " + db.error());
    }

    // traverse records
    Cursor cur = db.cursor();
    cur.jump();
    String[] rec;
    while ((rec = cur.get_str(true)) != null) {
      System.out.println(rec[0] + ":" + rec[1]);
    }
    cur.disable();

    // close the database
    if(!db.close()){
      System.err.println("close error: " + db.error());
    }

  }
}

Python

from kyotocabinet import *
import sys

# create the database object
db = DB()

# open the database
if not db.open("casket.kch", DB.OWRITER | DB.OCREATE):
    print("open error: " + str(db.error()), file=sys.stderr)

# store records
if not db.set("foo", "hop") or \
        not db.set("bar", "step") or \
        not db.set("baz", "jump"):
    print("set error: " + str(db.error()), file=sys.stderr)

# retrieve records
value = db.get_str("foo")
if value:
    print(value)
else:
    print("get error: " + str(db.error()), file=sys.stderr)

# traverse records
cur = db.cursor()
cur.jump()
while True:
    rec = cur.get_str(True)
    if not rec: break
    print(rec[0] + ":" + rec[1])
cur.disable()

# close the database
if not db.close():
    print("close error: " + str(db.error()), file=sys.stderr)

Ruby

require 'kyotocabinet'
include KyotoCabinet

# create the database object
db = DB::new

# open the database
unless db.open('casket.kch', DB::OWRITER | DB::OCREATE)
  STDERR.printf("open error: %s\n", db.error)
end

# store records
unless db.set('foo', 'hop') and
    db.set('bar', 'step') and
    db.set('baz', 'jump')
  STDERR.printf("set error: %s\n", db.error)
end

# retrieve records
value = db.get('foo')
if value
  printf("%s\n", value)
else
  STDERR.printf("get error: %s\n", db.error)
end

# traverse records
cur = db.cursor
cur.jump
while rec = cur.get(true)
  printf("%s:%s\n", rec[0], rec[1])
end
cur.disable

# close the database
unless db.close
  STDERR.printf("close error: %s\n", db.error)
end

Perl

use KyotoCabinet;

# create the database object
my $db = new KyotoCabinet::DB;

# open the database
if (!$db->open('casket.kch', $db->OWRITER | $db->OCREATE)) {
    printf STDERR ("open error: %s\n", $db->error);
}

# store records
if (!$db->set('foo', 'hop') ||
    !$db->set('bar', 'step') ||
    !$db->set('baz', 'jump')) {
    printf STDERR ("set error: %s\n", $db->error);
}

# retrieve records
my $value = $db->get('foo');
if (defined($value)) {
    printf("%s\n", $value);
} else {
    printf STDERR ("get error: %s\n", $db->error);
}

# traverse records
my $cur = $db->cursor;
$cur->jump;
while (my ($key, $value) = $cur->get(1)) {
    printf("%s:%s\n", $key, $value);
}
$cur->disable;

# close the database
if (!$db->close) {
    printf STDERR ("close error: %s\n", $db->error);
}

Visitor Pattern

All database classes of KC have methods to operate records like associative array. "set", "remove", and "get" are typical. More complex methods such as "increment" and "cas" are also provided by default. However, you may want to use your own operations. The "visitor" pattern is preferable for that purpose. Define a visitor object and pass it to the "accept" method so that the call back method defined by the visitor is executed with a record data atomically.

Atomicity is a key feature in multi-thread environment. While one thread are incrementing a record value by several method call such as "get" and "set", another thread may update the same record at the same time. In that case, former operation get whitewashed. KC solves the problem by the "accept" method which is guarded by record locking.

The following are examples in Java and Ruby. Other language bindings also support the visitor pattern.

Java

import kyotocabinet.*;

public class KCDBEX2 {
  public static void main(String[] args) {

    // create the object
    DB db = new DB();

    // open the database
    if (!db.open("casket.kch", DB.OREADER)) {
      System.err.println("open error: " + db.error());
    }

    // define the visitor
    class VisitorImpl implements Visitor {
      public byte[] visit_full(byte[] key, byte[] value) {
        System.out.println(new String(key) + ":" + new String(value));
        return NOP;
      }
      public byte[] visit_empty(byte[] key) {
        System.err.println(new String(key) + " is missing");
        return NOP;
      }
    }
    Visitor visitor = new VisitorImpl();

    // retrieve a record with visitor
    if (!db.accept("foo".getBytes(), visitor, false) ||
        !db.accept("dummy".getBytes(), visitor, false)) {
      System.err.println("accept error: " + db.error());
    }

    // traverse records with visitor
    if (!db.iterate(visitor, false)) {
      System.err.println("iterate error: " + db.error());
    }

    // close the database
    if(!db.close()){
      System.err.println("close error: " + db.error());
    }

  }
}

Ruby

require 'kyotocabinet'
include KyotoCabinet

# create the database object
db = DB::new

# open the database
unless db.open('casket.kch', DB::OREADER)
  STDERR.printf("open error: %s\n", db.error)
end

# define the visitor
class VisitorImpl < Visitor
  # call back function for an existing record
  def visit_full(key, value)
    printf("%s:%s\n", key, value)
    return NOP
  end
  # call back function for an empty record space
  def visit_empty(key)
    STDERR.printf("%s is missing\n", key)
    return NOP
  end
end
visitor = VisitorImpl::new

# retrieve a record with visitor
unless db.accept("foo", visitor, false) and
    db.accept("dummy", visitor, false)
  STDERR.printf("accept error: %s\n", db.error)
end

# traverse records with visitor
unless db.iterate(visitor, false)
  STDERR.printf("iterate error: %s\n", db.error)
end

# close the database
unless db.close
  STDERR.printf("close error: %s\n", db.error)
end

Popular scripting languages provide "closure" mechanisms and you can use it as a visitor instead of a derived object under the class inheritance mechanism. The following is an example in Python.

from kyotocabinet import *
import sys

# define the functor
def dbproc(db):

  # store records
  db[b'foo'] = b'step';  # bytes is fundamental
  db['bar'] = 'hop';     # string is also ok
  db[3] = 'jump';        # number is also ok

  # retrieve a record value
  print("{}".format(db['foo'].decode()))

  # update records in transaction
  def tranproc():
      db['foo'] = 2.71828
      return True
  db.transaction(tranproc)

  # multiply a record value
  def mulproc(key, value):
      return float(value) * 2
  db.accept('foo', mulproc)

  # traverse records by iterator
  for key in db:
      print("{}:{}".format(key.decode(), db[key].decode()))

  # upcase values by iterator
  def upproc(key, value):
      return value.upper()
  db.iterate(upproc)

  # traverse records by cursor
  def curproc(cur):
      cur.jump()
      def printproc(key, value):
          print("{}:{}".format(key.decode(), value.decode()))
          return Visitor.NOP
      while cur.accept(printproc):
          cur.step()
  db.cursor_process(curproc)

# process the database by the functor
DB.process(dbproc, 'casket.kch')

Conclusion

Kyoto Cabinet is a powerful tool to operate persistent associative array or key-value storage. Time and space efficiency is great. KC is easy to use in most popular languages and provides extreme flexibility by the visitor pattern. Please try it and probably you will take to it.

27 comments
ahfu : It is so good to release KyotoCabinet 1.0. Congratunations! By the way, I send mail to you many times to discuss tt/tc, but you never reply me.Whatever, I wish I can get a reply from you, my mail is ahfuzhang@gmail.com,thanks! (2010/05/25 20:15)
dpavlin : Thanks for your great work, I just wonder do you have any estimate when Kyoto Tyrant (networking) and Kyto Dystopia (full-text search) can be expected? (2010/05/25 21:59)
mikio : ahfu: I sent mail to you. (2010/05/26 06:45)
mikio : dpavlin: I'll undertake a project like Kyoto Tyrant in this year but I can't say the delivery time clearly. (2010/05/26 06:48)
mark : I am very happy with TC/TT and will keep using, but it is very exciting to see the release of KC! I will investigate when I get some spare time. (2010/05/26 11:53)
phil : Congratulations! I have been a (silent) and long time follower and fan of your work, from the time of hyperestraier. Now the fact that all of Kyoto Cabinet is GPL3 makes it hard to reuse in another non-GPL open source project. Would you consider for instance to make the clients bindings available under a non-GPL license such as LGPL or BSD, so that Kyoto Cabinet can also be used there? Mikio, thank you for your kind consideration on this request! (2010/05/26 20:52)
phil : just to add that I think that the GPL only license of KC may be a deterrent for folks like me that use Tokyo Cabinet to switch from Tokyo Cabinet to Kyoto Cabinet . Thanks again for your kind consideration of this request. (2010/05/26 20:55)
rj : I use tokyocabinet for my upcoming project. Kyoto cabinate seem to be interesting but it is under GPL which means we can't package and redistribute it. What there any reason to change from LGPL to GPL? (2010/05/27 01:04)
mikio : I'll provide GPL linking exception. But, the concrete terms are not determined yet. I want combine the two of usability as OSS and feeding myself by commercial license. (2010/05/27 07:41)
rj : Infact we wanted to use as XDMS server and get a commercial support from you. We don't use any open source project unless it is commercially supported because our customers are world's largest telco operators. I will wait to see more clarification on licensing term. Also it would be easy if you had some company website which give more information on commercial support. (2010/05/27 08:11)
neithere : Great news! Congrats. Would you please upload the Python bindings to PyPI so they could be easily installed? For now related software cannot officially depend on them. I'd like to make a Kyoto backend for PyModels. (2010/05/27 15:19)
mikio : As for now, KC supports Python 3 only. After I write the Python2 binding, I'll do it. (2010/05/28 21:11)
trax100 : Wow :) Good news. (2010/05/31 15:18)
angus : It's gret! Keeping walking!! (2010/06/22 20:29)
wburdick : Our small company is interested in using kyoto cabinet with our proprietary code. Please contact me at bill dot burdick at gmail dot com. (2010/06/29 00:37)
johan : Hi Mikio, I think Tokyo Cabinet have something called narrowing conditions, will there be something like that in Kyoto Cabinet in the near future? (2010/07/18 05:20)
eric : I'd like to consider Kyoto Cabinet for some of my projects, but I have a static linking requirement. Please let us know your plans for a commercial license. (2010/07/21 14:25)
eric : I'd like to consider Kyoto Cabinet for some of my projects, but I have a static linking requirement. Please let us know your plans for a commercial license. (2010/07/21 14:48)
eric : What type of model are you thinking of for commercial licensing? (subscription, per developer, royalty, something else) (2010/07/23 13:33)
mikio : "per developer" is simple and preferable for us. (2010/07/23 17:47)
kurokikaze : Whoa, this looks promising. Looking forward for Kyoto Tyrant. (2010/07/27 22:35)
eric : Is there anyway to do forward matching on keys like in Tokyo Cabinet? I've been digging through the API and can't find a similar feature. (2010/08/17 10:59)
mikio : No. But, you can define such a function easily by using a cursor. (2010/08/17 11:43)
jdictos : Any more information on the GPL linking exception? (2010/08/27 04:38)
mikio : Now, I'm consulting a lawyer to provide the license. I'll blog about the detail information in a few weeks. (2010/08/28 20:48)
jdictos : Can you create a kyoto cabinet database file on linux, then read it on windows? Is the file format the same between platforms? (2010/08/30 03:00)
mikio : Yes. The database format is portable between Linux and Windows, and between big endian and little endian. (2010/08/30 08:49)
riddle for guest comment authorization:
Where is the capital city of Japan? ...