Exception in Python-C extension

ID: 8
creation date: 2010/06/07 10:45
modification date: 2010/06/07 10:45
owner: mikio

I'm improving the Python binding of Kyoto Cabinet. I hit the wall as to its implementation, so I ask a favor of Python/C hackers.

Background

By default, each method of the Python binding does not report its error by raising exception. So, you should check the error code every time.

db = DB()
if db.open("casket", DB.OWRITER):
    if not db.put("japan", "tokyo"):
        print("put error:" + str(db.error()))
    if not db.close():
        print("close error:" + str(db.error()))
else:
    print("open error:" + str(db.error()))

The default behavior has a purpose to comply with the common language interface. However, I know that many users prefer the style using exception mechanism for error reporting. So, I want to add an option mode for exception reporting errors of the Python binding. Like this.

db = DB(DB.EXCEPTIONAL)   # enable the exceptional mode
try:
    db.open("casket", DB.OWRITER):
    db.put("japan", "tokyo")
except Error as e:
    print("error:" + str(e))
finally:
    try:
        db.close()
    except Error as e:
        print("error:" + str(e))

For that purpose, I think I should define the "Error" class as a child of the built-in "Exception" or "RuntimeError" or something. I wrote a prototype implementation to do it (see the C code and the package).

Problem

I found there's a memory leak problem in the code. The following test displays that the memory usage (resident size shown by "top") is growing unboundedly.

for i in range(1, 10000000000):
    try:
        print("try: " + str(i))
        sys.stdout.flush()
        raise Error("1: myerror")
    except RuntimeError as e:
        print("catch: " + str(e))
        sys.stdout.flush()

I think that I made some mistakes as to the way of inheritance to the built-in exception class. The current definition of the Error class is implemented as the following.

static bool define_err() {
  static PyTypeObject type_err = { PyVarObject_HEAD_INIT(NULL, 0) };
  size_t zoff = offsetof(PyTypeObject, tp_name);
  std::memset((char*)&type_err + zoff, 0, sizeof(type_err) - zoff);
  type_err.tp_name = "kyotocabinet.Error";
  size_t psiz = ((PyTypeObject*)PyExc_RuntimeError)->tp_basicsize;
  size_t rem = psiz % sizeof(void*);
  if (rem > 0) psiz += sizeof(void*) - rem;
  type_err.tp_basicsize = psiz + sizeof(Error_data);
  type_err.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE;
  type_err.tp_doc = "Error data.";
  type_err.tp_new = err_new;
  type_err.tp_dealloc = (destructor)err_dealloc;
  type_err.tp_init = (initproc)err_init;
  type_err.tp_repr = (unaryfunc)err_repr;
  type_err.tp_str = (unaryfunc)err_str;
  type_err.tp_richcompare = (richcmpfunc)err_richcmp;
  static PyMethodDef err_methods[] = {
    ...
    { NULL, NULL, 0, NULL }
  };
  type_err.tp_methods = err_methods;
  type_err.tp_base = (PyTypeObject*)PyExc_RuntimeError;
  if (PyType_Ready(&type_err) != 0) return false;
  cls_err = (PyObject*)&type_err;
  ...
  Py_INCREF(cls_err);
  if (PyModule_AddObject(mod_kc, "Error", cls_err) != 0) return false;
  return true;
}

static PyObject* err_new(PyTypeObject* pytype, PyObject* pyargs, PyObject* pykwds) {
  Error_data* data = (Error_data*)pytype->tp_alloc(pytype, 0);
  if (!data) return NULL;
  Py_INCREF(Py_None);
  data->pycode = Py_None;
  Py_INCREF(Py_None);
  data->pymessage = Py_None;
  return (PyObject*)data;
}

static void err_dealloc(Error_data* data) {
  Py_DECREF(data->pymessage);
  Py_DECREF(data->pycode);
  Py_TYPE(data)->tp_free((PyObject*)data);
}

I'm not assured about several points. First, I wondered the assignment to the "tp_basicsize" member is correct or not. Now it is calculated by adding the size of the parent to the size of the self object.

  size_t psiz = ((PyTypeObject*)PyExc_RuntimeError)->tp_basicsize;
  size_t rem = psiz % sizeof(void*);
  if (rem > 0) psiz += sizeof(void*) - rem;
  type_err.tp_basicsize = psiz + sizeof(Error_data);

Second, I wondered the assignment of "PyExc_RuntimeError" to the "tp_base" member is a decent way to declare inheritance.

  type_err.tp_base = (PyTypeObject*)PyExc_RuntimeError;

Third, now the "err_init" function assigned to the "tp_init" member does not explicitly call the initializer function of the parent class. I don't know how to do it and whether to do so.

Help

If you find some flows in my implementation or my design, please tell me as a comment or via e-mail to hirarin@gmail.com. Any suggestion is appreciated. Thanks.

7 comments
mikio : I got an advice to read Modules/xxsubtype.c in the CPython sourcecode. Thanks. (2010/06/09 10:23)
mikio : Reading the source, I found some points. The Error_data structure should have the PyException_HEAD macro as the base members. tp_size should simply be sizeof(Error_data). the err_dealloc function should clear the reference counts of the base members manually. (2010/06/09 16:30)
mikio : As the result, there's no memory leak or corruption. So, I'll release the next version in a few days. (2010/06/09 16:32)
riceball : what's time to add the page lock to tcb(tokyocabinet)? (2010/07/09 16:52)
zdavatz : excellent documentation! Thank you! (2010/07/14 23:44)
lekma : why didn't you use PyErr_NewException (http://docs.python.org/c-api/exceptions.html#PyErr_NewException) (2010/07/28 19:13)
mikio : The kyotocabinet.Error class had already been implemented then. I wanted to change the class hierarchy only. (2010/07/28 19:54)
riddle for guest comment authorization:
Where is the capital city of Japan? ...