Extension Modules

Have you ever needed a part of your very high level code to run faster? What some developers don't realize is that you don't need to write entire applications in a single language. The overall framework of large applications tends to be a considerable amount of work and requires the least amount of CPU time. It's often all those nasty little features that require a lot of processing power. The good news is that you can write program in a scripting language and implement some of the heavier features in a more efficient language.

There are a few advantages of this. Development time is much much shorter. There will be far fewer bugs. You can get a prototype working quickly, and it will be easy to change. You only have to work in a lower level language for critical functions. It's usually not too difficult to write a single function in say, C. Some other advantages of wrapping low level modules in a high level language are:

  • low level machine access in a high level language
  • code reuse of the low level module
  • difficult code can be made easily accessible to others
  • access to features and libraries not supported by the scripting language
  • learning the scripting language in depth

I want to cover a quick comparison of Ruby and Python extension modules. This isn't a tutorial; I'm only going over a few basic points. There are Python and Ruby tutorials out there.

In Python, you start by including Python.h and writing an initialization module PyMODINIT_FUNC initastman(void). The name must start with 'init', and the rest of the name is the name of the module. In Ruby, you include ruby.h and write a similar initialization function, void Init_rb_astman(). In this case, the function starts with 'Init_' and the remaining text is the module name. Be careful; case matters in Ruby. The first letter of the module is capitalized.

It gets more interesting after that. Ruby extension modules tend to look just like Ruby code, whereas Python extension modules have a bit of overhead. For example, you have to tell Python about your module and class functions.

static PyMethodDef astman_mod_methods[] = {
    {"astman",astman_new,METH_VARARGS,"New astman."},
    {NULL,NULL,0,NULL}
};
static PyMethodDef astman_methods[] = {
    {"connect",connect,METH_VARARGS,"Connect to Asterisk Manager."},
    {"login",login,METH_VARARGS,"Login to Asterisk Manager."},
    {"logoff",logoff,METH_VARARGS,"Logoff of Asterisk Manager."},
    {"close",_close,METH_VARARGS,"Close the connection."},
    {"list",list,METH_VARARGS,"List Manager commands."},
    {"command",command,METH_VARARGS,"Execute Asterisk command."},
    {NULL,NULL,0,NULL}
};

You also need some special structures. Often when wrapping some code or API, there is data that needs to persist between function calls. The first structure below is a wrapper for your variables. Notice there is nothing in it except for the required PyObject_HEAD and the struct from my code. I could have put anything I wanted in there, but I didn't need to because I already organized my data neatly into conn.

The next structure tells Python about all the special functions classes can implement such as __str__, __call__, a destructor, etc. You'll have to implement __getattr__, which should at a minimum lookup the function in the astman_methods array above. Fortunately, I only had to implement a few of functions; so, it's not too much work if you're just doing something simple.

typedef struct {
    PyObject_HEAD
    struct ast_connection *conn;
} astman;

PyTypeObject astman_Type = {
  PyObject_HEAD_INIT(&PyType_Type)
  0,
  "astman",                 /* char *tp_name; */
  sizeof(astman),           /* int tp_basicsize; */
  0,                        /* int tp_itemsize;        not used much */
  astman_dealloc,           /* destructor tp_dealloc; */
  0,                        /* printfunc  tp_print;   */
  astman_getattr,           /* getattrfunc  tp_getattr;  __getattr__ */
  0,                        /* setattrfunc  tp_setattr;   __setattr__ */
  0,                        /* cmpfunc  tp_compare;   __cmp__ */
  0,                        /* reprfunc  tp_repr;     __repr__ */
  NULL,                     /* PyNumberMethods *tp_as_number; */
  NULL,                     /* PySequenceMethods *tp_as_sequence; */
  NULL,                     /* PyMappingMethods *tp_as_mapping; */
  NULL,                     /* hashfunc tp_hash;      __hash__ */
  NULL,                     /* ternaryfunc tp_call;   __call__ */
  NULL,                     /* reprfunc tp_str;       __str__ */
};

Python also requires you to call Py_INCREF whenever you create or reference an object (even if you're just returning None). Ruby takes a different approach to garbage collection; so, you don't need to worry about incrementing or decrementing references. In Python, if an exception is raised, the extension function is responsible for returning NULL. In Ruby, exceptions longjmp out of the code. This means all the work is handled for you. It makes the C extension code look more like Ruby code itself, and there's less room for mistakes.

Arguments are handled a little differently in each language. Both have self as the first parameter to the extension function. Python only has one addition parameter, args. The function is responsible for parsing the args tuple and checking for the correct number of parameters. You must tell Ruby how many parameters each function has before the function is called. Then you declare the function with that number of parameters (plus one for self).

And last there is the issue of building the modules. Both languages have a special module for building modules. The only issue I had was the way you have tell Ruby about library functions. You need to specify each function individually. This is irritating because if you're building the library and the wrapper simultaneously, you have to add a new line to the extensions file for each new library function. Then you have to rebuild the make file before you can run make.

The following two snippets are the extension build files for Ruby and Python respectively. The comments specify how to execute the code. In this case, the library is a static archive (.a).

require 'mkmf'
dir_config("ast_man")
have_library("ast_man","ast_connect")
have_library("ast_man","ast_login")
have_library("ast_man","ast_list_commands")
have_library("ast_man","ast_absolute_timeout")
have_library("ast_man","ast_ping")
have_library("ast_man","ast_logoff")
create_makefile("rb_astman")
#ruby extconf.rb --with-ast_man-lib=/home/goldfita/pyastman
#do not use ~/pyastman

#python setup.py build_ext --inplace

from distutils.core import setup, Extension
setup(name='astman',
      version='0.0.1',
      ext_modules=[Extension('astman', ['astman.c'],library_dirs=['..'],libraries=['ast_man'])],
      )

These are the only two languages for which I have built extension modules. I know Java has native functions. And Perl I would assume lets you build extensions. Matlab has mex functions, which are written in C. If you ever feel you need to optimize a scripting language, consider writing an extension module. Or, on the other hand, if you feel development will take too long in a language like C, try building the frame of the application in a scripting language. Combining languages is a powerful way of dealing with tradeoffs like time vs efficiency.


home