Friday, 29 June 2018

pybind11 and python sub-modules

In my last post, I introduced pybind11 and some basic examples. In this post I want to show how to use python sub-modules with your exported bindings. In particular, I want to show how we structured our bindings in sub-modules when the C++ code was in different libraries in our main project.

Python sub-module

A python sub-module is accessible from python like:

1
2
3
>>> from ork import peon
>>> peon.work_work()
I'm not that kind of Orc

In this example, ork is the main module and peon is the sub-module. In pure Python these might have the folder structure

1
2
3
4
ork
    __init__.py
    peon
        __init__.py

C++ Layout

For our code, we had a project that has multiple C++ libraries and we wanted to expose bindings from each library as a different sub-module of ork in Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ork
    CMakeLists.txt
    peon
        CMakeLists.txt
        include/
        src/
    grunt
        CMakeLists.txt
        include/
        src/
    warlock
        CMakeLists.txt
        include/
        src/

So we wanted to export some functionality from ork.peon, ork.grunt, and ork.warlock to python.

Exporting the bindings

Single Shared Object

The easiest way to do this is to create a single shared object using pybind11. This will include all exported bindings for all the libraries you want to export.

A simplified example would be to add a new ork_bindings.cpp after your other libraries:

1
2
3
4
5
6
7
8
9
10
11
12
PYBIND11_MODULE(orc, mymodule) {
    mymodule.doc() = "Orks live here";
 
    py::module peon = mymodule.def_submodule("peon", "A peon is a submodule of 'ork'");
    peon.def("work_work", &Peon::work_work, "Do some work");
 
    py::module grunt = mymodule.def_submodule("grunt", "A grunt is a submodule of 'ork'");
    grunt.def("work_work", &Grunt::work_work, "Do some work");
 
    py::module warlock = mymodule.def_submodule("warlock", "A warlock is a submodule of 'ork'");
    warlock.def("work_work", &Warlock::work_work, "Do some work");
}

In your CMakeLists.txt you export the the module as:

1
2
3
4
5
6
7
pybind11_add_module(ork, ork_bindings.cpp)
target_add_library(ork
    PRIVATE
        peon
        grunt
        warlock
)

This works fine for a small amount of libraries and exported code. However, I didn't like this approach as it moved your export code away from your main code. This would make it easy to forget to add a new function to a binding and allow for consistency issues to creep into the project.

Multiple Shared Objects

Our chosen approach was to use a separate bindings library for each C++ library to be exported as a sub-module. Then use a Python module as the main module.

To do this we added the following to our code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ork/
    CMakeLists.txt
    peon/
        CMakeLists.txt
        include/
        src/
            bindings.cpp
    grunt
        CMakeLists.txt
        include/
        src/
            bindings.cpp
    warlock
        CMakeLists.txt
        include/
        src/
            bindings.cpp

An example bindings.cpp for peon is:

1
2
3
4
5
6
7
PYBIND11_MODULE(pypeon, m)
{
    // rename to a submodule
    m.attr("__name__") = "ork.pypeon";
    m.doc() = "A peon is a submodule of 'ork'";
    m.def("work_work", &Peon::work_work, "Do some work");
}

These bindings were added in each CMakeLists.txt as:

1
2
3
4
5
pybind11_add_module(pypeon, src/bindings.cpp)
target_link_library(pypeon
    PRIVATE
        peon
)

When installing your library you then install as:

1
2
3
4
5
ork/
    __init__.py
    pypeon.[python_info].so
    pygrunt.[python_info].so
    pywarlock.[python_info].so

One of the main problems with this approach is the naming of the sub-modules. As the C++ libraries are called peon, grunt, and warlock, it is not possible to have another CMake target with the same name. Therefore, you have to have a slightly different name. In the above example, I have chosen to add py as a prefix for the sub-module names.

Conclusions

So far we have found our approach to work, even with the downside of having a prefix to the name. This allows us to make sure bindings code lives as close a possible to our C++ code and we can conditionally choose which modules to export using CMake options.

Tuesday, 26 June 2018

Using C++ code from Python with pybind11

I have recently been working on a large C++ code base and needed to make certain parts of the code available in Python for use by our other teams. We looked at a number of different frameworks for this and eventually decided to go with pybind11. In this blog post I will describe the basic usage of pybind11 and how to call your exported code from python. For a full look at the code check out my example repository from here

Basic Example

The examples below can be considered part of a single library that is being made available. An example cpp file is available here

Method

Consider the following method that you want to make available to python

1
2
3
int add(int x, int y) {
    return x + y;
}

To make this available to python you add the following binding:

1
2
3
4
5
6
7
8
9
PYBIND11_MODULE(pybindings, mymodule) {
    using namespace pybind11::literals; // for _a literal to define arguments
    mymodule.doc() = "example module to export code";
    mymodule.def("add",
          &add,
          "Add 2 numbers together",
          "x"_a,
          "y"_a);
}

To explain the above a little bit we have

PYBIND11_MODULE(pybindings, m) - This defines the module name pybindings, with the variable mymodule used to reference it.

mymodule.doc() - Create a doc string for the module which will be displayed when using the help function in python.

mymodule.def - Define the function you want to export to python. The arguments to this are the name of the function in python, The C++ function to export, The doc string for the function, Any arguments for the function

Once build this function could be used as:

1
2
3
>>> import pybindings
>>> pybindings.add(1, 2)
3

Class

Consider the class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Adder {
public:
    Adder(int x) : x_(x) {};
    int add(int y) {
        return x_ + y;
    }
 
    void setAddition(int x) {
        x_ = x;
    }
    int getAddition() {
        return x_;
    }
private:
    int x_;
};

To export this, under your PYBIND11_MODULE add the following:

1
2
3
4
pybind11::class_<Adder>(mymodule, "Adder")
    .def(pybind11::init<int>())
    .def("add", &Adder::add)
    .def_property("addition", &Adder::getAddition, &Adder::setAddition);

This is similar to the method example above except we now define the class using pybind11::class_. The class type is passed as a template argument to this and functions are defined as part of the class instead of the module.

The constructor is defined by calling pybind11::init, and you can add access to class variables as properties using the def_property function.

Once built it can be run using:

1
2
3
4
5
6
7
>>> import pybindings
>>> a = pybindings.Adder(1)
>>> a.add(2)
3
>>> a.addition = 4
>>> a.add(2)
6

stl

pybind11 seamlessly supports using STL containers within python. To to this you import the pybind11/stl.h header to make the bindings available.

1
2
3
4
5
6
7
std::string join(std::vector<char> tojoin) {
    std::string ret;
    ret.reserve(tojoin.size())
    for(auto c: tojoin) {
        ret += c;
    }
}

To export add the function definition to your module:

1
2
3
4
mymodule.def("join",
      &join,
      "Join a list of strings",
      "tojoin"_a);

This is the very similar to the add function from our first example and shows how pybind11 seamlessly supports STL containers.

To call this from python:

1
2
3
4
5
6
>>> import pybindings
>>> s = pybindings.join(['a', 'b', 'c'])
>>> print(s)
abc
>>> type(s)
<class 'str'>

Building the Example

Pybind11 has good support for CMake and be easily integrated into a CMake project. To do this add the following to your CMakeLists.txt

1
2
find_package(pybind11)
pybind11_add_module(pybindings bindings.cpp)

This will add a target to build a library pybindings.[python-information].so.

NOTE: The name of the library must add match the name of the module you defined in your C++ code.

Using the binding

To use the bindings you add the location of the above shared object into your PYTHONPATH. Assuming your built your code in /data/code/build and have a python file /data/code/bindings.py you can run:

1
PYTHONPATH=/data/code/build/ python3 /data/code/bindings.py

The source of bindings.py can be found here