Home | History | Annotate | Download | only in howto
      1 .. highlightlang:: c
      2 
      3 .. _cporting-howto:
      4 
      5 *************************************
      6 Porting Extension Modules to Python 3
      7 *************************************
      8 
      9 :author: Benjamin Peterson
     10 
     11 
     12 .. topic:: Abstract
     13 
     14    Although changing the C-API was not one of Python 3's objectives,
     15    the many Python-level changes made leaving Python 2's API intact
     16    impossible.  In fact, some changes such as :func:`int` and
     17    :func:`long` unification are more obvious on the C level.  This
     18    document endeavors to document incompatibilities and how they can
     19    be worked around.
     20 
     21 
     22 Conditional compilation
     23 =======================
     24 
     25 The easiest way to compile only some code for Python 3 is to check
     26 if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
     27 
     28    #if PY_MAJOR_VERSION >= 3
     29    #define IS_PY3K
     30    #endif
     31 
     32 API functions that are not present can be aliased to their equivalents within
     33 conditional blocks.
     34 
     35 
     36 Changes to Object APIs
     37 ======================
     38 
     39 Python 3 merged together some types with similar functions while cleanly
     40 separating others.
     41 
     42 
     43 str/unicode Unification
     44 -----------------------
     45 
     46 Python 3's :func:`str` type is equivalent to Python 2's :func:`unicode`; the C
     47 functions are called ``PyUnicode_*`` for both.  The old 8-bit string type has become
     48 :func:`bytes`, with C functions called ``PyBytes_*``.  Python 2.6 and later provide a compatibility header,
     49 :file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones.  For best
     50 compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
     51 :c:type:`PyBytes` for binary data.  It's also important to remember that
     52 :c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
     53 :c:type:`PyString` and :c:type:`PyUnicode` are in Python 2.  The following example
     54 shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
     55 and :c:type:`PyBytes`. ::
     56 
     57    #include "stdlib.h"
     58    #include "Python.h"
     59    #include "bytesobject.h"
     60 
     61    /* text example */
     62    static PyObject *
     63    say_hello(PyObject *self, PyObject *args) {
     64        PyObject *name, *result;
     65 
     66        if (!PyArg_ParseTuple(args, "U:say_hello", &name))
     67            return NULL;
     68 
     69        result = PyUnicode_FromFormat("Hello, %S!", name);
     70        return result;
     71    }
     72 
     73    /* just a forward */
     74    static char * do_encode(PyObject *);
     75 
     76    /* bytes example */
     77    static PyObject *
     78    encode_object(PyObject *self, PyObject *args) {
     79        char *encoded;
     80        PyObject *result, *myobj;
     81 
     82        if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
     83            return NULL;
     84 
     85        encoded = do_encode(myobj);
     86        if (encoded == NULL)
     87            return NULL;
     88        result = PyBytes_FromString(encoded);
     89        free(encoded);
     90        return result;
     91    }
     92 
     93 
     94 long/int Unification
     95 --------------------
     96 
     97 Python 3 has only one integer type, :func:`int`.  But it actually
     98 corresponds to Python 2's :func:`long` typethe :func:`int` type
     99 used in Python 2 was removed.  In the C-API, ``PyInt_*`` functions
    100 are replaced by their ``PyLong_*`` equivalents.
    101 
    102 
    103 Module initialization and state
    104 ===============================
    105 
    106 Python 3 has a revamped extension module initialization system.  (See
    107 :pep:`3121`.)  Instead of storing module state in globals, they should
    108 be stored in an interpreter specific structure.  Creating modules that
    109 act correctly in both Python 2 and Python 3 is tricky.  The following
    110 simple example demonstrates how. ::
    111 
    112    #include "Python.h"
    113 
    114    struct module_state {
    115        PyObject *error;
    116    };
    117 
    118    #if PY_MAJOR_VERSION >= 3
    119    #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
    120    #else
    121    #define GETSTATE(m) (&_state)
    122    static struct module_state _state;
    123    #endif
    124 
    125    static PyObject *
    126    error_out(PyObject *m) {
    127        struct module_state *st = GETSTATE(m);
    128        PyErr_SetString(st->error, "something bad happened");
    129        return NULL;
    130    }
    131 
    132    static PyMethodDef myextension_methods[] = {
    133        {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
    134        {NULL, NULL}
    135    };
    136 
    137    #if PY_MAJOR_VERSION >= 3
    138 
    139    static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
    140        Py_VISIT(GETSTATE(m)->error);
    141        return 0;
    142    }
    143 
    144    static int myextension_clear(PyObject *m) {
    145        Py_CLEAR(GETSTATE(m)->error);
    146        return 0;
    147    }
    148 
    149 
    150    static struct PyModuleDef moduledef = {
    151            PyModuleDef_HEAD_INIT,
    152            "myextension",
    153            NULL,
    154            sizeof(struct module_state),
    155            myextension_methods,
    156            NULL,
    157            myextension_traverse,
    158            myextension_clear,
    159            NULL
    160    };
    161 
    162    #define INITERROR return NULL
    163 
    164    PyMODINIT_FUNC
    165    PyInit_myextension(void)
    166 
    167    #else
    168    #define INITERROR return
    169 
    170    void
    171    initmyextension(void)
    172    #endif
    173    {
    174    #if PY_MAJOR_VERSION >= 3
    175        PyObject *module = PyModule_Create(&moduledef);
    176    #else
    177        PyObject *module = Py_InitModule("myextension", myextension_methods);
    178    #endif
    179 
    180        if (module == NULL)
    181            INITERROR;
    182        struct module_state *st = GETSTATE(module);
    183 
    184        st->error = PyErr_NewException("myextension.Error", NULL, NULL);
    185        if (st->error == NULL) {
    186            Py_DECREF(module);
    187            INITERROR;
    188        }
    189 
    190    #if PY_MAJOR_VERSION >= 3
    191        return module;
    192    #endif
    193    }
    194 
    195 
    196 CObject replaced with Capsule
    197 =============================
    198 
    199 The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
    200 :c:type:`CObject`.  CObjects were useful,
    201 but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
    202 between valid CObjects, which allowed mismatched CObjects to crash the
    203 interpreter, and some of its APIs relied on undefined behavior in C.
    204 (For further reading on the rationale behind Capsules, please see :issue:`5630`.)
    205 
    206 If you're currently using CObjects, and you want to migrate to 3.1 or newer,
    207 you'll need to switch to Capsules.
    208 :c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
    209 Python 3.2.  If you only support 2.7, or 3.1 and above, you
    210 can simply switch to :c:type:`Capsule`.  If you need to support Python 3.0,
    211 or versions of Python earlier than 2.7,
    212 you'll have to support both CObjects and Capsules.
    213 (Note that Python 3.0 is no longer supported, and it is not recommended
    214 for production use.)
    215 
    216 The following example header file :file:`capsulethunk.h` may
    217 solve the problem for you.  Simply write your code against the
    218 :c:type:`Capsule` API and include this header file after
    219 :file:`Python.h`.  Your code will automatically use Capsules
    220 in versions of Python with Capsules, and switch to CObjects
    221 when Capsules are unavailable.
    222 
    223 :file:`capsulethunk.h` simulates Capsules using CObjects.  However,
    224 :c:type:`CObject` provides no place to store the capsule's "name".  As a
    225 result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
    226 behave slightly differently from real Capsules.  Specifically:
    227 
    228   * The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
    229 
    230   * The name parameter passed in to :c:func:`PyCapsule_IsValid` and
    231     :c:func:`PyCapsule_GetPointer` is ignored, and no error checking
    232     of the name is performed.
    233 
    234   * :c:func:`PyCapsule_GetName` always returns NULL.
    235 
    236   * :c:func:`PyCapsule_SetName` always raises an exception and
    237     returns failure.  (Since there's no way to store a name
    238     in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
    239     was deemed preferable to silent failure here.  If this is
    240     inconvenient, feel free to modify your local
    241     copy as you see fit.)
    242 
    243 You can find :file:`capsulethunk.h` in the Python source distribution
    244 as :source:`Doc/includes/capsulethunk.h`.  We also include it here for
    245 your convenience:
    246 
    247 .. literalinclude:: ../includes/capsulethunk.h
    248 
    249 
    250 
    251 Other options
    252 =============
    253 
    254 If you are writing a new extension module, you might consider `Cython
    255 <http://cython.org/>`_.  It translates a Python-like language to C.  The
    256 extension modules it creates are compatible with Python 3 and Python 2.
    257 
    258