1 .. highlightlang:: c 2 3 .. _cporting-howto: 4 5 ************************************* 6 Porting Extension Modules to Python 3 7 ************************************* 8 9 :author: Benjamin Peterson 10 11 12 .. topic:: Abstract 13 14 Although changing the C-API was not one of Python 3's objectives, 15 the many Python-level changes made leaving Python 2's API intact 16 impossible. In fact, some changes such as :func:`int` and 17 :func:`long` unification are more obvious on the C level. This 18 document endeavors to document incompatibilities and how they can 19 be worked around. 20 21 22 Conditional compilation 23 ======================= 24 25 The easiest way to compile only some code for Python 3 is to check 26 if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. :: 27 28 #if PY_MAJOR_VERSION >= 3 29 #define IS_PY3K 30 #endif 31 32 API functions that are not present can be aliased to their equivalents within 33 conditional blocks. 34 35 36 Changes to Object APIs 37 ====================== 38 39 Python 3 merged together some types with similar functions while cleanly 40 separating others. 41 42 43 str/unicode Unification 44 ----------------------- 45 46 Python 3's :func:`str` type is equivalent to Python 2's :func:`unicode`; the C 47 functions are called ``PyUnicode_*`` for both. The old 8-bit string type has become 48 :func:`bytes`, with C functions called ``PyBytes_*``. Python 2.6 and later provide a compatibility header, 49 :file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best 50 compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and 51 :c:type:`PyBytes` for binary data. It's also important to remember that 52 :c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like 53 :c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example 54 shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`, 55 and :c:type:`PyBytes`. :: 56 57 #include "stdlib.h" 58 #include "Python.h" 59 #include "bytesobject.h" 60 61 /* text example */ 62 static PyObject * 63 say_hello(PyObject *self, PyObject *args) { 64 PyObject *name, *result; 65 66 if (!PyArg_ParseTuple(args, "U:say_hello", &name)) 67 return NULL; 68 69 result = PyUnicode_FromFormat("Hello, %S!", name); 70 return result; 71 } 72 73 /* just a forward */ 74 static char * do_encode(PyObject *); 75 76 /* bytes example */ 77 static PyObject * 78 encode_object(PyObject *self, PyObject *args) { 79 char *encoded; 80 PyObject *result, *myobj; 81 82 if (!PyArg_ParseTuple(args, "O:encode_object", &myobj)) 83 return NULL; 84 85 encoded = do_encode(myobj); 86 if (encoded == NULL) 87 return NULL; 88 result = PyBytes_FromString(encoded); 89 free(encoded); 90 return result; 91 } 92 93 94 long/int Unification 95 -------------------- 96 97 Python 3 has only one integer type, :func:`int`. But it actually 98 corresponds to Python 2's :func:`long` typethe :func:`int` type 99 used in Python 2 was removed. In the C-API, ``PyInt_*`` functions 100 are replaced by their ``PyLong_*`` equivalents. 101 102 103 Module initialization and state 104 =============================== 105 106 Python 3 has a revamped extension module initialization system. (See 107 :pep:`3121`.) Instead of storing module state in globals, they should 108 be stored in an interpreter specific structure. Creating modules that 109 act correctly in both Python 2 and Python 3 is tricky. The following 110 simple example demonstrates how. :: 111 112 #include "Python.h" 113 114 struct module_state { 115 PyObject *error; 116 }; 117 118 #if PY_MAJOR_VERSION >= 3 119 #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m)) 120 #else 121 #define GETSTATE(m) (&_state) 122 static struct module_state _state; 123 #endif 124 125 static PyObject * 126 error_out(PyObject *m) { 127 struct module_state *st = GETSTATE(m); 128 PyErr_SetString(st->error, "something bad happened"); 129 return NULL; 130 } 131 132 static PyMethodDef myextension_methods[] = { 133 {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL}, 134 {NULL, NULL} 135 }; 136 137 #if PY_MAJOR_VERSION >= 3 138 139 static int myextension_traverse(PyObject *m, visitproc visit, void *arg) { 140 Py_VISIT(GETSTATE(m)->error); 141 return 0; 142 } 143 144 static int myextension_clear(PyObject *m) { 145 Py_CLEAR(GETSTATE(m)->error); 146 return 0; 147 } 148 149 150 static struct PyModuleDef moduledef = { 151 PyModuleDef_HEAD_INIT, 152 "myextension", 153 NULL, 154 sizeof(struct module_state), 155 myextension_methods, 156 NULL, 157 myextension_traverse, 158 myextension_clear, 159 NULL 160 }; 161 162 #define INITERROR return NULL 163 164 PyMODINIT_FUNC 165 PyInit_myextension(void) 166 167 #else 168 #define INITERROR return 169 170 void 171 initmyextension(void) 172 #endif 173 { 174 #if PY_MAJOR_VERSION >= 3 175 PyObject *module = PyModule_Create(&moduledef); 176 #else 177 PyObject *module = Py_InitModule("myextension", myextension_methods); 178 #endif 179 180 if (module == NULL) 181 INITERROR; 182 struct module_state *st = GETSTATE(module); 183 184 st->error = PyErr_NewException("myextension.Error", NULL, NULL); 185 if (st->error == NULL) { 186 Py_DECREF(module); 187 INITERROR; 188 } 189 190 #if PY_MAJOR_VERSION >= 3 191 return module; 192 #endif 193 } 194 195 196 CObject replaced with Capsule 197 ============================= 198 199 The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace 200 :c:type:`CObject`. CObjects were useful, 201 but the :c:type:`CObject` API was problematic: it didn't permit distinguishing 202 between valid CObjects, which allowed mismatched CObjects to crash the 203 interpreter, and some of its APIs relied on undefined behavior in C. 204 (For further reading on the rationale behind Capsules, please see :issue:`5630`.) 205 206 If you're currently using CObjects, and you want to migrate to 3.1 or newer, 207 you'll need to switch to Capsules. 208 :c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in 209 Python 3.2. If you only support 2.7, or 3.1 and above, you 210 can simply switch to :c:type:`Capsule`. If you need to support Python 3.0, 211 or versions of Python earlier than 2.7, 212 you'll have to support both CObjects and Capsules. 213 (Note that Python 3.0 is no longer supported, and it is not recommended 214 for production use.) 215 216 The following example header file :file:`capsulethunk.h` may 217 solve the problem for you. Simply write your code against the 218 :c:type:`Capsule` API and include this header file after 219 :file:`Python.h`. Your code will automatically use Capsules 220 in versions of Python with Capsules, and switch to CObjects 221 when Capsules are unavailable. 222 223 :file:`capsulethunk.h` simulates Capsules using CObjects. However, 224 :c:type:`CObject` provides no place to store the capsule's "name". As a 225 result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h` 226 behave slightly differently from real Capsules. Specifically: 227 228 * The name parameter passed in to :c:func:`PyCapsule_New` is ignored. 229 230 * The name parameter passed in to :c:func:`PyCapsule_IsValid` and 231 :c:func:`PyCapsule_GetPointer` is ignored, and no error checking 232 of the name is performed. 233 234 * :c:func:`PyCapsule_GetName` always returns NULL. 235 236 * :c:func:`PyCapsule_SetName` always raises an exception and 237 returns failure. (Since there's no way to store a name 238 in a CObject, noisy failure of :c:func:`PyCapsule_SetName` 239 was deemed preferable to silent failure here. If this is 240 inconvenient, feel free to modify your local 241 copy as you see fit.) 242 243 You can find :file:`capsulethunk.h` in the Python source distribution 244 as :source:`Doc/includes/capsulethunk.h`. We also include it here for 245 your convenience: 246 247 .. literalinclude:: ../includes/capsulethunk.h 248 249 250 251 Other options 252 ============= 253 254 If you are writing a new extension module, you might consider `Cython 255 <http://cython.org/>`_. It translates a Python-like language to C. The 256 extension modules it creates are compatible with Python 3 and Python 2. 257 258