ctypes
— A foreign function library for Python¶
Source code: Lib/ctypes
ctypes
is a foreign function library for Python. It provides C compatible
data types, and allows calling functions in DLLs or shared libraries. It can be
used to wrap these libraries in pure Python.
ctypes tutorial¶
Note: The code samples in this tutorial use doctest
to make sure that
they actually work. Since some code samples behave differently under Linux,
Windows, or macOS, they contain doctest directives in comments.
Note: Some code samples reference the ctypes c_int
type. On platforms
where sizeof(long) == sizeof(int)
it is an alias to c_long
.
So, you should not be confused if c_long
is printed if you would expect
c_int
— they are actually the same type.
Loading dynamic link libraries¶
ctypes
exports the cdll, and on Windows windll and oledll
objects, for loading dynamic link libraries.
You load libraries by accessing them as attributes of these objects. cdll
loads libraries which export functions using the standard cdecl
calling
convention, while windll libraries call functions using the stdcall
calling convention. oledll also uses the stdcall
calling convention, and
assumes the functions return a Windows HRESULT
error code. The error
code is used to automatically raise an OSError
exception when the
function call fails.
Changed in version 3.3: Windows errors used to raise WindowsError
, which is now an alias
of OSError
.
Here are some examples for Windows. Note that msvcrt
is the MS standard C
library containing most standard C functions, and uses the cdecl
calling
convention:
>>> from ctypes import *
>>> print(windll.kernel32)
<WinDLL 'kernel32', handle ... at ...>
>>> print(cdll.msvcrt)
<CDLL 'msvcrt', handle ... at ...>
>>> libc = cdll.msvcrt
>>>
Windows appends the usual .dll
file suffix automatically.
Note
Accessing the standard C library through cdll.msvcrt
will use an
outdated version of the library that may be incompatible with the one
being used by Python. Where possible, use native Python functionality,
or else import and use the msvcrt
module.
On Linux, it is required to specify the filename including the extension to
load a library, so attribute access can not be used to load libraries. Either the
LoadLibrary()
method of the dll loaders should be used,
or you should load the library by creating an instance of CDLL by calling
the constructor:
>>> cdll.LoadLibrary("libc.so.6")
<CDLL 'libc.so.6', handle ... at ...>
>>> libc = CDLL("libc.so.6")
>>> libc
<CDLL 'libc.so.6', handle ... at ...>
>>>
Accessing functions from loaded dlls¶
Functions are accessed as attributes of dll objects:
>>> libc.printf
<_FuncPtr object at 0x...>
>>> print(windll.kernel32.GetModuleHandleA)
<_FuncPtr object at 0x...>
>>> print(windll.kernel32.MyOwnFunction)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "ctypes.py", line 239, in __getattr__
func = _StdcallFuncPtr(name, self)
AttributeError: function 'MyOwnFunction' not found
>>>
Note that win32 system dlls like kernel32
and user32
often export ANSI
as well as UNICODE versions of a function. The UNICODE version is exported with
a W
appended to the name, while the ANSI version is exported with an A
appended to the name. The win32 GetModuleHandle
function, which returns a
module handle for a given module name, has the following C prototype, and a
macro is used to expose one of them as GetModuleHandle
depending on whether
UNICODE is defined or not:
/* ANSI version */
HMODULE GetModuleHandleA(LPCSTR lpModuleName);
/* UNICODE version */
HMODULE GetModuleHandleW(LPCWSTR lpModuleName);
windll does not try to select one of them by magic, you must access the
version you need by specifying GetModuleHandleA
or GetModuleHandleW
explicitly, and then call it with bytes or string objects respectively.
Sometimes, dlls export functions with names which aren’t valid Python
identifiers, like "??2@YAPAXI@Z"
. In this case you have to use
getattr()
to retrieve the function:
>>> getattr(cdll.msvcrt, "??2@YAPAXI@Z")
<_FuncPtr object at 0x...>
>>>
On Windows, some dlls export functions not by name but by ordinal. These functions can be accessed by indexing the dll object with the ordinal number:
>>> cdll.kernel32[1]
<_FuncPtr object at 0x...>
>>> cdll.kernel32[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "ctypes.py", line 310, in __getitem__
func = _StdcallFuncPtr(name, self)
AttributeError: function ordinal 0 not found
>>>
Calling functions¶
You can call these functions like any other Python callable. This example uses
the rand()
function, which takes no arguments and returns a pseudo-random integer:
>>> print(libc.rand())
1804289383
On Windows, you can call the GetModuleHandleA()
function, which returns a win32 module
handle (passing None
as single argument to call it with a NULL
pointer):
>>> print(hex(windll.kernel32.GetModuleHandleA(None)))
0x1d000000
>>>
ValueError
is raised when you call an stdcall
function with the
cdecl
calling convention, or vice versa:
>>> cdll.kernel32.GetModuleHandleA(None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Procedure probably called with not enough arguments (4 bytes missing)
>>>
>>> windll.msvcrt.printf(b"spam")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Procedure probably called with too many arguments (4 bytes in excess)
>>>
To find out the correct calling convention you have to look into the C header file or the documentation for the function you want to call.
On Windows, ctypes
uses win32 structured exception handling to prevent
crashes from general protection faults when functions are called with invalid
argument values:
>>> windll.kernel32.GetModuleHandleA(32)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: exception: access violation reading 0x00000020
>>>
There are, however, enough ways to crash Python with ctypes
, so you
should be careful anyway. The faulthandler
module can be helpful in
debugging crashes (e.g. from segmentation faults produced by erroneous C library
calls).
None
, integers, bytes objects and (unicode) strings are the only native
Python objects that can directly be used as parameters in these function calls.
None
is passed as a C NULL
pointer, bytes objects and strings are passed
as pointer to the memory block that contains their data (char* or
wchar_t*). Python integers are passed as the platform’s default C
int type, their value is masked to fit into the C type.
Before we move on calling functions with other parameter types, we have to learn
more about ctypes
data types.
Fundamental data types¶
ctypes
defines a number of primitive C compatible data types:
ctypes type |
C type |
Python type |
---|---|---|
_Bool |
bool (1) |
|
char |
1-character bytes object |
|
|
1-character string |
|
char |
int |
|
unsigned char |
int |
|
short |
int |
|
unsigned short |
int |
|
int |
int |
|
|
int |
|
|
int |
|
|
int |
|
|
int |
|
unsigned int |
int |
|
|
int |
|
|
int |
|
|
int |
|
|
int |
|
long |
int |
|
unsigned long |
int |
|
__int64 or long long |
int |
|
unsigned __int64 or unsigned long long |
int |
|
|
int |
|
|
int |
|
|
int |
|
float |
float |
|
double |
float |
|
long double |
float |
|
char* (NUL terminated) |
bytes object or |
|
wchar_t* (NUL terminated) |
string or |
|
void* |
int or |
The constructor accepts any object with a truth value.
All these types can be created by calling them with an optional initializer of the correct type and value:
>>> c_int()
c_long(0)
>>> c_wchar_p("Hello, World")
c_wchar_p(140018365411392)
>>> c_ushort(-3)
c_ushort(65533)
>>>
Since these types are mutable, their value can also be changed afterwards:
>>> i = c_int(42)
>>> print(i)
c_long(42)
>>> print(i.value)
42
>>> i.value = -99
>>> print(i.value)
-99
>>>
Assigning a new value to instances of the pointer types c_char_p
,
c_wchar_p
, and c_void_p
changes the memory location they
point to, not the contents of the memory block (of course not, because Python
string objects are immutable):
>>> s = "Hello, World"
>>> c_s = c_wchar_p(s)
>>> print(c_s)
c_wchar_p(139966785747344)
>>> print(c_s.value)
Hello World
>>> c_s.value = "Hi, there"
>>> print(c_s) # the memory location has changed
c_wchar_p(139966783348904)
>>> print(c_s.value)
Hi, there
>>> print(s) # first object is unchanged
Hello, World
>>>
You should be careful, however, not to pass them to functions expecting pointers
to mutable memory. If you need mutable memory blocks, ctypes has a
create_string_buffer()
function which creates these in various ways. The
current memory block contents can be accessed (or changed) with the raw
property; if you want to access it as NUL terminated string, use the value
property:
>>> from ctypes import *
>>> p = create_string_buffer(3) # create a 3 byte buffer, initialized to NUL bytes
>>> print(sizeof(p), repr(p.raw))
3 b'\x00\x00\x00'
>>> p = create_string_buffer(b"Hello") # create a buffer containing a NUL terminated string
>>> print(sizeof(p), repr(p.raw))
6 b'Hello\x00'
>>> print(repr(p.value))
b'Hello'
>>> p = create_string_buffer(b"Hello", 10) # create a 10 byte buffer
>>> print(sizeof(p), repr(p.raw))
10 b'Hello\x00\x00\x00\x00\x00'
>>> p.value = b"Hi"
>>> print(sizeof(p), repr(p.raw))
10 b'Hi\x00lo\x00\x00\x00\x00\x00'
>>>
The create_string_buffer()
function replaces the old c_buffer()
function (which is still available as an alias). To create a mutable memory
block containing unicode characters of the C type wchar_t
, use the
create_unicode_buffer()
function.
Calling functions, continued¶
Note that printf prints to the real standard output channel, not to
sys.stdout
, so these examples will only work at the console prompt, not
from within IDLE or PythonWin:
>>> printf = libc.printf
>>> printf(b"Hello, %s\n", b"World!")
Hello, World!
14
>>> printf(b"Hello, %S\n", "World!")
Hello, World!
14
>>> printf(b"%d bottles of beer\n", 42)
42 bottles of beer
19
>>> printf(b"%f bottles of beer\n", 42.5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ctypes.ArgumentError: argument 2: TypeError: Don't know how to convert parameter 2
>>>
As has been mentioned before, all Python types except integers, strings, and
bytes objects have to be wrapped in their corresponding ctypes
type, so
that they can be converted to the required C data type:
>>> printf(b"An int %d, a double %f\n", 1234, c_double(3.14))
An int 1234, a double 3.140000
31
>>>
Calling variadic functions¶
On a lot of platforms calling variadic functions through ctypes is exactly the same as calling functions with a fixed number of parameters. On some platforms, and in particular ARM64 for Apple Platforms, the calling convention for variadic functions is different than that for regular functions.
On those platforms it is required to specify the argtypes
attribute for the regular, non-variadic, function arguments:
libc.printf.argtypes = [ctypes.c_char_p]
Because specifying the attribute does not inhibit portability it is advised to always
specify argtypes
for all variadic functions.
Calling functions with your own custom data types¶
You can also customize ctypes
argument conversion to allow instances of
your own classes be used as function arguments. ctypes
looks for an
_as_parameter_
attribute and uses this as the function argument. The
attribute must be an integer, string, bytes, a ctypes
instance, or an
object with an _as_parameter_
attribute:
>>> class Bottles:
... def __init__(self, number):
... self._as_parameter_ = number
...
>>> bottles = Bottles(42)
>>> printf(b"%d bottles of beer\n", bottles)
42 bottles of beer
19
>>>
If you don’t want to store the instance’s data in the _as_parameter_
instance variable, you could define a property
which makes the
attribute available on request.
Specifying the required argument types (function prototypes)¶
It is possible to specify the required argument types of functions exported from
DLLs by setting the argtypes
attribute.
argtypes
must be a sequence of C data types (the printf()
function is
probably not a good example here, because it takes a variable number and
different types of parameters depending on the format string, on the other hand
this is quite handy to experiment with this feature):
>>> printf.argtypes = [c_char_p, c_char_p, c_int, c_double]
>>> printf(b"String '%s', Int %d, Double %f\n", b"Hi", 10, 2.2)
String 'Hi', Int 10, Double 2.200000
37
>>>
Specifying a format protects against incompatible argument types (just as a prototype for a C function), and tries to convert the arguments to valid types:
>>> printf(b"%d %d %d", 1, 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ctypes.ArgumentError: argument 2: TypeError: 'int' object cannot be interpreted as ctypes.c_char_p
>>> printf(b"%s %d %f\n", b"X", 2, 3)
X 2 3.000000
13
>>>
If you have defined your own classes which you pass to function calls, you have
to implement a from_param()
class method for them to be able to use them
in the argtypes
sequence. The from_param()
class method receives
the Python object passed to the function call, it should do a typecheck or
whatever is needed to make sure this object is acceptable, and then return the
object itself, its _as_parameter_
attribute, or whatever you want to
pass as the C function argument in this case. Again, the result should be an
integer, string, bytes, a ctypes
instance, or an object with an
_as_parameter_
attribute.
Return types¶
By default functions are assumed to return the C int type. Other
return types can be specified by setting the restype
attribute of the
function object.
The C prototype of time()
is time_t time(time_t *)
. Because time_t
might be of a different type than the default return type int, you should
specify the restype
attribute:
>>> libc.time.restype = c_time_t
The argument types can be specified using argtypes
:
>>> libc.time.argtypes = (POINTER(c_time_t),)
To call the function with a NULL
pointer as first argument, use None
:
>>> print(libc.time(None))
1150640792
Here is a more advanced example, it uses the strchr()
function, which expects
a string pointer and a char, and returns a pointer to a string:
>>> strchr = libc.strchr
>>> strchr(b"abcdef", ord("d"))
8059983
>>> strchr.restype = c_char_p # c_char_p is a pointer to a string
>>> strchr(b"abcdef", ord("d"))
b'def'
>>> print(strchr(b"abcdef", ord("x")))
None
>>>
If you want to avoid the ord("x")
calls above, you can set the
argtypes
attribute, and the second argument will be converted from a
single character Python bytes object into a C char:
>>> strchr.restype = c_char_p
>>> strchr.argtypes = [c_char_p, c_char]
>>> strchr(b"abcdef", b"d")
b'def'
>>> strchr(b"abcdef", b"def")
Traceback (most recent call last):
ctypes.ArgumentError: argument 2: TypeError: one character bytes, bytearray or integer expected
>>> print(strchr(b"abcdef", b"x"))
None
>>> strchr(b"abcdef", b"d")
b'def'
>>>
You can also use a callable Python object (a function or a class for example) as
the restype
attribute, if the foreign function returns an integer. The
callable will be called with the integer the C function returns, and the
result of this call will be used as the result of your function call. This is
useful to check for error return values and automatically raise an exception:
>>> GetModuleHandle = windll.kernel32.GetModuleHandleA
>>> def ValidHandle(value):
... if value == 0:
... raise WinError()
... return value
...
>>>
>>> GetModuleHandle.restype = ValidHandle
>>> GetModuleHandle(None)
486539264
>>> GetModuleHandle("something silly")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in ValidHandle
OSError: [Errno 126] The specified module could not be found.
>>>
WinError
is a function which will call Windows FormatMessage()
api to
get the string representation of an error code, and returns an exception.
WinError
takes an optional error code parameter, if no one is used, it calls
GetLastError()
to retrieve it.
Please note that a much more powerful error checking mechanism is available
through the errcheck
attribute;
see the reference manual for details.
Passing pointers (or: passing parameters by reference)¶
Sometimes a C api function expects a pointer to a data type as parameter, probably to write into the corresponding location, or if the data is too large to be passed by value. This is also known as passing parameters by reference.
ctypes
exports the byref()
function which is used to pass parameters
by reference. The same effect can be achieved with the pointer()
function,
although pointer()
does a lot more work since it constructs a real pointer
object, so it is faster to use byref()
if you don’t need the pointer
object in Python itself:
>>> i = c_int()
>>> f = c_float()
>>> s = create_string_buffer(b'\000' * 32)
>>> print(i.value, f.value, repr(s.value))
0 0.0 b''
>>> libc.sscanf(b"1 3.14 Hello", b"%d %f %s",
... byref(i), byref(f), s)
3
>>> print(i.value, f.value, repr(s.value))
1 3.1400001049 b'Hello'
>>>
Structures and unions¶
Structures and unions must derive from the Structure
and Union
base classes which are defined in the ctypes
module. Each subclass must
define a _fields_
attribute. _fields_
must be a list of
2-tuples, containing a field name and a field type.
The field type must be a ctypes
type like c_int
, or any other
derived ctypes
type: structure, union, array, pointer.
Here is a simple example of a POINT structure, which contains two integers named x and y, and also shows how to initialize a structure in the constructor:
>>> from ctypes import *
>>> class POINT(Structure):
... _fields_ = [("x", c_int),
... ("y", c_int)]
...
>>> point = POINT(10, 20)
>>> print(point.x, point.y)
10 20
>>> point = POINT(y=5)
>>> print(point.x, point.y)
0 5
>>> POINT(1, 2, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: too many initializers
>>>
You can, however, build much more complicated structures. A structure can itself contain other structures by using a structure as a field type.
Here is a RECT structure which contains two POINTs named upperleft and lowerright:
>>> class RECT(Structure):
... _fields_ = [("upperleft", POINT),
... ("lowerright", POINT)]
...
>>> rc = RECT(point)
>>> print(rc.upperleft.x, rc.upperleft.y)
0 5
>>> print(rc.lowerright.x, rc.lowerright.y)
0 0
>>>
Nested structures can also be initialized in the constructor in several ways:
>>> r = RECT(POINT(1, 2), POINT(3, 4))
>>> r = RECT((1, 2), (3, 4))
Field descriptors can be retrieved from the class, they are useful for debugging because they can provide useful information:
>>> print(POINT.x)
<Field type=c_long, ofs=0, size=4>
>>> print(POINT.y)
<Field type=c_long, ofs=4, size=4>
>>>
Warning
ctypes
does not support passing unions or structures with bit-fields
to functions by value. While this may work on 32-bit x86, it’s not
guaranteed by the library to work in the general case. Unions and
structures with bit-fields should always be passed to functions by pointer.
Structure/union alignment and byte order¶
By default, Structure and Union fields are aligned in the same way the C
compiler does it. It is possible to override this behavior by specifying a
_pack_
class attribute in the subclass definition.
This must be set to a positive integer and specifies the maximum alignment for the fields.
This is what #pragma pack(n)
also does in MSVC.
It is also possible to set a minimum alignment for how the subclass itself is packed in the
same way #pragma align(n)
works in MSVC.
This can be achieved by specifying a _align_
class attribute
in the subclass definition.
ctypes
uses the native byte order for Structures and Unions. To build
structures with non-native byte order, you can use one of the
BigEndianStructure
, LittleEndianStructure
,
BigEndianUnion
, and LittleEndianUnion
base classes. These
classes cannot contain pointer fields.
Bit fields in structures and unions¶
It is possible to create structures and unions containing bit fields. Bit fields
are only possible for integer fields, the bit width is specified as the third
item in the _fields_
tuples:
>>> class Int(Structure):
... _fields_ = [("first_16", c_int, 16),
... ("second_16", c_int, 16)]
...
>>> print(Int.first_16)
<Field type=c_long, ofs=0:0, bits=16>
>>> print(Int.second_16)
<Field type=c_long, ofs=0:16, bits=16>
>>>
Arrays¶
Arrays are sequences, containing a fixed number of instances of the same type.
The recommended way to create array types is by multiplying a data type with a positive integer:
TenPointsArrayType = POINT * 10
Here is an example of a somewhat artificial data type, a structure containing 4 POINTs among other stuff:
>>> from ctypes import *
>>> class POINT(Structure):
... _fields_ = ("x", c_int), ("y", c_int)
...
>>> class MyStruct(Structure):
... _fields_ = [("a", c_int),
... ("b", c_float),
... ("point_array", POINT * 4)]
>>>
>>> print(len(MyStruct().point_array))
4
>>>
Instances are created in the usual way, by calling the class:
arr = TenPointsArrayType()
for pt in arr:
print(pt.x, pt.y)
The above code print a series of 0 0
lines, because the array contents is
initialized to zeros.
Initializers of the correct type can also be specified:
>>> from ctypes import *
>>> TenIntegers = c_int * 10
>>> ii = TenIntegers(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
>>> print(ii)
<c_long_Array_10 object at 0x...>
>>> for i in ii: print(i, end=" ")
...
1 2 3 4 5 6 7 8 9 10
>>>
Pointers¶
Pointer instances are created by calling the pointer()
function on a
ctypes
type:
>>> from ctypes import *
>>> i = c_int(42)
>>> pi = pointer(i)
>>>
Pointer instances have a contents
attribute which
returns the object to which the pointer points, the i
object above:
>>> pi.contents
c_long(42)
>>>
Note that ctypes
does not have OOR (original object return), it constructs a
new, equivalent object each time you retrieve an attribute:
>>> pi.contents is i
False
>>> pi.contents is pi.contents
False
>>>
Assigning another c_int
instance to the pointer’s contents attribute
would cause the pointer to point to the memory location where this is stored:
>>> i = c_int(99)
>>> pi.contents = i
>>> pi.contents
c_long(99)
>>>
Pointer instances can also be indexed with integers:
>>> pi[0]
99
>>>
Assigning to an integer index changes the pointed to value:
>>> print(i)
c_long(99)
>>> pi[0] = 22
>>> print(i)
c_long(22)
>>>
It is also possible to use indexes different from 0, but you must know what you’re doing, just as in C: You can access or change arbitrary memory locations. Generally you only use this feature if you receive a pointer from a C function, and you know that the pointer actually points to an array instead of a single item.
Behind the scenes, the pointer()
function does more than simply create
pointer instances, it has to create pointer types first. This is done with the
POINTER()
function, which accepts any ctypes
type, and returns a
new type:
>>> PI = POINTER(c_int)
>>> PI
<class 'ctypes.LP_c_long'>
>>> PI(42)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected c_long instead of int
>>> PI(c_int(42))
<ctypes.LP_c_long object at 0x...>
>>>
Calling the pointer type without an argument creates a NULL
pointer.
NULL
pointers have a False
boolean value:
>>> null_ptr = POINTER(c_int)()
>>> print(bool(null_ptr))
False
>>>
ctypes
checks for NULL
when dereferencing pointers (but dereferencing
invalid non-NULL
pointers would crash Python):
>>> null_ptr[0]
Traceback (most recent call last):
....
ValueError: NULL pointer access
>>>
>>> null_ptr[0] = 1234
Traceback (most recent call last):
....
ValueError: NULL pointer access
>>>
Type conversions¶
Usually, ctypes does strict type checking. This means, if you have
POINTER(c_int)
in the argtypes
list of a function or as the type of
a member field in a structure definition, only instances of exactly the same
type are accepted. There are some exceptions to this rule, where ctypes accepts
other objects. For example, you can pass compatible array instances instead of
pointer types. So, for POINTER(c_int)
, ctypes accepts an array of c_int:
>>> class Bar(Structure):
... _fields_ = [("count", c_int), ("values", POINTER(c_int))]
...
>>> bar = Bar()
>>> bar.values = (c_int * 3)(1, 2, 3)
>>> bar.count = 3
>>> for i in range(bar.count):
... print(bar.values[i])
...
1
2
3
>>>
In addition, if a function argument is explicitly declared to be a pointer type
(such as POINTER(c_int)
) in argtypes
, an object of the pointed
type (c_int
in this case) can be passed to the function. ctypes will apply
the required byref()
conversion in this case automatically.
To set a POINTER type field to NULL
, you can assign None
:
>>> bar.values = None
>>>
Sometimes you have instances of incompatible types. In C, you can cast one type
into another type. ctypes
provides a cast()
function which can be
used in the same way. The Bar
structure defined above accepts
POINTER(c_int)
pointers or c_int
arrays for its values
field,
but not instances of other types:
>>> bar.values = (c_byte * 4)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: incompatible types, c_byte_Array_4 instance instead of LP_c_long instance
>>>
For these cases, the cast()
function is handy.
The cast()
function can be used to cast a ctypes instance into a pointer
to a different ctypes data type. cast()
takes two parameters, a ctypes
object that is or can be converted to a pointer of some kind, and a ctypes
pointer type. It returns an instance of the second argument, which references
the same memory block as the first argument:
>>> a = (c_byte * 4)()
>>> cast(a, POINTER(c_int))
<ctypes.LP_c_long object at ...>
>>>
So, cast()
can be used to assign to the values
field of Bar
the
structure:
>>> bar = Bar()
>>> bar.values = cast((c_byte * 4)(), POINTER(c_int))
>>> print(bar.values[0])
0
>>>
Incomplete Types¶
Incomplete Types are structures, unions or arrays whose members are not yet specified. In C, they are specified by forward declarations, which are defined later:
struct cell; /* forward declaration */
struct cell {
char *name;
struct cell *next;
};
The straightforward translation into ctypes code would be this, but it does not work:
>>> class cell(Structure):
... _fields_ = [("name", c_char_p),
... ("next", POINTER(cell))]
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in cell
NameError: name 'cell' is not defined
>>>
because the new class cell
is not available in the class statement itself.
In ctypes
, we can define the cell
class and set the
_fields_
attribute later, after the class statement:
>>> from ctypes import *
>>> class cell(Structure):
... pass
...
>>> cell._fields_ = [("name", c_char_p),
... ("next", POINTER(cell))]
>>>
Let’s try it. We create two instances of cell
, and let them point to each
other, and finally follow the pointer chain a few times:
>>> c1 = cell()
>>> c1.name = b"foo"
>>> c2 = cell()
>>> c2.name = b"bar"
>>> c1.next = pointer(c2)
>>> c2.next = pointer(c1)
>>> p = c1
>>> for i in range(8):
... print(p.name, end=" ")
... p = p.next[0]
...
foo bar foo bar foo bar foo bar
>>>
Callback functions¶
ctypes
allows creating C callable function pointers from Python callables.
These are sometimes called callback functions.
First, you must create a class for the callback function. The class knows the calling convention, the return type, and the number and types of arguments this function will receive.
The CFUNCTYPE()
factory function creates types for callback functions
using the cdecl
calling convention. On Windows, the WINFUNCTYPE()
factory function creates types for callback functions using the stdcall
calling convention.
Both of these factory functions are called with the result type as first argument, and the callback functions expected argument types as the remaining arguments.
I will present an example here which uses the standard C library’s
qsort()
function, that is used to sort items with the help of a callback
function. qsort()
will be used to sort an array of integers:
>>> IntArray5 = c_int * 5
>>> ia = IntArray5(5, 1, 7, 33, 99)
>>> qsort = libc.qsort
>>> qsort.restype = None
>>>
qsort()
must be called with a pointer to the data to sort, the number of
items in the data array, the size of one item, and a pointer to the comparison
function, the callback. The callback will then be called with two pointers to
items, and it must return a negative integer if the first item is smaller than
the second, a zero if they are equal, and a positive integer otherwise.
So our callback function receives pointers to integers, and must return an
integer. First we create the type
for the callback function:
>>> CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))
>>>
To get started, here is a simple callback that shows the values it gets passed:
>>> def py_cmp_func(a, b):
... print("py_cmp_func", a[0], b[0])
... return 0
...
>>> cmp_func = CMPFUNC(py_cmp_func)
>>>
The result:
>>> qsort(ia, len(ia), sizeof(c_int), cmp_func)
py_cmp_func 5 1
py_cmp_func 33 99
py_cmp_func 7 33
py_cmp_func 5 7
py_cmp_func 1 7
>>>
Now we can actually compare the two items and return a useful result:
>>> def py_cmp_func(a, b):
... print("py_cmp_func", a[0], b[0])
... return a[0] - b[0]
...
>>>
>>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func))
py_cmp_func 5 1
py_cmp_func 33 99
py_cmp_func 7 33
py_cmp_func 1 7
py_cmp_func 5 7
>>>
As we can easily check, our array is sorted now:
>>> for i in ia: print(i, end=" ")
...
1 5 7 33 99
>>>
The function factories can be used as decorator factories, so we may as well write:
>>> @CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))
... def py_cmp_func(a, b):
... print("py_cmp_func", a[0], b[0])
... return a[0] - b[0]
...
>>> qsort(ia, len(ia), sizeof(c_int), py_cmp_func)
py_cmp_func 5 1
py_cmp_func 33 99
py_cmp_func 7 33
py_cmp_func 1 7
py_cmp_func 5 7
>>>
Note
Make sure you keep references to CFUNCTYPE()
objects as long as they
are used from C code. ctypes
doesn’t, and if you don’t, they may be
garbage collected, crashing your program when a callback is made.
Also, note that if the callback function is called in a thread created
outside of Python’s control (e.g. by the foreign code that calls the
callback), ctypes creates a new dummy Python thread on every invocation. This
behavior is correct for most purposes, but it means that values stored with
threading.local
will not survive across different callbacks, even when
those calls are made from the same C thread.
Accessing values exported from dlls¶
Some shared libraries not only export functions, they also export variables. An
example in the Python library itself is the Py_Version
, Python
runtime version number encoded in a single constant integer.
ctypes
can access values like this with the in_dll()
class methods of
the type. pythonapi is a predefined symbol giving access to the Python C
api:
>>> version = ctypes.c_int.in_dll(ctypes.pythonapi, "Py_Version")
>>> print(hex(version.value))
0x30c00a0
An extended example which also demonstrates the use of pointers accesses the
PyImport_FrozenModules
pointer exported by Python.
Quoting the docs for that value:
This pointer is initialized to point to an array of
_frozen
records, terminated by one whose members are allNULL
or zero. When a frozen module is imported, it is searched in this table. Third-party code could play tricks with this to provide a dynamically created collection of frozen modules.
So manipulating this pointer could even prove useful. To restrict the example
size, we show only how this table can be read with ctypes
:
>>> from ctypes import *
>>>
>>> class struct_frozen(Structure):
... _fields_ = [("name", c_char_p),
... ("code", POINTER(c_ubyte)),
... ("size", c_int),
... ("get_code", POINTER(c_ubyte)), # Function pointer
... ]
...
>>>
We have defined the _frozen
data type, so we can get the pointer
to the table:
>>> FrozenTable = POINTER(struct_frozen)
>>> table = FrozenTable.in_dll(pythonapi, "_PyImport_FrozenBootstrap")
>>>
Since table
is a pointer
to the array of struct_frozen
records, we
can iterate over it, but we just have to make sure that our loop terminates,
because pointers have no size. Sooner or later it would probably crash with an
access violation or whatever, so it’s better to break out of the loop when we
hit the NULL
entry:
>>> for item in table:
... if item.name is None:
... break
... print(item.name.decode("ascii"), item.size)
...
_frozen_importlib 31764
_frozen_importlib_external 41499
zipimport 12345
>>>
The fact that standard Python has a frozen module and a frozen package
(indicated by the negative size
member) is not well known, it is only used
for testing. Try it out with import __hello__
for example.
Surprises¶
There are some edges in ctypes
where you might expect something other
than what actually happens.
Consider the following example:
>>> from ctypes import *
>>> class POINT(Structure):
... _fields_ = ("x", c_int), ("y", c_int)
...
>>> class RECT(Structure):
... _fields_ = ("a", POINT), ("b", POINT)
...
>>> p1 = POINT(1, 2)
>>> p2 = POINT(3, 4)
>>> rc = RECT(p1, p2)
>>> print(rc.a.x, rc.a.y, rc.b.x, rc.b.y)
1 2 3 4
>>> # now swap the two points
>>> rc.a, rc.b = rc.b, rc.a
>>> print(rc.a.x, rc.a.y, rc.b.x, rc.b.y)
3 4 3 4
>>>
Hm. We certainly expected the last statement to print 3 4 1 2
. What
happened? Here are the steps of the rc.a, rc.b = rc.b, rc.a
line above:
>>> temp0, temp1 = rc.b, rc.a
>>> rc.a = temp0
>>> rc.b = temp1
>>>
Note that temp0
and temp1
are objects still using the internal buffer of
the rc
object above. So executing rc.a = temp0
copies the buffer
contents of temp0
into rc
‘s buffer. This, in turn, changes the
contents of temp1
. So, the last assignment rc.b = temp1
, doesn’t have
the expected effect.
Keep in mind that retrieving sub-objects from Structure, Unions, and Arrays doesn’t copy the sub-object, instead it retrieves a wrapper object accessing the root-object’s underlying buffer.
Another example that may behave differently from what one would expect is this:
>>> s = c_char_p()
>>> s.value = b"abc def ghi"
>>> s.value
b'abc def ghi'
>>> s.value is s.value
False
>>>
Note
Objects instantiated from c_char_p
can only have their value set to bytes
or integers.
Why is it printing False
? ctypes instances are objects containing a memory
block plus some descriptors accessing the contents of the memory.
Storing a Python object in the memory block does not store the object itself,
instead the contents
of the object is stored. Accessing the contents again
constructs a new Python object each time!
Variable-sized data types¶
ctypes
provides some support for variable-sized arrays and structures.
The resize()
function can be used to resize the memory buffer of an
existing ctypes object. The function takes the object as first argument, and
the requested size in bytes as the second argument. The memory block cannot be
made smaller than the natural memory block specified by the objects type, a
ValueError
is raised if this is tried:
>>> short_array = (c_short * 4)()
>>> print(sizeof(short_array))
8
>>> resize(short_array, 4)
Traceback (most recent call last):
...
ValueError: minimum size is 8
>>> resize(short_array, 32)
>>> sizeof(short_array)
32
>>> sizeof(type(short_array))
8
>>>
This is nice and fine, but how would one access the additional elements contained in this array? Since the type still only knows about 4 elements, we get errors accessing other elements:
>>> short_array[:]
[0, 0, 0, 0]
>>> short_array[7]
Traceback (most recent call last):
...
IndexError: invalid index
>>>
Another way to use variable-sized data types with ctypes
is to use the
dynamic nature of Python, and (re-)define the data type after the required size
is already known, on a case by case basis.