Skip to content

Commit 461483f

Browse files
committed
Added Cython guided exercise example using numerical integration
1 parent f9037fd commit 461483f

File tree

7 files changed

+258
-0
lines changed

7 files changed

+258
-0
lines changed

cython/integrate/.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
*.c
2+
*.html
3+

cython/integrate/Readme.md

+139
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
# Faster code via static typing and Cython
2+
Cython is a Python compiler. This means that it can compile normal Python code without changes (with a few
3+
obvious exceptions of some as-yet unsupported language features). However, for performance critical code,
4+
it is often helpful to add static type declarations, as they will allow Cython to step out of the dynamic
5+
nature of the Python code and generate simpler and faster C code - sometimes faster by orders of magnitude.
6+
7+
It must be noted, however, that type declarations can make the source code more verbose and thus less
8+
readable. It is therefore discouraged to use them without good reason, such as where benchmarks prove that
9+
they really make the code substantially faster in a performance critical section. Typically a few types in
10+
the right spots go a long way.
11+
12+
All C types are available for type declarations: integer and floating point types, complex numbers, structs,
13+
unions and pointer types. Cython can automatically and correctly convert between the types on assignment.
14+
This also includes Python’s arbitrary size integer types, where value overflows on conversion to a C type
15+
will raise a Python OverflowError at runtime. (It does not, however, check for overflow when doing
16+
arithmetic.) The generated C code will handle the platform dependent sizes of C types correctly and safely
17+
in this case.
18+
19+
Types are declared via the cdef keyword.
20+
21+
## Exercise
22+
23+
The file **integrate.py** contains pure Python code for numerically integrating a function. The file
24+
**cyintegrate.pyx** contains an exact copy of the code in *integrate.py*. The **build_cython.sh** script
25+
builds the Cython code present in **cyintegrate.pyx** using the configuration in **setup.py**.
26+
27+
The **cython_speedup.py** file is setup to import both the pure Python and the Cython version and to
28+
benchmark their performance using a common configuration.
29+
30+
NOTE: You need to rebuild your Cython code anytime you make changes to **cyintegrate.pyx**
31+
32+
33+
## Step 1 - see how things work to start with
34+
Cython will give some performance benefit even when compiling Python code without any static type
35+
declarations.
36+
37+
Build the Cython code:
38+
39+
```bash
40+
python setup.py build_ext --inplace
41+
```
42+
43+
Run **cython_speedup.py** to compare the two implementations at this point.
44+
45+
```bash
46+
python cython_speedup.py
47+
```
48+
49+
On my system, even though we have done **nothing** to change the pure Python code in any way, Cython
50+
provides about a 60% speedup.
51+
52+
53+
## Step 2 - look at the HTML annotation file
54+
The **setup.py** file contains this code which tells Cython to create a *.html annotations file:
55+
56+
```python
57+
import Cython.Compiler.Options
58+
Cython.Compiler.Options.annotate = True
59+
```
60+
61+
This is equivalent to calling cythonize with the **-a** flag at the command line.
62+
63+
Open the **cyintegrate.html** file in the web browser of your choice. Yellow lines show code which requires
64+
integration with the Python interpreter. The darker the yellow, the more interactions with the Python
65+
intepreter. Any interactions with the Python intepreter slow Cython's generated C/C++ code down. White lines
66+
indicate no interaction with the Python intepreter (pure C code).
67+
68+
What you want to do is get rid of as much yellow as possible and end up with as much white as possible. This
69+
matters particularly inside loops. The main way you get rid of these interactions with the Python interpreter
70+
is to declare optional C static types, so Cython can use them to generate fast C code.
71+
72+
73+
## Step 3 - Typing Variables
74+
As we saw, simply compiling this code in Cython merely gives a 60% speedup. This is better than nothing,
75+
but adding some static types can make a much larger difference.
76+
77+
Types for function arguments can be added simply by prefacing the parameter name with a C type, such as:
78+
79+
```python
80+
def f(double x):
81+
return cos(x)
82+
```
83+
84+
Types for local variables can be added by declaring them wtih **cdef**:
85+
```python
86+
cdef int i
87+
```
88+
89+
Try adding static types for all function arguments and all local variables. I recommend using **double** for
90+
floating point types since the Python **float** type corresponds to a C **double**. Once you have done
91+
this, recompile your Cython code and re-run **cython_speedup.py**.
92+
93+
This results in a TBD times speedup over the pure Python version.
94+
95+
96+
## Step 4 - Typing Functions
97+
Python function calls can be expensive – in Cython doubly so because one might need to convert to and from
98+
Python objects to do the call. In our example above, the argument is assumed to be a C double both inside
99+
f() and in the call to it, yet a Python float object must be constructed around the argument in order to
100+
pass it.
101+
102+
Therefore Cython provides a syntax for declaring a C-style function, the cdef keyword:
103+
```python
104+
cdef double f(double x) except? -2:
105+
return cos(x)
106+
```
107+
108+
Some form of except-modifier should usually be added, otherwise Cython will not be able to propagate
109+
exceptions raised in the function (or a function it calls). The except? -2 means that an error will be
110+
checked for if -2 is returned (though the ? indicates that -2 may also be used as a valid return value).
111+
Alternatively, the slower except * is always safe. An except clause can be left out if the function returns
112+
a Python object or if it is guaranteed that an exception will not be raised within the function call.
113+
114+
A side-effect of cdef is that the function is no longer available from Python-space, as Python wouldn’t
115+
know how to call it. It is also no longer possible to change f() at runtime.
116+
117+
Using the cpdef keyword instead of cdef, a Python wrapper is also created, so that the function is
118+
available both from Cython (fast, passing typed values directly) and from Python (wrapping values in Python
119+
objects). In fact, cpdef does not just provide a Python wrapper, it also installs logic to allow the
120+
method to be overridden by python methods, even when called from within cython. This does add a tiny
121+
overhead compared to cdef methods.
122+
123+
Speedup: TBD times over pure Python.
124+
125+
126+
## Step 5 - Replacing Python standard library calls with C library calls
127+
The call to **cos(x)** still requires interaction with the Python interpreter. The C standard library
128+
also has a pure C implementation of the cosine function. Wouldn't it be great if we could just call that
129+
function instead?
130+
131+
Well, we can! Cython supports this sort of syntax:
132+
133+
```python
134+
from libc.math cimport cos
135+
```
136+
137+
We can replace the "from math import cos" line with the above for further performance improvements.
138+
139+
Speedup: TBD times over pure Python.

cython/integrate/build_cython.sh

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python setup.py build_ext --inplace

cython/integrate/cyintegrate.pyx

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# coding=utf-8
2+
"""
3+
Pure Python code for numerically integrating a function.
4+
"""
5+
from math import cos
6+
7+
8+
def f(x):
9+
"""Example function in one variable.
10+
11+
:param x: float - point we wish to evaluate the function at
12+
:return: float - function value at point x
13+
"""
14+
return cos(x)
15+
16+
17+
def integrate_f(a, b, N):
18+
"""Numerically integrate function f starting at point a and going to point b, using N rectangles.
19+
20+
:param a: float - starting point
21+
:param b: float - ending point
22+
:param N: int - number of points to use in the rectangluar approximation to the integral
23+
:return: float - approximation to the true integral, which improves as N increases
24+
"""
25+
s = 0.0
26+
dx = (b-a)/N
27+
for i in range(N):
28+
s += f(a+i*dx)
29+
return s * dx

cython/integrate/cython_speedup.py

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#!/usr/bin/env python
2+
# coding=utf-8
3+
""" Python wrapper to time the Cython implementation versus the pure Python implementation.
4+
5+
This is for the example of numerical integration.
6+
"""
7+
from integrate import integrate_f as pyintegrate_f
8+
from cyintegrate import integrate_f as cyintegrate_f
9+
from math import pi, isclose
10+
11+
if __name__ == '__main__':
12+
import sys
13+
import timeit
14+
15+
a = 0
16+
b = pi/2
17+
N = 10000
18+
try:
19+
N = int(sys.argv[1])
20+
except Exception:
21+
pass
22+
23+
number_of_times = 100
24+
try:
25+
number_of_times = int(sys.argv[2])
26+
except Exception:
27+
pass
28+
29+
int_py = pyintegrate_f(a, b, N)
30+
int_cy = cyintegrate_f(a, b, N)
31+
if not isclose(int_py, int_cy):
32+
raise(ValueError(int_cy))
33+
34+
py_tot = timeit.timeit("integrate_f({}, {}, {})".format(a, b, N),
35+
setup="from integrate import integrate_f",
36+
number=number_of_times)
37+
cy_tot = timeit.timeit("integrate_f({}, {}, {})".format(a, b, N),
38+
setup="from cyintegrate import integrate_f",
39+
number=number_of_times)
40+
py_avg = py_tot / number_of_times
41+
cy_avg = cy_tot / number_of_times
42+
43+
print("int({}, {}, {}) = {}".format(a, b, N, int_py))
44+
print("Python average time: {0:.2g}".format(py_avg))
45+
print("Cython average time: {0:.2g}".format(cy_avg))
46+
print("Cython speedup: {0:.2g} times".format(py_avg/cy_avg))

cython/integrate/integrate.py

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# coding=utf-8
2+
"""
3+
Pure Python code for numerically integrating a function.
4+
"""
5+
from math import cos
6+
7+
8+
def f(x):
9+
"""Example function in one variable.
10+
11+
:param x: float - point we wish to evaluate the function at
12+
:return: float - function value at point x
13+
"""
14+
return cos(x)
15+
16+
17+
def integrate_f(a, b, N):
18+
"""Numerically integrate function f starting at point a and going to point b, using N rectangles.
19+
20+
:param a: float - starting point
21+
:param b: float - ending point
22+
:param N: int - number of points to use in the rectangluar approximation to the integral
23+
:return: float - approximation to the true integral, which improves as N increases
24+
"""
25+
s = 0.0
26+
dx = (b-a)/N
27+
for i in range(N):
28+
s += f(a+i*dx)
29+
return s * dx

cython/integrate/setup.py

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# coding=utf-8
2+
from setuptools import setup
3+
from Cython.Build import cythonize
4+
import Cython.Compiler.Options
5+
6+
Cython.Compiler.Options.annotate = True
7+
8+
setup(
9+
name="cyintegrate",
10+
ext_modules=cythonize('cyintegrate.pyx', compiler_directives={'embedsignature': True}),
11+
)

0 commit comments

Comments
 (0)