Scientific Tools for Python
NumPy, SciPy and Matplotlib
| Authors: |
Stéfan van der Walt (DSP Lab, Univ of Stellenbosch)
Albert Strasheim (DSP Lab, Univ of Stellenbosch)
Neilen Marais (CEMAGG, Univ of Stellenbosch) |
| Date: |
09/09/2006 |
- Part 1: Overview
- Part 2: Wrapping C code using ctypes
- Part 3: Wrapping Fortran code using f2py
Normally I give these talks to convince people to move away from
MATLAB to Python. However, in this case, you already know that
Python is the best thing since sliced bread. Further, you are not
all scientists, so I'd rather proceed to give you a quick overview
of the different packages we use, with some emphasis on wrapping C
and Fortran code. If you are interested in more detail, you are
welcome to talk to me afterwards.
Pieces of the Puzzle
- Algorithms
- NumPy -- N-dimensional array
- SciPy -- Scientific toolkit
Being the array class underlying all other scientific packages,
NumPy needs to be relatively small and easy to build. It therefore
contains no Fortran code (although there are algorithms in C that
derive from Fortran implementations). It also makes use of BLAS,
LAPACK and ATLAS if those are installed.
SciPy, on the other hand, contains a lot of Fortran code.
- Visualisation
- Matplotlib -- Plotting
- mayavi (visual data inspection)
- Visual (3D animations)
- Interactive computation
-
- Software frontend
- wxPython (Graphical user interfaces)
SciPy
- Contains:
- Statistics
- Optimisation
- Numeric integration
- Linear algebra
- Fourier transforms
- Signal and image processing
- Genetic algorithms
- ODE solvers
- Special functions (Bessel etc.)
Fortran wrappers around FFTPACK, FITPACK, MINPACK, ODEPACK, QUADPACK etc.
Pretty pictures with matplotlib
- History of matplotlib
- High-quality graphs
- Supports LaTeX captions
- Roughly follows MATLAB (R) interface
- Can be embedded in applications
- Generates output in various formats, including png, jpeg, eps
- 2D so far, with some work on 3D capabilities
More pretty pictures
Basic array operations
Constructing an array:
import numpy as N
x = N.array([1,2,3])
x = N.array([1,2,3], dtype=int)
x = N.zeros((10,10,3), dtype=float)
x = N.empty((10,10,3), dtype=float)
Slicing over known dimensions:
>>> import numpy as N
>>> x = N.random.random((5,5))
>>> x
array([[ 0.36155788, 0.94092246, 0.87766115, 0.38876014, 0.14242888],
[ 0.41110453, 0.85805215, 0.77545163, 0.81612129, 0.68321265],
[ 0.8418767 , 0.16841387, 0.7550585 , 0.99356193, 0.65077119],
[ 0.67116546, 0.97694933, 0.37857987, 0.71123056, 0.43778407],
[ 0.52968691, 0.8107153 , 0.65152697, 0.97804839, 0.27858883]])
>>> x[0,:]
array([ 0.36155788, 0.94092246, 0.87766115, 0.38876014, 0.14242888])
Slicing over all unknown dimensions:
x = N.random.rand((5,5,3))
x[0,1,1]
x[1,...]
Further array manipulation
Reshaping:
>>> x = N.arange(12)
>>> x
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> x.reshape((4,3))
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
Convenient axis-wise operators:
>>> import numpy as N
>>> x = N.ones(12).reshape((4,3))
>>> x
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])
>>> x.sum()
12.0
>>> x.sum(axis=0)
array([ 4., 4., 4.])
>>> x.sum(axis=1)
array([ 3., 3., 3., 3.])
Record arrays
Example:
In [1]: import numpy as N
In [2]: dt = N.dtype([('id',int),('score',float)])
In [3]: N.zeros(15,dtype=dt)
Out[3]:
array([(0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0),
(0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0), (0, 0.0),
(0, 0.0), (0, 0.0), (0, 0.0)],
dtype=[('id', '<i4'), ('score', '<f8')])
In [4]: users = N.zeros(15,dtype=dt)
In [5]: users['id']
Out[5]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [6]: users['score']
Out[6]:
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0.])
Arrays in memory
Changing dtype and viewing memory:
>>> import numpy as N
>>> x = N.array([255,255], N.uint8)
>>> x.astype(N.uint16)
array([255, 255], dtype=uint16)
>>> x.view(N.uint16)
array([65535], dtype=uint16)
Array interface
Python-side:
In [4]: x = N.array([1,2,3])
In [5]: x.__array_interface__
Out[5]:
{'data': (137220152, False),
'descr': [('', '<i4')],
'shape': (3,),
'strides': None,
'typestr': '<i4',
'version': 3}
C-side:
typedef struct {
int two;
int nd;
char typekind;
int itemsize;
int flags;
Py_intptr_t *shape;
Py_intptr_t *strides;
void *data;
PyObject *descr;
} PyArrayInterface;
Extending numpy
- Needed under special circuimstances only
- f2py for Fortran code
- ctypes, SWIG, pyrex for C code
- Weave for inline C code
- Native C API