These files may be downloaded from http://mentat.za.net/numpy/kittens
Please do explore beyond the problems given, and feel free to ask questions at any time.
Note
Solutions to some problems are provided in the source tree as problem_name_solution.py. Do not look at these until you've made an attempt yourself!
What is the maximum number of dimensions a NumPy array can have? Use one of the array constructors (np.zeros, np.empty, np.random.random, etc.) to find out.
Construct the following two arrays:
x = np.array([[1, 2], [3, 4]], order='C', dtype=np.uint8)
y = np.array([[1, 2], [3, 4]], order='F', dtype=np.uint8)
Compare the bytes they store in memory by using
[ord(c) for c in x.data]
Note that, even though these arrays store data in different memory order, they are identical from the user's perspective.
print x
print y
Examine the bytes stored by the following array (using the "ord" trick shown above).
x = np.array([[1, 2], [3, 4]], dtype=np.uint32)
Note that, on most laptops, the byte order will be little Endian, i.e. least significant byte first.
Create a 3x3 ndarray called x. Slice out the first row and call that y. Convince yourself that y's base pointer is x.
y.base is x
Modify y and see whether x changes.
Advanced: Attempt the Fortran-ordering quiz.
Reproduce z from the following snippet, using broadcasting instead of mgrid. Hint: Use ogrid.
x, y = np.mgrid[:10, :5]
z = x + y
In our solution, broadcasting is used "behind the scenes". To see what happens more clearly, apply np.broadcast_arrays on the x and y from ogrid. This should correspond to the x and y produced by mgrid.
Benchmark the two approaches (mgrid vs ogrid), using IPython's timeit function. Can you explain the difference in execution time?
Given a list of 3-dimensional coordinates,
[[1, 2, 10],
[3, 4, 20],
[5, 6, 30],
[7, 8, 40]]
Normalise each coordinate by dividing with its Z (3rd) element. For example, the first row becomes:
[1/10, 2/10, 10/10]
Create a 3x3 ndarray. Use fancy indexing to slice out the diagonal elements.
Predict and verify the shape of the following slicing operation. Remember: index arrays are broadcast first, then come slices.
x = np.empty((10, 8, 6, 5, 4))
idx0 = np.zeros((3, 8)).astype(int)
idx1 = np.zeros((3, 1))
x[1:2, z0, 1:3, 3:4, z1]
Advanced: This is not strictly speaking a question on indexing, but it's a fun exercise either way.
Construct an array
x = np.arange(12, dtype=np.int32).reshape((3, 4))
so that x is
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Now, provide to np.lib.stride_tricks.as_strided the strides necessary to view a sliding 2x2 window over this array. The output should be
array([[[[ 0, 1],
[ 4, 5]],
[[ 1, 2],
[ 5, 6]],
[[ 2, 3],
[ 6, 7]]],
[[[ 4, 5],
[ 8, 9]],
[[ 5, 6],
[ 9, 10]],
[[ 6, 7],
[10, 11]]]], dtype=int32)
The code is of the form
z = as_strided(x, shape=(2, 3, 2, 2),
strides=(..., ..., ..., ...))
This sort of stride manipulation is very handy when applying region based statistics or operators.
Design a data-type for storing the following record:
Use it to represent the following data:
x = np.array([(100, (0, 0.5)),
(200, (0, 10.3)),
(300, (5.5, 15.1))], dtype=XXX)
Consider structured_arrays/data.txt. Modify load_txt_template.py to load the data in this file. This requires specifying a data-type that encapsulates a record such as
# name x y block - 2x3 ints
aaaa 1.0 8.0 1 2 3 4 5 6
Create two structured arrays of your choosing. Use the np.savez command to save these to a single data-file. Load the data-file using np.load and confirm whether the data survived the round-trip. (Saving data using save or savez is highly recommended over pickling.)
Documentation for NumPy's __array_interface__ may be found in the online docs.
An author of a foreign package (array_interface/mutable_str.py) provides a string class that allocates its own memory:
In [1]: from mutable_str import MutableString
In [2]: s = MutableString('abcde')
In [3]: print s
abcde
You'd like to view these mutable strings as ndarrays, in order to manipulate the underlying memory.