Problem-set for SciPy2010 NumPy Tutorial

P1: The NumPy N-dimensional array
P2: Broadcasting
P3: Indexing
P4: Structured Arrays
P5: Universal Functions
P6: Array Interface
P7: Optimisation: Demo Only

These files may be downloaded from http://mentat.za.net/numpy/kittens

Please do explore beyond the problems given, and feel free to ask questions at any time.

Note

Solutions to some problems are provided in the source tree as problem_name_solution.py. Do not look at these until you've made an attempt yourself!

P1: The NumPy N-dimensional array

What is the maximum number of dimensions a NumPy array can have? Use one of the array constructors (np.zeros, np.empty, np.random.random, etc.) to find out.
Construct the following two arrays:
```
x = np.array([[1, 2], [3, 4]], order='C', dtype=np.uint8)
y = np.array([[1, 2], [3, 4]], order='F', dtype=np.uint8)
```
Compare the bytes they store in memory by using
```
[ord(c) for c in x.data]
```
Note that, even though these arrays store data in different memory order, they are identical from the user's perspective.
```
print x
print y
```
Examine the bytes stored by the following array (using the "ord" trick shown above).
```
x = np.array([[1, 2], [3, 4]], dtype=np.uint32)
```
Note that, on most laptops, the byte order will be little Endian, i.e. least significant byte first.
Create a 3x3 ndarray called x. Slice out the first row and call that y. Convince yourself that y's base pointer is x.
```
y.base is x
```
Modify y and see whether x changes.
Advanced: Attempt the Fortran-ordering quiz.

P2: Broadcasting

Reproduce z from the following snippet, using broadcasting instead of mgrid. Hint: Use ogrid.
```
x, y = np.mgrid[:10, :5]
z = x + y
```
In our solution, broadcasting is used "behind the scenes". To see what happens more clearly, apply np.broadcast_arrays on the x and y from ogrid. This should correspond to the x and y produced by mgrid.
Benchmark the two approaches (mgrid vs ogrid), using IPython's timeit function. Can you explain the difference in execution time?
Given a list of 3-dimensional coordinates,
```
[[1, 2, 10],
 [3, 4, 20],
 [5, 6, 30],
 [7, 8, 40]]
```
Normalise each coordinate by dividing with its Z (3rd) element. For example, the first row becomes:
```
[1/10, 2/10, 10/10]
```

P3: Indexing

Create a 3x3 ndarray. Use fancy indexing to slice out the diagonal elements.

Predict and verify the shape of the following slicing operation. Remember: index arrays are broadcast first, then come slices.

x = np.empty((10, 8, 6, 5, 4))

idx0 = np.zeros((3, 8)).astype(int)
idx1 = np.zeros((3, 1))

x[1:2, z0, 1:3, 3:4, z1]

Advanced: This is not strictly speaking a question on indexing, but it's a fun exercise either way.

Construct an array

x = np.arange(12, dtype=np.int32).reshape((3, 4))

so that x is

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Now, provide to np.lib.stride_tricks.as_strided the strides necessary to view a sliding 2x2 window over this array. The output should be

array([[[[ 0,  1],
         [ 4,  5]],

        [[ 1,  2],
         [ 5,  6]],

        [[ 2,  3],
         [ 6,  7]]],


       [[[ 4,  5],
         [ 8,  9]],

        [[ 5,  6],
         [ 9, 10]],

        [[ 6,  7],
         [10, 11]]]], dtype=int32)

The code is of the form

z = as_strided(x, shape=(2, 3, 2, 2),
                  strides=(..., ..., ..., ...))

This sort of stride manipulation is very handy when applying region based statistics or operators.

P4: Structured Arrays

Design a data-type for storing the following record:
- Timestamp in nanoseconds (a 64-bit unsigned integer)
- Position (x- and y-coordinates, stored as floating point numbers)
Use it to represent the following data:
```
x = np.array([(100, (0, 0.5)),
              (200, (0, 10.3)),
              (300, (5.5, 15.1))], dtype=XXX)
```
Consider structured_arrays/data.txt. Modify load_txt_template.py to load the data in this file. This requires specifying a data-type that encapsulates a record such as
```
# name  x       y       block - 2x3 ints
aaaa    1.0     8.0     1 2 3 4 5 6
```
Create two structured arrays of your choosing. Use the np.savez command to save these to a single data-file. Load the data-file using np.load and confirm whether the data survived the round-trip. (Saving data using save or savez is highly recommended over pickling.)

P5: Universal Functions

Modify the code provided in ufunc to create your own universal function.

P6: Array Interface

Documentation for NumPy's __array_interface__ may be found in the online docs.

An author of a foreign package (array_interface/mutable_str.py) provides a string class that allocates its own memory:
```
In [1]: from mutable_str import MutableString

In [2]: s = MutableString('abcde')

In [3]: print s
abcde
```
You'd like to view these mutable strings as ndarrays, in order to manipulate the underlying memory.
1. Add an __array_interface__ dictionary attribute to s, then convert s to an ndarray. Use the given array_interface/template.py as a guide.
2. Add "1" to the array. Now print the original string to ensure that its value was modified.

P7: Optimisation: Demo Only