3.10 Arrays

The elements of a list can, in principle, be of different types, e.g. [1, 3.5, "boo!"]. This is sometimes useful but, as scientists you will mostly deal with arrays. These are like lists but each element is of the same type (either integers or floats). This speeds up their mathematical manipulation by several orders of magnitude.

Arrays are not a ``core'' data type like integers, floating points and strings. In order to have access to the array type we must import the Numeric library. This is done by adding the following line to the start of every program in which arrays are used:

    from Numeric import *

When you create an array you must then explicitly tell Python you are doing so as follows:

    >>> from Numeric import *
    >>> xx = array([1, 5, 6.5, -11])
    >>> print xx
    [  1.    5.    6.5 -11. ]

The square brackets within the parentheses are required. You can call an array anything you could call any other variable.

The decimal point at the end of 1, 5 and -11 when they are printed indicates they are now being stored as floating point values; all the elements of an array must be of the same type and we have included 6.5 in the array so Python automatically used floats.

We can extend the box analogy used to describe variables in Section 3.3 to arrays. An array is a box too, but within it are smaller, numbered boxes. Those numbers start at zero, and go up in increments of one. See Figure 3.1.

Figure 3.1: Arrays can be thought of as boxes around boxes
\includegraphics[]{array.eps}

This simplifies the program--there need not be very many differently named variables. More importantly it allows the referencing of individual elements by offset. By referencing we mean either getting the value of an element, or changing it. The first element in the array has the offset [0] (n.b. not 1). The individual element can then be used in calculations like any other float or integer variable The following example shows the use of referencing by offset using the array created above:

    >>> print xx
    [  1.    5.    6.5 -11. ]
    >>> print xx[0]
    1.0
    >>> print xx[3]
    -11.0
    >>> print range(xx[1])       # Using the element just like any other
    [0, 1, 2, 3, 4]              # variable
    >>> xx[0] = 66.7
    >>> print xx
    [ 66.7   5.    6.5 -11. ]

Let's consider an example. The user has five numbers representing the number of counts made by a Geiger-Muller tube during succesive one minute intervals. The following program will read those numbers in from the keyboard. and store them in an array.

    from Numeric import *

    counts = zeros(5, Int)      # See below for an explanation of this
    
    for i in range(0, 5):
        print "Minute number", i
        response = input("Give the number of counts made in the minute")
        counts[i] = response

    print "Thank you"

The contents of the for loop are executed five times (see Section 3.6.2 ``Using the range function'' if you are unsure). It asks the user for the one minute count each time. Each response is put into the counts array, at the offset stored in i (which, remember, will run from $0$ to $4$).

The new thing in that example is the zeros function. You cannot get or change the value of an element of an array if that element does not exist. For example, you cannot change the 5th element of a two element array:

    >>> xx = array([3, 4])
    >>> xx[4] = 99
    Traceback (most recent call last):
      File "<pyshell#4>", line 1, in ?
        xx[4] = 99
    IndexError: index out of bounds

Contrast this with numbers (floats and integers) and strings. With these assigning to the variable creates it. With arrays Python must first know how many elements the variable contains so it knows where to put things, i.e. ``how many boxes are inside the box''.

This means we must create an empty five element array before we can start storing the Geiger-Muller counts in it. We could do this by writing counts = array(0, 0, 0, 0, 0) but this would quickly get tedious if we wanted a bigger array.

Instead we do it with the zeros() function. This takes two parameters, separated by a comma. The first is the number of elements in the array. The second is the type of the elements in the array (remember all the elements are of the same type). This can be Int or Float (Note the upper case ``I'' and ``F'' -- this is to distinguish them from the float() and int() functions discussed in Section 3.12 ``File input and output'').

In the Geiger-Muller example we created an array of type Int because we knew in advance that the number of counts the apparatus would make would necessarily be a whole number. Here are some examples of zeros() at work:

    >>> xx = zeros(5, Int)
    >>> print xx
    [0 0 0 0 0]
    >>> yy = zeros(4, Float)
    >>> print yy
    [ 0.  0.  0.  0.]

If there is any uncertainty as to whether Int or Float arrays are appropriate then use Float.

EXERCISE 3.10
Using for loops, range(), and the zeros() function, construct two 100 element arrays, such that element i of one array contains $\sin(2\pi i / 100))$ and the corresponding element of the other contains $\cos((2\pi i / 100))$.

Compute the scalar (i.e. dot) products of the two arrays, to check that $\sin$ and $\cos$ are orthogonal, i.e. their dot product is zero. The scalar, or dot, product is defined as:


\begin{displaymath}
\mathbf{x.y} = \sum\hackscore{i}x\hackscore{i}.y\hackscore{i}
\end{displaymath}

Note: We have only considered the features of arrays that are common to most other programming languages. However, Python's arrays are extremely powerful and can do some stuff that would have to be done ``manually'' (perhaps using for loops) in other languages. If you find you are using arrays in the problem then it is worth taking a look at Section 4.2, ``Arrays''. You will also note there is a function in the Numeric library that will calculate the dot product of two arrays for you! We want you to do it the hard way though.