Wednesday, December 5, 2012

Code for the Newton's method problem

As promised, here is my code for Newton's method. Note that I am using Numpy but not pandas. A friend has lent me this book so I am currently working on learning what pandas can do for me.

The first thing you have to do is import numpy. I import as np so all of my numpy commands begin with np. For some reason, the linear algebra module does not load unless I specifically call it with an import command. I need this to invert the Hessian matrix.
import numpy as np
import numpy.linalg
from math import *


Here I read in the data. The size of x along axis=0 (rows) gives the number of training examples. In order to get the intercept term, I have to put a column of ones in front of the x data. In order to get the column of ones where I wanted it, I had to use the insert command. Then I checked the number of columns.
x=np.loadtxt('q1x.dat')
m=np.size(x,0) #number of training examples
x=np.insert(x,[0],[1],axis=1)
n=np.size(x,1) #number of columns


This piece of code sets up a vector of zeros for theta to start the iterations. I do the same thing in the code below for the Hessian matrix and gradient vector. Recall that everything in Python starts at zero so I have an n x 1 vector for the gradient instead of n+1 x 1 like in Matlab.
theta=np.zeros(n)

y=np.loadtxt('q1y.dat')

 Note the reshape command for theta when I am calculating h. For some reason, I couldn't get theta to transpose.
# set up for 10 iterations
for i in range(0,10):
    H=np.zeros((n,n))
    grad=np.zeros(n)
    print "Theta just before h is ", theta
    h=1/(1+e**(np.dot(x,theta.reshape(n,1))))
    ts=np.dot(x,theta.reshape(n,1))

This part of the code is pretty self explanatory and looks very much like the matlab code from the answer sheet. However, note the outer command in the calculation for H. Let's examine this a little more closely.

Here's an example. In this example, a is an example of the lprime vector with one variable (ignore the intercept) and 4 training examples. b is the same vector. Notice they are both 4x1, so you cannot use matrix multiplication.
 a=np.array([[1],[2],[3],[4]])
>>> print a
[[1]
 [2]
 [3]
 [4]]
>>> b=np.array([[1],[2],[3],[4]])
>>> print b
[[1]
 [2]
 [3]
 [4]]
>>> c=np.dot(a,b)

Traceback (most recent call last):
  File "<pyshell#23>", line 1, in <module>
    c=np.dot(a,b)
ValueError: objects are not aligned




However, look at what happens when you use the outer command:
>>> d=np.outer(a,b)
>>> print d
[[ 1  2  3  4]
 [ 2  4  6  8]
 [ 3  6  9 12]
 [ 4  8 12 16]]

You get a 4x4 matrix. Where did this come from? The outer product gives you
out[i, j] = a[i] * b[j]
This is what you need for Hessian matrix. It took me a really, really long time to find this. You're welcome.
So here is the rest of the code:
# This calculates the values for lprime vector which will be summed below
    for j in range(0,m):
     
        grad=grad+(x[j]*(y[j]-h[j]))

        H=H+h[j]*(1-h[j])*np.outer(x[j],x[j])
   
    Hinv=np.linalg.inv(H)
    theta=theta-np.dot(Hinv,grad)


The next post will show the code for the plot.


No comments:

Post a Comment