For those of your following along, you know that I have been using Python instead of Matlab or Octave for the Machine learning course.
It turns out that Python is a base for interactive computing. It's like buying the base model of a car: it comes with the standard equipment and not much else. It will get you from Point A to Point B. But it you want to do something fancy, you have to add on.
For interactive data analysis, there a modules which you can import into Python that make scientific computing easier. You can add in Numpy (numerical python) which gives you access to arrays. You can add in Pandas which gives you access to dataframes. Dataframes allow you to treat data as if it is in a spreadsheet. This makes is much easier to summarize the data. I'll do a separate post on dataframes later.
With each new module that you add in, there are new data structures and commands to learn. This makes it incredibly frustrating for a newbie like me.
So when a friend loaned me Wes McKinney's Python for Data Analysis book, I was thrilled. I figured I could just follow along and learn everything I need to know. Of course, life is never that easy as I found out when I got to Chapter 3. In Chapter 3, Mr. McKinney starts using IPython. In order to keep using the book, I had to install this on my computer which uses Windows Vista. It turns out this is a big problem because all of the instructions for downloading IPython on your computer are written assuming you are using a Linux based system.
I have finally gotten IPython working on my computer, but it took a lot of research and finagling to do it. In order to help you, I'll try to walk you through the steps.
The completely unhelpful documentation for installation can be found at ipython.org. Click on the link and read the documentation. The only thing I understood when I read that is that I need Python version 2.6 or higher already installed on my computer. I had already installed Python 2.7 so that I could use Numpy, Matplotlib and Pandas. But what are easy_install and pip? The documentation doesn't explain and there is no further information when you click on pypi.
I did find a blog (this is usually the best source for a newbie) that explains it all. Click on this link to get the instructions. Now that you have done all that, you are ready to use IPython and the interactive notebook. You can see a picture of it here.
Here's how I start up the notebook. It's not perfect, but it gets me where I want to go.
Click on the Windows icon circle.
Type cmd in the search box.
The window with the command prompt will open.
You must change the directory. Type cd c:\Python27\scripts
When the command line prompt comes back, type ipython notebook --pylab=inline
This opens up the notebook and allows you to get plots in the notebook and not a separate window. The only problem that I have is that it opens up the notebook in Explorer and it really doesn't work. I just copy the IP address into Firefox and it works for me.
I have just finished all the lectures for Support Vector Machines, so I will be working on the next problem set.