As part of a machine learning course I’ve been taking on Coursera, I had to get some packages installed.
Since I couldn’t find a one-stop webpage covering all the instructions, I had to go back and forth multiple webpages. And then, after I’ve installed the whole thing, it took me a while to figure out how to run it.
And so, in this single post, I try to explain everything to you.
First up, I had to install the following packages:
- IPython Notebook
- GraphLab Create
GraphLab Create is not a free software, but they provide a 1-year, renewable license for educational purposes. You’ve to first go to their webpage and register yourself.
First up, go to the official instructions page and follow the instructions!
There are two options for installation:
- Installation into Anaconda Python Environment (recommended)
- Installation in Python environment using virtualenv
After following the official recommended path, you would have
- Installed Anaconda, pip, GraphLab Create, and IPython Notebook.
- Created a new Conda environment called
gl-env
.
In case you’re wondering (like I did), rest assured that the Anaconda installation will not clash with your existing Python installation (that ships with most Linux distributions).
On their website there is an option to upgrade to a version that uses GPU acceleration. I haven’t tried that myself, but feel free to try it if you have a compatible GPU card
Starting IPython Notebook to use GraphLab
The proper procedure for firing up the whole thing (in Linux) is:
- Open the terminal.
cd
to the directory where your IPython Notebooks are. Strictly speaking, this step is optional; but this is what you want to do in most cases.- Activate the
gl-env
Conda environment which you created earlier (see below for a brief into to Conda).$ source activate gl-env
- Start your IPython Notebook
$ ipython notebook
And there you go! You’re all set!
Step 3 above is where everybody gets it wrong; they simply skip this step! Although IPython Notebook would start up fine, if you skip step 3, python will choke at you when you try to import the graphlab
package:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-4b66ad388e97> in <module>()
----> 1 import graphlab
ImportError: No module named graphlab
This is because, if you’ve followed the official instructions, only the gl-env
environment would have the graphlab
package installed.
Brief Introduction to Conda
Conda, in simple terms, is a tool that allows you to simultaneously have multiple installations of Python on your computer without messing up the different installations. ie., you could create different “environments” of Python, each with different packages.
Depending on your needs, you can set up the different “sandboxed” environments with different packages installed in them; even different versions of python itself! And you can easily switch between the environments. A prime advantage to working this way is that you don’t have to touch the native python installation on your OS (if it has one).
To learn more about using Conda, check out the official documentations:
Trust me, Conda makes your life so much easier. (You might also want to check out virtualenv).
Hope you’ve found this post helpful.
Useful links: