This directory holds notes and code that were used in presentation
of lab 2 for "Machine Learning and Computational Statistics" Spring
The code is in python (2.x),
though it is certainly possible to use a different language to do
any of these tasks.
Some package installation will be required before running the code.
- Numpy (for scientific
- Scikit-learn (for
machine learning specific functionality)
- Matplotlib (for plotting)
- cvxopt (for optimization --
only required for the QP solver)
For mac users, I find the following installation tutorial
helpful (If you have trouble, check the comments on the bottom of
that page to see if others have found a solution).
- Here are notes on kernels and the kernel trick that I meant to present in the lab session.
- Download the code as a zip file here.
- create_examples.py is the code used to create the synthetic
datasets used in presentation. Call this first, it should create
4 example files in your current directory.
- svm_gui.py is a slightly modified version of a gui available
from scikit-learn [original
- The wisconsin breast cancer dataset is available from the UCI
- overfitting.py learns an SVM to classify on the breast cancer
dataset varying the values of C.
- generic_qp.py implements an SVM solver using an off-the-shelf
QP solver. (Note: I was asked a lot of
questions about what happens some property of the SVM is
changed - if you can formulate your change as a change to the
optimization problem, you can experiment with it here!)
- More info on QP solving in cxvopt can be found here. Matlab has a very similar interface