Dear Colleagues,
First, I should say that I'm no expert on cointegration.
The research work I did in time series has to do with
filtering for cointegration (using correlations of returns)
rather than cointegration itself.
Why? I didn't see how to improve the performance
of the cointegration algorithms.

That said, here is a nice explanation of cointegration:
http://www.northinfo.com/documents/174.pdf

The basic idea is that you take two or more time series that individually
can wander all over the place (like a drunken guy in a random walk)
but that together, based on some weighting, are at least weakly stationary
(mean, standard deviation are constant, i.e. do
not change over time).

To get this to happen, we "weight" each of the individual time series
e.g. 0.2 * ibm stock + 0.7 * hp stock + 0.1 * facebook etc.

You can try this weighting and then use some kind of 
exploration method (e.g. gradient descent or just a combinatorial
exploration) to find reasonably
good weightings to achieve (near) stationarity.
http://etclab.mie.utoronto.ca/people/moman/Stationarity/stationarity.html

If two time series are cointegrated and then start to violate cointegration,
then there might be a reversion to the mean.

Having said all this, I would be happy (since this is primarily a q
programming class) if you just did correlation on returns using
sketches.

Best,
Dennis

