STATStream:
Frequently Asked Questions
- What
is StatStream?
- Who
should use StatStream?
- What
functions does StatStream provide?
- What
do I need in order to run StatStream?
- Why
is StatStream better than the intuitive methods?
- What
is the performance gain compared with intuitive methods?
- What can
I do as a developer?
- My question
isn't answered here. Whom should I ask?
- What
is StatStream?
StatStream is a high performance statistical tool that reports
correlation information in a real-time way over streaming time series.
- Who
should use StatStream?
People who require real-time correlaion information over many time series.
- What
functions does StatStream provide?
StatStream takes streaming data from multiple sources
and outputs stream pairs whose correlation exceeds a threshold
set by the user. Correlation information is reported periodically based again
on a user parameter. The correlation is calculated over a window size which is
yet another parameter.
- What
do I need in order to run StatStream?
StatStream is built in K, so it can run on Windows/Unix/Linux .
To run StatStream, K and KDB are necessary. Users can download it from here.
- Why
is StatStream better than the intuitive methods?
We use advanced data
reduction techniques (e.g. DFT, SVD, Random Projection) to reduce the dimensionality
of data vectors and sophisticated data structures to purge the irrelavant
data.
- What
is the performance gain compared with intuitive methods?
Due to its efficient
filtering power, StatStream can save substantial computational efforts compred
to the intuitive pairwise correlation.
The following empirical result show the StatStream system performance over
a variety of datasets. Minimum recall for approximation method is 99%
-
What can I do as a developer?
In the software package, two files base.k and sketch.k are included. Such tools as DFT, Wavelet and Sketch are
covered in these two files. A developer can easily integrate these data reduction codes into his/her
system. Statstream algorithms are implemented in statstream2.k. People can find both DFT based statstream
and the sketch based one by tracing the codes in statstream2.k. All the codes are fully documented and thus easy
to follow. Any questions or feedbacks are welcome.
- My
question isn't answered here. Whom should I ask?
You can send email to shasha@cs.nyu.edu or xiaojian@cs.nyu.edu to contact the authors
of the software.
Maintained by shasha@cs.nyu.edu
Last Updated Nov. 29, 2005