STATStream: Frequently Asked Questions

  1. What is StatStream?
  2. Who should use StatStream?
  3. What functions does StatStream provide?
  4. What do I need in order to run StatStream?
  5. Why is StatStream better than the intuitive methods?
  6. What is the performance gain compared with intuitive methods?
  7. What can I do as a developer?
  8. My question isn't answered here. Whom should I ask?

  1. What is StatStream?

    StatStream is a high performance statistical tool that reports correlation information in a real-time way over streaming time series.

  2. Who should use StatStream?

    People who require real-time correlaion information over many time series.

  3. What functions does StatStream provide?

    StatStream takes streaming data from multiple sources and outputs stream pairs whose correlation exceeds a threshold set by the user. Correlation information is reported periodically based again on a user parameter. The correlation is calculated over a window size which is yet another parameter.

  4. What do I need in order to run StatStream?

    StatStream is built in K, so it can run on Windows/Unix/Linux .
    To run StatStream, K and KDB are necessary. Users can download it from here.

  5. Why is StatStream better than the intuitive methods?

    We use advanced data reduction techniques (e.g. DFT, SVD, Random Projection) to reduce the dimensionality of data vectors and sophisticated data structures to purge the irrelavant data.

  6. What is the performance gain compared with intuitive methods?

    Due to its efficient filtering power, StatStream can save substantial computational efforts compred to the intuitive pairwise correlation.
    The following empirical result show the StatStream system performance over a variety of datasets. Minimum recall for approximation method is 99%




  7. What can I do as a developer?

    In the software package, two files base.k and sketch.k are included. Such tools as DFT, Wavelet and Sketch are covered in these two files. A developer can easily integrate these data reduction codes into his/her system. Statstream algorithms are implemented in statstream2.k. People can find both DFT based statstream and the sketch based one by tracing the codes in statstream2.k. All the codes are fully documented and thus easy to follow. Any questions or feedbacks are welcome.

  8. My question isn't answered here. Whom should I ask?

    You can send email to shasha@cs.nyu.edu or xiaojian@cs.nyu.edu to contact the authors of the software.

Maintained by shasha@cs.nyu.edu

Last Updated Nov. 29, 2005