Professor Dennis Shasha,
Courant Institute, New York University.
Course Description
This 5 day
30 hour two part course is designed for programmers
who (i) are strong in some language (e.g. C, C++, Java,
Scheme, R, Matlab, SQL) (ii) but
are new or beginners to Q.
The goal is to make them proficient Q programmers.
The course is structured as a hands-on course in which students
are introduced to elements of Q and then work on exercises
having a large data puzzle character.
Q for Smarties (3 days) -- Fluency
-
Module 1: Computational Primitives.
-
Fundamental primitives of q including list, string, linear algebra,
aggregation, grouping and data generation operators.
-
User-written functions.
-
Integration of adverbs with built-in and user-written functions.
-
Functions that can read from the command line and read and write to files.
-
Hands-on programming exercises from non-parametric statistics,
Monte Carlo interest simulations, and date arithmetic on in-memory data.
This module will be necessary for anyone doing serious scripting in Q.
-
Module 2: Input/Output and Inter-process Communication
-
File input and output for small and large files.
-
Inter-process communication
-
Single client-multiple server communication, with failure-tolerance.
-
Multiple client-multiple server.
-
Module 3: Databases and Basic Database Operators
-
Creating tables using SQL semantics such as computing volume weighted
average price.
-
Ordered table semantics enabling moving averages with grouping.
-
Integration of user-defined functions within select clauses.
-
Generating large random data for simulation, bulk insert, adding columns,
update, and delete performance communication.
-
Foreign key joins, other equijoin types, asof, window joins.
-
User-defined functions.
-
Performance tuning.
Indexing (grouping/parting) and performance tests
and query tuning.
Q for Smarties (Two days) -- Advanced
-
Module 4: Scaling Up and Sophisticated Queries
Handling large tables through splaying, partitioning, compression, and
segmentation.
-
Mixed Q/KDB programming for parts explosions, trade strategy implementation,
running NAV.
-
Machine learning, domain-specific data
access languages, and finance-specific exercises.
-
Module 5: Advanced Tools
-
RDB, HDB
and tickerplants.
-
Accessing kdb from other languages.
-
Lots of hands-on in-class exercises.
This course is offered primarily to groups at enterprises that use
or plan to use q/kdb. When doing so, I can offer
flexibility in hours (e.g. I can work around critical trading
times during the day)
and I can tailor the content to your enterprise's needs.
Please contact me at shasha@cs.nyu.edu for further details.
Some Comments About the Class (when taught in the 3 day version)
-
Julian Dobson:
Overall I thought the class was excellent - your teaching style, and candy, make the material very digestible.
You might consider creating two separate courses; "q for beginners" which would cover just smarties til 3, and "q for developers" which would compress smarties 1 thru 3 into a single day (as a refresher, may be having a pre-requisite of having done the q for beginners, or at minimum having read thru the q for beginners material) followed by 2 days of the more complex smarties 4 & 5.
-
Asif Gilani:
The class offered by Dr.Shasha gives you a completely different experience of learning q. It is not an easy language to learn, very different from other high level language like C+/Java. His whole approach is so unique, interesting and concepts are laced with audience appropriate examples, makes it fun to learn q. Three full days class and you won't feel it, credit goes to Dr. Shasha. No matter what is your experience with q prior to joining this class, you will be much more comfortable after this experience. One notable difference I found is that his approach is very academic not commercial. His interest is to teach you not just to fill his time sheet and go home.
About the instructor
I am a professor of computer science at the
Courant Institute of New York University where I work with biologists on
pattern discovery for microarrays,
combinatorial design, network inference, and protein docking;
with physicists, musicians, and financial people on algorithms
for time series; and on
database applications in untrusted environments.
Other areas of interest include database
tuning as well as tree and graph matching.
Because I like to type, I have written
six books of puzzles about a mathematical detective, a biography about
great computer scientists,
and technical books about database tuning, biological
pattern recognition, time series, and statistics.
For fun, I have written the puzzle column for various
publications including
Dr. Dobb's Journal and
Scientific American.
If you're curious, here is my
CV
and a
review paper
on some our time series work.
shasha@cs.nyu.edu