Some useful large data utilities

.Q -- all the functions in the Q directory
http://code.kx.com/wiki/DotQ
.Q.fs -- chunk inputs for csv files (we saw that in smarties4)
.Q.opt .z.x -- for parsing command line arguments

From command line:
q -f1 foo -f2 bar

within the interpreter
.z.x
args: .Q.opt .z.x
args[`f1]


.Q.dd[`:dir]`file -- create file paths

.Q.dpft -- allows the creation of partitioned or segmented tables
one segment at a time.


Starting from the directory utils.d

q buildhdb.q
\\

Now read it in (starting from utils.d)

q
\l start/db
myquotes:select  from quote where ask > 50
save`:../../myquotes.csv
save`:../../daily.csv
\\

Now read these csv files in as an example (starting from utils.d).

q
myquotes2: ("DTSFFIICC"; enlist ",") 0: `myquotes.csv
mydaily: ("DSFFFFFFI"; enlist ",") 0: `daily.csv


For very big tables that are fixed width (like trade/quote/nbbo taqquote
files), we use .Q.dsftg.

Small example.
Here is an input file:

21:04:03ABBBBBB
21:05:03BCCCCCC
21:06:03CDDDDDD

Here is a program to upload it.

/ quote fields (types;widths)
d:(`:tq;2010.09.02;`quoteout) / the destination directory tq/2010.09.02
	/ table is quoteout
f:`time`ex`sym / the column headers
t:("TCS ";12 1 6 1) / Format of input
	/ 1 at the end is for a trailing \n; 
	/ must leave a corresponding space in the types fields
s:(`:taqquotemini;  0;0) / source of data
g:{x} / don't convert
g:{@[x;`time;"i"$]} / convert to integer; x is the table being produced


.Q.dsftg[d;s;f;t;g]

\
http://www.nyxdata.com/Data-Products/Daily-TAQ


1. g is used to do further processing after the load(1:) and before the save
(.Q.en)
For example, to change the first field from time to int:

g:{@[x;`time;"i"$]}

2. dsftg does load and save in chunks. Two globals.
M is the total number of rows to process.
In production should be set to 0W
but could be set to say 17 for testing.
n is the number of rows at a time, for read, process, and append.

