k tutorial

Stevan Apter, Fintan Quill, and Dennis Shasha

date: July 2020 (with frequent updates)

Note that this document plagiarizes freely from the earlier tutorial for k7 by Alexander Belopolsky and Dennis Shasha.

Since 1992, Arthur Whitney's k and its derivatives have served a small number of highly skilled programmers, enabling them to create high performance (and high value) applications for finance and other data-intensive domains. While implementation efforts in other languages such as C++ and Java (and, to a lesser extent, Python, R, and Matlab) often involve thousands of lines of code, much of it built on top of libraries, a typical k application is on the order of scores of lines of code without the need for libraries. The expressive power is in the language itself.

This tutorial aims to familiarize users of conventional programming languages with the latest iteration of k, Shakti k. The language is capable of managing streaming, in-memory, historical, relational, and time-series data. The distributed model extends out to multiple machines whether on-premise or in the cloud.

Shakti k provides connectivity via Python, HTTP, SSL/TLS, and json. Shakti k supports compression and encryption for data, whether in-memory, in-flight or on disk. Shakti k also has primitives for blockchain operations.

The tutorial introduces language concepts, then presents examples. A good way to learn the language is to try to program the examples on your own.

Readers are invited to suggest corrections or new examples. You can contact the authors at tutorial@shakti.com.

First encounter

Installing k

This can change, but as of August 2020, Shakti Software distributes a free evaluation version of k through its website. Just unzip and go.

Shakti k does not have any dependencies and once you install it, you are ready to go. Simply type k at the command prompt and you will see the k banner and the prompt will change to a single space:

$ k
2020.05.03 (c) shakti
 █

The banner starts with the timestamp corresponding to the modification time of the k program. The timestamp is followed by a letter that will be M if you are using macOS and L if you are using Linux. The Ncore and Mgb parts show how many CPU cores (N) your k session will use and how much memory (M gigabytes) is earmarked for k use.

Power tip: for better interactive experience we recommend installing the rlwrap utility so that you can (among other things) bring back previous commands using an up-arrow.

alias k="rlwrap k"

this will allow editing the expression that you enter at the k prompt and to recall any previous input from history.

Using k as a calculator

You can start using k as a powerful calculator: enter an expression at the prompt, press Enter and k will evaluate the expression and print the result. Many avaliable operations will look familiar, but you will soon discover some features that are unique to k.

Arithmetics

In k, +, -, and * work as the usual addition, subtraction and multiplication operations, e.g.,

 3*4
12

but the division operator is % while / has several uses including serving as a prefix for comments that k will ignore:

 10%3      / 10 divided by 3
3.333333

The next feature that may come as a surprise is that k does not use the traditional order of operations

 3*2+4     / addition is performed first
18

Instead of the (P)EMDAS order, k consistently evaluates its expressions from right to left with only the parentheses having higher precedence

 (3*2)+4   / multiplication is performed first
10

Elementary functions

As any good scientific calculator, k comes with a number of built-in functions. You can apply these functions by simpy typing their names before the argument separated by a space

 sqrt 2
1.414214

Trigonometric functions operate on arguments in radians and you will often need the π constant to convert from degrees. The π constant is built in in k and if you are using k on a Mac, you can type it using the alt-p key combination

 sin 3.1
0.04158066

Special values

The log function is base e which is about 2.7.

 log 3
1.098612

The exp is also e to the power given

 
 exp 2
7.389056

When the result is completely undefined, k will return 0w, which stands for missing data

 1 % 0
0w

Unary operators

In the previous sections, we have seen operators that take two numbers as operands and functions that take one number as an argument. From elementary math, you are familiar with a unary - operator that takes a single operand and returns its negation. Not surprisingly, - does the same in k:

 - 42
-42

Reference card

Within a k session, you can type \ to get a summary of the basic operations.

 \

Verb                      Adverb                 Noun               System  
:                         '  each                bool  011b         \       
+   plus      flip        /  eachright over      int   0 2 3        \l a.k  
-   minus     minus       \  eachleft  scan      float 0 2 3.4      \v [d]  
*   times     first       ': eachprior           char  " ab"       *\f [d]  
%   divide                /: initover converge   name  ``ab         \w [x]  
&   and       where       \: initscan converge   date   2024.01.01  \t:n x  
|   or        reverse                            time 12:34:56.789  \u:n x  
<   less      asc         I/O                                               
>   more      dsc         0: read/write line     List (2;3.4;`a)    \fl line
=   equal     group       1: read/write char     Dict {x:2;y:`a}    \fc char
~   match     not         2: *read/write data    Func {[x;y]x+y}    \cd dir 
!   key       enum        3: *kipc/set           Expr :x+y                  
,   cat       enlist      4: *http/get                                      
^   cut       sort                                                          
#   take      count       #[t;c;b[;a]]  select   Table  [[]a:..;b:..]       
_   drop      floor      *_[t;c;b[;a]]  update   NTable `a`b![[]c:..]       
?   find      unique     *?[x;i;f[;y]]  splice   TTable  [[a:..]b:..]       
@   at        type        @[x;i;f[;y]]  amend                               
.   dot       eval        .[x;i;f[;y]]  dmend    atom/list/dict/table       
$   value     string      $[c;t;f]      cond     scalar/vector/matrix       

select A by B from T where C; count first last min max sum avg              
sqrt exp log sin cos div mod bar ema ; in bin has                           

date/cuando: 2024.01.01T12:34:56.123456789 .z.[DTV] MDHRSTUV                
time/cuanto: 2m 2d ..   12:34:56.123456789 .z.[dtv] mdhrstuv                
x~`json?`json x:(23;3.4;"abc");x~`csv?`csv x                                

ffi:"./a.so"2:`f!"j"  /long f(long j){return 1+j;} //cblas ..               
c-k:"./b.so"2:`f!1    /K f(K x){return kj(1+xj);}  //feeds ..               
python: from k import k;k("+",(2,3)))  (nodejs: require("k"))               

error: value class rank length type domain (parse stack limit)              
limit: {[param8]local8 global32 const128 jump256} name8     


Lists

Much of the expressiveness of k derives from the fact that most operations that operate on single values (atoms) generalize nicely to lists.

The simplest list is simply a sequence of numbers separated by spaces. When you apply one of the arithmetic operations between an atom and a list, k computes the result of the operation between each element of the list and the atom. For example,

 1 2 3 4 * 10
10 20 30 40

You can also perform operations on the lists of the same length and k will pair up the corresponding elements and apply the operation to each pair.

 1 2 3 + 3 2 1
4 4 4

However, if the lengths don't match, k will signal an error

 1 2 3 + 3 2
1 2 3 + 3 2
:length

Generating lists

Entering long lists into k can soon become tedious and k provides nice ways to generate lists either deterministically or randomly. You can generate a uniform list of any length by placing &, ! or rand in front of a number:

  • &N - N zero bits
  • !N - 0 through N-1
  • ? N - N random numbers drawn uniformly from [0, 1]
  • ? -N ? M - N random numbers drawn without replacement from !M (here M ≥ N)

Zeros

Instead of typing 0 twelve times for a list of twelve zeros, we can tell k to generate a list for us

 &12
0 0 0 0 0 0 0 0 0 0 0 0

Index

 x: !10
 x 
0 1 2 3 4 5 6 7 8 9
 5 + x
5 6 7 8 9 10 11 12 13 14

Random lists

 10 ? 50 / generate randomly and uniformly with replacement (so there can be duplicates)
31 1 10 31 29 4 29 23 11 9

(because of randomness, your results might not match the above or below)

 
? 10 / generate 10 floating point numbers between 0 and 1
0.8380079 0.9375869 0.7381769 0.6844356 0.6763259 0.698496 0.7214945 0.8788969 0.6747493 0.570967


 

 -10 ? 10 / generate randomly without replacement (no duplicates)
3 7 0 5 1 2 9 6 4 8
 -10 ? 20
18 13 19 3 14 10 15 9 8 0

Assignment

Assignment in k uses a colon.

 a: 5+!12

From the next little while, a will refer to the sequence 5 through 16 until we reuse this name by assigning it to something else.

 a
5 6 7 8 9 10 11 12 13 14 15 16

Drop and Take

From simple lists, k can create other by cutting the list or "dropping" using the _ operator. For example, we can take the list in a varible before and cut off the first 2 elements:

 2  _ a
7 8 9 10 11 12 13 14 15 16

Or we can delete certain elements in a list as the left argument _. For example, we can take the list in a and remove those elements:

  8 11 _ a
5 6 7 9 10 12 13 14 15 16

If we want to create a matrix with some single initial value# (reshape) operator:

 3 4 # 0
(0 0 0 0;0 0 0 0;0 0 0 0)

If we want to extract ("take") from a list all instances of a pattern, then use the take # (reshape) operator:

 3 4 # 2 3 4 5 3 15 222 4 55
3 4 3 4

If the reshape operator is given a list on the right that contains fewer items than is necessary to fill the shape, items from the front of the list will be reused

 30 # a
5 6 7 8 9 10 11 12 13 14 15 16 5 6 7 8 9 10 11 12 13 14 15 16 5 6 7 8 9 10

Lists, enlist, and join

A list in k is simply a sequence of elements, e.g.,

 x: 3 1 6 4
x
3 1 6 4

Lists have length:

 #x
4

A list with one element can be created using an "enlist" operator which is just a comma:

 y: ,42
y
,42

The result has a comma to indicate that this is a list with a single element as opposed to just a single element.

Lists can be concatenated also using a comma, e.g.,

 y,y
42 42
 x,y
3 1 6 4 42
 x,x
3 1 6 4 3 1 6 4

One can generate multi-dimensional lists using semicolons, e.g.,

 z: (5 3; 15 30; 50 60)
z
 5  3
15 30
50 60

This matrix can then be accessed using bracket notation in row-column format

 z[2;0]
50
 z[0;1]
3

Further, we can take entire rows:

 z[0]
5 3

or columns

 z[;1]
3 30 60

Arithmetic works on both single dimensional arrays

 x + x
6 2 12 8

and multi-dimensional arrays

 z + z
 10   6
 30  60
100 120

Concatenation works on multi-dimensional arrays

 z,x
5 3  
15 30
50 60
3    
1    
6    
4  

Verbs, Adverbs, and User-defined Functions

The operators in k are called 'verbs' and often have two meanings depending on whether they are 'unary' (apply to a single argument) or 'binary' (apply to a pair of arguments). Usually, the binary verb will be the more familiar one. Much of the power comes from applying verbs to arrays.

Verbs

unary + (flip or transpose)

 x: (1 2 3 4; 5 6 7 8) / assign to x  a two row array whose first row is 1 2 3 4
 x
1 2 3 4
5 6 7 8

 +x                    / a four row array (transpose of x) whose first row is 1 5
1 5
2 6
3 7
4 8

binary + (plus)

 2 + 3 / scalar (single element) addition
5

 x + x / array addition (element by element)
2 4 6 8
10 12 14 16

 2 + x / element to array addition
3 4 5 6
7 8 9 10

Just as most human languages have verb modifiers called adverbs, k does too. Adverbs apply to most unary and binary operators. Thus, the / adverb (called 'over'), instead of indicating a comment, can cause the binary version of the verb to apply to the elements in the array in sequence and yields a single result. The \ adverb (called 'scan') does the same but keeps all the intermediate results.

over and scan

 +/ 1 2 3 4  / Apply the + operator between every pair of elements; produce sum
10

  +\ 1 2 3 4 / Same as above but produce all partial sums
1 3 6 10

Adverbs can modify verbs directly but can also modify verb-adverb combinations. The ' (each) adverb takes both roles.

 x
1 2 3 4
5 6 7 8

 +/'x / Apply +/ to each row of x
10 26

 +\'x / Apply +\ to each row of y
1 3 6 10
5 11 18 26

Adverbs can modify user-defined functions as well.

 f:{[a] (a*a)+3 } / function that squares its arg and adds three
 f[4]
19

Now we can apply f to each element of an array using the each adverb.

 f'1 2 3 4
4 7 12 19

There is \ (each left):

 1 2 3 4 +\ 10
11 12 13 14

There is each right:

 20 +/ 1 2 3 4
21 22 23 24

There is each left each right (which should be interpreted as performing an each left on successive elements of the right array):

 1 2 3 4 +\/ 10 20 30 40 50
(11 12 13 14;21 22 23 24;31 32 33 34;41 42 43 44;51 52 53 54)

Note that the value of the above expression is the same as this:

 x: 1 2 3 4 +\/ 10 20 30 40 50
 x
11 12 13 14
21 22 23 24
31 32 33 34
41 42 43 44
51 52 53 54
So the only difference is layout.

Eachright eachleft considers each element of the left array one at a time and applies

 1 2 3 4 +/\ 10 20 30 40 50
11 21 31 41 51
12 22 32 42 52
13 23 33 43 53
14 24 34 44 54

Just as each applies to user-defined functions, so does eachleft and eachright. This is one of the coolest features of the language.

Functions Arguments and Adverbs

 g:{[a;b] a + (7 * b)}
 g[2;3]
23

Eachleft considers each element of the left array one at a time and applies g to that element and to the entire right array.

 1 2 3 4 g\ 10 20
(71 141;72 142;73 143;74 144)

Eachright considers each element of the right array one at a time and applies g to the left array and that element.

 1 2 3 4 g/ 10 20
(71 72 73 74;141 142 143 144)

Eachleft eachright considers each element of the right array one at a time and applies g\: to the left array and that element.

 1 2 3 4 g\/ 10 20
(71 72 73 74;141 142 143 144)
Note that the operators inside g are + and * which apply equally well to scalars as to vectors, so g\/ will have the same result as g/. That is not always the case as we'll see when we look at matrix multiplication.

Finally, each can apply to the first argument

 x1: 1 2 3 4
 x2: 50

 g[;x2]'x1
351 352 353 354

Extended Example: matrix multiplication

Recall that matrix multiplication involves the dot products between rows of the left matrix and the columns of the right matrix.

 leftmat: (1 2 3; 4 5 6; 7 8 9; 10 11 12)
 leftmat
1 2 3
4 5 6
7 8 9
10 11 12

 rightmat: (100 200 300 400 500; 1000 2000 3000 4000 5000; 10000 20000 30000 40000 50000)
 rightmat
100 200 300 400 500
1000 2000 3000 4000 5000
10000 20000 30000 40000 50000

 dot:{[v1;v2] +/ v1 * v2} / dot product function

 dot[4 5 6; 300 3000 30000]
196200

 matmult:{[m1;m2] m1 dot/\ +m2}
 matmult[leftmat; rightmat]
(32100 64200 96300 128400 160500;65400 130800 196200 261600 327000;98700 197400 296100 394800 493500;132000 264000 396000 528000 660000)

/ To see that we need the /\, consider the following slightly changed
/ multiplication functions.
/ If you try them, you'll get a length error.
 matmultleft:{[m1;m2] m1 dot\ +m2}
 matmultright:{[m1;m2] m1 dot/ +m2}

Adverbs Replace Loops

K programmers tend not to need loops. In fact, some of them (like the first author of this document) militantly disdain loops. The reason is simply that the language uses adverbs instead of loops.

For example, the loop in a python-like language

result = 0
for i = 1 to len(myarray)
  result += f(myarray[i])

becomes

result: +/f'array

Each invocation of f could be done in parallel, depending on the implementation.

By contrast, we might want sequentiality, at least semantically. For example, cumulating calculations within a loop such as:

result = 0
for i = 1 to len(myarray)
  if result = f(result, myarray[i])

become / or \ at least for some f's

 myarray: 2 2 2 2
 f:{[x;y] x + 2*y}

 f\ myarray
2 6 10 14

 f/ myarray
14

Beyond numbers

"The easiest machine applications are the technical/scientific computations." Edsger W.Dijkstra

Names, characters and strings

Character strings are simply an array of characters

 x: "fast, cool, and really concise"
 #x
30
 x[2 4]
"s,"
 x[<x] / The < is a sorting operator (to be discussed in more detail below)
"    ,,aaacccdeefilllnnooorssty"

Whereas character strings occupy one byte per character, symbols are hashed and therefore take less space, a useful feature in a large data application in which a symbol is repeated many times.

 x: (`abc; `defg)
 x
`abc`defg
 #x
2

Here is some guidance in choosing between symbols and characters. If there are a few distinct character sequences and they are repeated many times (e.g., a history of all trades where there are only a few thousand stock symbols but millions of trades), then symbols are best for operations like sorting and matching. Otherwise char vectors are probably better especially if you need to do substring matching.

Dates, times and durations

Date format is year-month-day and you can get the day by .z.D

 x: .z.D
 x
2020.07.24


 x + 44
2020.09.06

Time format is hour:minutes:minutes.milliseconds

 x: .z.t / greenwich mean time
 x
22:52:01.424



 x + 609 / add to milliseconds
22:52:02.033

Dictionaries

/ Dictionaries are key to value structures. There are many ways to create a dictionary.

/ From partitions

 mypart: ="many sentences have the letter e very often"
 mypart
 |4 14 19 23 30 32 37        
a|1 16                       
c|11                         
e|6 9 12 18 22 25 28 31 34 41
f|39                         
h|15 21                      
l|24                         
m|0                          
n|2 7 10 42                  
o|38                         
r|29 35                      
s|5 13                       
t|8 20 26 27 40              
v|17 33                      
y|3 36       

 / Note that the first entry corresponds to the blank between words   


 mypart["e"]
6 9 12 18 22 25 28 31 34 41

/ One can also create dictionaries directly

 mydict: `bob`carol!(2;3)
 mydict
bob  |2
carol|3

 mydict[`bob]
2

/ dictionaries can be heteogeneous in their values

 mydict2: `alice`bill`tom`judy`carol!(1 2 3 4; 8 7; 5; "we the people"; `abc)

 mydict2[`carol]
`abc
 mydict2[`judy]
"we the people"

Tables

In the row-wise vs. column-wise table debate, k comes out as columnwise. We'll work up to this slowly.

Consider a list:

 x: 1 2 3 4 5 10 15
 x
1 2 3 4 5 10 15

Create a one column table from this:

 xtab: +`numcol!x
 xtab
numcol
------
     1
     2
     3
     4
     5
    10
    15

Here the table has one column and its header is numcol.

  select numcol from xtab
[[]numcol:1 2 3 4 5 10 15]


  select sum numcol from xtab
[numcol:40]

   select sum numcol from xtab where numcol > 4
[numcol:30]

Ok, now let's create a multiple column table. Your results may differ because of the use of randomness.

 n: 10
 newtab: +(`stock`date`price`vol)!(n ? `ibm`goog`hp;.z.D+/n ? 10;100 + n ? 200; n ? 5000)
 newtab
stock date       price vol 
----- ---------- ----- ----
goog  2020.07.29   182 2797
ibm   2020.08.02   171   18
goog  2020.08.02   165  676
goog  2020.07.27   148 1087
ibm   2020.07.26   188 4855
goog  2020.08.02   279 3445
ibm   2020.07.26   256 3497
ibm   2020.07.24   218 2213
goog  2020.07.28   181 3019
hp    2020.07.28   100 2255


  select sum price*vol by stock from newtab
[[stock:`goog`hp`ibm]vol:2289064 225500 2293484]




  select sum price*vol by stock from newtab where vol > 1500
[[stock:`goog`hp`ibm]vol:2016648 225500 2290406]

User-defined functions:

 f:{[x] 1.5*x}
 select sum f[price*vol] by stock from newtab where vol > 1500
[[stock:`goog`hp`ibm]vol:3024972 338250 3435609f]

Extracting Data from Tables into Other Structures

/ select always gives a table

 select date from newtab
 select date from newtab
[[]date:2020.07.29 2020.08.02 2020.08.02 2020.07.27 2020.07.26 2020.08.02 2020.07.26 2020.07.24 2020.07.28 2020.07.28]

/ Indexing by column name gives an array.

newtab[`date] 2020.07.29 2020.08.02 2020.08.02 2020.07.27 2020.07.26 2020.08.02 2020.07.26 2020.07.24 2020.07.28 2020.07.28

Modifying Tables

Arthur/Fintan: update is not working for me. Still to be implemented.
 select vol from newtab
[[]vol:2797 18 676 1087 4855 3445 3497 2213 3019 2255]


 update vol:1+vol from newtab
vol
---
101
101
201
201

/ but the table itself is not changed because there no assignment:

  newtab
tradeid stock timeindi price vol
------- ----- -------- ----- ---
      1 goog        50  1237 100
      2 msft        51   109 100
      3 goog        52  1240 200
      4 msft        53   112 200

/ Similarly delete without assignment gives a result but the underlying table is not changed.


 delete from newtab where vol > 100
[[]stock:,`ibm;date:,2020.08.02;price:,171;vol:,18]


 newtab
stock date       price vol 
----- ---------- ----- ----
goog  2020.07.29   182 2797
ibm   2020.08.02   171   18
goog  2020.08.02   165  676
goog  2020.07.27   148 1087
ibm   2020.07.26   188 4855
goog  2020.08.02   279 3445
ibm   2020.07.26   256 3497
ibm   2020.07.24   218 2213
goog  2020.07.28   181 3019
hp    2020.07.28   100 2255

/ On the other hand

Arthur/Fintan: update is not working for me. Still to be implemented.
 newtabupdated: update vol:1+vol from newtab
 newtabupdated
vol
---
101
101
201
201


/ Would need to assign to newtab to see this effect. / e.g. newtab: delete from newtab where vol > 100

/ Here is a row to insert.

 x: {stock:`amzn;date:2020.06.28;price:281;vol:4009}

/ An insert:

 newtab: newtab,x
 newtab
stock date       price vol 
----- ---------- ----- ----
goog  2020.07.29   182 2797
ibm   2020.08.02   171   18
goog  2020.08.02   165  676
goog  2020.07.27   148 1087
ibm   2020.07.26   188 4855
goog  2020.08.02   279 3445
ibm   2020.07.26   256 3497
ibm   2020.07.24   218 2213
goog  2020.07.28   181 3019
hp    2020.07.28   100 2255
amzn  2020.06.28   281 4009

/ There is a notion of a keyed table where each key value is supposed / to occur only once. tradeid is an example.

Arthur/Fintan: This keyed table facility is not working for me. Still to be implemented.
 u:`tradeid key newtab

/ Note that the , operator will insert if the new row has a new key / (tradeid of 45)

 y: {tradeid:45;stock:`goog;time:16:26:50.123;price:3337f;vol:75}
 u,y
tradeid|stock time         price vol
-------|----- ------------ ----- ---
 1     |goog  15:16:50.000 1237  100
 2     |msft  15:16:51.000 109   100
 3     |goog  15:18:50.000 1240  200
 4     |msft  15:18:52.000 112   200
15     |goog  15:26:50.123 2337  200
45     |goog  16:26:50.123 3337   75

/ but update if the new row has an existing key (tradeid of 4). / So the , operator is called an upsert.

 y: {tradeid:4;stock:`goog;time:16:26:50.123;price:3337f;vol:75}

 u,y
tradeid|stock time         price vol
-------|----- ------------ ----- ---
 1     |goog  15:16:50.000 1237  100
 2     |msft  15:16:51.000 109   100
 3     |goog  15:18:50.000 1240  200
 4     |goog  16:26:50.123 3337   75
15     |goog  15:26:50.123 2337  200

/ This is an upsert because this is an operation that specifies / the key and all fields.

 u
tradeid|stock time         price vol
-------|----- ------------ ----- ---
 1     |goog  15:16:50.000 1237  100
 2     |msft  15:16:51.000 109   100
 3     |goog  15:18:50.000 1240  200
 4     |msft  15:18:52.000 112   200
15     |goog  15:26:50.123 2337  200

Importing from a csv file

Create a small csv file mytrade.csv whose schema is: tradeid,stock,timeindicator,price,vol

Arthur/Fintan: I/O is not working for me.
mytrade.csv:
1,goog,50,1237,100
2,msft,51,109,100
3,goog,52,1240,200
4,msft,53,112,200
 ("iCiii";",")0:"mytrade.csv"
1    2    3    4   
goog msft goog msft
50   51   52   53  
1237 109  1240 112 
100  100  200  200 



 mytrade1: +(`tradeid`stock`timeindicator`price`vol)!("iCiii";",")0:"mytrade.csv"

 mytrade1
tradeid stock timeindi price vol
------- ----- -------- ----- ---
      1 goog        50  1237 100
      2 msft        51   109 100
      3 goog        52  1240 200
      4 msft        53   112 200


 select tradeid, price, vol from mytrade1
[[]tradeid:1 2 3 4;price:1237 109 1240 112;vol:100 100 200 200]




 select tradeid, price, vol from mytrade1  where price > 500
[[]tradeid:1 3;price:1237 1240;vol:100 200]

Many operations are faster on key fields of tables. Recall that a key in a table is a field with the property that no two rows of the table have the same value in that field. To specify a table with keys, we can use:

 

mykeytable: [[a:1 2 3]b:4 5 6]
In the above, a is the key field. Now if we want to append to this table, we can do that as follows.
 

x: 10 + !15
y: |100+ !15

mykeytable,:[[a:x]b:y] / append

 mykeytable
a |b  
--|---
 1|  4
 2|  5
 3|  6
10|114
11|113
12|112
13|111
14|110
15|109
16|108
17|107
18|106
19|105
20|104
21|103
22|102
23|101
24|100

An "upsert" is an append if the key being introduced is not in the table and is an update if it is.

mykeytable,: [[a:3 6 15]b:3000 3001 3002]

 mykeytable
a |b   
--|----
 1|   4
 2|   5
 3|3000
10| 114
11| 113
12| 112
13| 111
14| 110
15|3002
16| 108
17| 107
18| 106
19| 105
20| 104
21| 103
22| 102
23| 101
24| 100
 6|3001
Notice in the above that 6 is a new key value but 3 and 15 were already present, so their b fields were updated.

Control Flow

Before we get to control flow in the classical sense, it's important to remember how to read a k expression. As mentioned earlier, there is no precedence as there is in some languages where for example * binds more than + or -. Instead the precedence is right to left which, in case you're wondering, conforms with mathematical usage (e.g., for sum of x * y we first multiply x and y and then take the sum)

Order of operations

 20*4-3
20

 (20*4) - 3 / to make * bind closer than -, you need parentheses
77

 20*(4-3) / otherwise, precedence is right to left.
20

Let's say we want the sum of the elements having values greater than 35

 x: 90 30 60 40 20 19
 x > 35 / put a boolean 1 where values are greater than 35 and a 0 otherwise
101100b


 & x > 35 / indexes in x where values are greater than 35
0 2 3

 x[&x > 35] / elements in x that are greater than 35
90 60 40

 +/ x[&x > 35]  / sum of elements of x whose values are greater than 35
190

$[c;t;f] (Conditional)

if c is true then execute the t branch else the f branch)

 x: 3 4 5
 y: 10 20 30

 $[5 > 3; +/x; +/y]
12
 $[5 < 3; +/x; +/y]
60

?[x;I;[f;]y] (replace the index positions by what comes afterwards)

Arthur/Fintan ? for replacement is not working for me. Still to be implemented.
 x: 3 4 5 6 7 8 9 10
 y: 100 200 300 400

 ?[x;3;y] / replace what's in position 3 by y
3 4 5 100 200 300 400 6 7 8 9 10

 ?[x;3 4; y] / starting in position 3 and counting 4, replace by y
3 4 5 100 200 300 400 10

Input/Output and Interprocess Communication

Create a file having these three lines and call it tmp:

We the people
of the
United States

Read in the file

Fintan: Perhaps you can help me with the I/O part. This is not working for me. Dennis is stopping here on July 24.
 x: 0: "tmp"
 x
We the people
of the
United States
 x[1] / x is just an array so x[1] is the second element, viz.
"of the"

 y: (x[0]; x[1]; x[2]; x[1])
 y
We the people
of the
United States
of the

 "tmp2" 0: y

Now look at tmp2 and see that you have:

We the people
of the
United States
of the

1: (write binary image)

Fintan/Arthur: I know you're working on IPC
 x: 1 2 3 4
 "tmp3" 1: x

 y: 1: "tmp3"
 y
1 2 3 4

Text input/output is 0:

Fintan: If you can show me different input/output, that would be good. These examples below do not seem to work.
 "foo"2:("This is line 1\n This is line 2")

 2:"foo"
"This is line 1\n This is line 2"

Interactive prompt is 1:

 name:1:""1:"What is your name? "
What is your name? Carol
 name
"Carol"

Scripts and Meta-functions

\l a.k load

create a file foo.k with the two lines

 x: 1 2 3 4
 g: {[x] x*x}

Then start a k session and then load foo.k in that session using the \l command:

 \l foo.k
 x
1 2 3 4
 g
{[x] x*x}
 g'x
1 4 9 16

\v variables \f functions

 \l foo.k

 \v / creates a dictionary where the key f is for functions and v for variables
f|g
v|x

\w workspace

how much memory are you using

 \w
332

Shortcuts

1) We would be remiss to fail to mention some shortcuts that k afficionados love to use, even though some of us feel that they reduce clarity. For example, unary functions implicitly perform "each" when applied to arrays. For example,

 f:{[a] (a*a)+3 }

 f'1 2 3 4
4 7 12 19

 f 1 2 3 4
4 7 12 19

 x:(1 2 3 4;5 6 7 8)
 f'x
 4  7 12 19
28 39 52 67

 f x
 4  7 12 19
28 39 52 67

Default function parameters are x y z, e.g. {z+x*y}[3;2;1] is 7

 f:{z + x*y}
 f[10; 20; 30]
230

Eval

It is possible to make strings be evaluated as k expressions using the dot operator:

 . "2+3"
5

System Calls (in progress)

System calls: To get the Unix system calls, just precede with backslash.

e.g. \ls \date

Debugging

On error(inspect variables and assign to them)

 f:{[x;y] x + y}
 f[5;6]
11
 f[5;`abc]
{[x;y] x + y}
         ^
type error
> x
5
> y
`abc
> y:7
> x+y
12

\ up a level in the call stack (e.g. if in function f, go to caller of f). \

Another Example:

 fib:{[n] fib[n-1] + fib[n-2]}

 fib[5]
{[n] fib[n-1] + fib[n-2]}
^
error: stack
> 

To debug this, we can do several things. First, just query the variables, e.g.

> n
-193

We might realize that n should never be negative when computing fib.

k reference

Until now, we have touched on only a few of the verbs and types. Here is a more systematic list. From what you understand already, these won't be hard to learn.

+  plus        flip     
-  minus       negate   
*  times       first    
%  divide      inverse  
&  min|and     where    
|  max|or      reverse
<  less        up       
>  more        down     
=  equal       group    
~  match       not      
!  dict|mod    key|enum 
,  concat      enlist   
^  except      null
$  pad|cast    string   
#  take|select count    
_  drop|delete floor    
?  find|rand   uniq|rand
@  index       type     
.  apply       value    

First, let's look at how to read this table. In each row, the binary meaning precedes the unary meaning.

Let's go through each in turn.

/ this is a comment \\ if alone on a line exits the k session or a debugging environment

: gets

 x: 1 2 3 4 / gets indicates assignment

+ plus flip

 2 + 3
5

/ Unary + (transpose)

 +(1 2 3 4; 5 6 7 8)
1 5
2 6
3 7
4 8

- minus negate

 2 - 3
-1

/ Unary - (negation)

 - 3 4 5
-3 -4 -5

* times first

 4*5
20

/ Unary * (first in list)

 * 15 24 19 10
15

% divide inverse

 5 % 3
1.666667

! mod|div enum

/ The following is 14 mod 3 (it turns out to be convenient to put the modulus first)

 3\ 14
2

/ Unary ! enumerate either by integer (in this case numbers 0, 1, 2, ... 19)

 !20
!20

& min|and (and, if 1 is interpreted as true and 0 as false) where

 5 & 3
3

 1 & 1
1

 1 & 0
0

/ Unary & (indexes where a there is a non-zero)

 x: 4 8 9 2 9 8 4

 x = 9 / 1 will indicate a match
0010100b


 & x = 9 / locations in the list above that are 1
2 4

 x > 4
0110110b

 & x > 4
1 2 4 5

| max|or reverse

 5 | 3
5

/ Unary | reverses lists

 | 2 3 4 5 6
6 5 4 3 2

< less asc

 5 < 3 / returns 0 because false
0b
 3 < 5 / returns 1 because true
1b

/ Unary < says which order of indices gives data in ascending order

 x: 6 2 4 1 10 4
 xind: < x / index locations from smallest value to highest
 xind      / Notice that x[3] is 1, the lowest value in the list
3 1 2 5 0 4

 x[xind] / sort the values in ascending order
1 2 4 4 6 10

> more dsc

 5 >gt; 3
1b
 3 > 5
0b

/ Unary > says which order of indices gives data in descending order

 x: 6 2 4 1 10 4
xind: > x / index locations from highest value to smallest
 xind
4 0 2 5 1 3
 x[xind] / descending sorted order
10 6 4 4 2 1

= equal group

 5 = 5
1b

 1 2 13 10 = 1 2 13 10 / element by element
1111b


 1 2 14 10 = 1 2 13 10 / 0 at position 2 indicates inequality
1101b

/ Unary = gives a dictionary that each value v to the indexes where v is present

 = 30 20 50 60 30 50 20 20
20|1 6 7
30|0 4  
50|2 5  
60|3    

/ In the above example value 20 is at positions 1 6 and 7

~ match not

 1 2 13 10 ~ 1 2 13 10 / should match
1b

 1 2 10 13 ~ 1 2 13 10 / does not match (order matters)
0b

/ Unary ~ (like the Boolean not operator except that any non-zero becomes zero)

 ~ 1
0b
 ~ 0
1b
 ~ 5
0b
 ~ -5
0b

, concat enlist

 x: 10 11 12 13
 y: 4 3

 x,y / concatenate one way
10 11 12 13 4 3

 y,x / concatenate the other
4 3 10 11 12 13

 z: y,x
 z[4] / can index these as if these were an array
12

 z[2+!3] / can fetch many indexes
10 11 12

/ Unary , converts a scalar (atom) into a list or a list into a deeper list

 x: 5
 x / x is an atom
5
 *x / first on an atom is just the atom itself
5

 x: ,5 / x is now a list

 x / Note the comma in front indicating that x is  list
,5

 y: x,15
 y / the comma goes away since two or more elements already form a list
5 15

 *y / The * operator takes the first element of a list
5

 *|y
15

^ cut

Takes a list and splits it up (in the example below, gets the values at indexes 0 1; 2 3 4 5; 6 7 8 9 10; the rest)
 x: 0 2 6 11
y: 5 + !30

 x ^ y 
5 6                                                     
7 8 9 10                                                
11 12 13 14 15                                          
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

# take|shape count

 x: 10 20 30 40 50 60
/ scalar take vector to take portions of a vector with wrapping
 3 # x
10 20 30

 10 # x / Notice x is of length 6; result wraps
10 20 30 40 50 60 10 20 30 40

/ vector take scalar to create a matrix

3 7 # 100

100 100 100 100 100 100 100
100 100 100 100 100 100 100
100 100 100 100 100 100 100

/ vector extract from right vector the elements from the left vector in order
 "abcdef" # "I never thought I'd see you again"
"eedeeaa"

/ Take vector. Unary # counts the lengths of lists

 x
10 20 30 40 50 60

 # x / number of elements in x
6

 y: 9 8 5
 z: (x;y)
 z
10 20 30 40 50 60
9 8 5
 #'z / counts each list
6 3

_ drop|cut floor

 x: 10 20 30 40 50 60
 3 _ x / cut away 3 elements from the beginning
40 50 60

 10 _ x / Notice x is of length 6; so this eliminates more than necessary
!0
 x: 10 20 30 40 50 60 60 60 30 24
 30 35 60 _ x / delete elements on the left from the right
10 20 40 50 24

/ !0 means an empty list

/ Now unary _ is the floor operator

 15 % 4
3.75
 _ 15 % 4
3

$ cast|+/* string

 ` $ "abc" / cast string to symbol (name)
`abc

 . "18" / cast string to int
18
 . "18.2" / cast string to float
18.2

/ Unary form

 $ `abc / cast symbol to string
"abc"

? rand|find unique

/ Random as discussed in section 1

 10 ? 12 / with replacement (can be duplicates) from 0 to 11, but your results may differ because of the randomness
10 1 9 2 4 4 7 10 10 10


 15 ? 12 / there can be more elements (15) than the domain (0 to 11)
9 6 3 2 10 1 0 6 10 5 8 1 10 1 3

 -10 ? 12 / random and uniform without replacemnt (no duplicsates)
9 1 2 4 3 0 7 11 10 5

/ With list as left argument we can find the index of the first match

 40 20 30 10 20 30 ? 30
2

/ Unary rand for atoms (scalars)

 rand 7 / random between 0 and 1 (uniform)
.6717299 0.7831465 0.5664253 0.7210294 0.928477 0.6923788 0.7408487

/ Unary ? for arrays removes duplicates but and sorts them

 ? 40 20 30 10 20 30
10 20 30 40

@ at type

 x: 40 20 30 10 20 30
 @[x;2 4 5]
30 20 30

 x / unchanged
40 20 30 10 20 30
 @[x; 2 4 5;: ;  -17 -12 -8]
40 20 -17 10 -12 -8
 x / still unchanged
40 20 30 10 20 30

 f: {[x] x * x}
NOT YET IMPLEMENTED
 @[x; 2 4 5; f] / squares locations 2 4 and 5
40 20 900 10 400 900

 x / still unmodified
40 20 30 10 20 30

/ Unary @ finds the type of an object

 @ 18
`i

 @ "18"
`C

 @ `abc
`s

. dot value

/ Unary . Can evaluate a string

 . "18 + 5"
23
 . "f: {[x] x * x * x}"

 f[5]
125

in (membership test)

 5 in 10 20 30 5 6 9 10
1b

 5 in 10 20 30 5 6 5 10 / even if there are two instances, still return 1
1b

 5 in 10 20 30 50 6 15 10
0b

bin (binary search assumes ascending sorted order))

Not yet implemented. Arthur probably knows.
 x: 5 * (3 + !10)
 x
15 20 25 30 35 40 45 50 55 60

 x bin 33 / index location that is less than or equal to 33
3

 x bin 35 / index location that is less than or equal to 35 (here, equal)
4

within

(upper bound for singleton right hand side lists and closed lower and open upper bound for binary right hand side lists)

 x: 40 10 20 23 15 16 18
  x within 12 20
0000111b


 x within 16 23  / between 16 (inclusive) and 23 (exclusive)
0010011b

/ No unary version

Not yet implemented. Arthur doesn't plan on it?

find / substring looking for an exact match

 x: "abcdef"
 x find "cde" / look for beginning and length of match
,2 3

 x find "bd" / If there is no match, then is this the return value I want???
0#,Ø Ø

 y: x, x
 y
"abcdefabcdef"

 y find "cde" / get all matches as a list
2 3
8 3

like (string match with wildcards)

 x: "abcdef"
 x like "ab"
0

 x like "ab*" / allows wildcards
1

 x like "*def"
1

 x like "*bcd*" / arbitrary length wildcards with *
1

 x like "?bcd*" / ? is a single character substitution
1

21) abs (absolute value)

 abs -3.2
3.2
 abs -3.2 4 5.3
3.2 4 5.3

log (natural log, also known as ln or log base e)

 log 8
2.079442

exp (exponential on e)

 exp 1
2.718282

 exp 3
20.08554

 log 20.08554
3f

sin (takes its argument in radians)

 mypi: 3.14 / a crude approximation of pi
 sin[mypi] / should be approximately 0
0.001592653

 sin[mypi % 2] / sin pi/2 is 1, so this is close
0.9999997

cos also takes its argument in radians

 cos[mypi] / close to -1
-0.9999987

 cos[mypi %2]
0.0007963267

These examples are going to start from a database of trades.

 n: 100
 secid: n ? (`goog;`facebook;`ibm;`msft)
 price: 100 + n ? 200
 vol: 10 + n ? 1000
 time: !n

Find all trades such that the price is over 175.

 ii: & price > 175
 z: secid[ii] ,' price[ii] ,' vol[ii] ,' time[ii] / Your results may differ because of randomness
 z[5]
facebook
199     
417     
9   

Find the high and low of each security. Your results may differ because of randomness.

 mydict: =secid
 names: ? secid
 maxmin: {[name] name , (|/price[mydict[name]]), (&/price[mydict[name]])}
 maxmin'names
facebook 295 104
goog     295 104
ibm      265 104
msft     294 105

Determining the finishing time of each task if you do them in earliest deadline first order.

 n: 10
 taskid: !n
 tasktime: 2 + n ? 20
 deadlines: 40 + n ? 50

 tasktime / Your results may differ because of randomness
17 13 18 7 15 5 3 19 12 9


 deadlines
49 55 78 61 75 66 78 66 51 44

/ Put them all together

 taskid,'tasktime,'deadlines
0 17 49
1 13 55
2 18 78
3  7 61
4 15 75
5  5 66
6  3 78
7 19 66
8 12 51
9  9 44

/ Now determine the order of deadline indexes / for the deadlines to be in order.

 inddead: < deadlines
 deadlines[inddead]
44 49 51 55 61 66 66 75 78 78

/ Put the tasks in the same order

 taskid[inddead]
9 0 8 1 3 5 7 4 2 6

/ Put the task times in the same order

 tasktime[inddead]
9 17 12 13 7 5 19 15 18 3

/ Put all of them in the same order

 taskid[inddead],'tasktime[inddead],'deadlines[inddead]
9  9 44
0 17 49
8 12 51
1 13 55
3  7 61
5  5 66
7 19 66
4 15 75
2 18 78
6  3 78

/ Find end point if tasks are executed in this order

 +\tasktime[inddead]
9 26 38 51 58 63 82 97 115 118

/ Determine which deadlines are met (1) and which aren't (0)

 deadlines[inddead] > +\tasktime[inddead]
1111110000b

/ Determine which taskids have their deadlines met

 taskid[inddead][& deadlines[inddead] > +\tasktime[inddead]]
9 0 8 1 3 5

Navigation