q/kdb+ general notes

############################################ 
####     q/kdb+  (serious business)     #### 
############################################ 
 
## 
##  installation 
## 
 
(1) install q/kdb+      # URL https://kx.com/download 
(2) install rlwrap      # readline wrapper - it's a tool to let you call back and edit prev lines 
                        # e.g. http://macappstore.org/rlwrap/ 
 
then you wanna put something like this in your shell profile/config. 
alias q='/usr/local/bin/rlwrap -r ~/q/m32/q' 
 
$q 
q)         # here is the q console 
q)\\       # two backslashes "\\" means exit 
 
 
# syntax highlighter for emacs 
 
https://github.com/eepgwde/kdbp-mode/blob/master/kdbp-mode.el 
 
# then put something like this in your .emacs config (note your host needs to have q installed first) 
(setq load-path (cons "/path/to/your/emacs/config/dir" load-path))  # you put kdbp-mode.el in dir 
(load "kdbp-mode") 
 
 
## 
##  env var 
## 
 
QHOME : points to the dir where you put your bootstrap q.k file. 
        if not specified, it goes to $HOME/q 
QLIC : your licence file location. if not defined, fall back on $QHOME, then $HOME/q 
QINIT : name of a file that gets executed immediately after q.k 
        if not specified it executes q.q  if found in $QHOME or $HOME/q 
 
 
### 
###  resources for studying q/kdb+ 
### 
 
http://code.kx.com/q4m3/           # the textbook : Q for Mortals 
http://code.kx.com/q/cookbook/     # cookbook for common tasks 
http://code.kx.com/q/interfaces/   # interface to KDB in other languages 
 
note: most of the examples below I used from the above q4m3. 
 
### 
###  intro 
### 
 
q : a language 
- interpreted, not compiled 
- dynamically typed 
- table oriented (as opposed to tradtional OOP) 
-- a table can be seen as a list of dictionaries. 
-- Q table is column-oriented (as opposed to row-oriented). Q tables are column lists in contiguous storage and operation apply on entire columns. (wink, fast, wink) 
-- (recall) SQL tables, in comparison, are row-oriented. operations apply to fields within a row. rows of data kept across distrubuted storage. (wink, slow, wink) 
- "ordered" list: in SQL, both rows and each column content are unordered (more like sets). in q, lists/rows/column_content are orderd, (and usually contiguous mem space, thus super faster operation) 
kdb+ : in-memory database (consists of serialized q column lists) with persistent backing, in which data manipulation is done with q. 
     - can be persisted onto disk. 
     - only load columns / partitions as needed. this gives performance advantage. 
goal : 1. expressiveness, 2. speed, 3. efficiency 
 
why q/kdb+ ? 
- traditionally, you used to build a DB in Sybase/DB2/MySQL/Oracle, then you build retrieval/uploader interface in Perl/Python, then do analysis in R/Matlab. 
- now you can do all of that in q/kdb+. may require R/Matlab for analytics part, but being able to combine DB and script parts alone gives you enormous edge. 
- kdb+ update is performed in single thread. this gives you speed advantage because you don't have to worry about resource locking. (but we can still do massively parallel processing using map reduce functionality) 
 
in summary, 3 defining characteristics of q 
- vector programming 
- functional programming 
- table is a native (first class) data type 
 
history 
- functional programming language 
-- APL, A, J, K, Q                   // of course, there are others like Haskel, Prolog, F#, LISP, Scala 
-- Q is a wrapper on top of K. 
 
## 
##  variables / assignment / order of evaluation 
## 
 
q)a:17      # you assigned the value 17 to a variable named "a" 
q)a         # so ":" is the assignment operator. FYI "=" is used as equality test 
17          # the usual guidelines apply. e.g. make names meaningful (not too short/long), use noun for data, use verb for function 
q)          # NOTE: underscore "_" has a syntactic meaning, it's a built-in operator. so avoid using it as part of var/func names. 
q) a : 49   # whitespace is ok, use it for readability. 
q)a 
49 
q)a:3.14    # now we changed the var data type from int to float. be cognizant when you change data types of variables. 
 
q)foo:3+bar:4     # evaluation order is from Right to Left. 
q)bar             # this assigned 4 to bar, then 3+bar to foo 
4                 # this can lead to concise neat code in q, but also potentially hard-to-read 
q)foo 

 
q)5*2+4     # notice, order of eval is from right-to-left, so we get 30, instead of 14 here 
30 
 
q)(5*2)+4   # you can do this, but generally, you are encouraged to reorder your equation instead. 
14 
 
 
### 
###  error msg in q 
### 
 
q)foo     # undefined variable 
'foo      # error msg is a single quote followed by some message. usually not helpful at all. 
 
here is the list of known error in q   http://code.kx.com/q/ref/error-list/ 
 
e.g. 
 
q)123 * `ibm 
'type 
 
other common ones i see include 
 
'nyi     // not yet implemented. (maybe not defined/implemented yet) 
'branch  // if[], do[], while[], $[] cannot contain more than 255-byte code. (i.e. too long) 
         // it is not easy to guess how many bytes your code becomes after compilation 
         // so make it shorter or a real fix is replace your branching with lambda 
'local   // too many local variables (23 max) 
'global  // too many global variables 
'params  // too many function parameters 
'length  // incompatible length. e.g.   q)1 1 1 + 2 3 
'constants  // too many constants (96 max) 
'assign     // when you try assign to a reserved word.   e.g. csv:"duh" 
 
 
if[0 > count[foo]; 'fooNegativeValueError];       // this works 
if[0 > count[foo]; '`fooNegativeValueError];      // you can give symbol like this instead 
if[0 > count[foo]; '`$"foo cannot be negative"];  // also works 
 
 
## 
##  comment : a single fwd-slash "/" 
## 
e.g. 
q)a:17   / here is a comment 
 
q)a:17/  error, you need whitespace before the "/" 

 
# to comment out multiple lines, use slash and backslash 
 
a:123 
b:`ibm 

here 
all 
commented out 

c:12 34 56 
d:`foo`bar 
 
 
## 
##  functional/declarative programming lang VS procedural/imperative programming lang 
## 
 
intuitively, the former = what to do,  the latter = how to do. 
q aspires to be the former - "a paradigm that treats computation as the evaluation of mathematical functions and avoids changing state and mutable data" 
the latter makes explicit references to the state of the execution environment. 
 
here, let me quote wikipedia: (ref) https://en.wikipedia.org/wiki/Functional_programming 
 
"In functional code, the output value of a function depends only on the arguments that are passed to the function, so calling a function f twice with the same value for an argument x produces the same result f(x) each time; this is in contrast to procedures depending on a local or global state, which may produce different results at different times when called with the same arguments but a different program state. Eliminating side effects, i.e., changes in state that do not depend on the function inputs, can make it much easier to understand and predict the behavior of a program, which is one of the key motivations for the development of functional programming." 
 
q is not a strictly pure functional, because q allows side-effect (i.e. functions to change the state of variables that are defined outside the function scope). 
 
 
## 
##  what separates pretenders and contenders 
## 
 
1. adverbs 
2. general application 
3. functional forms 
 
 
######################################### 
####     data types / structures     #### 
######################################### 
 
### 
###  atoms / data types       # atom (aka scalar) - you can compare 
### 
 
type    size  charType  example                        numType  null_value 
-------------------------------------------------------------------------- 
boolean    1     b      1b                             1          0b     # 1b or 0b 
byte       1     x      0x26                           4          0x00 
 
char       1     c      "a"                            10         " "    # a single ascii char, with double quotes. e.g. "a", "b", "_" 
symbol     *     s      `ibm                           11         `      # kind of like string. starts with a backtick e.g. `ibm 
 
short      2     h      34h                            5          0Nh    # 16-bit signed integer 
int        4     i      34i                            6          0Ni    # 32-bit signed integer 
long       8     j      34j                            7          0Nj    # 64-bit signed integer (sometimes you see 0N instead) 
 
float      8     f      3.4                            9          0n     # 8-byte IEEE float (known as "double" in other language) 
real       4     e      3.4e                           8          0Ne    # 4-byte IEEE float (known as "float" in other language) 
 
date       4     d      2000.01.01                     14         0Nd 
timespan   8     n      12:00:00.000000000             16         0Nn    # so it covers milli-micro-nano seconds 
time       4     t      23:59:59:042                   19         0Nt 
timestamp  8     p      2015.01.01T00:00:00.000000000  12         0Np    # timestamp = date + timespan 
month      4     m      2009.11m                       13         0Nm 
minute     4     u      23:59                          17         0Nu 
second     4     v      23:59:59                       18         0Nv 
enumeration                                            20~76 
dictionary                                             99 
table                                                  98 
function                                               100 
nil ::                                                 101               # often you have to parenthesize (::) to avoid it being interpreted as assignment 
 
(note) numType is the output you see when you use operator "type" 
 
 
## 
##  structures 
## 
 
list           # can be homogenous (aka "simple" list aka vector, containing elems of the same type) or of different data types. 
dictionary     # a pair of lists (keys and values) 
table          # aka flipped column dictionary. aka list of records/dictionaries. 
 
NOTE: take a moment to appreciate that, fundamentally, there are only two data types: atoms and lists. because a dict is just paired lists, and a table is just a list of dictionaries. 
 
### 
###  integer 
### 
 
- all signed. 
- long is default. 
- if you use short or int, and go over its max,min then automatically the type gets promoted to its wider type. (unless it's part of a homogenous list of atoms of wider type - this is to enforce a type in columns in tables. you don't want the column type to suddenly change when updating/inserting an elem) 
 
q)42j      // long 
42 
 
q)42i      // int 
42i 
 
q)42h      // short 
42h 
 
 
### 
###  floating point 
### 
 
- float: 8 byte IEEE floating point. (called double in traditional languages) 
- real:  4 byte IEEE floating point. (called float in other languages, yes it's confusing) 
 
e.g. "float" 
 
q)-1.30     # default type is float, not real 
-1.3 
q)3.14 
3.14 
q)34f 
34f 
q)100.      # see a dot "." at the end signifies float 
100f 
 
note: you can use exponent 'e' as well. (not to be confused with data type suffix for 'real') 
 
q)1.23e7     # equivalent to 1.23e+7 
1.23e+07     # equivalent to 1.23e07 
             # equivalant to 1.23e+07 
             # e7 means 10^7 = 10000000, thus 1.23 * 10000000 
q)1.23e-7    # e-7 means 1 / 10^7 = 1 / 10000000 
1.23e-07 
 
 
e.g. "real" 
 
q)1.23e 
1.23e 
q)-1.23e 
-1.23e 
q)1.23e7e    # notice the use of e 
1.23e+07e 
 
 
NOTE: equality test of two float numbers is a tricky business in IEEE floating point spec. 
      in q, if two float numbers match to 12th decimal place, then x=y gives 1b 
      the best way to compare is 0=x-y 
e.g. 
 
q)x:1.0000000000001 
q)y:1.0000000000002 
q)x=y 
1b 
q)0=x-y        // even this will give 1b if the difference is smaller than 2 xexp -43  i.e. 1/(2^43) 
0b 
 
q)x:1.000000000001 
q)y:1.000000000002 
q)x=y 
0b 
 
 
### 
###  boolean 
### 
 
1b or 0b     # there is no string representation like True, False in python 
 
note: boolean is promoted to unsigned integer in arithmetic operation. 
 
e.g. 
 
q)42+1b 
43 
 
q)42.5+1b 
43.5 
 
q)3.14+-0b 
3.14 
 
==> this can be useful, as conditional. 
e.g 
 
q)flagPaid:1b 
q)sendMoney:100*flagPaid 
q)sendMoney 
100 
 
 
### 
###   byte 
### 
 
unsigned 8-bit value, represented in hex. 
 
e.g. 
 
q)0x1a 
0x1a 
 
q)0x1A    # upper case works also but lower case is the norm. 
0x1a 
 
q)0+0x1a  # just like boolean, it gets promoted to unsigned int when applied to arithmetic operation. 
26 
 
q)1+0x1a 
27 
 
 
### 
###  GUID : globally unique identifier 
### 
 
 
### 
###  char 
### 
- an ascii byte char. 
 
q)"f"     # denoted by a single char in double quotes. 
"f" 
 
note: escape char is backslash "\" 
e.g. 
 
q)"\""      # double quote 
"\"" 
 
q)"\\"      # backslash itself 
"\\" 
 
q)"\n"      # newline 
"\n" 
 
q)"\r"      # return 
"\r" 
 
q)"\t"      # tab 
"\t" 
 
 
### 
###   symbol 
### 
 
- a symbol is an atom holding text. 
- denoted with a backtick + text 
 
q)`ibm 
`ibm 
 
note: symbol != string. in q, string refers to a list of chars. 
e.g. 
 
q)`a~"a" 
0b 
 
note: what if we want to create a symbol that contains whitespace and backtick ? 
     - you cast a list of chars as below 
e.g. 
q)`$"An example of symbol with whitespace and backtick `" 
`An example of symbol with whitespace and backtick ` 
 
 
############  Temporal data types ############### 
 
### 
###  "date" type 
### 
 
- denoted as yyyy.mm.dd 
- under the hood, it is just a four-byte signed integer, where 2000.01.01 == 0 
e.g. 
 
q)2000.01.01 
2000.01.01 
 
q)2000.01.01=0 
1b 
 
q)2000.01.01=1 
0b 
 
q)2000.01.02=1 
1b 
 
q)1999.12.31=-1 
1b 
 
q)2000.01.01+3       # you can increment like an integer 
2000.01.04 
 
q)`int$2000.01.25    # you can obtain the underlying int by casting 
24i 
 
q)1985.1.3           # error, leading zero is required for mm.dd 
'1985.1.3 
 
NOTE: valid date range is 1709.01.01 to 2290.12.31   (ref) https://code.kx.com/q/ref/datatypes/#temporal 
      (may expand in future version of q) 
  e.g. 
q)2290.12.31 
2290.12.31        # valid 
 
q)9999.12.31 
'9999.12.31       # error 
 
q)`date$9999.12.31    # you cannot even cast like this 
'9999.12.31           # still error (kx may improve this implementation in the future) 
 
q)`date$2921939      # but casting on the underlying integer still works 
9999.12.31           # valid 
 
 
### 
###  "time" type 
### 
 
- denoted as hh:mm:ss.mmm    # i.e. covers millisecond precision 
- a four byte signed integer, where incrementing by 1 means adding 1-millisecond 
 
q)12:34:56.789 
12:34:56.789 
 
q)12:00:00.000=12*60*60*1000    # it's true because 12:00:00 is 12*60*60*1000 milliseconds 
1b 
 
q)`int$00:00:00.000    # you can obtain the underlying int by casting. i.e. how many milliseconds since 00:00:00.000 
0i 
 
q)`int$12:00:00.000 
43200000i              # this is simply 12*60*60*1000 (milliseconds) 
 
 
q)12:00:1.000     # leading zero is required for mm:ss 
'12:00:1.000 
 
q)12:00:01.1      # trailing zero can be omitted 
12:00:01.100 
 
 
### 
###  "timespan" type 
### 
 
- denoted as 0Dhh:mm:ss.mmmuuunnn    # i.e. covers nano second precision 
- a long (a signed 8-byte integer), where incrementing +1 means +1 nano second, and 00:00:00.0 being 0 
 
q)12:34:56.123456789     # "0D" is optional 
0D12:34:56.123456789 
 
q)12:34:56.123456        # trailing zero can be omitted 
0D12:34:56.123456000 
 
q)`long$00:00:00.0       # cast to long to know underlying integer representation 

 
q)`long$12:34:56.123456789 
45296123456789 
 
q)`time$12:34:56.123456789  # you can extract "time" portion by casting 
12:34:56.123 
 
### 
###  "timestamp" type 
### 
 
- it's a lexical combination of "date" and "timespan", concatenated with "D" 
- a long (a signed 8-byte integer), where millennium = 0, then you increment or decrement by nanosecond 
 
q)2014.11.22D17:43:40.123456789    # notice "D" 
2014.11.22D17:43:40.123456789 
 
q)`long$2014.11.22D17:43:40.123456789      # as usual, you can obtain the underlying integer by casting to long 
469993420123456789 
 
q)`date$2014.11.22D17:43:40.123456789      # you can extract "date" portion by casting to "date" 
2014.11.22 
 
q)`timespan$2014.11.22D17:43:40.123456789  # similarly, you can extract "timespan" portion by casting 
0D17:43:40.123456789 
 
q)`time$2014.11.22D17:43:40.123456789      # similarly, you can extract "time" portion by casting 
17:43:40.123 
 
### 
###  "datetime" type 
### 
 
- deprecated. use timestamp instead. 
- underlying value is float (not integer), and gives unpredicatable result when using 'datetime' type in a join, or as a key. 
 
q)`datetime$2014.11.22D17:43:40.123456789 
2014.11.22T17:43:40.123 
 
q)"z"$2014.11.22D17:43:40.123456789 
2014.11.22T17:43:40.123 
 
$ date +%s 
1532306455 
 
q)"Z" $ "1532306455" 
2018.07.23T00:40:55.000 
 
 
### 
###  "month" type 
### 
 
- denoted as e.g. 2014.09m    # NOTICE trailing "m"   (because otherwise it cannot be distinguished from float) 
- a signed 4-byte integer under the cover, 2000.01m being 0 
 
q)2012.04m     # don't forget the trailing "m" 
2012.04m       # it will silently give you float 
 
q)1999.12m=-1 
1b 
 
q)2000.02m=1 
1b 
 
q)2001.01m=12 
1b 
 
q)2000.03m+1 
2000.04m 
 
q)`int$2012.04m    # cast to int to obtain the underlying int representation 
147i 
 
note: feels illogical in terms of underlying int representation, but the first day of month == the month 
e.g. 
 
q)2012.04m=2012.04.01 
1b 
 
### 
###  "minute" type 
### 
 
- denoted as hh:mm 
- a signed 4-byte integer under the hood. 00:00 being 0 
 
q)12:45 
12:45 
 
q)00:13=13 
1b 
 
q)00:13+4 
00:17 
 
q)00:13+120 
02:13 
 
q)00:13-14 
-00:01 
 
q)`int$12:00 
720i 
 
note: again, feels illogical in terms of underlying integer representation, but you can do this. 
 
q)12:00=12:00:00.000 
1b 
q)12:00=12:00:00.000000000 
1b 
 
q)`minute$12:01:02.345     # extracting "minute" 
12:01 
 
### 
###  "second" type 
### 
 
- denoted as hh:mm:ss 
- a signed 4-byte integer under the hood, 00:00:00 being 0 
 
q)00:00:04=4 
1b 
q)00:00:04=5 
0b 
q)00:00:04+3 
00:00:07 
q)00:00:04-9 
-00:00:05 
 
q)`second$12:34:56.789    # cast to "second" to extract 
12:34:56 
 
note: again, feels illogical in terms of underlying integer representation, but the below holds. 
 
q)12:34:56=12:34:56.000 
1b 
 
q)12:34:56.000=12:34:56.000000000 
1b 
 
q)`int$12:34:56 
45296i 
 
q)`int$12:34:56.000 
45296000i 
 
q)`long$12:34:56.000000000 
45296000000000 
 
 
### 
###  extracting constituents 
### 
 
- cast! 
- in addition to the casting we already saw above, here are some more. 
 
q)dt:2014.12.31 
 
q)`year$dt     # "year" 
2014i 
 
q)`mm$dt       # "mm" 
12i 
 
q)`dd$dt       # "dd" 
31i 
 
 
q)ti:12:34:56.789 
 
q)`hh$ti     # "hh" 
12i 
 
q)`mm$ti     # "mm" 
34i 
 
q)`ss$ti     # "ss" 
56i 
 
- How do we extract milli & nano second portion ? 
- simply use "mod" 
 
q)(`int$12:34:56.789) mod 1000 
789 
 
q)(`long$12:34:56.123456789) mod 1000*1000*1000     # notice how parenthesis is necessary 
123456789 
 
## 
##  dot notation to extract parts 
## 
 
alternative to casting, you can use dot notation like below. 
 
q)a 
2018.12.16D17:25:48.469042000 
 
q)a.year 
2018i 
 
q)a.month 
2018.12m 
 
q)a.week 
2018.12.10 
 
q)a.date 
2018.12.16 
 
q)a.minute 
17:25 
 
q)a.year 
2018i 
 
q)a.time 
17:25:48.469 
 
q)a.mm       // month 
12i 
q)a.dd       // date 
16i 
q)a.hh       // hour 
17i 
q)a.uu       // minute 
25i 
q)a.ss       // second 
48i 
 
NOTE: dot notation doesnt work in function param/variables. so stick with explicit casting. 
 
e.g. 
 
q){x.minute} .z.P 
'x.minute 
 
q){`minute$x} .z.P 
17:34 
 
### 
###   infinity and null 
### 
 
- in q, both infinity and null are actual (specially reserved) values (not the absense of mem alloc or null ptr) 
- this is good in a way, because you can treat inf & null like any other values. but be careful about types. 
  (to avoid having to remember null literal for each data type, there is a builtin function "null" - examples below) 
- just memorize.   (char "w" was chosen as it looks kind of infinity. duh) 
 
0w     # positive float inf      // +inf 
-0w    # negative float inf      // -inf    IEEE defines these 
0n     # null float (aka NaN)    // NaN 
 
// unfortunately, unlike IEEE float, there is no standard for int, so we just use real values as below 
 
0W     # positive long inf    // aka +9223372036854775807  (just underlying numeric representation) 
-0W    # negative long inf    // aka -9223372036854775807  (because of this, we have the following property) 
0N     # null long (aka 0Nj)  // aka -9223372036854775808       pos int < 0W < 0N < -0W < neg int 
 
e.g. 
 
q)0W + -1 0 1 2 3 
9223372036854775806 0W 0N -0W -9223372036854775806        // very illustrative example 
 
q)0N + 1    # NOTE: operating on null gives null 
0N 
 
q)avg 12 0N 34    # null is ignored 
23f 
 
q)34 % 0    # recall division result is always float in q 
0w 
 
q)-34 % 0   # just following actual math definition of division by zero 
-0w 
 
q)0 % 0     # undefined, thus NaN 
0n 
 
q)42<0W 
1b 
 
q)-0W<42 
1b 
 
// NOTE: you will often deal with null comparison in kdb. beware of the result. 
e.g. 
students:select from people where age < 18;   // what if the age is null for some students ? 
 
q)0N < 123      // see how you get 1b in this case. because recall null is a big negative number 
1b 
q)0N <= 123 
1b 
q)0N > 123 
0b 
q)0N >= 123 
0b 
 
 
Let's review null representation for each data type. 
- notice some types (e.g. boolean, char) don't have space for null, so it uses 0b, " " etc 
- we denote such types with * below 
 
 type         null 
-------------------- 
boolean*       0b 
guid*          0Ng (00000000-0000-0000-0000-000000000000) 
byte*          0x00 
short          0Nh 
Int            0N 
long           0Nj 
real           0Ne 
float          0n 
char*          " "     # a whitespace char, also an empty list of char 
sym            ` 
timestamp      0Np 
month          0Nm 
date           0Nd 
datetime       0Nz     // deprecated, use timestamp instead 
timespan       0Nn 
minute         0Nu 
second         0Nv 
time           0Nt 
 
 
note: testing for null can be done with "=" operator but it is painful, because you have to be careful about the data types and have to remember the exact right null literal. 
      so here is a nice type-independent null-check function "null" 
 
q)null 34 
0b 
 
q)null ` 
1b 
 
q)null " " 
1b 
 
q)null "" 
`boolean$() 
 
q)null 0b 
0b 
 
q)null 0n 
1b 
 
q)null 0N 
1b 
 
q)null 0Ni 
1b 
 
q)null 0Nj 
1b 
 
 
### 
###  "type" operator 
### 
 
see the operator section 
 
 
###################################### 
####    data structures - list    #### 
###################################### 
 
list           # can be homogenous (containing elems of the same type) or of different data types 
dictionary     # essentially a pair of lists 
table          # a list of dictionaries aka flipped column dictionary. 
 
 
a list in q 
- an ordered collection of data 
- each item directly accessible by index 
- dynamically allocated array 
 
a "simple" list : a list of atoms of homogenous/uniform type. (aka vector in math) 
                 - underlying implementation is contiguous mem array data storage. (fast) 
 
a "general" list: a list of diff data types. 
                 - underlying implementation is contiguous mem array of pointers. (still ok fast) 
 
 
q)(1; 2; 3)     # syntax is parenthesis, and semi-colon 
1 2 3           # q interprets a simple list, and simplifies the representation 
                # simple list can be written without parentheis and semi-colon 
 
q)1 2 3~(1; 2; 3) 
1b 
 
q)("a"; 1234; `ibm)    # a general list 
"a" 
1234 
`ibm 
 
q)("a"; "b"; "c")      # a simple list 
"abc" 
 
q)((1; 2; 3); (`1; "2"; 3); 4.4)    # a general list whose first elem is a simple list 
1 2 3 
(`1;"2";3) 
4.4 
 
q)list_foo:1 2 3     # list can be assigned to a variable just like atoms 
q)10 * list_foo 
10 20 30 
 
 
q)1 2 ~ 1 2 
1b 
 
q)1 2 ~ 2 1      # order matters 
0b 
 
q)count ((1; 2; 3); (`1; "2"; 3); 4.4)     # counts the given list (outer most list) 

 
q)count 12 34 56 

 
q)ticker_list:`ibm`goog`aapl      # simple list 
q)count ticker_list 

 
 
q)count 34     # note: counting atom gives 1 

 
q)count ()     # empty list 

 
q)first 12 34 56    # takes the first elem. basically the same as  1#12 34 56  (but take operator returns a list) 
12 
 
q)last 12 34 56    # returns the last item.  i.e. first reverse 12 34 56 
56 
 
 
### 
###  simple list 
### 
 
note: q converts a given list to a simple list whenever possible. this can be a problem when you have a general list, and you remove some items which made it a list of atoms of uniform type, then q converts it to a simple list, then you cannot add any other data types to the list. 
 
 
q)1 2 3h    # in this notation, you indicated "every" item is short. not just the last item. 
1 2 3h 
 
q)(100i; 200i; 300i) 
100 200 300i           # notice the notation 
 
 
q)(1; 2; 3i)    # here, this is a general list of long, long, int 


3i 
 
 
q)1.2 4.0 7.9f    # notice .0 gets suppressed  (it's still float) 
1.2 4 7.9 
 
q)1.0 2.0 3.0     # notice how q displays it with "f" suffix ONLY for the last item for a simple list when it's all float 
1 2 3f 
 
q)10 20 30f 
10 20 30f         # all items are float 
 
q)10 20 30.       # equivalent 
10 20 30f 
 
q)1 2 3f~1.0 2.0 3.0 
1b 
 
q)1.1 2 3.3~1.1 2.0 3.3 
1b 
 
 
q)(0b;1b;0b;1b;1b)   # boolean simple list is denoted like this 
01011b               # notice NO whitespace between items 
                     # then one trailing 'b' 
q)01011b 
01011b 
 
q)count 01011b 

 
q)(0x20;0xa1;0xff)   # similarly, 'byte' simple list consolidates items together 
0x20a1ff             # notice one preceeding '0x' 
 
q)0x20a1ff~(0x20;0xa1;0xff) 
1b 
 
 
q)(`ibm; `msft; `aapl; `sbux)   # similarly, 'symbol' simple list juxtaposes items together 
`ibm`msft`aapl`sbux             # notice NO whitespace 
 
q)`ibm`msft`aapl`sbux 
`ibm`msft`aapl`sbux 
 
q)count `ibm`msft`aapl`sbux 

 
 
q)("k"; "e"; "n"; "i"; "c"; "s")   # a char simple list is called a "string" 
"kenics"                           # string is not atom, so you cannot compare with "=" 
                                   # but identity comparison works 
 
q)"kenics"=("k"; "e"; "n"; "i"; "c"; "s") 
111111b 
 
q)"kenics"~("k"; "e"; "n"; "i"; "c"; "s") 
1b 
 
q)"kenics"="foobar" 
000000b 
 
q)"kenics"="foo" 
'length 
 
q)"kenics"~"foo" 
0b 
 
 
q)12:34 01:02:03      # temporal data types, it adjusts to the widest type and creates a simple list 
12:34:00 01:02:03 
                                       # another example 
q)01:02:03 12:34 11:59:59.999 09:25 
01:02:03.000 12:34:00.000 11:59:59.999 09:25:00.000 
 
q)01:02:03 12:34 11:52:54:234 09:25u    # you can force type with trailing type char 
01:02 12:34 11:52 09:25                 # notice "u" 
 
q)01:02:03 12:34 11:52:54:234 09:25v    # notice "v" 
01:02:03 12:34:00 11:52:34 09:25:00 
 
 
### 
###  empty list 
### 
 
q)emptyList:()         # general empty list 
q)count emptyList 

q)emptyList 
q) 
q)type emptyList 
0h 
 
q)emptyListLong:0#123    # here is a neat way to create an empty simple list 
q)emptyListLong          # you forced the type, so only long can be added 
`long$() 
 
 
### 
###  a list of a single item    (aka a singleton) 
### 
 
q)(34)        # this resolves to an atom 
34 
 
q)type (34)   # indeed 
-7h 
 
q)(34+7)      # because if you do this, then you have (41) which resolves to 41 
41 
 
q)enlist 34   # here is how to create a singleton 
,34 
 
q)1#34        # another way 
,34 
 
q)34,()       # another way 
 
q),34         # you cannot do this 
',            # error 
 
q)type enlist 34 
7h 
 
q)alist:enlist 34 
q)alist 
,34 
q)alist,77 
34 77 
 
q)"a"      # this is atom 
"a" 
 
q)enlist "a"    # this is singleton 
,"a" 
 
q)"b",enlist "a" 
"ba" 
q)"b","a" 
"ba" 
 
q)enlist (10 20 30; `a`b`c; "a"; "b")    # btw, enlist works on multiple input items 
10 20 30 `a`b`c "a" "b" 
 
q)count enlist (10 20 30; `a`b`c; "a"; "b")     # notice how it created a singleton where its element is a list 

 
## 
##  how to ensure enlist'ed-ness 
## 
 
(1)  enlist[x]    // notice this returns a nested list if x is already a list 
(2)  1#x 
(3)  (),x         // or x,()       // this is very powerful, and used a lot 
                                   // effectively enlist[x] if x is atom, and otherwise does nothing 
 
### 
###  indexing - list access 
### 
 
q)(100; 200; 300)[0] 
100 
 
q)100 200 300[0] 
100 
 
q)L:100 200 300 
q)L[0] 
100 
 
q)L[1] 
200 
 
q)L[5]    # NOTE: out of bound indexing gives you NULL of the "first" item of the list 
0N        # hence long null here 
 
q)L:100 200 300i 
q)L[5] 
0Ni       #  another example. this is int null 
 
q)L3:(`ibm; 2; 3.3) 
q)L3[0W] 
`         #  another example. this is symbol null 
 
q)L2:1.1 2.2 3.3 
q)L2[-1]  #  another example. this is float null. NOTICE you cannot use -1 indexing unlike Perl/Python. 
0n 
 
q)L:100 200 300 
q)L[0]:34        # indexed assignment 
q)L              # NOTE: for index assignment on simple list, q is inflexible in widening/narrowing the data type. 
34 200 300       # you must provide the exact correct data type. 
                 # e.g. 
q)L[0]:34h       # trying to put short into a simple list of long type 
'type            # error 
 
q)L:100 200 300 
q)L[]            # this lets you get the entire list 
100 200 300 
q)L 
100 200 300 
q)L[::]            # :: denotes the nil item, which is the same as above L[] 
100 200 300        # note: type of :: doesn't match any other data type 
                   #      i.e. if you include :: in a list, the list becomes general list, which people may take advantage of. 
q)L:(1; 2; 3; `a)  # here is a perfect example from q4m 
q)L[3]:4 
q)L                # q automatically converted L to a simple list 
1 2 3 4 
q)L[3]:`a 
'type 
 
q)L:(:: ; 1 ; 2 ; 3; `a)   # here, by always keeping :: in the list, the list never becomes simple list 
q)L[4]:4                   # possibly you may wanna use this trick in the future 
q)L[4]:`a 
q) 
 
 
## 
##  list with expressions 
## 
 
q)a:123                 # as below, you can construct list out of variables and other lists. 
q)b:456 
q)L:(a; b) 
q)L 
123 456 
 
q)L1:1 2 3 
q)L2:34 56 
q)L3:(L1; L2) 
q)L3 
1 2 3 
34 56 
q)L4:(count L1; sum L2)    # even run some functions like this.  nothing surprising here. 
q)L4 
3 90 
 
q)a:123            # NOTE: you CANNOT denote a simple list with variables. 
q)b:456            # have to write in a general list form. i.e. (a; b) 
q)L:a b 
'Cannot write to handle 123. OS reports: Bad file descriptor 
 
 
### 
###  list concatenation 
### 
 
use 
- join "," 
- merge "^" 
 
 
### 
###  join "," operator 
### 
 
q)1 2 3,4 5 
1 2 3 4 5 
 
q)1,2 3 4      # ok to specify atom 
1 2 3 4 
 
q)1 2 3,4 
1 2 3 4 
 
q)1,2 
1 2 
 
q)1 2 3, `ibm`msft     # if you join diff type lists, then you get general list 



`ibm 
`msft 
 
q)(),34           # combining with an empty list. another way to produce a singleton 
,34               # just like  enlist 34 
 
 
### 
###  merge "^" operator      // aka fill 
### 
 
q)L1:10 0N 30 
q)L2:100 200 0N 
q)L1^L2             # overlays right-list onto left-list, unless null in right-list 
100 200 30 
 
q)L1:10 0N 30 456 
q)L2:100 200 0N 
q)L1^L2             # two lists must be of the same length 
'length 
 
note: if x is an atom, then you may interpret it as "null replacer" for y 
 
e.g. 
 
q)-99999 ^ 1 2 3 0N 4 5 6 
1 2 3 -99999 4 5 6 
 
q)`foo ^ `ibm``msft`aapl 
`ibm`foo`msft`aapl 
 
### 
###  fills[x]       // uniform 
### 
 
x is a list. 
 
q)fills 0N 12.34 12.35 0N 12.33 0N     // returns null with non-null preceding value 
0n 12.34 12.35 12.35 12.33 12.33       // useful when you want to know last trade price after binning, etc 
 
note: to fills[x] backward, you can just reverse fills reverse x 
e.g. 
q)reverse fills reverse 0N 2 3 0N 0N 9 0N 
2 2 3 9 9 9 0N 
 
// normally you wanna fill fwd first and then fill backward any remaining null when cleaning data 
// i.e. reverse fills reverse fills x 
 
 
### 
###  nested list  &  depth indexing 
### 
 
q)L4:enlist 1 2 3 4    # notice what enlist does. here it creates a singleton in which the only item is a simple list 1 2 3 4 
q)count L4 

q)L4 
1 2 3 4 
q)L4[0] 
1 2 3 4 
 
q)L3:1 2 3 4    # notice the diff 
q)count L3 

q)L3[0] 

 
 
q)m:(1 2 3 4; 10 20 30 40; 100 200 300 400)    # 3-by-4 matrix 
q)m 
1   2   3   4 
10  20  30  40 
100 200 300 400 
 
q)m[0]       # this is important to know. if you only specify the first depth 
1 2 3 4      # and not specify the rest, you get the whole list like this in this case. 
 
q)m[1] 
10 20 30 40 
 
q)m[1][2]    # called iterated indexing 
30 
 
q)m[1;2]     # this works also (called depth indexing), same as m[1][2] 
30           # notice separator is a semi colon ";" not a colon, unlike python. 
 
q)m:(1 2 3 4; ((10 20 30 40) ; (11 22 33 44) ) ; 5 6 7 8 ) 
 
q)m[1][1][3] 
44 
 
q)m[1;1;3]     # another example 
44 
 
NOTE: the obvious question is why have both m[1][2] and m[1;2]  ? 
     --> q lets you assign only with depth indexing. 
e.g. 
 
q)m[1][1][3]:777     # error 
'assign 
 
q)m[1;1;3]:777       # this works, in contrast 
q)m 
1 2 3 4 
(10 20 30 40;11 22 33 777) 
5 6 7 8 
 
NOTE: in general, in q world, depth-indexing is a preferred style over iterated indexing. 
      because it can be more informative. 
e.g.  compare the below two. depth-indexing tells you more explicitly what you are specifying. 
m[3]     # iterated indexing 
m[3;;]   # depth indexing 
 
 
### 
###  list indexing 
### 
 
instead of accessing by each item individually, we can access a list of items. 
 
q)L:100 200 300 400 
 
q)L[1 3]      # notice the syntax. you specify a list of indices 
200 400 
 
q)L[(1; 3)]   # this is obviously the same thing 
200 400 
 
q)L[3 0 2 1 0]          # see how powerful this is. (1) you can change index order, (2) you can even repeat the same index 
400 100 300 200 100 
 
q)01101011b[0 2 4] 
011b 
 
q)"hello world"[0 6 6 10] 
"hwwd" 
 
 
q)L:100 200 300 400 
 
q)idx:0 2 0 
 
q)L[idx]          # you can use variables to access items like this 
100 300 100 
 
NOTE: to further generalize indexing, you can retrieve an arbitrary shape of indexing. 
      see the below example. 
e.g. 
 
q)L:100 200 300 400    # L is a simple list 
q)L[(0 1; 2 3)]        # here you specified a list index 
100 200                # first item is (0; 1) so it returns 100 200 
300 400                # 2nd item is (2; 3) so it returns 300 400 
 
===> in other words, you get the output which is the same shape as the index. 
     let's look at more elaborate example. 
e.g. 
 
L2:(1 2 3 4; 11 22 33 44)    # 4-by-4 matrix 
 
q)L2[1 0 0]      # here the first "1" refers to 11 22 33 44 
11 22 33 44      # because you don't specify any index within 11 22 33 44, you got the whole list 
1  2  3  4       # similarly you get these two lists by "0 0" 
1  2  3  4 
                 # let's actually specify the index for the 2nd depth 
q)L2[1 0 0; 3]   # here you only specify "3", so you get atom 
44 4 4           # then q converts the output into a simple list 
 
q)L2[1 0 0; 3 2]  # take time to appreciate this 
44 33 
4  3 
4  3 
 
q)L2[1 0 0; 0 1 3 2 1]     # you can do this 
11 22 44 33 22             # this indexing, output should be intuitive if you followed the above. 
1  2  4  3  2 
1  2  4  3  2 
 
q)L:100 200 300 400       # you can re-assign values by list-indexing like this 
q)L[1 2 3]:2*L[1 2 3]     # see, it's powerful. 
q)L 
100 400 600 800 
 
q)L:100 200 300 400 
q)L[3 0 1]:777 666 555    # you can define the non-sequential index number order like this. 
q)L 
666 555 300 777 
 
q)L[0 0 0]:12 59 34      # if you do this, then the latest assignment dominates. 
q)L 
34 555 300 777 
 
q)L[0 3 2]:81           # NOTE this behavior. this is a common q behavior. 
q)L                     #                     atom is extended to match a list. 
81 555 81 81 
 
 
### 
###  simplified notation for list-indexing (aka juxtaposition) 
### 
 
q)L:100 200 300 400 
 
q)L[3] 
400 
 
q)L 3           # see this simplified notation, aka juxtaposition 
400 
 
q)L[3 0 1] 
400 100 200 
 
q)L 3 0 1       # juxtaposition (you don't have to use this, but this does help reduce density) 
400 100 200     #               (and some experts use this all the time, so you have to know this notation) 
 
q)idx:3 0 1 
 
q)L idx         # another juxtaposition example 
400 100 200 
 
q)L ::               # recall :: is nil/identity,  L[] == L[::] == L 
100 200 300 400 
 
q)null[::] 
1b 
q)null (::)      # you need the parenthesis 
1b 
 
### 
###  rand[x]       // 2 overloads 
### 
 

# (1)  x is positive numeric atom 

 
returns a numeric atom randomly chosen from [0,x) 
 
e.g. 
 
q)rand 5.0 
4.8925 
 
q)rand 5 

 

# (2)  x is a list 

 
returns an item randomly chosen from x 
e.g. 
 
q)rand 12 34 56 78 
34 
 
NOTE: don't confuse case 1 & 2. 
 
e.g. 
 
q)rand 1 2 3 4f        // if you want to apply to each elem of x, then you need "each" like below 
3f 
 
q)rand each 1 2 3 4f 
0.07347808 0.6319052 1.023146 3.447189 
 
 
#### 
####  x ? y   operator 
#### 
 
?[x;y] has 4 overloads.     // also, there are vector conditional eval ?[v;expr_true;expr_false] 
                            //             and functional select/exec ?[t;c;b;a] 
                            // but we look at those later 
## 
##  (1)  x=a_list ? y=any_data_object     # called "find" 
## 
 
returns the first index where y appears in x 
 
e.g. 
 
  [0][1][2][3][4][5][6] 
q)18 99 12 34 56 34 78 ? 34    # returns the index of the FIRST match, which is [3], not [5] 

 
q)11 22 33 44 ? 95       # if no match, then it returns the length of the list 
4                        # as if it's the index of the item if you append it to the list 
 
q)(11 22 33 44, 95) ? 95    # like this 

 
q)11 22 33 ? 33 11       # this runs find "?" on each item of the right operand to check against the left operand list 
2 0                      # you just have to know this behavior. 
 
 
## 
##  (2.1)  x=positive_int_atom ? y=numeric_atom     # called "roll" 
## 
 
return a LIST of x items chosen from [0,y)       (with replacement) 
 
e.g. 
 
q)5 ? 3         # note: repetition is allowed. 
0 1 2 1 2 
 
q)5 ? 3.14 
1.221636 1.229445 0.2550793 2.941396 0.8735862 
 
q) 1 ? 10 
,6              # a list 
 
q) 1 ? 5.0      # eqivalent to  first rand 5.0 
2.248655 
 
q)1+a - a:.z.t      # suppose you want to add a random delay, upto 1 sec 
00:00:00.001 
q)1000+a - a:.z.t 
00:00:01.000 
q)1 ? 1000+a - a:.z.t 
,00:00:00.587 
q)first 1 ? 1000+a - a:.z.t       # this is the way to go 
00:00:00.869 
 
 
## 
##  (2.2) x=pos_int_atom ? y=list_of_any_data_type     #  "roll" from a list 
## 
 
returns x items randomly chosen from y     (with replacement) 
 
e.g. 
 
q)3 ? `ibm`msft 
`ibm`msft`ibm 
 
q)3 ? `ibm`msft`aapl`dell`foo`bar 
`aapl`dell`foo 
 
 
## 
##  (3.1)  x=negative_int_atom ? y=int_atom_where_y>abs(x)      # called "deal"  (it's essentially "roll without repetition") 
## 
 
returns a LIST of x integers chosen randomly from [0,y), WITHOUT repetition, hence the condition y>=abs(x) 
 
e.g. 
 
q) -15 ? 100    // 15 distinct integers from [0,100) 
42 74 76 21 41 82 34 33 92 59 47 4 87 1 70 
 
q) -15 ? 15     // first 15 integers in random order 
7 10 11 13 2 12 0 3 9 1 8 14 5 6 4 
 
q) (asc -15?15) ~  asc -15?15 
1b 
 
## 
##  (3.2)  x=negative_int_atom ? y=list_of_any_data_type      #  "deal" from a list 
## 
 
returns a list of x items randomly chosen from y, WITHOUT repetition. hence you need to satisfy count[y] >= x 
 
q)-3 ? `ibm`msft`aapl`dell`foo`bar 
`foo`dell`ibm 
 
 
// NOTE:  here is a powerful trick of random deal operator. 
 
q)-10?`3                                   // you can create random symbol universe like this 
`bli`iob`iep`anp`cdf`klo`oka`jmi`gff`pmo 
 
q)-5 -5 ?' `3`4              // if you want length 3 and 4, then like this 
nkj  phc  lbd  coj  mgp 
kdlh mfcg ijga eikl padg 
 
q)raze -10 -5 ?' `3`4        // better raze like this 
`ggb`bnn`clh`jmb`alf`mke`bel`din`jpj`okk`mekp`hkib`pock`fhle`afpl 
 
NOTE: you can use trick only from `1 to `8 length. AND it only samples letters of "a" to "p" 
q).Q.a 
"abcdefghijklmnopqrstuvwxyz" 
 ^^^^^^^^^^^^^^^^ 
  only these 
 
q)do[10; tickers,:`$3?.Q.a]     // if you really want to cover all 26 chars, then you can do this 
q)tickers 
`wvt`ojl`xgj`piq`nvx`eac`mbx`jiy`bbe`qdi    // but this potentially contains dupes 
                                            // maybe we need something more elaborate like below 
q).Q.a cross .Q.a cross .Q.a   // 
"aaa" 
"aab" 
"aac" 
"aad" 
"aae" 
"aaf" 
"aag" 
"aah" 
"aai" 
"aaj" 
"aak" 
"aal" 
"aam" 
"aan" 
"aao" 
"aap" 
"aaq" 
"aar" 
.. 
 
q)count .Q.a cross .Q.a cross .Q.a 
17576 
 
q)-5 ? .Q.a cross .Q.a cross .Q.a     // deal from here 
"ygr" 
"lwc" 
"hvz" 
"rue" 
"crk" 
 
 
// 
// random exercise of ?[x;y]    roll & deal 
// 
 
q)2012.04.01 + -5 ? 31                                   // pick 5 unique days in Apr 2012 
2012.04.17 2012.04.08 2012.04.13 2012.04.26 2012.04.21 
 
q)2012.01.01 + -5 ? 365                                  // pick 5 unique days in 2012 
2012.12.07 2012.07.24 2012.08.20 2012.01.24 2012.06.15 
 
q)5 ? 24:00:00.000                                       // pick 5 random times of a day 
04:53:54.445 16:18:54.635 00:26:34.645 06:14:07.214 18:02:41.343 
 
q)-5 ? 24:00:00.000       // unfortunately, this doesn't work 
'type 
 
q)00:00:00.000 + -5 ? `long$24:00:00.000                         // but you can do this 
13:22:42.238 18:27:21.125 12:38:43.928 11:27:23.970 12:34:36.576 
 
q)a + 5 ? 24:00:00.000 - a:13:00:00.000                          // pick 5 random times after 13:00:00 
22:44:21.924 19:01:15.805 13:00:44.438 22:40:59.709 22:36:57.301 
 
 
## 
##  (4)  x ? y        // enum-extend 
## 
 
- x is a symbol list by name 
- y is a symbol list 
- returns y enumerated by x 
 
e.g. 
 
q)() ~ key `mysym 
1b 
 
q)`mysym ? `ibm`msft`aapl`ibm`nflx`ibm 
`mysym$`ibm`msft`aapl`ibm`nflx`ibm 
 
q)mysym 
`ibm`msft`aapl`nflx 
 
NOTE: x?y enum is smart, in that 
      - if x is not defined, creates x:distinct y 
      - if x is already defined, then x!y properly adds any missing symbols to (and updates) x 
 
recall x$y and x!y require x already defined to cover all elems in y.  so x?y is more flexible 
BUT you still need to load any existing `mysym from disk before extending it. be careful. 
 
https://code.kx.com/q/ref/enums/#enum-extend 
http://code.kx.com/q4m3/7_Transforming_Data/#75-enumerations 
 
 
NOTE: lets take this moment to review enumeration operators 
 
1.  x$y    // x is a sym list by name (that must be defined to fully cover y), y is a symbol list 
2.  x?y    // x is a sym list by name (that may not exist yet, or partially exists), y is a symbol list 
3.  x!y    // x is a sym list by name (that must be defined to fully cover y), y is an int list (NOT a symbol list) where each int is an index 
 
==> they all return an enumerated list 
 
NOTE: mysym == by value,  `mysym == by name 
 
### 
###  elided(omitted) indices 
### 
 
q)m:(1 2 3 4; 100 200 300 400; 1000 2000 3000 4000) 
 
q)m                     # a typical 3-by-4  depth=2 matrix 
1    2    3    4 
100  200  300  400 
1000 2000 3000 4000 
                       # let's remember depth-indexing 
q)m[1;]                # not specifying means specifying everything 
100 200 300 400 
 
q)m[;3]                # see how powerful this can get 
4 400 4000             # this is "column" extraction 
 
q)m[::;3]              # alternative syntax 
4 400 4000 
 
q)m[1;]~m[1] 
1b 
 
#  more examples 
 
q)L:((1 2 3;4 5 6 7);(`a`b`c`d;`z`y`x`;`0`1`2);("now";"is";"the")) 
 
q)L 
(1 2 3;4 5 6 7) 
(`a`b`c`d;`z`y`x`;`0`1`2) 
("now";"is";"the") 
 
q)L[;1;] 
4 5 6 7 
`z`y`x` 
"is" 
 
q)L[;;2] 
3 6 
`c`x`2 
"w e" 
 
q)L[0 2;;0 1]     # see how powerful elided index can get, together with list indexing 
(1 2;4 5) 
("no";"is";"th") 
 
 
### 
###   rectangular list 
### 
 
a rectangular list = a list of lists of the same length. 
 
e.g. 
q) L:(1 2 3;4 5 6;7 8 9)          # a list of 3 lists of length 3 
 
q)L:(1 2 3; (10 20; 100 200; 1000 2000))   # notice this is also a rectangular list 
                                           # a list of 2 lists of length 2 
q)L 
1 2 3 
10 20 30 
100 200 300 
 
q)flip L       # flip = transpose 
1 10 100 
2 20 200 
3 30 300 
 
### 
###  matrix 
### 
 
matrix is a special case of rectangular list, defined recursively. 
 
matrix of dimension 0  =  scalar 
matrix of dimension 1  =  vector (a simple list) 
matrix of dimension 2  =  a list of matrices of dimention 1 (i.e. a list of vectors) 
matrix of dimension n  =  a list of matrices of dimention n-1 
 
usually, you deal in 2 or 3 dim matrix. 
 
e.g.     #  a 2-3-2 matrix 
 
q)mm:((1 2;3 4;5 6);(10 20;30 40;50 60)) 
q)mm 
1 2   3 4   5 6 
10 20 30 40 50 60 
 
q)mm[0;;]             # same as mm[0] 
1 2 
3 4 
5 6 
 
q)mm[;1;] 
3  4 
30 40 
 
q)mm[;;1] 
2  4  6 
20 40 60 
 
 
e.g.    # you can do this simplified indexing. get used to it. 
 
q)show m:(1 2; 10 20; 100 200; 1000 2000) 
1    2 
10   20 
100  200 
1000 2000 
 
q)m 0 2 
1   2 
100 200 
 
q)m 0 2 0 1 
1   2 
100 200 
1   2 
10  20 
 
 
 
 
######################################################## 
##   operators (aka built-in functions, aka verbs)    ##     all build-in q operators are functions 
######################################################## 
 
https://code.kx.com/q/ref/card/    # universal list of operators 
 
## 
##  vocab 
## 
 
a function is - 
 
niladic: takes zero arg 
monadic: takes one arg 
 dyadic: takes two args 
triadic: takes three args 
 
e.g.   til is a monadic function that takes a non-negative int 
 
q)til 10 
0 1 2 3 4 5 6 7 8 9 
 
note: if a function takes a list as an input arg, it's still a monadic function. we count the list as one arg, instead of counting individual elems in the list. 
 
"atomic" function - another perspective to look at a function is how it processes its input. if it works on atoms, then it's an atomic function. 
e.g. 
q)neg 3.14       # a typical atomic function. (it's also monadic in this case) 
-3.14 
 
q) 3 + 4         # atomic (also dyadic) 

 
q)10 20 30 40 ? 20      # atomic only in its 2nd arg (it's also dyadic in this case) 

 
 
## 
## what is the diff btwn q-built-in functions and your custom functions? 
## 
 
q built-in function: can be infix or prefix. can take non-alphanumeric chars. e.g. + # | 
your function: only alphanumeric, only prefix. 
 
q)3+5       #  infix way 

 
q)+[3;5]    #  prefix way 

 
## 
##  atomic function (automatically) extends to "list" input      (aka "foreach" in other languages) 
## 
 
q)neg -7 

 
q)neg 1 2 3 
-1 -2 -3 
 
q)1 2 3+10 20 30     # when dealing with two lists, they must be of the same length 
11 22 33 
 
q)1 2 3+10 20 30 40 
'length 
 
q)neg (1 2 3; 4 5)    # it extends to nested lists also 
-1 -2 -3 
-4 -5 
 
q)(1 2 3; 4 5)+(100 200 300; 400 500) 
101 202 303 
404 505 
 
q)100 200 300+5      # it extends atom to match lists 
105 205 305 
 
q)5+100 200 300 
105 205 305 
 
 
### 
###  show 
### 
 
q)foo:17         # usually, assignment doesn't output anything 
q) 
q)show foo:17    # here is a way to show output 
17 
q) 
 
 
############   arithmetic ops   ############ 
 
 
q)2+3    # addition 

 
q)2-9    # subtraction 
-7 
 
q)1.3*2  # multiplication 
2.6 
 
q)6%3    # NOTE: division operator is "%" not "/" which is for comment 
2f       # NOTE: division result is ALWAYS float 
 
 
note: we can use operator on list. (because atomic function extends to adapt to list input in "foreach" manner) 
e.g. 
 
q)100 200 300+42    # you can do  atom+list 
142 242 342 
 
q)42+100 200 300 
142 242 342 
 
q)100 200 300+1 2 3    # you can do  list+list  IFF they are of the same length 
101 202 303            # use count to verify before you add them 
 
q)1 2 3+1 2 3 4        # error, two lists not of the same length 
'length 
  [0]  1 2 3+1 2 3 4 
            ^ 
 
note: remember, the order of eval is right-to-left 
e.g. 
 
q)5*2+4 
30 
 
q)(5*2)+4   # you can do this, but generally, you are encouraged to reorder your equation instead. 
14 
 
 
### 
###  mod[x]        //  atomic dyadic 
### 
 
q)23 mod 10 

 
q)23.45 mod 10 
3.45 
 
q)(`int$12:34:56.789) mod 1000    # a common technique to extract millisecond portion 
789 
 
q)11 22 33 mod 10          # again, you can do list mod atom 
1 2 3 
 
q)11 22 33 mod 10 21 32    # or list mod list 
1 1 1 
 
q)-7 mod 2 

 
 
### 
###  x div y         // atomic dyadic 
### 
 
- integer division. same as  floor[x % y] 
 
q)7 div 2 

 
q)7 div 2.5 

 
q)-7 div 2    # floor -3.5 is -4 
-4 
 
q) -7 div 2.5 
-3 
 
q) 7 div -2    # same as -7 div 2 
-4 
 
q)3 4 5 div 2     # the usual atom list expansion 
1 2 2 
 
q)7 div 2 3 4 
3 2 1 
 
q)3 4 5 div 2 3 4 
1 1 1 
 
 
### 
###  signum[x]        // atomic monadic 
### 
 
- returns the sign. an int  1i, -1i, 0i 
 
q)signum 3.14 1234 -3.14 0 
1i 1i -1i 0i 
 
q)signum (0n;0N;0Nt;0Nd;0Nz;0Nu;0Nv;0Nm;0Nh;0Nj;0Ne)     // returns -1 for null 
-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 
 
q)signum      // implementation is simple 
{(x>0)-x<0} 
 
 
### 
###  abs[x]      // atomic monadic 
### 
 
q)abs 123 -123 3.14 
123 123 3.14 -3.14 
 
q)abs 0b 
0i 
 
q)abs 10 -43 0N       // returns null for null 
10 43 0N 
 
 
### 
###  neg[x]      // atomic monadic 
### 
 
q)neg 3.14 -1234 -3.14 0 
-3.14 1234 3.14 -0 
 
 
 
### 
###  floor[x],  ceiling[x]      // atomic monadic 
### 
 
q)floor 3.14 

q)floor 5 

q)floor -3.14      # returns the largest int less than or equal to the input 
-4 
 
q) x:3.145 
q) deci:100 
q)(floor x * deci) % deci     #  you can implment a decimal place trimmer like this 
3.14 
 
q)ceiling 3.14 

q)ceiling 3 

q)ceiling -3.14     # returns the smalest int bigger than or equal to the input 
-3 
 
q)a:123.456 
q)a div 1          # floor 
123f 
q)(a + 1) div 1    # ceiling 
124f 
 
 
### 
###  larger "|"  and smaller "&"     //  "and" == "&"   "or" == "|" 
### 
 
q)13|99   # returns larger 
99 
 
q)13 & 85   # returns smaller 
13 
 
q)42 & 98.12    # if either is float, then returns float 
42f 
 
notice: for binary, they become logical OR and AND. (in fact, they can be written as "or" and "and") 
e.g. 
 
q)1b|0b 
1b 
 
q)1b & 0b 
0b 
 
q)123 or 456   # duh 
456 
 
e.g.  list expansion as usual 
 
q)23 | 10 20 30 40 
23 23 30 40 
 
q)0101b & 0011b 
0001b 
 
q)"hello" | "fooba" 
"hoolo" 
 
q)asc `ibm`msft`aapl     // you can sort symbols 
`s#`aapl`ibm`msft 
q)`ibm | `aapl           // but cannot do this. duh 
'type 
 
 
 
### 
###  max[x], min[x], prd[x], sum[x]         // aggregate 
### 
 
q)sum 1+til 10       # same as +/ 
55 
q)prd 1+til 10       # same as */ 
3628800 
q)max 20 10 40 30    # same as |/ 
40 
q)min 20 10 40 30    # same as &/ 
10 
 
NOTE: beware of null values in these operations. 
 
e.g. 
 
q)min 123 0n 456    // did you expect 0f or 123f ? 
123f 
 
 
### 
###  maxs[x], mins[x], sums[x], prds[x]       // uniform 
### 
 
q)sums 1 3 -4 21     # same as +\        # this is "cumulative sum" 
1 4 0 21                                 # e.g. useful to track cumu sum of each trade volume thruout the day. 
q)prds 1 3 -4 21     # same as -\        # update cvol:sums[tradeSize] from `trades 
1 3 -12 -252 
q)maxs 1 3 -4 21     # same as |\        # can be used to compute high price 
1 3 3 21 
q)mins 1 3 -4 21     # same as &\        # low price 
1 1 -4 -4 
 
 
### 
###  reciprocal[x]     // atomic monadic 
### 
 
- returns a float = 1.0 % x 
 
q)reciprocal 5 
0.2 
 
q)1 % 5         # same 
0.2 
 
 
### 
###  x xbar y         // atomic dyadic 
### 
 
- y is numeric 
- x is an integer 
- returns y rounded down to the nearest multiple of x 
 
q)3 xbar 14 
12 
 
q)3 xbar 14.68 
12f 
 
q)3 xbar til 20 
0 0 0 3 3 3 6 6 6 9 9 9 12 12 12 15 15 15 18 18 
 
--> this becomes powerful when you want to bin data into a particular interval. 
e.g. 
 
q)5 xbar 11:00 + 0 2 3 5 7 11 13 
11:00 11:00 11:00 11:05 11:05 11:10 11:10 
 
q)select last price, sum size by 10 xbar timeStamp from trade where sym=`aapl 
minute| price size 
------| -----------     // last price, and total size of each 10-min bin 
09:30 | 55.32 90094 
09:40 | 54.99 48726 
09:50 | 54.93 36511 
10:00 | 55.23 35768 
... 
 
q)select sym by 5 xbar close from daily where date=last date 
close| sym 
-----| ----------------------        // binning by close price 
25   | `sym$`AIG`DOW`GOOG`PEP,... 
30   | `sym$,`AAPL,... 
45   | `sym$`HPQ`ORCL,... 
... 
 
 
### 
###   deltas[x;y]  deltas[y]      // uniform, monadic|dyadic 
### 
 
same as   -': 
 
q)deltas 1 7 9 14       // atomic form assumes x is 0 
1 6 2 5 
q)deltas[1; 1 7 9 14]   // you can explicitly set what the first elem gets diff'ed to. 
0 6 2 5 
 
q)y: 1 7 9 14 
q)deltas[first y; y]    // a very common use case 
0 6 2 5 
 
 
### 
###   ratios[x]  ratios[x;y]     // dyadic form means you can specify x as the first elem 
###                              // atomic form means we assume the first elem is 1 
 
q)(%':) 100 99 101 102 101               // each previous ': 
100 0.99 1.020202 1.009901 0.9901961 
 
q)ratios 100 99 101 102 101              // ratios == (%':) 
100 0.99 1.020202 1.009901 0.9901961 
 
q)ratios[first a; a:100 99 101 102 101]    // this first[x] trick is very common 
1 0.99 1.020202 1.009901 0.9901961 
 
 
### 
###   differ[y]  differ[x;y]          // atomic form assumes x is :: 
### 
 
q)(~':) 1 1 1 2 2 3 4 5 5 5 6 6        # a popular adverb is  ~': 
011010001101b 
 
q)not (~':) 1 1 1 2 2 3 4 5 5 5 6 6    # especially its negated version has its own name "differ" 
100101110010b 
 
q)differ 1 1 1 2 2 3 4 5 5 5 6 6       # differ == not (~':) 
100101110010b                          # notice the first elem is always 1 in this case 
 
q)L:1 1 1 2 2 3 4 5 5 5 6 6 
q)differ L 
100101110010b 
q)where differ L    # gives you the indices of where each uniq number starts 
0 3 5 6 7 10 
 
q)(where differ L) cut L     # splits into lists by numbers 
1 1 1 
2 2 
,3 
,4 
5 5 5 
6 6 
 
===>  suppose you pick the longest length lists from the above. 
e.g. 
q)runs:(where differ L) cut L 
q)ct:count each runs             # count each list length 
q)runs where ct=max ct 
1 1 1 
5 5 5 
 
 
 
#############   comparision operators   ############ 
 
 
### 
###  (in)equality test = <>          // notice this is XOR for boolean 
### 
 
compares two atoms. 
 
q)37=30+7  # "=" is for equality test 
1b         # the result is boolean 
q)3=4 
0b 
q)3.14=2.14 
1b 
q)`abc=`abc   # symbol is atomic, so it can be compared like this 
1b 
 
q)123 <> 456     # same as  q) not 123=456 
1b 
q)123 <> 123 
0b 
q)not 123 = 123    # notice you can simply apply "not" to flip the result 
0b 
 
note: again, we can use operator on list 
e.g. 
 
q)100 100 100=100 101 102 
100b 
 
q)100<99 100 101 
001b 
 
note: for temporal data types, comparison can work in 2 ways (1) on underlying int rep, or (2) converts one to the same granularity as the other var, then compare. 
e.g. 
 
q)2000.01.01=2000.01.01D00:00:00.000000000     # here the operator converted 2000.01.01 to timestamp 
1b 
 
q)2015.01.01<2015.02m                          # here the operator converted 2015.02m to 2015.02m to 2015.02.01 
1b                                             #   q)`date$2015.02m 
                                               #   2015.02.01 
 
q)12:00:00=12:00:00.000 
1b 
 
 
### 
### comparison operators --- equal to or greater/less than   > < <= >= 
### 
 
q)123 <= 456 
1b 
 
q)123 >= 456 
0b 
 
q)"A" < "Z"    # works on underlying int in ascii table 
1b 
  symbols are compared in lexcographical order 
q)`a > `b     # 
0b 
q)`a < `b 
1b 
 
q)10 20 30<=30 20 10     # you can extend to list 
110b 
 
q)2 < 1 2 3 
001b 
 
 
NOTE: beware of null values. 
 
e.g. 
 
update foo:aVal from t where aVal <  bVal;   // suppose you are trying to assign smaller of two values 
update foo:bVal from t where aVal >= bVal;   // how do we treat null ? 
 
// recall the underlying representation of null 
 
q)0N < 123 
1b 
q)0N <= 123 
1b 
q)0N > 123 
0b 
q)0N >= 123 
0b 
 
 
### 
###   x ~ y    // "match" operator 
### 
 
compares two objects (can be non-atom). returns 1b if the same shape/type/value. 
 
note: this differs from the notion of match in other language where a match means two objects share the same underlying pointer addr. 
 
e.g. 
 
q)42=42 
1b 
 
q)42~40+2 
1b 
 
q)42~42h       // notice a subtlety here. data type must match. 
0b 
 
q)42f~42.0 
1b 
 
q)42~`42 
0b 
 
q)2 4~2 4 
1b 
 
q)42~(4 2;(1 0)) 
0b 
 
q)(4 2)~(4;2*1) 
1b 
 
q)4 2~(4;2*1) 
1b 
 
q)(())~enlist () 
0b 
 
q)(1; 2 3 4)~(1; (2; 3; 4)) 
1b 
 
q)(1 2;3 4)~(1;2 3 4) 
0b 
 
q)42~(42)            # notice (42) becomes atom, not singleton list 
1b 
 
q)97="a"        # this is 1b because this char's underlying int is 97 
1b 
q)`long$"a" 
97 
 
q)t 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
 
q)t ~ t 
1b 
 
### 
###  not[x] 
### 
 
q)not 0 
1b 
 
q)not 1 
0b 
 
q)not 34 
0b 
 
q)not 12.345 
0b 
 
q)not 123 456 789 
000b 
 
q)not "abc" 
000b 
 
q)not `ibm     # not is not defined for symbols 
'nyi 
 
q)not 0.0 
1b 
 
q)not 0N       # null is not 0, thus not 0N returns 0b 
0b 
 
note: for temporal data types, 0 is midnight of millennium. 
 
q)not 2000.01.01 
1b 
 
q)not 2000.01.02 
0b 
 
NOTE: be cognizant of where you put 'not' in where phrase. (no pun intended) 
      basically you prepend to any binary predicates. 
e.g. 
 
select from t where not c1 in `ibm`aapl`msft 
select from t where not ((c2 >= 2015.03.14) | (c1 in `ibm`aapl`msft)) 
 
 
 
### 
###  null[x] 
### 
 
because remembering the exact null literal for each data type is painful, here is a nice type-independent null-check function. 
 
q)null 34 
0b 
 
q)null ` 
1b 
 
q)null " " 
1b 
 
q)null "" 
`boolean$() 
 
q)null 0b 
0b 
 
q)null 0n 
1b 
 
q)null 0N 
1b 
 
q)null 0Ni 
1b 
 
q)null 0Nj 
1b 
 
 
 
### 
###  flip[x] 
### 
 
transpose a matrix (or a rectangular list). i.e. flip rows and columns. 
 
 
### 
###  type[x] operator 
### 
 
remember "numType" values from the data types table. 
 
q)type 42 
-7h          # long's numType = 7 
             # negative "-" indicates atom 
             # "h" because the output of "type" operator is short 
 
q)type "a" 
-10h         # recall char's numType = 10 
 
q)type `ibm 
-11h         # recall symbol's numType = 11 
 
q)type 1 2 3 
7h           #  positive because the input was not atom 
 
 
q)type `a`b`v!10 20 30 
99h                      # dict numType = 99 
 
q)type (1; `ibm; 98) 
0h                       # general list numType = 0 
 
q)type {x} 
100h                     # function numType = 100 
 
q)type ([] c1:`a`b`c; c2:10 20 30) 
98h                                  # table numType = 98 
 
 
### 
###  til[x] 
### 
 
q)til 10 
0 1 2 3 4 5 6 7 8 9 
 
q)2 * 1 + til 5          # recall right-to-left eval order 
2 4 6 8 10 
 
q)3 # til 10 
0 1 2 
 
 
### 
###  distinct[x] 
### 
 
a monadic function that takes a list, and returns a distinct item list, in the order of occurrence. not sorted. 
 
q)distinct 2 3 2 2 1 5 4 5 
2 3 1 5 4                    # in the order of occrrence, not sorted 
 
NOTE: it can be powerful in q-sql 
 
t : select distinct from t           #  this effectively removes dupe rows 
t : select distinct c1,c3 from t     #  get distinct combo of c1,c3  (you can add more columns) 
L : exec distinct c2 from t          #  another common use case 
 
 
### 
###  x union y 
### 
 
join x & y, then distinct it. 
i.e.  union[x;y] == distinct x,y 
 
q)1 2 3 4 union 3 4 5 6 
1 2 3 4 5 6 
 
q)distinct 1 2 3 4, 3 4 5 6 
1 2 3 4 5 6 
 
note: works on tables. assuming the same meta. 
 
q)t 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
 
q)t2 
c1 c2 c3 
-------- 
a  10 0 
b  20 1 
c  30 2 
a  40 3 
 
q)t union t2 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
a  10 0 
b  20 1 
c  30 2 
a  40 3 
 
q)distinct t,t2 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
a  10 0 
b  20 1 
c  30 2 
a  40 3 
 
 
### 
###  first[x]     // aggregate 
###  last[x] 
### 
 
returns first/last elem of x 
 
q)first `aapl`ibm`msft 
`aapl 
 
q)last 12 34 56 78       // i.e. first reverse 
78 
 
q)d 
a| 12  34  56 
b| 1.1 2.2 3.3 
c| 10  20  30 
 
q)first d      // if you want the first of each entry, then you need "each" like below 
12 34 56 
 
q)first each d 
a| 12 
b| 1.1 
c| 10 
 
q)t 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
 
q)first t 
c1| `a 
c2| 12 
c3| 10 
 
q)first each t 
`a`b`c`d 
 
 
### 
###  next[x] 
###  prev[x] 
###  x xprev y 
### 
 
q)next 12 34 56 78 
34 56 78 0N 
 
q)prev 12 34 56 78 
0N 12 34 56 
 
q)update (prev updateDate)-updateDate by sym from refDB   // see how long ago each symbol got updated in refDB 
 
q)2 xprev 12 34 56 78 90    // just like prev, but you can specify x items previous 
0N 0N 12 34 56 
 
q)-2 xprev 12 34 56 78 90   // there is no xnext[x;y] 
56 78 90 0N 0N 
 
quiz: supposed you have a table with columns ticker,category,startDate,endDate 
      same ticker can change category over time 
      collapse span if the gap is less than a week. 
 
 
### 
###  x sublist y 
### 
 
2 overloads 
 
//  use case (1)   x is an integer atom   -   then similar to  x # y 
 
q)3 # 12 34 56 78 90 
12 34 56 
 
q)3 sublist 12 34 56 78 90 
12 34 56 
 
q)30 # 12 34 56 78 90 
12 34 56 78 90 12 34 56 78 90 12 34 56 78 90 12 34 56 78 90 12 34 56 78 90 12.. 
 
q)30 sublist 12 34 56 78 90     // sublist doesn't wrap around. 
12 34 56 78 90 
 
 
//  use case (2)  x is a pair of integers 
 
q)3 2 sublist 12 34 56 78 90 99     // take 2 elems from y[3] spot 
78 90                               // i.e. take x[1] elems from y[x[0]] spot 
 
 
### 
###  x in y 
### 
 
q)`ibm in `aapl`ibm`msft`ms 
1b 
 
q)`gs in `aapl`ibm`msft`ms 
0b 
 
q)`gs`ibm`jpm in `aapl`ibm`msft`ms 
010b 
 
 
### 
###  x within y 
### 
 
q)1 3 10 6 4 within 2 6 
01011b 
 
q)select from t where tradeDate within 2018.04.01 2018.06.30 
 
 
### 
###  x bin y      // returns the index i of the first item in x that is x[i] <= y 
###  x binr y     // returns the index i of the first item in x that is x[i] >= y 
### 
 
the idea is x is a sorted list, and bin[] binr[] conduct binary search 
 
  [0][1][2][3][4] 
q)12 34 56 78 90 bin 60 

 
q)12 34 56 78 90 binr 60 

 
q)12 34 56 78 90 binr 7 

 
q)12 34 56 78 90 binr 99   // returns count[x] if y is bigger than any of x 

 
q)12 34 56 78 90 bin 7    //  returns -1 if y is smaller than x[0] 
-1 
 
 
### 
###  x rotate y      // uniform 
### 
 
x is an integer 
y is a list (or a table) 
 
q)2 rotate 12 34 56 78 90     // rotate 2 elems to left 
56 78 90 12 34 
 
q)-2 rotate 12 34 56 78 90    // to right 
78 90 12 34 56 
 
q)show t : ([] c1:`a`b`c`d; c2:12 34 56 78; c3:10 20 30 40) 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
 
q)1 rotate t       // rotate 1 row up 
c1 c2 c3 
-------- 
b  34 20 
c  56 30 
d  78 40 
a  12 10 
 
 
### 
###    x # y       // take operator 
### 
 
#[x;y]  is very overloaded. 
 
 
//  use case 1  -  x is an integer atom, y is an atom or a list (or a dict or a table) 
 
q)3 # til 10      # like head/tail 
0 1 2             # returns the first 3 elems 
 
q)-3 # til 10     # returns the last 3 elems 
7 8 9             # NOTE: take operator ALWAYS returns list 
 
q)0 # 1 2 3     # 0# returns an empty list of type of the FISRT elem 
long$()         # so it's a nice way to create an empty list of an atom of your choice 
 
q)5#1 2 3       # if you take more than the number of the items in the list 
1 2 3 1 2       # it comes back to the beginning until you reach the specified number 
q)5#91          # here is a neat succinct way to create a simple list of a given atom 
91 91 91 91 91 
 
note: works on dict/table records. 
 
q)show d : `a`b`c ! (12 34 56; 1.1 2.2 3.3; 10 20 30) 
a| 12  34  56 
b| 1.1 2.2 3.3 
c| 10  20  30 
 
q)2 # d 
a| 12  34  56 
b| 1.1 2.2 3.3 
 
q)5 # d           //  notice it wraps around 
a| 12  34  56 
b| 1.1 2.2 3.3 
c| 10  20  30 
a| 12  34  56 
b| 1.1 2.2 3.3 
 
q)show t:([] c1:`a`b`c; c2:12 34 56; c3:5.5 4.4 7.7) 
c1 c2 c3 
--------- 
a  12 5.5 
b  34 4.4 
c  56 7.7 
 
q)2 # t 
c1 c2 c3 
--------- 
a  12 5.5 
b  34 4.4 
 
 
//  use case 2  -  x is an integer list, y is an atom or a list 
                -  returns a matrix 
 
q)3 4 # `ibm 
ibm ibm ibm ibm 
ibm ibm ibm ibm 
ibm ibm ibm ibm 
 
q)2 3 # `ibm`aapl`msft`gs 
ibm aapl msft 
gs  ibm  aapl 
 
q)2 3 4#"a"           // 3D 
"aaaa" "aaaa" "aaaa" 
"aaaa" "aaaa" "aaaa" 
 
 
//  use case 3  -  x is a symbol atom/list, y is a dict (or a table) 
                -  returns x entries/columns of y 
 
q)d 
a| 12  34  56 
b| 1.1 2.2 3.3 
c| 10  20  30 
 
q)`a`c # d        // same as  d`a`c 
a| 12 34 56 
c| 10 20 30 
 
q)t 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
 
q)`c1`c3 # t       //  same as  select c1,c3 from t 
c1 c3 
----- 
a  10 
b  20 
c  30 
d  40 
 
 
//  use case 4  -  x is a table, y is a keyed table 
 
q)show kt : `c1 xkey t 
c1| c2 c3 
--| ----- 
a | 12 10 
b | 34 20 
c | 56 30 
d | 78 40 
 
q)([] c1:`a`b`c`d`x`y`z`w) # kt 
c1| c2 c3 
--| ----- 
a | 12 10 
b | 34 20 
c | 56 30 
d | 78 40 
x | 
y | 
z | 
w | 
 
 
### 
###  x ! y 
### 
 
--- use case 1     // dictionary which is a paired lists 
 
- x is a list 
- y is a list 
 
q)`ticker`px`vol!(`ibm`msft;1.1 2.2; 30 40) 
ticker| ibm msft 
px    | 1.1 2.2 
vol   | 30  40 
 
--- use case 2      // same as xkey[] 
 
q)t 
ticker px  vol 
-------------- 
ibm    1.1 30 
msft   2.2 40 
 
q)1!t 
ticker| px  vol 
------| ------- 
ibm   | 1.1 30 
msft  | 2.2 40 
 
q)2!t               // it is less flexible than xkey[] because you cannot specify which columns to key 
ticker px | vol     // in fact, xkey[] is actually a combination of xcol[] and ![] 
----------| --- 
ibm    1.1| 30 
msft   2.2| 40 
 
q)0!2!t          //  0!t is a neat way to  () xkey t 
ticker px  vol 
-------------- 
ibm    1.1 30 
msft   2.2 40 
 
q)1!`t           // you can update input in-place 
`t 
q)t 
ticker| px  vol 
------| ------- 
ibm   | 1.1 30 
msft  | 2.2 40 
 
 
--- use case 3    // functional form  of update/delete 
 
![t;c;b;a] 
 
--- use case 4    // enumeration 
 
- x is a symbol list by name 
- y is an int list   // this is important 
 
q)mysym:`ibm`msft`aapl`nflx 
q)show elist:`mysym!3 0 1 2 0          // enumerated list 
`mysym$`nflx`ibm`msft`aapl`ibm 
 
q)show elist:`mysym$`nflx`ibm`msft`aapl`ibm     // recall if y is not int list, but actual symbol list 
`mysym$`nflx`ibm`msft`aapl`ibm                  // then you need to use $ 
 
 
### 
###  where x 
### 
 
monadic 
 
q)where 101010b      # returns the index of 1b 
0 2 4 
 
q)L: 10 20 30 40 50 
q)where L>20           # returns the index where an elem > 20 
2 3 4 
 
q)L[where L>20]        # see how useful 
30 40 50 
 
q)L[where L>20]:77 
q)L 
10 20 77 77 77 
 
q)L:`b`b`a`c`b 
q)where L=`b 
0 1 4 
 
NOTE: the most common place of use is in q-sql. 
 
 
### 
###  group[x] 
### 
 
a monadic function that returns a dict that maps each elem to an index list 
e.g. 
 
q)group "i miss mississippi" 
i| 0 3 8 11 14 17 
 | 1 6 
m| 2 7 
s| 4 5 9 10 12 13 
p| 15 16 
 
q)L:`b`b`a`c`b 
q)group L 
b| 0 1 4 
a| ,2 
c| ,3 
 
 
 
### 
###  amend (aka "assign in place") 
### 
 
recall in C++, you can write this, to amend a variable. 
x = 123; 
x += 5;    # i.e. x = x + 5 
 
q)x:123 
q)x+:5     #  notice the syntax 
q)x 
128 
 
q)x:123 
q)x-:5 
q)x 
118 
 
q)x:123 
q)x|:456 
q)x 
456 
 
q)L: 1 2 3 
q)L,:4         # same as  q) L:L,4 
q)L,:5 6       # a common useful way to append item(s) to a list 
q)L 
1 2 3 4 5 6 
 
q)L,:3.14      # but it expects type match 
'type 
 
 
### 
###   alias :: 
### 
 
- double colon 
- an alias is a variable that is an expression 
 
q)a:42 
q)b::a     # alias 
q)c:a      # assignment 
q)a:123 
q)b 
123 
q)c 
42 
 
note: alias defines "dependent" variables. e.g. alias w depends on x and y below. 
     - if none of dependent vars changed, then it retains the memoized result 
     - if any of dependent vars changed, then alias re-evaluates the expression 
     (note the diff from functions which evaluates every time it's called. no memoization. also function requires you explicitly supply input arg every time, not binding any particular vars permanently.) 
 
q)w::(x*x)+y*y 
q)x:3 
q)y:4 
q)w 
25 
q)y:5 
q)w 
34 
 
q)w::(x*x)+y*y 
q)v:2*w 
q).z.b             # system var .z.b shows dependency dictionary 
x| w               # (be careful not to create messy loopy dependency) 
y| w 
w| v 
 
note: common use of alias is a viewer for DB 
e.g. 
 
q)t:([]c1:`a`b`c`a;c2:20 15 10 20;c3:99.5 99.45 99.42 99.4) 
q)v::select sym:c1,px:c3 from t where c1=`a 
q)v 
sym px 
-------- 
a 99.5 
a 99.4 
 
q)update c3:777.0 from `t where c1=`a 
`t 
 
q)v 
sym px 
------- 
a   777 
a   777 
 
 
### 
###  find "?" operator 
### 
 
  [0][1][2][3][4][5][6] 
q)18 99 12 34 56 34 78 ? 34    # returns the index of the FIRST match, which is [3], not [5] 

 
q)11 22 33 44 ? 95       # if no match, then it returns the length of the list 
4                        # as if it's the index of the item if you append it to the list 
 
q)(11 22 33 44, 95) ? 95    # like this 

 
q)11 22 33 ? 33 11       # this runs find "?" on each item of the right operand to check against the left operand list 
2 0                      # you just have to know this behavior. 
 
 
### 
###  deal "?" 
### 
 
q) 4 ? 15.0                            # pick 4 times from [0,15.0) 
10.3793 7.061824 9.520074 14.5086 
 
q) 10 ? 10 
8 1 9 5 4 6 6 1 8 5 
 
### 
###  count[x] 
### 
 
- count the number of elems in a list 
e.g. 
q)foo:123 456 789 
q)count foo 

 
q)count 34      # counting a single atom gives you 1  (naturally) 

 
q)count ((1 2 3); (4 5 6))    # counts the outer most list 

 
 
### 
###  join "," operator 
### 
 
q)1 2 3,4 5 
1 2 3 4 5 
 
q)1,2 3 4    # ok to specify atom 
1 2 3 4 
 
q)1 2 3,4 
1 2 3 4 
 
q)1,2 
1 2 
 
q)1 2 3, `ibm`msft     # if you join diff type lists, then you get general list 



`ibm 
`msft 
 
q)(),34           # combining with an empty list. another way to produce a singleton 
,34               # just like  enlist 34 
 
 
### 
###  x except y 
### 
 
q)12 34 56 78 90 except 34 78 
12 56 90 
 
q)12 34 56  except  12 56 
,34                            # notice the return type is always list 
 
q)12 34 56  except  12 56 34 
`long$()                       # always list 
 
 
NOTE: you can subtract a subset of a table like this. very useful. 
 
q)show t1:([] c1:`a`b`c; c2:12 34 56; c3:5.5 4.4 7.7) 
c1 c2 c3 
--------- 
a  12 5.5 
b  34 4.4 
c  56 7.7 
 
q)show t2 : t1[0 2] 
c1 c2 c3 
--------- 
a  12 5.5 
c  56 7.7 
 
q)t1 except t2 
c1 c2 c3 
--------- 
b  34 4.4 
 
 
### 
###  x inter y 
### 
 
returns the intersection of x & y 
 
q)1 2 3 4 inter 3 4 5 6 
3 4 
 
NOTE: if applied to a dict (or a table), inter[x;y] returns the common entries (or rows) 
 
 
### 
###  all[x]      // unary, aggregate 
### 
 
returns 1b if all elements of x is non-zero 
        0b otherwise 
 
q)all 1 2 3 4 
1b 
 
q)all 1 2 3 4 0 5 
0b 
 
q)1 2 3 = 1 2 5 
110b 
 
q)all 1 2 3=1 2 4 
0b 
 
q) 1 2 3 in 1 2 4 
110b 
 
q)all 1 2 3 in 1 2 4 
0b 
 
q)if[all x in y; ...]    // intuitive, checks if all elems in x are in y 
 
 
NOTE: all[] is implemented as essentially (&/) 
 
q)all 
min$["b"] 
 
### 
###  any[x]      // unary, aggregate 
### 
 
returns 1b if any elem of x is non-zero 
        0b if all elem of x is zero 
 
q)any 1 2 3 4 
1b 
 
q)any 1 2 0 4 
1b 
 
q)any 0 0 0 
0b 
 
q)1 2 3=10 20 4 
000b 
 
q)any 1 2 3=10 20 4 
0b 
 
q)any 1 2 3=1 20 30 
1b 
 
q)1 2 3 in 1 20 30 
100b 
 
q)if[any x in y; ...]     // intuitive, checks if any of x is in y 
 
 
NOTE: see the underlying implementation below. 
 
q)any 
max$["b"] 
 
 
### 
###   gtime[x]  ltime[x]          // notice the assumptions about input x's timezone 
### 
 
q)gtime .z.P                      // assumes x is "timestamp" (or datetime) type in machine local timezone 
2018.12.14D00:57:20.610104000     // and converts it to GMT 
 
q)ltime .z.p                      // assumes x is timestamp/datetime type in GMT 
2018.12.13D20:00:04.633453000     // and converts it to local timezone 
 
 
##############    string    ############## 
 
 
### 
###   string[x]        // atomic, monadic 
### 
 
converts each atom of x to string which is a LIST of char. 
 
q)string `ibm`msft 
"ibm" 
"msft" 
 
q)"/tmp/", string[`foo] , ".txt" 
"/tmp/foo.txt" 
 
q)hsym `$ "/tmp/", string[`foo] , ".txt" 
`:/tmp/foo.txt 
 
q)string "q"        // applying string to char/string can be weird 
,"q"                // the result is a singleton char 
q)"q" ~ string "q" 
0b 
q) string "qqq" 
,"q" 
,"q" 
,"q" 
 
q)`q = `$ string `q       // this is intuitive 
1b 
q)`q = `$ string "q"      // this also holds 
1b 
 
q)string `c1`c2`c3!123 456 789     // applies to the value of dict 
c1| "123" 
c2| "456" 
c3| "789" 
 
q)string ([] name:`ibm`goog`nflx; price:12 34 56)    // applies to the columns of table 
name   price 
------------ 
"ibm"  "12" 
"goog" "34" 
"nflx" "56" 
 
 
### 
###  x like y      // uniform, dyadic 
### 
 
- x is symbol (or string) atom/list 
- y is pattern string 
- regex 
-- ? is a single character 
-- * is any character (zero to many chars) 
-- [] is a list of chars to be matched 
  e.g.  [abc] 
        [^abc] 
        [a-z] 
        [^a-z] 
        [0-9] 
        [^0-9] 
 
e.g. 
 
q)`ibm_p.n like "*_p.*" 
1b 
 
q)10 # trd 
time         ticker price 
---------------------------- 
00:00:00.000 ibm    34.98847 
00:00:00.001 ibm    7.410121 
00:00:00.002 aapl   47.75388 
00:00:00.003 ibm    61.93018 
00:00:00.004 ibm    92.01628 
00:00:00.005 aapl   40.26659 
00:00:00.006 ibm    61.13012 
00:00:00.007 aapl   28.83927 
00:00:00.008 aapl   87.59073 
00:00:00.009 ibm    31.97122 
 
q)select[5] from trd where ticker like "?ap*" 
time         ticker price 
---------------------------- 
00:00:00.002 aapl   47.75388 
00:00:00.005 aapl   40.26659 
00:00:00.007 aapl   28.83927 
00:00:00.008 aapl   87.59073 
00:00:00.010 aapl   56.28223 
 
q)select[5] from trd where ticker like "?[ab]*"     // second char is 'a' or 'b' 
time         ticker price 
---------------------------- 
00:00:00.000 ibm    34.98847 
00:00:00.001 ibm    7.410121 
00:00:00.002 aapl   47.75388 
00:00:00.003 ibm    61.93018 
00:00:00.004 ibm    92.01628 
 
q)select[5] from trd where ticker like "?[^ab]*" 
time ticker price 
----------------- 
q) 
 
 
NOTE: if you want to match for special characters, then you use [] also 
e.g. 
q)(`$("ab*c";"abcc")) like "ab[*]c" 
10b 
q)(`$("ab?c";"abcc")) like "ab[?]c" 
10b 
q)(`$("ab^c";"abcc")) like "ab[*^]c" 
10b 
 
 
ref: http://code.kx.com/q/cookbook/regex/ 
 
 
NOTE: if your column is of string type, then you have to like\:   // each-left 
 
e.g. 
 
q)show t:([] ticker:("ibm";"msft";"aapl"); price:12 34 56) 
ticker price 
------------ 
"ibm"  12 
"msft" 34 
"aapl" 56 
 
q)select from t where ticker like "msf*"      // ops this actually works 
ticker price 
------------ 
"msft" 34 
 
q)select from t where ticker like\: "msf*"     // each left 
ticker price 
------------ 
"msft" 34 
 
 
// more regex exercise 
 
q)strList:("foo123"; "bar456"; "bza789"; "kenics")    // suppose you have a list of string 
q)filter:("foo";"enic")                               // and filter 
 
q)filter ,\: "*"               // you can append wildcard easily like this 
"foo*" 
"enic*" 
 
q)"*" ,/: filter ,\: "*"       // maybe you wanna match like this 
"*foo*" 
"*enic*" 
 
q)filter:"*" ,/: filter ,\: "*" 
 
q)strList like/: filter               // notice the use of like each-right 
1000b 
0001b 
 
q)any strList like/: filter 
1001b 
 
q)strList where any strList like/: filter      // beautiful 
"foo123" 
"kenics" 
 
 
### 
###  x ss y       // uniform, dyadic 
### 
 
- ss = string search 
- x and y are string.  y is a substring(aka pattern) of x. 
 
q)"we are the spartans, are we?"  ss  "are"     // returns the index position list of substr match 
3 21 
 
q)"we are the spartans, are we?"  ss  "a?e"     // you can use the usual *?[]^  regex, just like like[x;y] 
3 21 
 
q)"we are the spartans, are we?"  ss  "a[abc]e" 
`long$() 
 
q)"we are the spartans, are we?"  ss  "a[rxyz]e" 
3 21 
 
 
### 
###  ssr[x;y;z]    // uniform, triadic 
### 
 
- ss and replace 
- x is string 
- y is pattern (substr of x) to match 
- z is string or function 
-- if z is string, then replace y with z 
-- if z is function, then replace y with z[y] 
 
q)ssr["tokyo london tokyo newyork" ; "tokyo" ; "hk"] 
"hk london hk newyork" 
 
q)ssr["tokyo london tokyo newyork" ; "tokyo" ; upper] 
"TOKYO london TOKYO newyork" 
 
// more realistic examples 
 
q)csvPath:"/some/path/NAME_YMD.csv" 
q)ssr/[csvPath; ("YMD";"NAME"); (string[.z.D]; "dividend")] 
"/some/path/dividend_2018.08.19.csv" 
 
q)ssr[;"  ";" "]/["winter  is    my  favorite season  of   the year"]      // this recursively converts two whitespace into a single whitespace 
"winter is my favorite season of the year"                                 // until converge. a very neat trick. 
 
 
 
### 
###  lower[x]  upper[x]      // atomic, monadic 
### 
 
- x is string or symbol 
 
q)lower `PoWeR 
`power 
q)lower "PoWeR" 
"power" 
q)upper `IbM`MsfT 
`IBM`MSFT 
q)upper ("ibM"; "msfT") 
"IBM" 
"MSFT" 
 
 
### 
###  trim[x]  ltrim[x]  rtrim[x]    // uniform, monadic 
### 
 
- x is string 
 
q)trim "    foo bar   "     // trims leading/trailing whitespace 
"foo bar" 
q)ltrim "    foo bar   "    // left trim 
"foo bar   " 
q)rtrim "    foo bar   "    // right trim 
"    foo bar" 
 
 
### 
###  md5[x]       // uniform, monadic 
### 
 
- x is string 
 
q)md5 "foo bar baz cmu"               // returns md5 128bit hash 
0xaa55837b9598199c8b081db434f6845b 
 
 
### 
###  x $ y  (pad)      // uniform, dyadic 
### 
 
- x is long 
- y is string 
 
q)10 $ "kenics"     // padding to 10 char width 
"kenics    " 
q)-10 $ "kenics"    // pad to right 
"    kenics" 
 
q)s:string (`kenics;`foo;`sugimoto;`msft)    // a more realistic usage 
q)neg[(max count each s)]$s 
"  kenics" 
"     foo" 
"sugimoto" 
"    msft" 
 
### 
###  x sv y        // uniform, dyadic 
### 
 
- scalar from vector   (joins stuff)                //  really should be called "join" 
- behaves differently based on the type of y 
 
q)"," sv  ("foo";"bar";"msft")    // join strings 
"foo,bar,msft" 
 
q)"." sv string 192 168 2 59 
"192.168.2.59" 
 
q)` sv ("foo"; "bar"; "msft")    // if y is string, x is `  then it uses default newline (\n in unix, \r\n in windows) 
"foo\nbar\nmsft\n" 
 
 
q)`a sv `foo`bar`txt    // you cannot join symbols like this 
'type 
 
q)` sv `foo`bar`txt     // but if you specify `  then it becomes a dot 
`foo.bar.txt            // this is useful when you need to join symbols with a dot 
 
q)` sv `:foo`bar`txt    // another tricky behavior of sv[x;y] is if the first elem of y is a file handle 
`:foo/bar/txt           // then it joins by slash '/' 
 
NOTE: (` sv) is so useful thus they made .Q.dd:{` sv x,`$string y} 
 
 
### 
###  x vs y      // uniform, dyadic 
### 
 
- vector from scalar (separates y by x)       // really should be called "split" or "cut" 
 
q)"," vs "foo,bar,msft"       // intuitive when y is string, and x is string or char 
"foo" 
"bar" 
"msft" 
 
q)",a," vs "foo,a,bar,a,msft" 
"foo" 
"bar" 
"msft" 
 
q)"|" vs "foo|bar||msft" 
"foo" 
"bar" 
"" 
"msft" 
 
q)` vs "abc\ndef\nghi"         // if y is string, x is `  then it separates by newline 
"abc" 
"def" 
"ghi" 
q)` vs "abc\r\ndef\r\nghi"     // handles windows newline too 
"abc" 
"def" 
"ghi" 
 
q)` vs `foo.txt       // if y is a symbol, x is `  then it splits y by `. 
`foo`txt              // this is useful when you want to count file suffix type 
 
q)` vs `:/tmp/foo/bar/test.txt    // if y is a file handle, x is `  then it separates into dir and file (in symbol format) 
`:/tmp/foo/bar`test.txt 
 
 
NOTE: if x,y are integer/boolean, sv/vs behave as decode/encode (like converting int to bit, base10 to base8, so on) 
 
 
q)0b vs 123h           // converts y (must be int) to binary if x = 0b 
0000000001111011b 
 
q)0x0 vs 123456789i    // converts y (must be int) to hex if x = 0x00 
0x075bcd15 
 
q)0x00 
0x00 
 
q)0x0 
0x00 
 
q)"." sv string `short$ 0x0 vs .z.a      // ip addr 
"10.29.49.194" 
 
 
##############    sort     ############## 
 
 
### 
###  asc[x], desc[x]     // unary, uniform 
### 
 
- stable sort 
 
q)asc 2 1 3 4 2 1 2     // ascending order 
`s#1 1 2 2 2 3 4 
 
NOTE: if x is 
    - a mixed list, the list is sorted within datatype 
    - a dict, the dict is sorted by the values 
    - a table, the table is sorted by the first non-key column 
 
### 
###  iasc[x], idesc[x]     // unary, uniform 
### 
 
- returns the list of indices which you can follow sequentially to get the sorted x 
- this is argsort[x] 
 
e.g. 
 
q)L:2 1 3 4 2 1 2 
q)iasc L 
1 5 0 4 6 2 3 
 
q)L iasc L 
1 1 2 2 2 3 4 
 
q)(asc L)~L iasc L 
1b 
 
 
### 
###   rank[x]     // unary, uniform 
### 
 
- returns the indices of x elems where they occur in the sorted list. 
  NOTE: yes, it is easy to confuse rank[x] with iasc[x] 
 
 
e.g. 
 
q)L:2 7 3 2 5 
 
q)rank L        // this about what this index list tells you. it is the "rank" of each elem of x 
0 4 2 1 3 
 
q)asc[L] rank L    // what it means is this gets you the original x 
2 7 3 2 5 
 
q)rank[L] ~ iasc iasc L     // notice  rank[x] == iasc iasc x 
1b 
 
NOTE:  "iasc idesc x"  is another common use case. if you think about it, this is really the rank function. 
e.g. 
what does rank 0 meen ? is it the biggest or smallest ? 
if you think rank 0 is the biggest, rank 1 is the 2nd biggest,, then your "rank" function is iasc idesc x 
 
e.g. 
 
q)show L : 10?100 
70 36 12 97 92 99 45 83 94 8 
 
q)iasc idesc L 
5 7 8 1 3 0 6 4 2 9 
 
 
### 
###  x xrank y       // binary 
### 
 
x : an integer 
y : a list of sortable type 
 
groups y into x buckets. it is effectively rank[y] with grouping (y into x buckets) 
 
q)4 xrank til 8          / equal size buckets 
0 0 1 1 2 2 3 3 
q)4 xrank til 9          / first bucket has extra if y is not divisible by x 
0 0 0 1 1 2 2 3 3 
 
q)3 xrank 1 999 5 4 0 3   / outlier 999 does not get its own bucket, as intended 
0 2 2 1 0 1 
q)3 xrank 1 7 5 4 0 3     / same as above 
0 2 2 1 0 1 
 
NOTE: a more slightly practical example is to bucket price range and measures statistics for each bucket 
 
e.g. 
 
q)show t:flip `price`name!((20?20);(20?`aapl`msft`ibm)) 
price name 
---------- 
14    aapl 
9     aapl 
14    aapl 
13    aapl 
9     ibm 
13    ibm 
10    msft 
14    msft 
17    ibm 
14    ibm 
17    msft 
16    msft 
11    aapl 
8     ibm 
8     aapl 
10    ibm 
19    msft 
12    ibm 
11    msft 
1     aapl 
 
q)select Min:min price,Max:max price,Count:count i by bucket:4 xrank price from t 
bucket| Min Max Count 
------| ------------- 
0     | 1   9   5 
1     | 10  12  5 
2     | 13  14  5 
3     | 14  19  5 
 
note: an even more realistic example can be you sort the timestamp colunm and group into x buckets, and you extract min, max prices within each bucket. 
 
 
 
##############   aggregating statistics   ############## 
 
 
### 
###  avg[x]     // unary, aggregate 
### 
 
- returns arithmetic mean in float 
e.g. 
 
q)avg 1 2 3         // return type is float 
2f 
q)avg 1 0n 2 3      // null is ignored 
2f 
q)avg 1.0 0w        // works with infinity 
0w 
q)avg -0w 0w        // if x contains BOTH pos/neg infinity, then the result is NULL 
0n 
 
 
note: commonly used as part of q-sql 
 
q)select avg price  by name from trade 
name| price 
----| -------- 
aapl| 10 
ibm | 11.85714 
msft| 14.5 
 
 
### 
###  avgs[x]      // unary, uniform 
### 
 
- same as  avg\ 
 
e.g. 
 
q)avgs 1 2 3 0n 4 -0w 0w 
1 1.5 2 2 2.5 -0w 0n 
 
 
### 
###   dev[x]      // unary, aggregate 
### 
 
- returns sttdev(x) in float 
 
q)dev 1 1 1 
0f 
q)dev 11 22 33 44 
12.29837 
 
 
recall                n       _                                              n 
                      Σ (xi - x)^2                                           Σ xi 
                     i=1                                               _    i=1 
 variance σ(X)^2 =  ---------------  =  E[(X-E[X])^2]       where mean x = ------- = E[X] 
                          n          =  E[X^2] - E[X]                         n 
 
 σ(X) = stddev(X) 
 
 
### 
###   var[x]       // unary, aggregate 
### 
 
- computes variance of numeric list x 

 
e.g. 
 
q)var 232 75 3 129 
6979.688 
 
 
### 
###    x cov y        // binary, aggregate 
### 
 
- computes covariance of x & y  (i.e. how variability of x and y correlates) 
- returns a float (can be any number) 
- x and y must be of the same length 
 
           n     _     _ 
cov[x;y] = Σ (xi-x)(yi-y) / (n-1) 
          i=1 
_              _ 
x = avg[x],    y = avg[y] 
 
e.g. 
 
q)2 3 5 7 cov 3 3 5 9 
4.5 
q)2 3 5 7 cov 4 3 0 2 
-1.8125 
q)select price cov size by sym from trade      // a more practical example 
 
 
(ref) http://www.statisticshowto.com/covariance/ 
 
 
### 
###   x cor y      // binary, aggregate 
### 
 
- computes correlation btwn x & y 
- returns a float [-1f,1f] 
- x and y must be of the same length 
 
                cov[x;y] 
cor[x;y] =  ---------------- 
              dev[x]*dev[y] 
 
e.g. 
 
q)1 2 3 cor 10 20 30       // completely positive corr 
1f 
q)1 2 3 cor 3 2 1          // completely negative corr 
-1f 
q)1 2 3 cor 1 22 333       // strong positive corr 
0.8928819 
q)1 2 3 cor 0.00000000301  499999999  0.000000000003      // zero correlation 
0f                                                        // i.e. completely uncorrelated 
 
 
### 
###   med[x]       // unary, aggregate 
### 
 
- computes median (in float) of numeric list x 
 
e.g. 
 
q)med 33 22 11 
22f 
 
q)med 12 34 56 78 34 55 
44.5 
 
 
### 
###   mode[x]      // unary, aggregate 
### 
 
not natively available in in q. but it's trivial to implement. 
 
first idesc count each group 12 34 56 78 12 12 56 34 78 12 78 78 78 
78 
 
 
### 
###   x wsum y       // binary, aggregate 
### 
 
- computes weighted sum of the products of x and y. 
 i.e.   wsum[x;y] == sum[x * y] 
 
e.g. 
 
q)2 5 7 wsum 2 1.5 10 
81.5 
 
 
### 
###   x wavg y       // binary, aggregate 
### 
 
- computes average of y weighted by x    (DONT mitakenly swap x and y) 
 i.e. wavg[x;y]  ==  sum[x*y] % sum[x] 
 
e.g. 
 
select vwap:qty wavg px by ticker from tradedata    // vwap == wavg[qty;px] 
ticker | vwap 
-------| ------ 
ibm    | 26.94 
aapl   | 126.73 
msft   | 75.28 
 
 
q)2 0N 4 5 wavg 1 2 0N 8      // NOTE nulls in either argument are ignored 
6f 
 
 
### 
###  sqrt[x]     // atomic, monadic 
### 
 
- returns a float 
 
q)sqrt 9 
3f 
 
q)sqrt 2 
1.414214 
 
q)sqrt 1.5 
1.224745 
 
q)sqrt 0 
0f 
 
q)sqrt -1     # returns null if not defined 
0n 
 
q)sqrt 4 9 25 36 
2 3 5 6f 
 
 
### 
###  exp[x]        // atomic, monadic 
### 
 
- returns a float e^x  (e raised to the power of its input x) 
- recall btw x = exp[log[x]] 
 
q)exp 1 
2.718282 
 
q)exp 4.2 
66.68633 
 
q)exp -2 
0.1353353 
 
q)exp -1 
0.3678794 
 
q)exp 0.5 
1.648721 
 
q)exp 0 
1f 
 
 
### 
###   x xexp y       // binary, aggregate 
### 
 
- atomic dyadic 
- returns a float x^y 
- essentially  xexp[x;y] == prd y#x 
- implemented as xexp[x;y] == exp[y * log[x]] 
 
q) 2 xexp 5       // returns 2^5 
32f 
 
q)-2 xexp 2 
4f 
 
q)1 xexp 0.5 
1f 
 
q)9 xexp 0.5 
3f 
 
q)1.23 xexp 3.45 
2.04255 
 
q)-2 xexp 0.5      // returns null ifundefined 
0n 
 
q)a:123 
q)sqrt a xexp 2 
123f 
 
 
note: xexp[x;y] == exp y * log x 
 
 
### 
###  log[x]        //  atomic, monadic 
### 
 
- returns a float  (natural log, base e) 
 
q)log 1 
0f 
 
q)log exp 1 
1f 
 
q)log 12.34 
2.512846 
 
q)log 0.05 
-2.995732 
 
q)log -1        # returns null if undefined 
0n 
 
 
### 
###  x xlog y 
### 
 
- atomic dyadic 
- returns a float 
 
q)2 xlog 32       # returns log(32) of base 2 
5f 
 
q)10 xlog 1000    # returns log(1000) of base 10 
3f 
 
q)2 xlog -1       # returns null if undefined 
0n 
 
 
###########   moving statistics   ########### 
 
### 
###  x ema y      // binary, uniform 
### 
 
- exponential (weighted) moving average 
- x is a numeric atom.  0 < x < 1.0     (or a list of length of y) aka "smoothing factor" or "decay factor" 
- y is a numeric list 
 
suppose you have Y of length n where we refer to each elem Y1,Y2,,, Yt,,, Yn 
 
EMAt = Y1                    if t = 1 
     = α*Yt + (1-α)*EMAt-1   if t > 0 
 
q)ema[1%3; 12 24 36 48 60 72 84 96] 
12 16 22.66667 31.11111 40.74074 51.16049 62.107 73.40466 
 
- it's trivial to implement ema. 
ewma:{{(x*z) + (1f-x)*y}[x]\[y]}     // take a moment to understand this 
                                     // ewma takes two input args: x & y 
                                     // internal func defines the equation, and projection with [x] 
                                     // within the internal func, y & z correspond to two elems of [y] in "scan" way 
 
ewma:{first[y](1f-x)\x*y}            // optimized 
 
 
### 
###  x mavg y      // binary, uniform 
### 
 
- x is a positive integer 
- y is a numeric list 
- returns x-item simple moving average of y 
 
e.g. x = 30, y = a list of daily volume 
   --> gives you a 30-day moving average daily volume, aka adv30 
 
q)2 mavg 1 2 3 5 7 10 
1 1.5 2.5 4 6 8.5 
 
q)5 mavg 0N 2 0N 5 7 0N       // nulls after the first are replaced by 0 
0n 2 2 3.5 4.666667 4.666667 
 
 
### 
###  x mcount y      // binary, uniform 
### 
 
- x is a positive integer 
- y is a list 
- counts non-null items 
 
q)3 mcount 0 1 2 3 4 5 6 
1 2 3 3 3 3 3 
 
q)3 mcount 0N 1 2 3 0N 5 
0 1 2 3 2 2 
 
(i dont really see real world use of mcount[] ) 
 
 
### 
###  x mdev y        // binary, uniform 
### 
 
- x is a positive integer 
- y is a numeric list 
- returns x-item moving stddev of y 
 
q)5 mdev 0N 2 0N 5 7 0N      // nulls after the first are replaced by 0 
0n 0 0 1.5 2.054805 2.054805 
 
 
### 
###  x mmax y       // binary, uniform 
###  x mmin y 
### 
 
- x is a positive integer 
- y is a numeric list 
- returns x-item moving max[] or min[] of y 
 
q)3 mmax 2 7 1 3 5 2 8 
2 7 7 7 5 5 8 
 
q)3 mmax 0N -3 -2 0N 1 0   // initial null returns negative infinity 
-0W -3 -2 -2 1 1           // remaining nulls replaced by preceding max 
 
q)3 mmin 0N -3 -2 1 -0W 0 
0N 0N 0N -3 -0W -0W 
 
q)3 mmin 0N -3 -2 1 0N -0W    // null is considered the min value 
0N 0N 0N -3 0N 0N 
 
 
### 
###  x msum y       // binary, uniform 
### 
 
- x is a positive integer 
- y is a numeric list 
- returns x-item moving sum[] of y 
 
q)3 msum 1 2 3 5 7 11 
1 3 6 10 15 23 
 
q)3 msum 0N 2 3 5 0N 11     // nulls treated as 0 
0 2 5 10 8 16 
 
 
 
############   trigonometric functions   ############# 
 
### 
###  cos[x]      // unary, atomic 
### 
 
mathematical cosine of x 
 
q)cos 0.2 
0.9800666 
q)min cos 10000?3.14159265 
-1f 
q)max cos 10000?3.14159265 
1f 
 
### 
###  acos[x]     // unary, atomic 
### 
 
inverse of cos[x], aka arc-cosine, i.e. the value whose cosine is x 
 
q)acos -0.4 
1.982313 
 
### 
###  sin[x]     // unary, atomic 
### 
 
mathematical sine of x 
 
q)sin 0.5 
0.4794255 
q)sin 1%0 
0n 
 
### 
###  asin[x]     // unary, atomic 
### 
 
inverse of sin[x], a.k.a. arc-sine, i.e. the value whose sine is x 
 
q)asin 0 
0f 
q)asin 1.414213562373095%2 
0.7853982 
q)asin 1 
1.570796 
q)asin -1 
-1.570796 
 
### 
###  tan[x]     // unary, atomic 
### 
 
mathematical tangent of x 
 
q)tan 0 0.5 1 1.5707963 2 0w 
0 0.5463025 1.557408 3.732054e+07 -2.18504 0n 
 
### 
###  atan[x]     // unary, atomic 
### 
 
arc-tangent of x, i.e. the value whose tangent is x 
 
q)atan 0 
0f 
q)atan 1.414213562373095 
0.9553166 
q)atan 1 
0.7853982 
 
 
########    metadata    ######### 
 
### 
###   attr[x] 
### 
 
- x can be of any data types 
- returns the attribute, a symbol either `s`u`p`g`  where ` means no attribute 
 
q)attr 5 3 1 

q)attr 1 3 5 

q)attr asc 3 1 5 
`s 
 
 
### 
###   cols[x] 
### 
 
- x is a table 
- returns a symbol vector of column names 
 
q)show t : ([] name:`ken`foo`bar ; age:12 34 56) 
name age 
-------- 
ken  12 
foo  34 
bar  56 
 
q)cols t         // passing by value 
`name`age 
 
q)cols `t        // you can pass by reference 
`name`age 
 
 
### 
###  key[x] 
### 
 
arguably the most overlaoded function. lets cover one by one.       //  see http://code.kx.com/q/ref/metadata/#key 
 
----- case 1 
 
- x is a dict 
- returns its keys as symbol list 
 
q)d 
a| 12  34  56 
b| 1.1 2.2 3.3 
c| 10  20  30 
 
q)key d 
`a`b`c 
 
q)key `             // recall context/namespace is all dictionary 
`q`Q`h`j`o 
q)key `. 
`f`t`getPriceByTicker`a 
q)key `.q 
``neg`not`null`string`reciprocal`floor`ceiling`signum`mod`xbar`xlog`and`or`ea.. 
 
 
----- case 2 
 
- x is a symbol atom 
- returns x if it exists in the current context as variable, otherwise returns () 
 
q)a:12.345 
q)key `a             // this is THE way to check if a variable is defined 
`a 
 
q)key `b             // b is undefined. then you get an empty list () 
q) 
q)-3!key `b          // visible this way 
"()" 
 
 
----- case 3 
 
- x is a file handle 
- returns x if it exists, otherwise returns () 
 
q)key `:/tmp/foo.q       // this is THE way to check if a file exists before attempting to open 
`:/tmp/foo.q 
 
q)() ~ key `:/tmp/bar.q 
1b 
 
 
----- case 4 
 
- x is a directory handle 
- returns its contents in symbol, otherwise returns ()       // similar to system["ls /tmp/some/dir"] which returns a string list 
                                 (i.e. returns () if dir non-existent) 
q)key `:/tmp/some/dir 
`foo.q`afile`bfile`cfile`bar.q 
 
q)fileList: key `:/tmp/some/dir 
q)fileList where fileList like "*.q"       // very easy to grep for particular files in a dir like this 
`foo.q`bar.q 
 
q)key `:/tmp/emptyDir        // if an empty dir, then `symbol$() 
`symbol$()                   // this is NOT the same as () 
 
q)() ~ key `:/tmp/nonExistentDir      // non existent dir 
1b 
 
 
----- case 5 
 
- x is a keyed table 
- returns the key columns 
 
q)show kt:`ticker`price xkey ([] ticker:`ibm`msft`aapl; price:123 456 789; vol:11 22 33) 
ticker price| vol 
------------| --- 
ibm    123  | 11 
msft   456  | 22 
aapl   789  | 33 
 
q)key kt 
ticker price 
------------ 
ibm    123 
msft   456 
aapl   789 
 
 
---- case 6 
 
- x is a simple list (aka vector) 
- returns the type in symbol 
 
q)key 12 34 56 
`long 
q)key `ibm`msft`aapl 
`symbol 
q)0#12.34 
`float$() 
q)key 0#12.34 
`float 
 
---- case 7 
 
- x is an enumerated list 
- returns the enum list name 
 
q)mysym:`ibm`msft`aapl 
q)enumList: `mysym$`ibm`msft`ibm`aapl`ibm`aapl`msft 
q)enumList 
`mysym$`ibm`msft`ibm`aapl`ibm`aapl`msft 
q)key enumList 
`mysym 
q)value enumList 
`ibm`msft`ibm`aapl`ibm`aapl`msft 
 
NOTE: case 6 & 7 give you an insight about how to inspect a table column type. 
e.g. 
q)show t:([] name:`mysym?`ibm`msft`aapl; price:1.2 3.4 5.6; vol:10 20 30 )    // actually what's the diff btwn "?" and "$" 
name price vol                                                                // ? actually creates a symbol list called mysym on the fly 
--------------                                                                // whereas $ assumes you already have a master list "mysym" defined 
ibm  1.2   10 
msft 3.4   20 
aapl 5.6   30 
 
q)last flip t    // notice this actually returns the value instead of a whole singleton dictionary enlist[`vol]!enlist[10 20 30] 
10 20 30         // so it's a simple list. so it's case 6 
 
q)first flip t 
`mysym$`ibm`msft`aapl     // again a simple list (albeit enumerated) so it's case 7 
 
q)key each flip t       // so you can do this 
name | mysym            // how is this different from meta[t] ? 
price| float            // notice the symbol column shows the enumeration variable name, instead of symbol 
vol  | long             // that's useful info that meta[t] doesnt provide 
 
q)meta t 
c    | t f a 
-----| ----- 
name | s        // sometimes you want to inspect the enumeration info, 
price| f        // e.g. to verify if it's enumerated against the right sym variable name 
vol  | j        // e.g. to deide if we need to de-enumerate the columns 
                // hence "key each flip t" is useful 
 
 
---- case 8 
 
- x is a positive integer 
- same as til[x]                 // seriously.. 
 
q)key 8 
0 1 2 3 4 5 6 7 
 
 
 
### 
###  keys[x] 
### 
 
- x is a table 
- returns the key column name in symbol list 
 
q)t 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
 
q)kt 
c1| c2 c3 
--| ----- 
a | 12 10 
b | 34 20 
c | 56 30 
 
q)keys t 
`symbol$() 
 
q)keys kt 
,`c1 
 
 
### 
###  type[x] 
### 
 
- returns a short integer 
-- negative for an atom 
-- positive for a list 
-- 0h for a general list 
 
e.g. 
 
q)type `ibm`fb`msft 
11h 
q)type `ibm 
-11h 
 
 
### 
###  meta[x] 
### 
 
- x is a table 
- returns a keyed table 
 
c : column names 
t : type (upper case if list, and empty if general) 
f : foreign key 
a : attribute 
 
q)show meta trade 
c    | t f a 
-----| ----- 
time | t 
sym  | s   s 
price| f 
size | i 
 
 
### 
###  tables[x] 
### 
 
- x is a reference to namespace 
- returns a symbol list of tables in x 
 
e.g. 
 
q)tables `. 
`kt`t`t2 
 
 
### 
###   getenv[x]       // x is env var in symbol 
###   x setenv y      // y is its value in string 
### 
 
q)getenv `HOME 
"/Users/kenics" 
 
q)getenv `USER 
"kenics" 
 
q)`TMPDIR setenv "/tmp" 
q)getenv `TMPDIR 
"/tmp" 
 
q)\echo $TMPDIR 
"/tmp" 
 
$ echo $TMPDIR 
/tmp 
 
 
### 
###  system[string x] 
###  \x 
### 
 
a way to invoke system commands. 
q interpreter is smart enough to first check if it is q system command, otherwise tries a shell command. 
e.g. 
 
q)\a               // equivalent to system "a" 
                   // lists a symbol list of tables in the current context. 
 
q)\ls -la /tmp     // equivalent to system "ls -la /tmp" 
                   // invokes unix "ls" command 
 
q)system "pwd" 
q)\pwd 
 
note: within a script, you can only use system[] 
 
 
## 
##  exit[x] 
## 
 
q)exit 34 
$ echo $? 
34 
 
 
 
########   parse trees   ######## 
 
http://code.kx.com/q/ref/parsetrees/ 
http://code.kx.com/q/wp/parse_trees_and_functional_forms.pdf      / a very good whitepaper 
 
what is a parse tree ? 
it is a prefix form represented as lists, which is essentially a functional form. 
one benefit is you care prepare statement and evaluate later. 
another benefit is parse[x] lets you (almost) construct functional form given non-functional form qsql statement. 
 
### 
###  parse[x],  eval[x] 
### 
 
q)3+4       // infix form 

q)+[3;4]    // prefix form 

 
q)parse["3+4"]    // let's create a parse tree (tree == list, in this case) 



 
q)-3!parse["3+4"]    // better visualization. yes this is functional form. 
"(+;3;4)" 
 
q)eval (+;3;4)       // you can eval parse tree, i.e. functional form with eval[x] 

 
 
q)parse "3*4+5"      // an elem within a parse tree can be another parse tree 


(+;4;5)              // this 
 
 
// given this select statement, parse[x] lets you quickly construct a functional form ?[t;c;b;a] 
 
q)parse "select foo:c3,bar:c4 by c1 from t where c1 > 123, c2 in `ibm`msft`aapl" 

`t 
,((>;`c1;123);(in;`c2;,`ibm`msft`aapl))          // how cool is this ! 
(,`c1)!,`c1 
`foo`bar!`c3`c4 
 
 
// NOTE: dont go crazy with parse[x] because it can give you k (especially when you use adverb) 
 
 
### 
###  value[x]   ==   get[x] 
### 
 
- executes x which is a q statement given as string 
 
q)value "3+4" 

 
// a more realistic use is to construct a script name and execute based on the input data name 
e.g. 
 
 value "process",string[inputName],".q"; 
 
 
q)value[str] ~ eval parse[str] 
1b 
 
 
// if you give a list to value[], then it works as below 
 
q) f[arg 1;..;arg n] ~ value (f;arg 1;..;arg n) 
1b 
 
q)value ("2*"; 123) 
246 
 
NOTE: given a parse tree, value[] can actually evaluate it. (yes, confusingly its functionality overlaps with eval[]) 
      HOWEVER, it cannot handle nested parse tree, so don't use value[] as eval[] in general. 
e.g. 
 
q)value parse "3+4" 

q)value parse "3+4*5" 
'type 
 
q)eval parse "3+4" 

q)eval parse "3+4*5" 
23 
 
 
### 
###  reval[x] 
### 
 
it is eval[x] but like "-b" option enabled at q startup. 
 
"-b" option = client cannot write/update on server. 
 
$  q -p 5432 -b      // on server side 
q)a:12 34 56 
 
q)h:hopen 5432       // on client side 
q)h "count a" 

q)h "a:98 76 54"     // -b prevents client from write-access 
'noupdate 
 
 
.z.pg:{reval[x]}     // common usage 
 
 
note: reval[x] blocks system command too. like "system" 
 
 
############################# 
####    dictionaries     #### 
############################# 
 
- a pair of lists, keys and values 
 
q) `a`b`c ! 123 456 789 
a| 123 
b| 456 
c| 789 
 
q)d: `a`b`c ! 123 456 789 
q)d 
a| 123 
b| 456 
c| 789 
 
q)type d                      # type is 99h 
99h 
 
q)d[`a]           # accessing val by key 
123 
 
q)d[`a`a`c`b] 
123 123 789 456 
 
q)d `a 
123 
 
q)d `a`a`c`b 
123 123 789 456 
 
q)foo: `a`c 
q)d foo        # you can use var to specify keys 
123 789 
 
q)d[`a]:999       # updating/amending val by key 
q)d 
a| 999 
b| 456 
c| 789 
 
q)d[`foo] : 123    # inserting a new key-val to a dict 
q)d 
a  | 999 
b  | 456 
c  | 789 
foo| 123 
 
 
- you can make a key/value a list too. 
e.g. 
 
q)123 456 789 ! (`abc`foo ; enlist `bar; `ken`pun) 
123| `abc`foo 
456| `bar 
789| `ken`pun 
 
q)(`abc`foo ; enlist `bar; `ken`pun) ! 123 456 789 
`abc`foo| 123 
`bar    | 456 
`ken`pun| 789 
 
- you can take this to further complexity, but be careful. 
e.g. 
 
q)d:(`a`b; `c`d`e; enlist `f)!10 20 30 
q)d `f 
30 
q)d ? 20 
`c`d`e 
q)d:`a`b`c!(10 20; 30 40 50; enlist 60) 
q)d `b 
30 40 50 
q)d ? 30 40 50 
`b 
q)d ? enlist 60 
`c 
 
 
note: although we usually don't care, positional order within key/val lists is significant. 
e.g. 
 
q)(`a`b`c!10 20 30)~`a`b`c!10 20 30 
1b 
 
q)(`a`b`c!10 20 30)~`a`c`b!10 30 20 
0b 
 
 
note: a common way to construct a dictionary for functional form aggr "a" in ?[t;c;b;a] is below 
 
q)a!a:`ticker`tradePrice`mdv21 
ticker    | ticker 
tradePrice| tradePrice 
mdv21     | mdv21 
 
q){x!x}`ticker`tradePrice`mdv21 
ticker    | ticker 
tradePrice| tradePrice 
mdv21     | mdv21 
 
 
### 
###  key[], value[], count[] 
### 
 
q)d: `a`b`c ! 123 456 789 
 
q)key d 
`a`b`c 
 
q)value d 
123 456 789 
 
q)count d 

 
 
### 
###  empty, singleton dict 
### 
 
q) d:()!()                       # an empty dict 
q) d:(`symbol$())!`float$()      # a typed empty dict 
 
NOTE: to define a singleton dict, you must enlist atoms. because, recall, a dict is a pair of lists, NOT a pair of atoms. 
e.g. 
 
q)d: (enlist `a) ! enlist 123        # q) `a!123   creates an enumerated value for a link column. (not a dict) 
q)d 
a| 123 
 
 
### 
###  reverse lookup   (lookup key from value) 
### 
 
recall find operator "?" 
 
q)10 20 30 10 40 ? 10      # returns the index for the FIRST occerrence 

 
q)d: `a`b`c`a ! 123 456 123 999 
q)d ? 123                          # using "?" on dict gives you the FIRST key for the given value 
`a 
 
q)d ? 777         # if you try non-existent val, you get null whose type is the first item of the key list 

 
q)where d=123     # to get all keys for a given val 
`a`c 
 
 
### 
###  list VS dict 
### 
 
q)L: 123 456 789 
q)d: 0 1 2 ! 123 456 789    # if you define your keys 0,1,2,, then it's similar to a list 
q)L 2 
789 
q)d 2 
789 
q)L 2 0 0 1 
789 123 123 456 
q)d 2 0 0 1 
789 123 123 456 
 
q)d[3] : 999       # except you cannot add a key-val like this in a list 
q)d 
0| 123 
1| 456 
2| 789 
3| 999 
 
 
### 
###  non-uniq keys & values 
### 
 
q)d: `a`b`c`a ! 123 456 123 999       # non-uniq keys are allowed. 
q)d 
a| 123 
b| 456 
c| 123 
a| 999 
 
q)d[`a]        # but the usual lookup gives you only the FIRST value 
123 
 
q)d ? 123 
`a 
 
q)where d=123   # to get all keys for a given val 
`a`c 
 
NOTE: it is worth stopping for a moment to appreciate conceptually all of list, dict, function in q are essentially the same "map" operation. 
 
L[i] -> v    // once you get this, you can do some powerful stuff with "where" operation, coupled with adverb. 
d[k] -> v 
f[x] -> y 
 
 
### 
###  extracting & removing a sub-dictionary 
### 
 
q)d : `a`b`c`a ! 123 456 123 789 
 
q)`b`c # d        # to extract a sub-dict, syntax is (a list) # dict 
b| 456            # this is an overload of the take operator "#" 
c| 123 
 
q)`c # d          # the left operand MUST be a LIST 
'type 
 
q)(enlist `c) # d    # so you have to do this 
c| 123 
 
q)`b`a # d        # if duplicate keys, then only the FIRST match is returned 
b| 456 
a| 123 
 
q)`c _ d        # to remove a sub-dict, syntax is key _ dict 
a| 123          # looks like the left operand can be atom 
b| 456 
a| 789 
 
q)(enlist `c) _ d 
a| 123 
b| 456 
a| 789 
 
q)`c`a _ d 
b| 456 
 
q)`foo`bar # d     # extracting non-existent keys 
foo| 
bar| 
 
q)`foo`bar _ d     # removing non-existent keys, no change 
a| 123 
b| 456 
c| 123 
a| 789 
 
q)`a`b`c _ d      # removing all elems, gives you an empty dict 
q) 
 
q)`c`a cut d      # "cut" == "_"  in this case 
b| 456 
 
 
### 
###  cut VS "_" (aka drop or cut) 
### 
 
--> overloaded so depending on the argument, they are the same or different. 
    confusingly neither is a subset of the other. so they are really two diff functions that have an overlapping functinality. 
 

#  x cut y    (use case 1)       //  "_" cannot do this 

 
if x is an integer atom 
if y is a list (or a table) 
 
then x cut y  returns y split into a list of x-elem-list(or a table) 
 
e.g. 
 
q)4 cut til 10 
0 1 2 3 
4 5 6 7 
8 9 
 
q)show t : ([] c1:`a`b`c`d; c2:12 34 56 78; c3:10 20 30 40) 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
d  78 40 
 
q)show tblList : 3 cut t 
+`c1`c2`c3!(`a`b`c;12 34 56;10 20 30) 
+`c1`c2`c3!(,`d;,78;,40) 
 
q)tblList[0] 
c1 c2 c3 
-------- 
a  12 10 
b  34 20 
c  56 30 
 
q)tblList[1] 
c1 c2 c3 
-------- 
d  78 40 
 
 

#   x cut y     (use case 2)     // same as "_" 

 
if x is a non-decreasing list of integers in the domain til count y 
if y is a list (or a table) 
 
then x _ y  returns y cut at each index given in x 
 
e.g. 
 
q)0 2 4 9 _ `a`b`c`d`e`f`g`h`i`j`k`l`m 
`a`b 
`c`d 
`e`f`g`h`i 
`j`k`l`m 
 
q)4 9 _ `a`b`c`d`e`f`g`h`i`j`k`l`m 
`e`f`g`h`i 
`j`k`l`m 
 
q)4 4 _ `a`b`c`d`e`f`g`h`i`j`k`l`m 
`symbol$() 
`e`f`g`h`i`j`k`l`m 
 
 

#  x _ y      (use case 3 - aka "drop")     //  cut[x;y] cannot do this. 
#                                           //  easy way to remember is cut cannot drop 
 
if x is an integer atom 
if y is a list (or dictionary) 
 
then x _ y returns without the first(or last) x elems 
 
e.g. 
 
q)4  _ `a`b`c`d`e`f`g`h`i`j`k`l`m     // remove first 4 elems 
`e`f`g`h`i`j`k`l`m 
 
q)-4 _ `a`b`c`d`e`f`g`h`i`j`k`l`m     // remove last 4 elems 
`a`b`c`d`e`f`g`h`i 
 
q)myfile            // a common use of drop "_" is to remove the preceeding colon from file name 
`:/tmp/foo.q 
q)1_string myfile   // now this is ready to be loaded 
"/tmp/foo.q" 
q)system "l ",1_string myfile       // like this 
 
 
### 
###  operations on dictionaries 
### 
 
applying a (value modifying) function to a dictionary effectively applies it to the value list. 
e.g. 
 
q)d:`a`b`c!10 20 30 
q)neg d              # applying a monadic func to its values 
a| -10 
b| -20 
c| -30 
 
q)0.5 * d 
a| 5 
b| 10 
c| 15 
 
q)d=20 
a| 0 
b| 1 
c| 0 
 
q)sum d       # you can even do this 
60 
 
q)f:{x*x} 
q)f d          # same as  f[d] 
a| 100 
b| 400 
c| 900 
 
q)d1:`a`b`c!1 2 3 
q)d2:`a`b`c!10 20 30 
q)d1+d2                # here is an example of applying a dyadic func on two dicts 
a| 11                  # straight-forward 
b| 22 
c| 33 
 
q)d1:`a`b`c!1 2 3      # notice if keys are not identical, 
q)d2:`b`c`d!20 30 40   # then you get the "union" 
q)d1+d2 
a| 1 
b| 22 
c| 33 
d| 40 
 
similarly, you can do [in]equality test. 
notice, if no overlapping key, then null is substituted. 
 
q)(`a`b`c!10 20 30)=`b`c`d!20 300 400 
a| 0 
b| 1 
c| 0 
d| 0 
 
q)(`a`b`c!0N 20 30)=`b`c`d!20 300 0N    # notice 0N=0N 
a| 1                                    # here 
b| 1 
c| 0 
d| 1                                    # here 
 
q)(`a`b`c!10 20 30)<`b`c`d!20 300 400 
a| 0 
b| 0 
c| 1 
d| 1 
 
 
### 
###  join "," on dicts 
### 
 
joins two dicts. if overlapping keys, then the right-to-left overwrite. 
 
q)d1:`a`b`c!10 20 30   # notice key `c overlaps 
q)d2:`c`d!300 400 
q)d1,d2 
a| 10 
b| 20 
c| 300       # d2 prevailed 
d| 400 
 
note: as such, join "," is not commutative. d1,d2 and d2,d1 are different. 
 
### 
###  coalesce "^" 
### 
 
similar to the join "," operator, "^" merges two dict. if overlapping keys, the right dict prevails over the left. 
the only diff is no right-to-left overwrite if the right val is null. 
 
q)d1:`a`b`c!10 0N 30 
q)d2:`b`c`d!200 0N 400 
 
q)d1,d2 
a| 10 
b| 200 
c|         # see here 
d| 400 
 
q)d1^d2 
a| 10 
b| 200 
c| 30      # see here 
d| 400 
 
 
### 
###  column dictionary 
### 
 
a column dictionary = a dictionary where a key is a "symbol" whose value is a list (of the "same" length across keys) 
 
NOTE: the above definition must be strictly honored. repeat: key must be of symbol type, and the value must be equal length lists 
      yes, each value list can be general list (though general list is slow in general) 
 
e.g. 
 
q)`c1`c2!(`a`b`c; 10 20 30) 
c1| a b c 
c2| 10 20 30 
 
q)class:`name`iq!(`ken`tom`bob;72 34 96) 
q)class 
name| ken tom bob 
iq  | 72  34  96 
 
q)class[`name] 
`ken`tom`bob 
 
q)class[`name;]     # notice you can do this depth-indexing too 
`ken`tom`bob 
 
q)class[`name][1] 
`tom 
 
q)class[`name;1]    # equivalent depth-indexing 
`tom 
 
q)class[`iq][2] 
96 
 
q)class[`iq;2]      # depth-indexing 
96 
 
q)class[;1]         # see how powerful depth-indexing can be 
name| `tom 
iq  | 34 
 
 
NOTE: column dictionary with a single column needs the "enlist" spell 
e.g. 
 
q)d : (enlist `name) ! (enlist `ken`tom`bob) 
q)d 
name| ken tom bob 
 
 
NOTE: similarly, even if you have multiple columns, per column dictionary definition, each column MUST have a list as its value. 
 
e.g. 
 
q)`a`b`c!12 34 56      // is this a dictionary ?  - yes 
a| 12                  // is this a column dictionary  - no 
b| 34 
c| 56 
 
q)flip `a`b`c!12 34 56     // thus you cannot flip 
'rank 
 
q)flip `a`b`c!(enlist 12; enlist 34; enlist 56)     // yes, it is a column dict, so you can flip 
a  b  c 
-------- 
12 34 56 
 
 
NOTE: this comes up even if you are defining a table in a simpler syntax. 
 
q)([] a:12; b:34; c:56)         // invalid syntax 
'rank 
 
q)([] a:enlist 12; b:enlist 34; c:enlist 56)        // valid syntax 
a  b  c 
-------- 
12 34 56 
 
q)([] a:enlist 12; b:34; c:56)         // you can get away with this lazy syntax, but technically the above is more proper. 
a  b  c 
-------- 
12 34 56 
 
 
### 
###  flipping a column dictionary     (== a table) 
### 
 
flip == transpose 
 
q)class 
name| ken tom bob 
iq  | 72  34  96 
 
q)flip class     # flipped column dictionary == table 
name iq 
------- 
ken  72 
tom  34 
bob  96 
 
q)t[1]          # effectively, the indexing-depth is reveresed 
name| `tom      # i.e. that's what transposing a matrix means 
iq  | 34 
 
q)t[1;`name] 
`tom 
 
 
q)class ~ flip flip class    # NOTE: flipping a column dict twice gives you the original dict 
1b 
 
 
 
########################## 
####     function     #### 
########################## 
 
side effects: when you access outside resource from within function definition. obviusly side-effects should be very carefully controlled. 
 
pure functional programming paradigm doesnt allow side effects. but in q, you can do side effect, i.e. a q func can modify global variables. (not that you should) 
 
## 
##  syntax - function definition 
## 
 
q) f:{[a;b] a:a*2 ; b:b*b ; a+b}    #  {[input_arg_1; input_arg_2,,,]  expression_1 ; expression_2 ;,,, expression_n} 
                                    #  the output is the final expression 
                                    #  NOTE there is no semicolon ";" after the final expression. if you put a ";" then nothing gets returned. 
                                    #       (a very common newbie mistake) 
 
q) f[3;4]      # invoking. func_name[input_1; input_2;,, input_8] 
22 
 
NOTE: as of q3.5, a q func can take only upto 8 input args. 
      a common work-around is to supply input args in a list or a dictionary. 
NOTE: the number of input args is aka "valence" or "degree" "rank" of a function. (more on this later) 
 
q)f:{x*x}      # you can omit input param, then q assumes [x;y;z] 
q)f[3]         # they are implicit parameters 

 
q)f:{x+y+z} 
q)f[3;4;5] 
12 
 
q){x*y}        # a function is just data, so you can define without its name 
{x*y} 
 
q){x*x}[4]     # and invoke it like this. anonymous functions == lambda func/expression 
16 
 
q){a:x*x; b:y*y; a+b}[3;4]     # another example 
25 
 
q)g:{x*z}      # note you should be careful when using implicit params like this. 
q)g[2;3;4]     # this func implicitly requires 3 input args, with the 2nd param being ignored. 

 
function "juxtaposition" - if one argument func (aka monadic function), then you can do without brackets. but you need a whitespace. 
e.g. 
 
q){x*x} 5 
25 
q)f:{x*x} 
q)f 5 
25 
 
q){x*y} 3 4     # you still need brackets if you do more than one input args 
{x*y}[3 4]      # error 
 
q){x*y}[3;4]    # correct syntax 
12 
 
 
NOTE: if you write a q statement that spans across multiple lines (e.g. define your function in multiple lines), you need indentation (of a whitespace) 
 
e.g. 
 
f:{[x;y] 
 z:x+y; 
 z*z 
 }       // NOTE you MUST indent this closing bracket also. it's a common newbie mistake. 
         // especially if you have been coding in other languages like C, Perl, etc. 
 
foo: 
 bar:123;    //  same as  foo:bar:123;    notice a whitespace before bar 
 
 
## 
##  a niladic function 
## 
 
a function of 0 degree. i.e. no input arg. 
(1) a func that returns a constant 
(2) a func that modifies/references a global variable 
 
q)f:{[] 42} 
q)f[] 
42 
 
q)a:123 
q)f:{a*2} 
q)f[] 
246 
 
## 
##  functions with no return value 
## 
 
just add a semi-colon at the end. 
e.g. 
 
q)fvoid:{[x] `a set x;} 
q)fvoid 42 
q) 
 
 
### 
###  application-by-name 
### 
 
if a function is defined as a global variable, you can apply it by its symbol name. 
e.g. 
 
q)f:{x*x} 
 
q)f 5         # called lookup-by-name 
25 
 
q)`f 5        # called application-by-name 
25 
 
 
### 
###  call-by-name 
### 
 
q)f:{x*x} 
q)f 5       # this is called "call-by-value" 
25          # input arg is (even if it's variable) evaluated to a value when the func is invoked. 
            # so any manipulation you do inside the func is on a local copy. 
            # this can be problematic if the data is huge or directly want to change a global var. you don't wanna create a local copy. 
 
"call-by-name" == "call-by-reference" 
 
you reference a variable by its symbol format 
e.g. 
 
q)a:123 
q)get `a       # get returns the value of a global var 
123 
 
q)`a set 456    # set is a dyadic func that lets you change the value of a global var 
`a              # notice the output is the symbol name. it's not error msg (which starts with a single quote) 
q)a 
456 
 
 
### 
###   local VS global variables 
### 
 
q)a:123             # "a" is global 
q)f:{b:x*2 ; b}     # "b" is local 
q)f a 
246 
 
note: call-by-name only works on global var. 
note: there is no lexical scoping in q.  i.e. local var is only available within its immediate scope of definition. 
 
q)foo:123 
q)f:{foo * 2}     # global var is visible inside functions 
q)f[] 
246 
 
q)a:34              # to change global var value 
q)f:{`a set x*x}    # use "set" 
q)f 5 
`a 
q)a 
25 
 
 
### 
###  identify function :: 
### 
 
it returns the input as output. 
e.g. 
 
q)::[123] 
123 
 
q)::[12 34 56] 
12 34 56 
 
q)::[`a`b`c] 
`a`b`c 
 
q):: 123       # you cannot do juxtaposition with identify func 

 
### 
###  function as data 
### 
 
recall all q functions are anonymous (aka "lambda" aka nameless function), and you can only assign to variables. 
another important point is all q functions are just data. 
e.g. you can treat them as item in a list. 
 
q)(123 ; `ibm ; {x*x}) 
123 
`ibm 
{x*x} 
 
q)(123 ; `ibm ; {x*x})[2] 
{x*x} 
 
q)(123 ; `ibm ; {x*x})[2][5]      # [2] refers to {x*x} 
25                                # [5] is an input arg 
 
q)(123 ; `ibm ; {x*x})[2;5]      # depth indexing works also 
25 
 
q)f:{x*x} 
q)(f ; neg ; abs)[0; 5]      # invoking f[5] 
25 
 
q)(f ; neg ; abs)[1; 3.14]   # invoking neg 3.14 
-3.14 
 
q)(f ; neg ; abs)[2; -3.14]  # incoking abs -3.14 
3.14 
 
### 
###  "higher order function" (aka "adverb" in q) 
### 
 
another implication of functions being data is you can use a func as input to other func. 
"higher order function" - to take a function and produce a related function. 
 
q)apply:{x y}     # here "x" is assumed to be a func, and "y" is its input 
q)sq:{x*x} 
q)apply[sq; 5]    # invoking sq[5] 
25 
 
 
### 
###   projection        // partially applied function 
### 
 
projection == partially specifying params of a func, resulting in a func of remaining params. 
 
e.g. 
 
q)f:{x+y} 
q)f[3]       # here you only specified the first param x 
{x+y}[3]     # the result is a function that needs y specified 
 
q)f[3][4]    # so you can do this. same as f[3;4] 

 
q)f[3;]       # as a good style, you write semi-colon, so we know which param is being skipped 
{x+y}[3;] 
q)f[3;][4]    # like this 

q)f[3;] 4     # juxtaposition 

 
q)g:f[;4]    # here you used projection to only specify the 2nd param 
q)g[3]       # then you later can specify the 1st param 
7             # NOTE: after you define g like this, and change f definition, g remains unchanged. 
 
q)g:5 +       # you can do projection using build-in operators also 
q)g[3]        # but only the left operand 

 
q)f:{x+y+z}    # another example 
q)g:f[;3;]     # specifying the 2nd param 
q)g[1;5]       # specifying the 1st and 3rd params 
9              # you can project in any order 
 
q){x+y+z}[;;3][1;] 2      # you can do it like this also 

 
NOTE: it is common to see q code  f[x] y  instead of  f[x;y] 
      because sometimes it helps readability 
 
NOTE: often we see projection done using parenthesis 
 
q)f:(3*)     // yes, annoyingly, this is valid. some people write projection this way. so be familiar. 
q)f 
*[3] 
 
q)f:{3*x}    // essentially the same thing as this 
q)f 
{3*x} 
 
q)f:{*[3]}   // these are all equivalent 
q)f:*[3]     // 
 
NOTE: the above parenthesis-style projection may be used as adverb. (can be veeery cryptic) 
 
q)f:{3*x} 
q)f\[5;4]             // recall a typical use of scan 
4 12 36 108 324 972 
 
q)5 f\ 4              // infix form 
4 12 36 108 324 972 
 
q)5 {3*x}\ 4          // lambda infix 
4 12 36 108 324 972 
 
q)5 (3*)\ 4           // yes... this is valid also... some people write this way.. so be familiar.. 
4 12 36 108 324 972 
 
q)*[3]\[5;4]          // projection prefix 
4 12 36 108 324 972 
 
q)5 *[3]\ 4           // projection infix 
4 12 36 108 324 972 
 
 
### 
###   everything is a map 
### 
 
list, dictionary, functions, it's all just mapping between domain and range. 
that's why the indexing is all the same. 
also from this math conceptual standpoint, out-of-bound index should simply return null. instead of error. 
e.g. 
 
q)10 20 30 40[100] 
0N 
 
q)`a`b`c[-1] 

 
q)(1.1; 1; `1)[3] 
0n 
 
q)d:`a`b`c!10 20 30 
q)d[`x] 
0N 
 
 
### 
###  atomic function 
### 
 
recall how atomic functions automatically extends lists. and it's essentially "foreach" in imperative languages. 
 
e.g.       (it's easy to understand with atomic & monadic functions like this) 
 
q)neg 10 
-10 
q)neg 10 20 30 
-10 -20 -30 
q)neg (10 20 30; 40 50) 
-10 -20 -30 
-40 -50 
q)neg `a`b`c!10 20 30 
a| -10 
b| -20 
c| -30 
q)neg `a`b`c!(10 20; 30 40 50; 60) 
a| -10 -20 
b| -30 -40 -50 
c| -60 
 
 
===> what happens with dyadic functions ? 
 
1st arg | 2nd arg 
----------------- 
  atom  | atom      e.g.  3 + 4 
  atom  | list 
  list  | atom      e.g.  10 20 30 40 ? 20 
  list  | list 
 
if you can fix the non-atomic param, then you can project to an atomic monadic func. 
 
1st arg | 2nd arg 
----------------- 
  atom  | atom      e.g.  3 + 4 
  atom  | list      e.g.  1 + 10 20 30 
  list  | atom      e.g.  10 20 30 + 1 
  list  | list      e.g.  1 2 3 + 10 20 30     # here, two lists must be of the same length 
 
q)3 + 4 

q)1 + 10 20 30 
11 21 31 
q)10 20 30 + 1 
11 21 31 
q)1 2 3 + 10 20 30 
11 22 33 
 
 
Revelation: a function that consists of atomic functions is atomic. 
e.g. 
 
q)f:{(x*x)+(2*x)-1}        # atomic monadic 
q)f 0 
-1 
q)f til 10 
-1 2 7 14 23 34 47 62 79 98 
 
q)pyth:{sqrt (x*x)+y*y}     # atomic dyadic 
 
q)pyth[1; 1] 
1.414214 
 
q)pyth[1; 1 2 3] 
1.414214 2.236068 3.162278 
 
q)pyth[1 2 3; 1 2 3] 
1.414214 2.828427 4.242641 
 
 
### 
###  adverb  (applying non-atomic function in a foreach way) 
### 
 
adverbs == higher order functions (=functions that modify other functions) on lists. 
 
note: adverbs are extremely powerful. 
 
case 1:  monadic each (aka foreach)     # "each" 
case 2:  each-both                      #   ' 
case 3:  each-left                      #   \: 
case 4:  each-right                     #   /: 
case 5:  over                           #   / 
case 6:  scan                           #   \ 
 
## 
##  adverb - case 1  "monadic each"        (aka foreach, aka map) 
## 
 
here is a non-atomic monadic function "count". 
you wanna apply it to each item in a list (instead of a whole list) 
 
q)count 10 20 30 

 
q)count (10 20 30 40 ; `a`b`c) 

 
q)count each (10 20 30 40 ; `a`b`c)      # each is a dyadic func.  infix syntax  a_func each a_list 
4 3                                      # or, you can use prefix syntax  each[a_func ; a_list] 
 
q)each[count ; (10 20 30 40 ; `a`b`c)]   # like this 
4 3 
 
q)each[count] (10 20 30 40 ; `a`b`c)     # juxtaposition 
4 3 
 
q)each[each[count]] (10 20 30 40 ; `a`b`c)    # two levels 
1 1 1 1 
1 1 1 
 
q)(count each) each (10 20 30 40 ; `a`b`c) 
1 1 1 1 
1 1 1 
 
note         count  is a verb 
        each[count] is an adverb 
 
## more examples 
 
q)neg  (1 2 3 ; 10 20) 
-1 -2 -3 
-10 -20 
 
q)neg each (1 2 3 ; 10 20)     # applying each on atomic function is redundant 
-1 -2 -3 
-10 -20 
 
## more example 
 
q)reverse "hello" 
"olleh" 
 
q)reverse ("hello" ; "world" ; "foo" ; "bar") 
"bar" 
"foo" 
"world" 
"hello" 
 
q)reverse each ("hello" ; "world" ; "foo" ; "bar") 
"olleh" 
"dlrow" 
"oof" 
"rab" 
 
## more example 
 
q)enlist 1001 1002 1004 1003     # here is a common way to convert a list of N items 
1001 1002 1004 1003              # into a 1-by-N matrix 
 
q)flip enlist 1001 1002 1004 1003     # flip to get a N-by-1 matrix 
1001 
1002 
1004 
1003 
 
q)enlist each 1001 1002 1004 1003     # you can do this with "each" also 
1001 
1002 
1004 
1003 
 
## more example 
 
q)show t:([] id:123 456; name:("foo";"kenics")) 
id  name 
------------ 
123 "foo" 
456 "kenics" 
 
q)update len:count each name from t 
id  name     len 
---------------- 
123 "foo"    3 
456 "kenics" 6 
 
 
## 
##  adverb - case 2  "each-both"  ' 
## 
 
q)1 2 3+10 20 30      # this is each both 
11 22 33              # happens automatically with atomic functions 
 
q)`a`b`c , `d`e`f     # join operator "," is non-atomic dyadic 
`a`b`c`d`e`f 
 
q)`a`b`c ,' `d`e`f    # turning non-atomic to atomic 
a d                   # notice the single quote 
b e                   #  ,  is a verb 
c f                   #  ,' is an adverb 
 
q)"abc","de" 
"abcde" 
 
q)("abc"; "uv"),'("de"; "xyz") 
"abcde" 
"uvxyz" 
 
q)"abc" ,' "de"     # notice if you do a_list atomic_func a_list, the length must match. 
'length 
 
q)1 ,' 10 20 30     # this is a common technique 
1 10                # like you get a dir name and a list of file names, to construct full path for each file 
1 20                # BUT think about it, is this both-each ? 
1 30                # in a way, this is each-right 
 
q)1 ,/: 10 20 30    # in fact, each-right works 
1 10 
1 20 
1 30 
 
q)1 2 3 ,' 10       # again, this is each-both 
1 10                # but each-left works also 
2 10 
3 10 
 
q)1 2 3 ,\: 10      # each-left 
1 10 
2 10 
3 10 
 
q)(` sv) each `:/tmp ,' `foo`bar`baz 
`:/tmp/foo`:/tmp/bar`:/tmp/baz 
 
q)f:{x+y}    # here is just a simplistic custom func 
q)f[2;3] 

q)2 f 3     # recall our custom func cannot be used infix way 
'type 
 
q)2 f' 3 4 5     # BUT adverb lets you use it infix way ! (know this, otherwise you get confused reading others code) 
5 6 7 
 
q)2 # ("abcde"; "fgh"; "ijklm") 
"abcde" 
"fgh" 
 
q)2 #' ("abcde"; "fgh"; "ijklm") 
"ab" 
"fg" 
"ij" 
 
q)#' [2 ; ("abcde"; "fgh"; "ijklm")]     # as always, you can use prefix syntax 
"ab" 
"fg" 
"ij" 
 
q)t1:([] c1:1 2 3)      #  ,' adverb is a common way to concatenate two tables 
q)t2:([] c2:`a`b`c) 
q)t1,'t2 
c1 c2 
----- 
1  a 
2  b 
3  c 
 
q)show a:5?10 
1 2 8 5 6 
q)show b:5?10 
8 4 0 1 5 
q)f:{x+y+z} 
q)f[;;3]'[a;b]        // get familiar with this syntax, and appreciate the power of each-both 
12 9 11 9 14 
q)a f[;;3]'b          // infix form is also acceptable 
12 9 11 9 14 
 
## 
##  adverb - case 3  "each-left"  \: 
## 
 
applying the 2nd arg to each of the first arg. 
syntax is \: 
 
q)1 2 3 + 10    # this is each-left  (also each-both in this case) 
11 12 13 
 
q)1 2 3 +\: 10     # notice the syntax \: 
11 12 13           # obviously, it's redundant for atomic function 
 
q)1 2 3 + 10 20 
'length 
 
q)1 2 3 +\: 10 20    # compare this to each-right 
11 21                # notice how 1 2 3 are treated as arguments in this case 
12 22 
13 23 
 
q)1 2 3 +/: 10 20    # each-right 
11 12 13             # notice how 10 20 are treated as arguments in this case 
21 22 23 
 
 
q)("abc"; "de"; enlist "f") ,\: ">"    #  ,    is a verb 
"abc>"                                 #  ,\:  is an adverb 
"de>" 
"f>"                                   # as you can imagine, each-left each-right combo is used to construct xml 
 
q)"<" ,/: ("abc"; "de"; enlist "f") ,\: ">" 
"<abc>" 
"<de>" 
"<f>" 
 
## 
##  adverb - case 4  "each-right"  /: 
## 
 
applying the 1st arg to each of the 2nd arg. 
syntax is  /: 
 
q)10 + 1 2 3 
11 12 13 
 
q)10 +/: 1 2 3 
11 12 13 
 
q)10 1000 + 1 2 3 
'length 
 
q)10 1000 +/: 1 2 3 
11 1001 
12 1002 
13 1003 
 
q)"</" ,/: ("abc"; "de"; enlist "f") 
"</abc" 
"</de" 
"</f" 
 
q)"</" ,/: ("abc"; "de"; enlist "f") ,\: ">"    # a neat way to construct html/xml tags 
"</abc>" 
"</de>" 
"</f>" 
 
## 
##  x cross y        // cross product 
## 
 
q)1 2 3,/:\:10 20 
1 10 1 20 
2 10 2 20 
3 10 3 20 
 
q)raze 1 2 3,/:\:10 20 
1 10 
1 20 
2 10 
2 20 
3 10 
3 20 
 
q)1 2 3 cross 10 20    # join with each-right & each-left 
1 10                   # returns all possible combinations of elements of both x and y 
1 20 
2 10 
2 20 
3 10 
3 20 
 
q)(cross/)(2 3;10;"abc") 
2 10 "a" 
2 10 "b" 
2 10 "c" 
3 10 "a" 
3 10 "b" 
3 10 "c" 
 
q)s:`ibm`msft`aapl 
q)v:1 2 
q)([]sym:s)cross([]val:v) 
sym  val 
-------- 
ibm  1 
ibm  2 
msft 1 
msft 2 
aapl 1 
aapl 2 
 
q)show deck:"A234567890JQK" cross "dhcs"      // prepare a deck of cards 
"Ad" 
"Ah" 
"Ac" 
"As" 
"2d" 
"2h" 
"2c" 
"2s" 
"3d" 
"3h" 
"3c" 
"3s" 
"4d" 
"4h" 
.. 
 
q)10 # sdeck: -52?deck        // shuffled deck 
"3h" 
"0d" 
"3c" 
"8d" 
"8c" 
"9d" 
"Kc" 
"Qd" 
"4d" 
"6h" 
 
q)deal:{[deck;n] {[sd;n;x] sd where (til[52] mod n) = x}[-52?deck;n] each til n}      // deal to 8 players 
q)deal[deck;8] 
("4s";"9h";"2c";"8d";"5h";"Ac";"Qd") 
("6s";"As";"7d";"Js";"Ad";"Kh";"4h") 
("8h";"3s";"0c";"2h";"Qs";"4c";"9s") 
("3c";"6h";"9c";"Kc";"Ks";"5c";"4d") 
("5d";"6c";"Kd";"2d";"3d";"2s") 
("0s";"7s";"Qc";"8s";"8c";"0h") 
("Jh";"5s";"9d";"Jc";"Ah";"3h") 
("6d";"0d";"Jd";"7h";"Qh";"7c") 
 
 
## 
##  adverb - over "/"      # for accumulation 
## 
lets think about adding up all elements in a list. 
we should be able to apply "+" operator recursively across all elems, and accumulate the result. this is called "fold" or "reduce" or "accumulate" in functional programming, aka "over" in q. 
(need to begin with an initial value) 
 
f:{x+y}     // this is exactly +[x;y] does 
 
q)f[3;4]    // f[x;y] 

 
q)f/[3;4 5 6]     // f/[x;Y] =  f[f[f[x;Y[0]]; Y[1]]; Y[2]] 
18 
 
NOTE:  "f" itself may be a dyadic func (like "+" is) or a monadic function (like "neg" is). 
       lets not confuse the syntax for both. 
 
if f is monadic: 
 
    f/[x]      =  f[ f[ f[x] ] ]    // apply f to x until the result matches prev result, OR (2) until the result is the same as original input value 
                                    // sometimes this is called "converge" instead of "over"  - obviously be careful not to do infinite recursive loop 
    f/[n;x]    =  f[ f[ f[x] ] ]    // apply f to x for n times  where n = positive int 
    f/[cond;x] =  f[ f[ f[x] ] ]    // apply f to x until cond returns 0b    // sometimes this is also called "converge" instead of "over" 
                                    // i.e. cond is a monadic predicate function that returns 1b or 0b 
 
if f is dyadic: 
 
     f/[x;Y] =  f[f[f[x;Y[0]]; Y[1]]; Y[2]]         // x is init value 
     f/[Y]   =  f[ f[f[Y[0];Y[1]]; Y[2]]; Y[3]]     // Y[0] is init value 
 
if f multivariate: 
 
     f/[x;Y;Z] =  f[f[ f[x;Y[0];Z[0]]; Y[1]; Z[1]]; Y[2]; Z[2]]    // x is init value 
 
 
 
NOTE:  f itself may be dyadic or mondatic but that's a separate thing from whether (f/) is used in a dyadic or monadic way. 
 
      e.g.  (+/)[1 2 3 4 5]  = 15     // here "+" is dyadic, but (+/) is used in a monadic way 
 
e.g. 
 
q)0 +/ 1 2 3 4 5     # here, 0 is the initial value in accumulator, function is "+" and adverb "/" 
15                   # "+" is dyadic, and adverb (+/) is used in a dyadic way 
 
q)(+/)[0; 1 2 3 4 5 6 7 8 9 10]     # prefix form works too 
55 
 
q)0 +/ 1+til 10     # sum of 1 ~ 10 
55 
 
q)0 {x+y}/ 1 2 3 4 5    # recall, "+" is just {x+y}, and we can use our own function too. 
15 
 
q)f:{2*x+y}            # here you can combine over "/" with your own function 
q)100 f/ 1 2 3 4       # notice how you can use adverb infix, but you cannot use f itself infix. 
1652 
 
q)(+/) 1 2 3 4 5    # here, if you don't need to specify an init value, then it takes the first elem in the list as the init value 
15 
 
q)+/[1 2 3 4 5 6 7 8 9 10]     # again, prefix form works 
55 
 
note: as above, "/" is "over" operator that lets you convert a given function into a new func that accumulates across the original list, into a single atom result. 
 
q)(*/) 1+til 10   # products 
3628800 
 
q)(|/) 20 10 40 30   # max 
40 
 
q)(&/) 20 10 40 30   # min 
10 
 
NOTE: because the above adverbs are so common, they have their own names, as below. 
 
q)sum 1+til 10      # same as +/ 
55 
q)prd 1+til 10      # same as */ 
3628800 
q)max 20 10 40 30   # same as |/ 
40 
q)min 20 10 40 30   # same as &/ 
10 
 
NOTE: you can implement a power function like below. (multiplicative way of implementing) 
 
q)(*/) 2#1.47     # i.e. 2^1.47 
2.1069 
 
q)n:5 
q)(*/) n#10      # i.e. 5^10 
100000 
 
note: in functional programming, "over" aka "fold" or "reduce" 
 
 
q)(,/)((1 2 3; 4 5); (100 200; 300 400 500))     # here you combine concatenate "," with over "/" 
1 2 3                                            # and got rid of the top level of nested lists 
4 5                                              # this is such a common operation 
100 200                                          # thus has a name "raze" 
300 400 500 
 
q)raze ((1 2 3; 4 5); (100 200; 300 400 500))    # raze == ,/ 
1 2 3 
4 5 
100 200 
300 400 500 
 
note: raze f each x   is a very common use pattern in q code 
 
note: you can use literally "over" instead of "/" but syntax is slightly diff. 
e.g 
q){x & y} over 12 -34 56 
-34 
q)(&) over 12 -34 56 
-34 
q)over[&;12 -34 56] 
-34 
 
 
NOTE: so far, what we saw above is applying f[x;y] recursively when given x and a list of y values, e.g. f/[x;Ylist] 
      but what if you want to apply f[x] recursively to its result N times (without Y list) ? 
      syntax is as follows: 
e.g. 
 
f/[N;x] 
 
q)f:{x,x} 
q)f[1 2] 
1 2 1 2 
q)f/[1;1 2]         // applying it 1 time 
1 2 1 2 
q)f/[2;1 2]         // twice. i.e.  f[f[x]] 
1 2 1 2 1 2 1 2 
q)f/[3;1 2]         // 3 times. i.e.  f[f[f[x]]] 
1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 
 
 
====> now we are ready to implement fibonacci function (popular interview question) 
 
q)f:{x,sum -2#x} 
 
q)f[1 1 2 3 5]        // without recursion 
1 1 2 3 5 8 
 
q)f/[10;1 1]                     // recursive call 
1 1 2 3 5 8 13 21 34 55 89 144 
 
q)10 f/ 1 1                      // prefix form works too 
1 1 2 3 5 8 13 21 34 55 89 144 
 
 
 
q)f:{reverse sums x}       // here is a fancy alternative way 
 
q)f/[10;1 1] 
144 89 
 
q)f\[10;1 1] 
1   1 
2   1 
3   2 
5   3 
8   5 
13  8 
21  13 
34  21 
55  34 
89  55 
144 89 
 
q)last each f\[10; 1 1] 
1 1 2 3 5 8 13 21 34 55 89 
 
q)f:{raze (1;0 1) x}         // here is another fancy alternative 
q)g:{sum f/[x;0]}            // you can write in one shot like  g:{sum {raze (1;0 1) x}/[x;0]} 
q)g each 1+til 12 
1 1 2 3 5 8 13 21 34 55 89 144 
 
 
NOTE: another variant of over "/" is you can specify termination condition/predicate. 
      syntax is:    f/[cond;x] 
e.g. 
 
q){x+x}\[10000>;2] 
2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 
 
q){x+x}\[@[>;10000];2]                                 // let's practice general apply @[] 
2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 
 
q)@[>;10000] 3 
1b 
q)@[>;10000] 12345 
0b 
 
q){x,sum -2#x}/[140>last@; 1 1] 
1 1 2 3 5 8 13 21 34 55 89 144 
 
 
### 
###  scan "\"    # backslash   (another kind of higher-order func) 
### 
 
"\" can augment both dyadic and monadic functions. we distinguish as follows; 
   scan "f\"  if f is dyadic 
iterate "f\"  if f is monadic 
 
similar to "over" with "/", "scan" with "\" converts a func into a high-order func. 
the only diff is the result is a list, containing all intermediate values and the final value. 
 
i.e.  (f\)[a list x1, x2,,, xn]  returns  x1, f(x1,x2), f(f(x1,x2),x3),,, f(..,xn) 
 
e.g. 
 
q)(+\) 1+til 10 
1 3 6 10 15 21 28 36 45 55 
 
q)(*\) 1+til 10 
1 2 6 24 120 720 5040 40320 362880 3628800 
 
q)(|\) 20 10 40 30 
20 20 40 40 
 
q)(&\) 20 10 40 30 
20 10 10 10 
 
NOTE: just like "over", some scan operations are so common, they have their own names 
 
q)sums 1+til 10       # same as +\ 
1 3 6 10 15 21 28 36 45 55 
 
q)prds 1+til 10       # same as *\ 
1 2 6 24 120 720 5040 40320 362880 3628800 
 
q)maxs 20 10 40 30    # same as |\ 
20 20 40 40 
 
q)mins 20 10 40 30    # same as &\ 
20 10 10 10 
 
 
note: you can literally use "scan" instead of "\"  but syntax is slightly diff. 
e.g. 
q){x & y} scan 12 -34 56 
12 -34 -34 
 
 
NOTE: assume f is a dyadic func. and if (f\) is called on [x;y] i.e. dyadic input, then x is the initial value, and y is an input list. 
      dont confuse f being a dyad func, and (f\) being used as dyad. two separate things. 
 
e.g. 
 
q)(+\)  3 5 7    //  f\ being used as monadic, where init value is assumed 0 
3 8 15 
 
q)(+\)[0 ; 3 5 7]   //  dyadic way of using \f 
3 8 15 
 
q)(+\)[1 ; 3 5 7] 
4 9 16 
 
q)1 +\ 3 5 7      // you can write this way too. again it's just simply invoking f\ in a dyadic way 
4 9 16 
 
 
## 
##  iterate "\"          # yet another backslash overload 
## 
 
recall: 
    scan "f\" is when f itself is a dyadic func. 
interate "f\" is when f itself is a monadic func. 
 
iterate  (f\)[x]  repeatedly calls on its previous result until the result is the same as prev result OR the result matches the original input 
 
e.g. 
 
q)(neg\) -5       // terminated because it matched the orig input 
-5 5 
 
q)(rotate[1]\) "kenics"       // you get the idea 
"kenics" 
"enicsk" 
"nicske" 
"icsken" 
"cskeni" 
"skenic" 
 
note: if iterate f\ is used in dyadic way, e.g. (f\)[x;y] 
      if x is a positive integer, then it means the number of iteration 
      OR x can be a "while" condition 
 
q)(neg\)[5;-1] 
-1 1 -1 1 -1 1 
 
q)f:2*        // notice f is a monadic func that doubles the input 
q)f 7 
14 
q)(f\)[999>; 1]                      // think about "while" condition here 
1 2 4 8 16 32 64 128 256 512 1024    // it is essentially just a monadic function that evaulates to 1b or 0b 
 
q)(f\)[>[999;] ; 1]                  // so it can be re-written as this 
1 2 4 8 16 32 64 128 256 512 1024 
 
q)(f\)[>[999] ; 1]                   // or this 
1 2 4 8 16 32 64 128 256 512 1024 
 
q)(f\)[@[>;999]; 1]                  // or this 
1 2 4 8 16 32 64 128 256 512 1024 
 
 
q)(f\)[1000>sum; 12 34 56]       //  think about why this doesnt work 
'type                            //  because you just ran  >[1000;sum]  but sum is a monadic function, 
 
q)(f\)[{1000 > sum x}; 12 34 56]   // what you really meant to do is this 
12  34  56 
24  68  112                        // alternatively you can below 
48  136 224 
96  272 448 
192 544 896 
 
q)(f\)[1000>sum@; 12 34 56]       // a common use case of general apply "@" 
12  34  56                        // you conveniently apply the left chunk to the right 
24  68  112                       //   1000>sum@  means   >[1000] @ [sum] 
48  136 224 
96  272 448 
192 544 896 
 
 
 
 
### 
###   adverb - case 5,  each-previous ': 
### 
 
adverb ':  lets you define a declarative func to perform a dyadic operation on each item of a list with its predecessor. 
 
q)80 -': 100 99 101 102 101     # take 99 for example, its predecessor is 100, thus 99 - 100 = -1 
20 -1 2 1 -1                    # 100 doesn't have predecessor, so you define 80, as such 
 
q)(-':) 100 99 101 102 101      # without the init predecessor, it returns the first elem. (semantically makes sense) 
100 -1 2 1 -1 
 
q) deltas 100 99 101 102 101     # (-':) == deltas 
100 -1 2 1 -1 
 
q) deltas sums 100 99 101 102 101 
100 99 101 102 101 
 
q) sums deltas 100 99 101 102 101 
100 99 101 102 101 
 
q)(%':) 100 99 101 102 101 
100 0.99 1.020202 1.009901 0.9901961 
 
q)ratios 100 99 101 102 101              # ratios == (%':) 
100 0.99 1.020202 1.009901 0.9901961 
 
q)deltas0:{first[x] -': x}        # if you want the first predecessor to be the first item, define a custom func like this 
q)deltas0 100 99 101 102 101 
0 -1 2 1 -1 
 
q)(~':) 1 1 1 2 2 3 4 5 5 5 6 6        # a popular adverb is  ~': 
011010001101b 
 
q)not (~':) 1 1 1 2 2 3 4 5 5 5 6 6    # especially its negated version has its own name "differ" 
100101110010b 
 
q)differ 1 1 1 2 2 3 4 5 5 5 6 6       # differ == not (~':) 
100101110010b                          # notice the first elem is always 1 in this case 
 
q)L:1 1 1 2 2 3 4 5 5 5 6 6 
q)differ L 
100101110010b 
q)where differ L    # gives you the indices of where each uniq number starts 
0 3 5 6 7 10 
 
q)(where differ L) cut L     # splits into lists by numbers 
1 1 1 
2 2 
,3 
,4 
5 5 5 
6 6 
 
===>  suppose you pick the longest length lists from the above. 
e.g. 
q)runs:(where differ L) cut L 
q)ct:count each runs             # count each list length 
q)runs where ct=max ct 
1 1 1 
5 5 5 
 
 
### 
###   general application   "@" and "." 
### 
 
## 
##  verb "@" 
## 
 
"applying" 
- an index to a list (to retrieve an item) 
- a key to a dict 
- a param to a function (invoking a function) 
 
a higher order function @ is a true form of basic application in q. 
@ is dyadic. 1st arg = a monadic func. 2nd arg is the input to the 1st arg func. 
 
e.g. 
 
q)L:12 34 56 78 
 
q)L[1] 
34 
q)L @ 1 
34 
q)@[ L ; 1 ] 
34 
 
q)L[1 1 0 3] 
34 34 12 78 
q)L @ 1 1 0 3 
34 34 12 78 
q)@[ L ; 1 1 0 3] 
34 34 12 78 
 
q)count L 

q)count @ L 

q)@[count ; L] 

 
q){x*x}[L] 
144 1156 3136 6084 
q) {x*x} @ L 
144 1156 3136 6084 
q)@[ {x*x} ; L] 
144 1156 3136 6084 
 
q)d:`a`b`c!10 20 30 
 
q)d[`a] 
10 
q)d @ `a 
10 
q)@[d ; `a] 
10 
 
q)d[`a`c] 
10 30 
q)d @ `a`c 
10 30 
q)@[d ; `a`c] 
10 30 
 
q)sum 12 34 56 
102 
q)sum @ 12 34 56 
102 
q)@[sum] 12 34 56 
102 
q)@[sum;] 12 34 56 
102 
 
q)f:{3*7}      # think about this niladic function f 
q)f[] 
21 
q)f[::]        # nice to explicitly specify nill "::" input 
21 
q)f 34         # any input is ignored with niladic functions 
21 
 
q)f @ (::)     # this parenthesis is necessary. it's just the syntax. 
21 
 
q)@[f; ]       # here you need explicit nil :: 
@[{3*7};] 
 
q)@[f; ::]     # like this 
21 
 
q)t:([]c1:1 2 3; c2:`a`b`c) 
 
q)t@1       # same as below 
c1| 2 
c2| `b 
 
q)t 1 
c1| 2 
c2| `b 
 
q)t [1; ] 
c1| 2 
c2| `b 
 
q)t [0 2; ]  # same as below 
c1 c2 
----- 
1  a 
3  c 
 
q)t @ 0 2 
c1 c2 
----- 
1  a 
3  c 
 
 
 
 
## 
##   verb "."     # a dot 
## 
 
a high order function "." is the true form of multi-variable (non-monadic) general "apply" in q. 
non-monadic = multi variables (including index-by-depth, or x,y inputs into a function) 
 
NOTE: as such, "." takes a LIST as its input. see below 
 
q)L : (12 34 56; 100 200) 
 
q)L[1;0] 
100 
q)L . 1 0        # unlike "@", verb dot "." lets you specify index-by-depth 
100              # see the syntax here. 1st arg = list/dist/func/table, 2nd arg = a list 
q).[L ; 1 0]     # both infix and prefix work 
100 
q).[L ; (1;0)] 
100 
 
q)@[L ; 1 0]     # see the diff between "@" and "." 
100 200 
12 34 56 
 
q).[L ; 0]           # notice, you cannot give an atom 
'type 
q).[L ; enlist 0]    # it has to be a list  (NOTE: this means you dont really need "@" any more) 
12 34 56 
 
q)L . (0 ; )       # to do elided-indexing, you need explicit nil :: 
'type 
q)L . (0 ; ::)     # like this 
12 34 56 
 
q).[L;();+;1]      # you can use an empty list like this to select all 
13 35 57 
101 201 
 
q)d:`a`b`c!(10 20 30; 40 50; enlist 60) 
q)d[`b;0] 
40 
q)d . (`b; 0)       # see verb "." syntax 
40 
q).[d ; (`b;0)] 
40 
 
q)g:{x-y} 
q)g[9;4] 

q)g . 9 4          # see verb "." syntax 

q).[g ; 9 4] 

 
## some more examples of verb dot "." 
 
q)L:10 20 30 
q)L . enlist 1 
20 
 
q)m:(10 20 30; 100 200 300) 
q)m . 0 1 
20 
 
q)ds:(`a`b`c!10 20 30; `x`y!100 200) 
q)ds . (0 ; `b) 
20 
 
q)mix:(10 20 30; `a`b`c!(1; 2; (300 400))) 
q)mix . (1; `c; 1) 
400 
 
q)dc:`c1`c2!(1 2 3; `a`b`c) 
q)dc . (`c2; 1) 
`b 
 
q)t:([]c1:1 2 3;c2:`a`b`c) 
q)t . (1 ; `c2) 
`b 
 
q)kt:([k:`a`b`c] v:1.1 2.2 3.3) 
q)kt . `b`v 
2.2 
 
 
q).[>;100 999]           //  100>999 
0b 
q).[>;100 99]            //  100>99 
1b 
 
 
## 
##  general apply "@" with monadic function 
## 
 
  syntax is  @[L;I;f]       // same as  f[L@I] 
 
q)L:10 20 30 40 50 
q) @[L ; 0 2] 
10 30 
 
q) @[L ; 0 2 ; neg]      # see how you applied neg to L[0;2] and got the whole list 
-10 20 -30 40 50         # this is very powerful 
                         # NOTE: syntax is  @[L;I;f]   where I = index list, f = function 
q)neg @[L ; 0 2] 
-10 -30 
 
q)m:(10 20 30; 100 200 300; 1000 2000 3000) 
q)@[m; 0 2; neg] 
-10 -20 -30 
100 200 300 
-1000 -2000 -3000 
 
q)L:10 20 30 40       # NOTE: if you want to modify data in-place, use symbol name of the variable 
q)@[L; 0; neg] 
-10 20 30 40 
q)L 
10 20 30 40 
q)@[`L; 0 ; neg] 
`L 
q)L 
-10 20 30 40 
 
 
## 
##  general apply "@" with dyadic functions 
## 
 
  syntax is  @[L;I;f;v]       // same as  f[L@I;v] 
 
q)L:10 20 30 40 
 
q)100 200 + @[L; 0 1]       # see this only gets you a sub-domain 
110 220                     # see below for how to get the whole list 
 
q)@[L; 0 1; +; 100 200]      # NOTE: syntax is  @[L;I;f;v] 
110 220 30 40 
 
q)@[L; 0 1; +; 100] 
110 120 30 40 
 
q)d:`a`b`c!10 20 30 
 
q)@[d; `a`b; +; 100 200] 
a| 110 
b| 220 
c| 30 
 
q)@[d; `a`b; +; 100] 
a| 110 
b| 120 
c| 30 
 
## more examples 
 
q)m:(10 20 30; 100 200 300; 1000 2000 3000) 
q)@[m; 0 2; +; 1 2] 
11 21 31 
100 200 300 
1002 2002 3002 
 
q)L:10 20 30 40 
q)@[L; 0 2; :; 42 43]      # same as  L[0 2]:42 43 
42 20 43 40 
 
q)@[`L; 0 2; :; 42 43]     # use symbol name of the var to modify do in-place 
`L 
q)L 
42 20 43 40 
 
 
## 
##  general apply "." with monadic function 
## 
 
  syntax is  .[L;I;f]         // same as  f[L . I] 
 
q)m:(10 20 30; 100 200 300)        # see examples 
q).[m; 0 1] 
20 
q)neg .[m; 0 1] 
-20 
q).[m; 0 1; neg]        # NOTE the syntax   .[L;I;f] 
10 -20 30 
100 200 300 
 
q)d:`a`b`c!(10 20 30; 40 50; enlist 60) 
q).[d; (`a; 1)] 
20 
q).[d; (`a; 1); neg] 
a| 10 -20 30 
b| 40 50 
c| ,60 
 
NOTE: like always, you can use symbol name of the var to modify in-place 
 
##  more examples  (elided indexing) 
 
q).[m; (0; ::); neg]      # notice you need explicit nil :: (for elided indexing) 
-10 -20 -30 
100 200 300 
q)d:`a`b`c!(100 200 300; 400 500; enlist 600) 
q).[d; (`a; ::); neg] 
a| -100 -200 -300 
b| 400 500 
c| ,600 
q).[d; (::; 0); neg] 
a| -100 200 300 
b| -400 500 
c| ,-600 
 
 
## 
##  general apply "." with dyadic functions 
## 
 
  syntax  .[L;I;f;v]          // same as  f[(L . I);v] 
 
q)m:(10 20 30; 100 200 300) 
q)m 
10  20  30 
100 200 300 
q).[m; 0 1] 
20 
 
q).[m; 0 1; +; 1] 
10  21  30            # 21 
100 200 300 
 
q).[m; (::; 1); +; 1 2] 
10  21  30                  # 21 
100 202 300                 # 202 
 
q)d:`a`b`c!(100 200 300; 400 500; enlist 600) 
 
q)d 
a| 100 200 300 
b| 400 500 
c| ,600 
 
q).[d; (`a; ::); +; 1] 
a| 101 201 301 
b| 400 500 
c| ,600 
 
q).[d; (::; 0); +; 1] 
a| 101 200 300 
b| 401 500 
c| ,601 
 
 
note: as always, you can specify the symbol name of variables to modify in-place. 
 
 
NOTE: beware of extremely similar syntax of other use of @[] and .[] called "trap" aka protected eval below. 
 
@[f_monadic ; arg ; expr_fail] 
.[f_multivalent ; L_arg ; expr_fail] 
 
 
 
################################# 
####    Transforming Data    #### 
################################# 
 
## 
##  global variable dictionary `. 
## 
 
q)L : 12 34 56 
q)d:`a`b`c!10 20 30 
q)a: `ibm 
q)f:{x-y} 
q)get `.          # shows you global variable assignment 
a    | `ibm 
f    | {x-y} 
L    | 12 34 56 
d    | `a`b`c!10 20 30 
 
 
note: anything defined at root `. layer is global. and you can expand on it. 
 
e.g.      // so these are defined at global level. i.e. you can reference/use it from anywhere 
 
.kenics.util.greet:{(`morning`afternoon`evening!`ohayo`konichiwa`konbanwa)[x]} 
.kenics.cfg.name:`ken`sugimoto 
 
 
### 
###  data cast operator "$"        # atomic dyadic 
### 
 
<type> $ <data> 
 
3 ways to specify <type> 
 
(1) (positive) numeric short type value. 
 
e.g. 
q)7h $ 42i       # int to long 
42 
q)6h $ 42        # long to int 
42i 
q)9h $ 42        # long to float 
42f 
q)6h $ 3.14      # float to int 
3i 
 
(2) char type value 
 
e.g. 
q)"j"$42i 
42 
q)"i"$42 
42i 
q)"f"$42 
42f 
 
 
(3) type name symbol       # probably the most readable 
 
e.g. 
q)`int$42 
42i 
q)`long$42i 
42 
q)`float$42 
42f 
q)100. 
100f 
 
q)`long$"\n" 
10 
 
q)`char$42 
"*" 
 
q)`date$0            # review date/time type casting 
2000.01.01 
 
q)`int$2001.01.01 
366i 
 
q)`long$12.345       # obviously some casting narrows/widens 
12 
q)`short$123456789 
32767h 
 
q)`boolean$0         # anything 0 is 0b, and else 1b 
0b 
q)`boolean$0.0 
0b 
q)`boolean$123 
1b 
q)`boolean$-12.345 
1b 
 
q)`date$2015.01.02D10:20:30.123456789      # some date/time casting examples 
2015.01.02                                 # see how you can extract parts 
q)`year$2015.01.02 
2015i 
q)`month$2015.01.02 
2015.01m 
q)`mm$2015.01.02 
1i 
q)`dd$2015.01.02 
2i 
q)`hh$10:20:30.123456789 
10i 
q)`minute$10:20:30.123456789 
10:20 
q)`uu$10:20:30.123456789 
20i 
q)`second$10:20:30.123456789 
10:20:30 
q)`ss$10:20:30.123456789 
30i 
 
 
### 
###   data cast operator overload:   x $ y  where y is given as string 
### 
 
when y is given as "string", you can use upper case letter type of the target data type to cast. 
 
q)flip{(neg x;upper .Q.t x;key'[x$\:()])}5h$where" "<>20#.Q.t 
-1h  "B" `boolean 
-2h  "G" `guid 
-4h  "X" `byte 
-5h  "H" `short 
-6h  "I" `int 
-7h  "J" `long 
-8h  "E" `real 
-9h  "F" `float 
-10h "C" `char 
-11h "S" `symbol 
-12h "P" `timestamp 
-13h "M" `month 
-14h "D" `date 
-15h "Z" `datetime     // deprecated, use timestamp instead 
-16h "N" `timespan 
-17h "U" `minute 
-18h "V" `second 
-19h "T" `time 
 
$ date +%s 
1532305705 
 
$ date -d @1532305705 
Mon Jul 23 09:28:25 JST 2018 
 
q)"P" $ "1532305705"             // notice y is string, not integer 
2018.07.23D00:28:25.000000000 
 
q)"Z" $ "1532305705" 
2018.07.23T00:28:25.000 
 
or you can also do this. 
 
q)"P"$"2014.11.22"               // again, notice y is string 
2014.11.22D00:00:00.000000000 
 
--> note if x is a lower case, then it's a different way of casting. 
 
q)"p"$2014.11.22                // notice y is NOT string 
2014.11.22D00:00:00.000000000 
 
q)"i"$123.456 
123i 
 
q)"i"$"123"           // it just converts ascii into int decimal representation 
49 50 51i 
 
q)"I"$"123"           // this is atoi("123") = 123 
123i 
 
q)"d"$"2018.10.19" 
2000.02.20 2000.02.18 2000.02.19 2000.02.26 2000.02.16 2000.02.19 2000.02.18 2000.02.16 2000.02.19 2000.02.27 
 
q)"D"$"2018.10.19" 
2018.10.19 
 
 
 
### 
###  forcing types 
### 
 
q)L:10 20 30 40      # a simple list of long 
q)L[1]:56h           # cannot assign a short 
'type 
 
q)L[1]: (type L) $ 56h    # cast to long 
q)L 
10 56 30 40 
 
q)L,: (type L) $ 78h     # appending 
q)L 
10 56 30 40 78 
 
### 
###  typed empty list 
### 
 
suppose you want a float list, but if you create an empty list and add a long, then it automatically becomes a long list, and you get error when attempting to add a float. 
 
q)L:() 
q)type L 
0h 
 
q)L ,: 123 
q)type L 
7h 
 
q)L ,: 3.14          # so you wanted to define an empty list of float typed 
'type 
 
q)L: `float $ ()     # here is one way 
q)type L 
9h 
 
q)L: 0 # 3.14        # another popular way to create a typed empty list, using an existing list 
q)L 
`float$() 
q)type L 
9h 
 
 
## 
##  more examples   (cast $ as atomic function) 
## 
 
as atomic function, cast $ extends to a list 
 
e.g.  atom $ list 
 
q)"i"$10 20 30 
10 20 30i 
 
q)`float$(42j; 42i; 42j) 
42 42 42f 
 
e.g.  list $ atom 
 
q)`short`int`long$42 
42h 
42i 
42 
 
q)"ijf"$98.6 
99i 
99 
98.6 
 
e.g.  list $ list 
 
q)"ijf"$10 20 30 
10i 
20 
30f 
 
 
### 
###  casting to string   ("string" function) 
### 
 
q)string 42     # converts input to a string (which by definition is a LIST of chars) 
"42" 
q)string 4 
,"4"            # notice how the output is always a list 
q)string 42i 
"42" 
q)a:2.0 
q)string a 
,"2" 
q)f:{x*x} 
q)string f 
"{x*x}" 
 
q)string 1 2 3 
,"1" 
,"2" 
,"3" 
q)string "string" 
,"s" 
,"t" 
,"r" 
,"i" 
,"n" 
,"g" 
q)string (1 2 3; 10 20 30) 
,"1" ,"2" ,"3" 
"10" "20" "30" 
 
q)string `ibm`goog`msft`aapl 
"ibm" 
"goog" 
"msft" 
"aapl" 
 
 
### 
###  casting to symbol   `$      (atomic) 
### 
 
NOTE: recall it's  `$    not `symbol$ 
 
q)`$ "foobar" 
`foobar 
 
q)`symbol$"foobar"      # error 
'type 
 
q)`$"Hello World"    # here is a common way to create a symbol that contains whitespace 
`Hello World 
 
q)`$"\"foo\""        # if you want to include special chars, escape with backslash 
`"foo" 
 
q)`$ ("hello"; "world"; "foo"; "bar") 
`hello`world`foo`bar 
 
 
NOTE: it is confusing because you cannot cast a string to a symbol with `symbol$"foobar", but you must use `$"foobar" 
      BUT you can declare an empty list of symbol type with both `symbol$() and `$() 
 
q)0 # `ibm`ibm 
`symbol$() 
 
q)`symbol$() 
`symbol$() 
 
q)`$() 
`symbol$() 
 
 
 
### 
###  cast string to other types 
### 
 
use "uppwercase" type char. 
 
e.g. 
q)"J"$"42" 
42 
q)"F"$"42" 
42f 
q)"F"$"42.0" 
42f 
q)"I"$"42.0" 
0Ni 
q)"I"$" " 
0Ni 
q)"D"$"12.31.2014" 
2014.12.31 
 
NOTE: to cast string to function, use "parse" 
 
q)parse "{x*x}" 
{x*x} 
 
q)value "{x*x}"       # here, "value" is equivalent 
{x*x} 
 
 
### 
###   enumerator 
### 
 
http://code.kx.com/q4m3/7_Transforming_Data/#75-enumerations 
 
q)u:`g`aapl`msft`ibm     # assume it's a distinct list 
q)v:1000000?u 
q)v 
`g`g`msft`aapl`msft`aapl`msft`ibm`msft`aapl`g`ibm`aapl`msft.... 
 
q)k:u ? v 
q)k 
0 0 2 1 2 1 2 3 2 1 0 3 1 2.... 
 
q)u[k] 
`g`g`msft`aapl`msft`aapl`msft`ibm`msft`aapl`g`ibm`aapl`msft.... 
 
q) u[k] ~ v 
1b 
 
note:  see how  u[k] == v 
       keeping u & k is more efficient (in terms of speed/compactness) than keeping v 
       u & k are what a traditional enumerator is made of. 
 

#  "enumerator" variable 

 
q)u:distinct v 
q)u 
`g`msft`aapl`ibm 
q)e:`u$v                # known as enumerator variable. notice the syntax `u$v 
q)e 
`u$`g`g`msft`aapl`msft`aapl`msft`ibm`msft`aapl`g`ibm`aapl`msft 
 
q)`int $ e                      # gives you the indices, like  u?v 
0 0 2 1 2 1 2 3 2 1 0 3 1 2i 
 
now, the idea is you use the numerator `u$v  instead of v itself. 
e.g. 
 
q)v[3] 
`aapl 
q)e[3] 
`u$`aapl 
 
q)v = `msft 
00101010100001b 
q)e = `msft 
00101010100001b 
 
q)v ? `aapl 

q)e ? `aapl 

 
q)v in `aapl`msft 
00111110110011b 
q)e in `aapl`msft 
00111110110011b 
 
## 
##   efficient update with enumerator 
## 
 
suppose you want to replace all `g with `goog 
 
q)v 
`g`g`msft`aapl`msft`aapl`msft`ibm`msft`aapl`g`ibm`aapl`msft 
 
q)u[where u=`g] : `goog        # see it's one shot, using numerator. you literally modified u[0] 
q)e 
`u$`goog`goog`msft`aapl`msft`aapl`msft`ibm`msft`aapl`goog`ibm`aapl`msft 
 
q)v[v = `g] : `goog            # sure you can do this but computationally, it's more expensive, as you modify ALL occurrences of `g in v 
q)v 
`goog`goog`msft`aapl`msft`aapl`msft`ibm`msft`aapl`g`ibm`aapl`msft 
 
## 
##  dynamically appending to enumerator domain 
## 
e.g. 
 
q)u ,: `twtr       # make sure you add it to the master list first 
q)e ,: `twtr       # then you add it to e, just like you would to v 
q)e 
`u$`goog`goog`msft`aapl`msft`aapl`msft`ibm`msft`aapl`goog`ibm`aapl`msft`twtr     # `twtr is added 
 
 
(ref) http://code.kx.com/q4m3/7_Transforming_Data/#75-enumerations 
      https://code.kx.com/q/ref/enums/#enum-extend 
 
 
######################### 
####     Tables      #### 
######################### 
 
a table == a flipped column dictionary         // it's fundamentally a list of dictionaries 
(underlying data is unchanged) 
 
q)d : `name`iq ! (`tom`bob`simon ; 65 34 128) 
 
q)type d 
99h                 # 99h = dictionary 
 
q)d 
name| tom bob simon 
iq  | 65  34  128 
 
q)t : flip d 
q)t 
name  iq 
--------- 
tom   65 
bob   34 
simon 128 
 
q) type t 
98h              # 98h = table 
 
q) d[`name;2] 
`simon 
 
q) t[2;`name]       # notice indexing order is flipped 
`simon 
q) type t[2;`name] 
-11h                # returned a symbol "atom" 
 
q) t [0 2 0; ]      # the first level index == row(record) index 
name  iq            # NOTE: unlike normal SQL, q-table rows are ordered (also column is contiguous list, thus super fast) 
---------           # NOTE: just like dictionary, we allow dupe records 
tom   65 
simon 128 
tom   65 
q) type t [0 2 0; ]      # returned a "table" 
98h 
 
q)t [0 2 0; `name] 
`tom`simon`tom 
q)type t [0 2 0; `name]     # returned a symbol "list" 
11h 
 
q)t [2; ] 
name| `simon 
iq  | 128 
q) type t [2; ] 
99h                      # returned a "dict" 
 
 
NOTE: as above, depending on how granularly/specifically you index, you get diff type of data returned. 
 
NOTE: as below, it's important to recognize that a table is just a list of dictionary records of same length 
 
q)show d:`name`price`vol!(`ibm`msft`aapl;1.2 3.4 5.6;10 20 30) 
name | ibm msft aapl 
price| 1.2 3.4  5.6 
vol  | 10  20   30 
 
q)show t:flip d 
name price vol 
-------------- 
ibm  1.2   10 
msft 3.4   20 
aapl 5.6   30 
 
q)type t 
98h 
 
q)1#t 
name price vol 
--------------  // is this a dict ?  no 
ibm  1.2   10 
 
q)type 1#t      // this is still a table 
98h 
 
q)t 0           // wow, but this is a dict 
name | `ibm 
price| 1.2 
vol  | 10 
 
q)type t 0      // indeed 
99h 
 
q)t 0 1         // but this is a table 
name price vol 
-------------- 
ibm  1.2   10 
msft 3.4   20 
 
q)type t 0 1 
98h 
 
q)(2#t) ~ t 0 1     // it is kinda confusing because this is 1b 
1b 
 
q)(1#t) ~ t 0       // but this is 0b 
0b 
 
q)type each t 
99 99 99h 
 
q)type each flip t    // this is almost like meta[t] 
name | 11 
price| 9 
vol  | 7 
 
q)meta t 
c    | t f a 
-----| ----- 
name | s 
price| f 
vol  | j 
 
 
## 
##  table definition syntax 
## 
 
q) d: flip `name`iq ! (`tom`bob`simon ; 65 34 128)        # not bad but there is a simpler way 
q) d 
name  iq 
--------- 
tom   65 
bob   34 
simon 128 
 
q) t: ([] name:`tom`bob`simon ; iq:65 34 128)        # see column name doesn't have backtick 
q) t                                                 # note: here a colon is NOT variable assignment. it's just a syntactic sugar. 
name  iq                                             #    `name `iq are merely symbols (that are keys mapped to columns in our flipped column dictionary) 
--------- 
tom   65 
bob   34 
simon 128 
 
q) d ~ t 
1b 
 
### 
###  more table init examples 
### 
 
q)([] c1:`a`b`c; c2:42; c3:98.6)    # atoms extend, but at least one column must be a LIST 
c1 c2 c3 
---------- 
a  42 98.6 
b  42 98.6 
c  42 98.6 
 
q)([] c1:`a; c2:100)             # at least one column must be a LIST 
'rank 
 
q)([] c1:enlist `a; c2:100)      # like this 
c1 c2 
------ 
a  100 
 
q) d : `c1`c2`c3!(`a`b`c;42;1.1)     # NOTE: atom extends when converted to a table 
c1| `a`b`c                                   this is a neat feature q has for you 
c2| 42 
c3| 1.1 
 
q)flip d                     # like this 
c1 c2 c3 
--------- 
a  42 1.1 
b  42 1.1 
c  42 1.1 
 
q)([] name:`$(); iq:`int$())       # empty table init (i.e. defining a schema) 
name iq 
------- 
 
q)([] name:0#`; iq:0#0)            # equivalently 
name iq 
------- 
 
q)([] name:0#`; iq:0#0) ~ ([] name:`$(); iq:`long$()) 
1b 
 
NOTE: in q, there is no nested empty list (sadly), so you get the below 
 
q)t 
ticker price 
------------ 
"ibm"  12 
"msft" 34 
"aapl" 56 
 
q)meta t 
c     | t f a 
------| ----- 
ticker| C        // string type column 
price | j 
 
q)meta 0#t 
c     | t f a 
------| ----- 
ticker|          // no nested empty list (it's just how q is) 
price | j 
 
q)-3!0#t 
"+`ticker`price!(();`long$())"     // see it's just an empty list for ticker column 
 
 
### 
###   useful table operations 
### 
 
q)t: ([] name:`tom`bob`simon; iq:65 34 128) 
q)t 
name  iq 
--------- 
tom   65 
bob   34 
simon 128 
 
 
q)meta t          #  c = column names 
c   | t f a       #  t = type 
----| -----       #  f = foreign 
name| s           #  a = attributes 
iq  | j 
 
 
NOTE: meta returns a keyed tabled (where c is the key column) 
note: if a column is a nested list of one same type, then you get "upper" case. 
 
q)meta ([] c1:1 2 3; c2:(1 2; enlist 3; 4 5 6))          # c2 is a nested list of "long" 
c | t f a 
--| ----- 
c1| j 
c2| J        # hence upper case "J" 
 
 
q)cols t       #  returns a list of column names. equivalent to  q)key t[0] 
`name`iq 
 
q)count t      # counting the number of records 

 
q)key t[2]      #  when you specify a single record, your returned type is a dict 
`name`iq        #  so you can use "key" and "value" like this 
q)value t[2] 
`simon 
128 
 
q) t , t        # you can join two tables if their schema match 
name  iq 
--------- 
tom   65 
bob   34 
simon 128 
tom   65 
bob   34 
simon 128 
 
q)t:([] c1:`a`b`c; c2:42; c3:98.6) 
 
q)t `c1`c2       // returns a list of values of selected columns 
a  b  c          // not sure if we need this operation often 
42 42 42 
 
q)`c1`c2 # t     // same as select c1,c2 from t 
c1 c2 
----- 
a  42 
b  42 
c  42 
 
q)tables 
k){."\\a ",$$[^x;`;x]} 
q)tables `.                # gives you a symbol list of global table vars 
`d`kt`t`t1`t2 
 
NOTE: any (value modifying) function applies to all values of the tables. just like it did with dictionaries. 
e.g. 
 
q)show t:([] c1:`a`b`c; c2:42; c3:98.6) 
c1 c2 c3 
---------- 
a  42 98.6 
b  42 98.6 
c  42 98.6 
 
q)string t 
c1   c2   c3 
---------------- 
,"a" "42" "98.6" 
,"b" "42" "98.6" 
,"c" "42" "98.6" 
 
 
NOTE: when it comes to modifying a table (e.g. removing, slicing, updating) you can probably do with select/update/delete templates easily, instead of using built-in q functions. 
 
 
### 
###  basic templates select & update 
### 
 
select/update/delete/exec are called "templates" aka general form of table query. 
under the hood, q-interpretor convert them to functional form. 
 
note: we revisit this in detail later, here just covering very basics. 
 
NOTE: select always returns a table 
NOTE: select always returns a table 
NOTE: select always returns a table   (exec does not) 
 
q)select from t       # no wildcard * 
name  iq 
--------- 
tom   65 
bob   34 
simon 128 
 
q)t ~ select from t 
1b 
 
q)select name from t 
name 
----- 
tom 
bob 
simon 
 
q)select foo:name, bar:iq from t      # rename columns 
foo   bar 
--------- 
tom   65 
bob   34 
simon 128 
 
q)select iq: iq % 100 from t 
iq 
---- 
0.65 
0.34 
1.28 
 
q)update iq: iq % 100 from t     # notice how "update" returns an updated whole table, not select rows 
name  iq 
---------- 
tom   0.65 
bob   0.34 
simon 1.28 
 
q)t               # note update didn't update in-place. for that you need to specify the symbol name `t 
name  iq 
--------- 
tom   65 
bob   34 
simon 128 
 
 
// note: see how you can omit column names when using update like below 
 
q)show t:([] c1:12 0N 34 56; c2:`ibm`msft`aapl`dell) 
c1 c2 
------- 
12 ibm 
   msft 
34 aapl 
56 dell 
 
q)update c1:99999^c1 from t      // explicitly updated c1 
c1    c2 
---------- 
12    ibm 
99999 msft 
34    aapl 
56    dell 
 
q)update 99999^c1 from t         // implicitly updated c1 
c1    c2 
---------- 
12    ibm 
99999 msft 
34    aapl 
56    dell 
 
 
### 
###  keyed tables (primary key)          // a dictionary of tables 
### 
 
inspiration is the primary key from SQL. 
 
NOTE: a keyed table is a "dictionary" where key = a key table, value = a value table 
 
q)v : ([] name:`tom`bob`simon ; age:65 19 34 ) 
q)v 
name  age 
--------- 
tom   65 
bob   19 
simon 34 
 
q)k : ([] eid:1001 1002 1003) 
q)k 
eid 
---- 
1001 
1002 
1003 
 
q)kt : k ! v         // at this point, notice keyed table is a dictionary, not a table 
q)kt 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)kt : ([eid:1001 1002 1003] name:`tom`bob`simon ; age:65 19 34)      # here is a simpler way to define. 
q)kt 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)type kt           // notice how it's nothing but a dictionary 
99h 
q)key kt 
eid 
---- 
1001 
1002 
1003 
q)value kt 
name  age 
--------- 
tom   65 
bob   19 
simon 34 
 
NOTE: the beauty of keyed table is you can index by they key value (instead of row number) 
 
q)kt[1003;`name] 
`simon 
 
q)kt[1003;`age] 
34 
 
q)kt[1001]          # you get a dictionary 
name| `tom 
age | 65 
 
q)type kt[1001]     # which makes sense because keyed table is a dictionary 
99h 
 
- Question: how to select multiple records of keyed table ? 
- Answer: see below. select statement is the easiest. 
 
q)kt[1001 1003]       # cannot do this 
'length 
 
q)kt[flip enlist 1001 1003]     # has to be like this 
name  age                       # or use "select" as below 
--------- 
tom   65 
simon 34 
 
q)select name,age from kt where eid in 1001 1003 
name  age 
--------- 
tom   65 
simon 34 
 
 
note: because a keyed table is a dictionary, you can simply run below. 
 
q)key kt 
eid 
---- 
1001 
1002 
1003 
 
q)value kt 
name  age 
--------- 
tom   65 
bob   19 
simon 34 
 
q)keys kt        # "keys" returns key column name(s) 
,`eid 
 
q)cols kt        # "cols" returns all column names 
`eid`name`age 
 
 

#  empty keyed table 

 
q)ktempty:([eid:`int$()] `symbol$name:(); iq:`int$())      # nice to declare type 
q)ktempty:([eid:0#0] name:0#`; iq:0#0) 
 
 

#  keyed table VS (non-keyed) table 

 
## 
##  x xkey y         // y is a table.  x is a symbol atom (or list) of column names of y. 
## 
 
"xkey" function lets you promote/demote columns to key columns 
 
q)t                # a normal (non-keyed) table 
eid  name  age 
-------------- 
1001 tom   65 
1002 bob   19 
1003 simon 34 
 
q)`eid xkey t        # make "eid" a key column 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)`eid`age xkey t       # multiple key columns 
eid  age| name 
--------| ----- 
1001 65 | tom 
1002 19 | bob 
1003 34 | simon 
 
q)kt                   # a keyed table 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)() xkey kt          # unkey all columns 
eid  name  age        # NOTE: you cannot do  ` xkey kt, but you must use an empty list  () xkey kt 
--------------                also, recall xkey[();kt] ~ 0!kt 
1001 tom   65 
1002 bob   19 
1003 simon 34 
 
NOTE: the above is call-by-value.  obviously, if you call-by-name, you can modify in-place. 
e.g. 
 
q)`eid xkey `t 
`t 
 
q)t 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
note: confusing enough, ![2;t] is the same as xkey[`c1`c2;t]   (assuming `c1`c2 are the first two columns) 
 
NOTE: just like regular (non-keyed) tables, dupe key/record is allowed in keyed table (after all, q dictionary allows dupe keys) 
 
q)dt:([] eid:1001 1002 1003 1001; name:`sam`tom`simon`ken) 
 
q)`eid xkey dt 
eid | name 
----| ----- 
1001| sam        #  q) dt[1001] only retrives `sam 
1002| tom 
1003| simon 
1001| ken        # but q) select from dt where eid=1001   gives you both `sam & `ken 
 
 
q)first kt      # retrieving the first record (notice key column is skipped, 
name| `tom      # this is because keyed table is a dict, and first/last works on the value 
age | 65 
 
q)last kt       # very last record 
name| `simon 
age | 34 
 
q)2 # kt          # more generally, you can use take "#" operator 
eid | name age    # to retrieve the first/last N records 
----| -------- 
1001| tom  65 
1002| bob  19 
 
q)-4 # kt 
eid | name  age 
----| --------- 
1003| simon 34 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
 

#  multiple key columns 

 
q)ktc : `eid`name xkey kt 
q)ktc 
eid  name | age 
----------| --- 
1001 tom  | 65 
1002 bob  | 19 
1003 simon| 34 
 
q) ktc[(1001;`tom)]     # here is how 
age| 65 
 
q)select age from ktc where eid = 1001, name=`tom       # select is intuitive 
age 
--- 
65 
 
 
### 
###   foreign keys  (aka virtual columns) 
### 
 
a foreign key (column) in a table means a (non-keyed) column in a table whose values are the values of a primary key column in another table. 
(i.e. you cannot add a value to a foreign key column, if the value is not in the primary key column of the other table) 
 
foreign keys define relations, and enforce (at least directional) integrity between tables. (hence relational DBs) 
 
to quote q4m3, "A foreign key is one or more table columns whose values are defined as an enumeration over the key column(s) of a keyed table." 
 
how to we implement a foreign key ? - we can leverage precisely the functionality of enumerator. (which lets you restrict foreign key values to be in primary key values) 
 
q)show kt:([eid:1001 1002 1003] name:`tom`bob`simon; age:65 19 34) 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)show t:([] eid:1003 1001 1002 1001 1002 1001; score:126 36 92 39 98 42) 
eid  score 
---------- 
1003 126 
1001 36 
1002 92 
1001 39 
1002 98 
1001 42 
 
===> let's make eid (from t) a foreign key, over eid from kt. 
 
there are a couple of ways to do so. (both essentially the same) 
 
## (1)  use update template 
 
q) update eid: `kt$eid from `t 
`t 
 
## (2) define enumerator in table definition 
 
q)t:([] eid: `kt$1003 1001 1002 1001 1002 1001; score:126 36 92 39 98 42) 
 
 
===> now you can confirm as below. 
 
q)meta t 
c    | t f  a 
-----| ------ 
eid  | j kt      # yay 
score| j 
 
q)fkeys t     # tells you `eid from t is a foreign key mapped to a primary key `eid from ky 
eid| kt 
 
 
q)select eid,score,eid.name,eid.age from t       // see how you can reference the columns of kt via key "linked" column id 
eid  score name  age                             // so you linked two tables via foreign key 
-------------------- 
1003 126   simon 34 
1001 36    tom   65 
1002 92    bob   19 
1001 39    tom   65 
1002 98    bob   19 
1001 42    tom   65 
 
 
note: if you must un-link your foreign key, back to a normal column, then you use "value" func. 
e.g. 
 
q)meta update value eid from t 
c    | t f a 
-----| ----- 
eid  | j 
score| j 
 
 
NOTE: 1. you cannot splay/partition foreign keys 
      2. target table must be "keyed" table 
 
===> "link column" solves both 
 
### 
###   link column 
### 
 
q)show t1:([] eid:1001 1002 1003; name:`tom`bob`simon; age:65 19 34) 
eid  name  age 
-------------- 
1001 tom   65 
1002 bob   19 
1003 simon 34 
 
q)show t2:([] eid:`t1!(exec eid from t1)?1003 1001 1002 1001 1002 1001; score:126 36 92 39 98 42) 
eid score 
---------       // notice it's another overload of "!" in this case 
2   126         // syntax is  `t1!(indices corresponding to the rows of t1) 
0   36 
1   92          // NOTE: the column name "eid" doesn't have to match. see below for example. 
0   39 
1   98 
0   42 
 
q)exec eid from t1 
1001 1002 1003 
 
q)1001 1002 1003 ? 1003 1001 1002 1001 1002 1001      // notice the "link column" is just index list for the target table column 
2 0 1 0 1 0 
 
q)meta t2 
c    | t f  a 
-----| ------ 
eid  | i t1 
score| j 
 
q)select eid.name, score from t2       // yes, so this effectively achieves the same thing as foreign key 
name  score                            // but you had to do the index lookup explicitly like above 
----------- 
simon 126 
tom   36 
bob   92 
tom   39 
bob   98 
tom   42 
 
// NOTE: as mentioned above, the link column is based on row indices. so you can totally change column names. 
e.g. 
 
q)show t3:([] foo:`t1!(exec eid from t1)?1003 1001 1002 1001 1002 1001; score:126 36 92 39 98 42) 
foo score 
--------- 
2   126 
0   36 
1   92 
0   39 
1   98 
0   42 
 
q)meta t3 
c    | t f  a 
-----| ------ 
foo  | i t1 
score| j 
 
q)select foo.name, score from t3 
name  score 
----------- 
simon 126 
tom   36 
bob   92 
tom   39 
bob   98 
tom   42 
 
 
### 
###   appending/joining tables - examples 
### 
 
there are a few ways. 
 
## 
##  "join" effect via foreign key 
## 
e.g.   see how powerful it is. 
 
q)select eid.name, eid.age, score from t 
name  age score 
--------------- 
simon 34  126 
tom   65  36 
bob   19  92 
tom   65  39 
bob   19  98 
tom   65  42 
 
 
## 
##  appending new records 
## 
 
intuitive. there are a couple of ways. 
 
q)t ,: `eid`score ! 1002 57 
q)t ,: `score`eid ! 74 1001      # if you specify column names, you can change orders 
q)t ,: 1003 61                   # if you don't specify column names, it assumes the existing order. 
 
q)t 
eid  score 
---------- 
1003 126 
1001 36 
1002 92 
1001 39 
1002 98 
1001 42 
1002 57     # these 3 got added, per above 
1001 74     # 
1003 61     # 
 
 
q)`t upsert (1002; 49)       # here is another way 
 
`t 
q)t 
eid  score 
---------- 
1003 126 
1001 36 
1002 92 
1001 39 
1002 98 
1001 42 
1002 57 
1001 74 
1003 61 
1002 49     # this got added 
 
 
## 
##  join tables with "," 
## 
 
ONLY if two tables have the exact same meta structure, you can join them with "," 
e.g. 
 
q)t : ([] name:`tom`simon`adam ; age:51 34 79) 
q)t2 : ([] name:`bob`simon`sam ; age:53 34 41) 
 
q)t 
name  age 
--------- 
tom   51 
simon 34 
adam  79 
 
q)t2 
name  age 
--------- 
bob   53 
simon 34 
sam   41 
 
q)t,t2            # example 
name  age 
--------- 
tom   51 
simon 34 
adam  79 
bob   53 
simon 34         # notice it's a dupe. so it wasn't really a union, but just combining two tables. 
sam   41 
 
q)t[0],t2[0 1]        # example 
name  age 
--------- 
tom   51 
bob   53 
simon 34 
 
==> works on keyed tables too. 
 e.g. 
 
q)kt 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)kt2              # q)kt2 : ([eid:1002 1005] name:`bob`adam ; age:29 82) 
eid | name age 
----| -------- 
1002| bob  29      # see eid=1002 is dupe with kt 
1005| adam 82 
 
q)kt, kt2 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   29     # see how kt2 prevailed for eid=1002 
1003| simon 34 
1005| adam  82 
 
 
## 
##  coalesce tables with "^" 
## 
 
similar to ",", but coalesce "^" doesn't overwrite the 1st table item if the 2nd table item is null. (so it's a softer overwrite) 
 
q)t 
name  age 
--------- 
tom   51 
simon 34 
adam  79 
 
q)t ^ t       # notice overwriting a table with itself gives you the same table. (duh) 
name  age 
--------- 
tom   51 
simon 34 
adam  79 
 
q)t2 
name  age 
--------- 
bob   53 
simon 34 
sam   41 
 
q)t ^ t2     # interesting. so you just got the 2nd table. (duh) 
name  age 
--------- 
bob   53 
simon 34 
sam   41 
 
==> coalesce "^" on keyed tables works as advertised. 
 e.g. 
 
q) ([k:`a`b`c] v:10 0N 30) ^ ([k:`a`b`c] v:100 200 0N) 
k| v 
-| --- 
a| 100 
b| 200 
c| 30      # 2nd table didn't overwrite with 0N 
 
q)([k:`a`b`c`x] v:10 0N 30 40) ^ ([k:`a`b`c`y]; v:100 200 0N 0N)     # a nice merge 
k| v 
-| --- 
a| 100 
b| 200 
c| 30 
x| 40 
y| 
 
 
## 
##   column join with "join each-both" adverb ,' 
## 
 
join , 
each-both ' 
 
adverb ,'     # you can use this to do column-level join. 
 
e.g. 
 
q) ([] c1:`a`b`c ; c2:10 20 30) ,' ([] c2:100 200 300 ; c3:12 34 56 ; c4:`ibm`aapl`msft) 
c1 c2  c3 c4 
--------------     # both tables must have the same number of rows 
a  100 12 ibm 
b  200 34 aapl 
c  300 56 msft 
 
 
q) ([k:1 2 3] v1:10 20 30) ,' ([k:3 4 5] v2:1000 2000 3000)      # their key-columns meta must match 
k| v1 v2 
-| ------- 
1| 10 
2| 20 
3| 30 1000 
4|    2000 
5|    3000 
 
 
### 
###  complex column data 
### 
 
in general, each column should be a simple list. (faster, storage-efficient, easier to process) 
 
q)show tf:([] d:2015.01.01 2015.01.02; l:67.9 72.8; h:82.1 88.4) 
d          l    h 
--------------------     # a very simple daily high/low temperature data table 
2015.01.01 67.9 82.1 
2015.01.02 72.8 88.4 
 
===> alternatively, as below, you can have l and h in one column. 
 
q)show tp:([] d:2015.01.01 2015.01.02; lh:(67.9 82.10; 72.8 88.4)) 
d          lh 
--------------------    # lh column item is a list of two items (low ; high) 
2015.01.01 67.9 82.1 
2015.01.02 72.8 88.4 
 
===>  now, unless you have a good reason, go with simple-list column. 
      (nested column requires lots of adverbs to access/process) 
 
q)tf [`h][0]        # simple-list column 
82.1 
 
q)tp [`lh][0][1]    # nested column 
82.1 
 
 
### 
###  Attributes (for lists) 
### 
 
suppose you have a huge dictionary (or a huge table). 
its lookup is a linear search on the key list (or column values), which can be slow. 
 
attributes are special meta property for list. 
 
q-interpreter optimizes certain operations based on attributes. 
e.g.  if a list has "sorted" attribute, then we can binary-search, instead of linear search. // `s# 
      if we build hash, then query speed can be constant. // `u# `p# `g# 
     (not worth it, unless over a million items) 
     (in fact, attribute is an overhead, so potentially even detrimental to performance if you apply for small dataset) 
 
  syntax : symbol_name_of_attribute # a_list 
e.g. 
 
q)L: 1 2 3 4 
q)L 
1 2 3 4 
q)`s# L         // see the syntax 
`s#1 2 3 4 
q)L             // NOTE:  `#s is in-place 
`s#1 2 3 4 
 
attribute  |  description 
------------------------- 
  `s#      |  sorted (ascending order) 
  `u#      |  unique (i.e. distinct) 
  `p#      |  parted 
  `g#      |  grouped 
  `#       |  remove attributes 
 
 
## 
##   sorted attribute  `s# 
## 
 
q)asc 2 1 8 6      # notice asc automatically applies `s# 
`s#1 2 6 8 
 
q)til 5            # til doesn't 
0 1 2 3 4 
 
q)L ,: 6           # appending 6 
q)L 
`s#1 2 3 4 6       # `s# attribute is preserved 
 
q)L ,: 1 
q)L 
1 2 3 4 6 1        # `s# attribute is lost 
 
q)`s#2 3 1 5       # you get error if you try apply `s# to unsorted data 
's-fail            # usually you just sort using asc[] or xasc[] 
 
q)L:asc L 
 
q)meta t:([] dt:2018.04.25 2018.10.17; ticker:`ibm`msft; tp:123 456) 
c     | t f a 
------| ----- 
dt    | d 
ticker| s 
tp    | j 
q)meta t:`dt`ticker xasc t          // xasc is a common way to apply `s# 
c     | t f a                       // notice only the first column is done 
------| ----- 
dt    | d   s 
ticker| s 
tp    | j 
 
note: it's common to specify `s# for date/time column in a table/dict 
  e.g. 
 
q) t : ([] ti:`s#00:00:00 00:00:01 00:00:03; v:98 98 100.) 
q)meta t 
c | t f a 
--| ----- 
ti| v   s 
v | f 
 
q)meta ([k:`s#1 2 3 4] v:`d`c`b`a)      # you can apply `s# to key column too 
c| t f a 
-| ----- 
k| j   s 
v| s 
 
 e.g. 
 
q)d:`s#10 20 30 40 50!`a`b`c`d`e 
q)key d 
`s#10 20 30 40 50      # so lookup is faster. linear search VS binary search. i.e. O(n) VS O(log n) 
q)d 10 
`a 
q)d 12 
`a 
q)d 15      # one side effect of applying `s# to dict keys is in-between values get mapped too 
`a          # interesting. aka "step function" that lets you do "as of" query 
q)d 20 
`b 
 
q)grade:`s#0 70 80 90 100!`f`d`c`a`a     // a realistic example is deciding student grade in a class 
q)grade 50 69 70 84 90 99 100 
`f`f`d`c`a`a`a 
 
q)rates:`s#2017.04.01 2017.05.01 2017.06.01  ! 23.45 20.98 21.11     // or checking "as of" price/rates 
q)rates 2017.04.28 
23.45 
 
q)-3!d                                  // how does it work ? 
"`s#`s#10 20 30 40 50!`a`b`c`d`e"       // the trick is actually `s# applied to the dictionary itself, as well as its keys 
 
===> if you think about it, its implementation is trivial, using bin[x;y] 
 
stepFunc:{value[x] bin[key x;y]} 
q)d:10 20 30 40 50!`a`b`c`d`e 
q)stepFunc[d;10 12 15 20 25] 
`a`a`a`b`b 
 
 
## 
##  unique attribute  `u# 
## 
 
q)L : 3 1 7 4 
q)`u# L           // note `u# is not in-place 
`u#3 1 7 4 
q)L : `u# L 
q)L 
`u#3 1 7 4 
 
q)L ,: 2 
q)L 
`u#3 1 7 4 2      // `u# attribute preserved 
 
q)L ,: 7 
q)L 
3 1 7 4 2 7       // `u# attribute lost 
 
q)`u#12 34 12 56   // must be unique. if not unique, then use `p# or `g# 
'u-fail 
 
NOTE: `u# attr basically builds a hash so the query is O(1) which is a plain good old hash, but remember it comes at the cost of memory overhead. 
 
now think about how to apply this to a column in a table. 
 
t[`dt]:`u#t[`dt]          // this works 
t:@[t;`dt;`u#]            // this works too (hard-core people write this way, so be able to read this when you see it) 
t:update `u#dt from t     // this works too (probably most readable) 
 
 
## 
##  parted attribute  `p# 
## 
 
parted == same value being adjacent. think of this as a softer `u# or a hybrid of `u# and `s# 
 
underneath, it builds a hash that returns an index and the count of how many elems have the same key 
 
e.g. 
 
q)L : 2 2 2 1 1 4 4 4 3 3 
 
q)`p# L 
`p#2 2 2 1 1 4 4 4 3 3     // so when you query "2", q knows index 0 and 3 elems 
q)L 
2 2 2 1 1 4 4 4 3 3        // not in-place 
 
q)L : `p# L 
q)L 
`p#2 2 2 1 1 4 4 4 3 3 
 
q)`p#1 2 3 2        // error 
'u-fail 
 
so what this means is, you usually sort then apply `p# 
 
## 
##  `g#  (group attribute) 
## 
 
this is basically a relaxed `p# attr. 
underneath, q builds a hash and returns the list of index. this is a lot of memory overhead. 
 
q)L:`g# 12 34 12 56 
q)L 
`g#12 34 12 56 
q)L,: 12 
q)L 
`g#12 34 12 56 12    // yes, `g# is preserved, and q updates internal hash (i.e. update takes longer but query is quicker) 
 
as such, `p# is better then `g# but sometimes you are not at liberty to sort the list/column, so `g# is useful. 
 
 
## 
##  `#  (remove attribute) 
## 
 
q)L : asc 5 3 1 8 
q)L 
`s#1 3 5 8 
 
q)`# L 
1 3 5 8 
q)L 
`s#1 3 5 8       #  not in-place 
 
q)L : `# L 
q)L 
1 3 5 8 
 
 
## 
##   attr[x] 
## 
 
- x can be of any data types 
- returns the attribute, a symbol either `s`u`p`g`  where ` means no attribute 
 
q)attr 5 3 1 

q)attr 1 3 5 

q)attr asc 3 1 5 
`s 
 
## 
##   should we always apply attributes to all columns ? 
## 
 
it depends on the data size. and how you usually query the data. think about the trade of between speed gain and mem overhead. 
is your data/list huge ? by which column do you query ? (often date and ticker) 
 
as a good practice, always reapply attribute when you modify the data. 
 
## 
##   how do we apply `s# to multiple columns as key ? 
## 
 
use keyed tables. 
 
q)kt:`dt`ticker xkey t       // now all you have to do is sort the keys and apply `s# 
q)(`s#a)!kt a:asc key kt     // probably make this a util function 
 
dkasc:{[kt] `s#a!kt a:asc key kt}   // asc sort a dict by its key 
 
 
########################## 
####      q-sql       #### 
########################## 
 
q-sql is a collection of functions that resembles SQL but there are some syntactic/behavioral differences. 
 
functionallly, q-sql doesnt introduce anything new. but it makes code more readable. 
 
e.g. 
update price%100 from t       // deviding price by 100 
@[t;`price;%;100]             // see how cryptic without q-sql 
 
 

#  sample data 

 
q) \l /home/kenics/q/sp.q   # this will load the below 
 
s:([s:`s1`s2`s3`s4`s5] 
 name:`smith`jones`blake`clark`adams; 
 status:20 10 30 20 30; 
 city:`london`paris`paris`london`athens) 
 
p:([p:`p1`p2`p3`p4`p5`p6] 
 name:`nut`bolt`screw`screw`cam`cog; 
 color:`red`green`blue`red`blue`red; 
 weight:12 17 17 14 12 19; 
 city:`london`paris`rome`london`paris`london) 
 
sp:([] 
 s:`s$`s1`s1`s1`s1`s4`s1`s2`s2`s3`s4`s4`s1;     / fkey 
 p:`p$`p1`p2`p3`p4`p5`p6`p1`p2`p2`p2`p4`p5;     / fkey 
 qty:300 200 400 200 100 100 300 400 200 200 300 400) 
 
 
### 
###  insert    (upsert is superior, thus ignore insert) 
### 
 
### 
###  upsert 
### 
 
[for regular table] 
upsert = insert  (yes, it even inserts dupe records) 
 
[for keyed table] 
upsert = insert  if the record doesn't exist 
         update  if the record exists 
 
NOTE: upsert on a table removes all attributes. (you need to re-apply them) 
 
## 
##  inserting new records with "upsert" 
## 
 
q)t : ([] name:`tom`simon`adam ; age:31 14 59 ; height: 6.1 5.7 5.9) 
q)t 
name  age height 
---------------- 
tom   31  6.1 
simon 14  5.7 
adam  59  5.9 
 
q)t upsert (`jack; 47; 5.5)      # to modify in-place you need to specify by symbol name of the var `t 
name  age height                 # i.e. pass-by-name 
---------------- 
tom   31  6.1 
simon 14  5.7 
adam  59  5.9 
jack  47  5.5 
 
q)upsert[ t ; (`jack; 47; 5.5)]     # equivalent cmd, in the prefix form. 
name  age height 
---------------- 
tom   31  6.1 
simon 14  5.7 
adam  59  5.9 
jack  47  5.5 
 
 
note: below are some other ways. 
 
 
q)t upsert ([] name:enlist `peter ; age:33 ; height:5.3)      # adding a record 
name  age height                                              # you need to make at least one column a list (hence enlist) 
---------------- 
tom   31  6.1 
simon 14  5.7 
adam  59  5.9 
peter 33  5.3 
 
q)t upsert ([] name:`peter`fred ; age:33 28 ; height:5.3 5.4)     # adding two records 
name  age height 
---------------- 
tom   31  6.1 
simon 14  5.7 
adam  59  5.9 
peter 33  5.3 
fred  28  5.4 
 
q)t upsert (`name`age`height) ! (`peter ; 28 ; 5.4)      # adding a record 
name  age height 
---------------- 
tom   31  6.1 
simon 14  5.7 
adam  59  5.9 
peter 28  5.4 
 
 
note: on regular table, upsert simply inserts (even dupe records). 
e.g. 
 
q) t 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
q) t upsert  (`c ; 30) 
c1 c2 
----- 
a  10 
b  20 
c  30 
c  30 
 
 
## 
##  upsert on keyed tables 
## 
 
q) kt 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q) kt upsert (1001; `tom; 999) 
eid | name  age 
----| --------- 
1001| tom   999           # updated 
1002| bob   19 
1003| simon 34 
 
q)kt upsert (1005; `adam; 47) 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
1005| adam  47           # inserted 
 
 
## 
##  upsert on serialized/splayed tables 
## 
 
you can upsert on a handle. 
e.g. 
 
q)`:/q4m/tser set ([] c1:`a`b; c2:1.1 2.2)      # appending/inserting a record to a serialized table 
q)`:/q4m/tser upsert (`c; 3.3) 
`:/q4m/tser 
q)get `:/q4m/tser 
c1 c2 
------ 
a  1.1 
b  2.2 
c  3.3 
 
 
q)`:/q4m/tsplay/ set ([] c1:`sym?`a`b; c2:1.1 2.2)      # appending/inserting a record to a splayed table 
`:/q4m/tsplay/ 
q)`:/q4m/tsplay upsert (`sym?`c; 3.3) 
`:/q4m/tsplay 
q)select from `:/q4m/tsplay 
c1 c2 
------ 
a  1.1 
b  2.2 
c  3.3 
 
 
### 
###   select 
### 
 
NOTE: "select" ALWAYS returns a table.   (in contrast, "exec" returns a list or a dict) 
 
basic syntax:  select [comma_separated_predicate] [by column_name] from a_table [where comma_separated_predicate] 
 
note: a_table can be regular or keyed. 
      comma_separated_predicates are evaluated from left-to-right e.g. pred1, pred2, pred3,, but each predicate is evaluated from right-to-left 
 
e.g. 
 
q)show t : ([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3) 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
q)select from t       # notice how you don't need wildcard * 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
q)t ~ select from t 
1b 
 
q)select c1,c3,c1 from t 
c1 c3  c11 
---------- 
a  1.1 a 
b  2.2 b 
c  3.3 c 
 
q)select c1, 2*c3 from t 
c1 x                           # notice unspecified column name is "x" 
------ 
a  2.2 
b  4.4 
c  6.6 
 
q)select c1, 2*c3, 3+c2, c2, c3+1, 5%c2  from t 
c1 x   x1 c2 c3  x2                                 # notice, after "x" it gets incremented 
--------------------------                          # x, x1, x2,, so on 
a  2.2 13 10 2.1 0.5 
b  4.4 23 20 3.2 0.25 
c  6.6 33 30 4.3 0.1666667 
 
q)select c1, foo:2*c3 from t         # you can name a column 
c1 foo                               # colon is not a variable assignment. just a syntactic sugar. 
------ 
a  2.2 
b  4.4 
c  6.6 
 

#  virtual (row index) column "i" 

 
q)select i,c1 from t       # there is a hidden virtual column "i" which gives row index 
x c1                       # note: select doesn't display i as i in the output column name. (better name it like i:i) 
----                       # note: this means you cannot create a column called "i" (q will ignore whatever you define as "i" column) 
0 a 
1 b 
2 c 
 

#  distinct 

 
q)select distinct from ([] c1:`a`b`a; c2:10 20 10)    # this means distinct * (all columns) 
c1 c2                                                 # you can also  q) select distinct specific_column(s) 
----- 
a  10 
b  20 
 
## 
##  select[n]      # TOP n, ORDER BY 
## 
 
q)s 
s | name  status city 
--| ------------------- 
s1| smith 20     london 
s2| jones 10     paris 
s3| blake 30     paris 
s4| clark 20     london 
s5| adams 30     athens 
 
q) select[2] from s where name <> `smith      # first 2 records. equivalent to SELECT TOP 2 * FROM 
s | name  status city                         # same as q) select[0 2] 
--| ------------------ 
s2| jones 10     paris 
s3| blake 30     paris 
 
q) select[-2] from s where city <> `london    # last 2 records 
s | name  status city 
--| ------------------- 
s3| blake 30     paris 
s5| adams 30     athens 
 
q)select[1 3] from s        # select[n m] 
s | name  status city       # n is the starting index. e.g. n=0 means the first row. 
--| -------------------     # m is how many items 
s2| jones 10     paris 
s3| blake 30     paris 
s4| clark 20     london 
 
note: you can use select[] to do xasc/xdesc by columns  (equivalent to SQL "ORDER BY c1,c2,,,cn [ASC|DESC]") 
 
q)`city xdesc s               # ORDER BY city DESC 
s | name  status city 
--| ------------------- 
s2| jones 10     paris 
s3| blake 30     paris 
s1| smith 20     london 
s4| clark 20     london 
s5| adams 30     athens 
 
q)select[>city] from s        # ORDER BY city DESK 
s | name  status city 
--| ------------------- 
s2| jones 10     paris 
s3| blake 30     paris 
s1| smith 20     london 
s4| clark 20     london 
s5| adams 30     athens 
 
similarly, we can do ASC 
 
q)`city xasc s                # ORDER BY city [ASC] 
s | name  status city         # you can do `city`name`status to do ORDER BY city,name,status 
--| ------------------- 
s5| adams 30     athens 
s1| smith 20     london 
s4| clark 20     london 
s2| jones 10     paris 
s3| blake 30     paris 
 
q)select[<city] from s        # ORDER BY city [ASC] 
s | name  status city 
--| ------------------- 
s5| adams 30     athens 
s1| smith 20     london 
s4| clark 20     london 
s2| jones 10     paris 
s3| blake 30     paris 
 
q)select[2; <city] from s     # you can combine TOP and ORDER BY 
s | name  status city 
--| ------------------- 
s5| adams 30     athens 
s1| smith 20     london 
 
 
## 
##  select on nested columns 
## 
 
- requires lots of adverbs. 
 
q)show tnest:([] c1:`a`b`c; c2:(10 20 30; enlist 40; 50 60)) 
c1 c2 
----------- 
a  10 20 30 
b  ,40 
c  50 60 
 
q)select avg c2 from tnest       # error 
'length 
 
q)select c1,avg each c2 from tnest      # you need "each" 
c1 c2 
----- 
a  20 
b  40 
c  55 
 
q)update c3:(1.1 2.2 3.3; enlist 4.4; 5.5 6.6) from `tnest 
`tnest 
q)tnest 
c1 c2       c3 
----------------------- 
a  10 20 30 1.1 2.2 3.3 
b  ,40      ,4.4 
c  50 60    5.5 6.6 
 
q)select wtavg:c2 wavg' c3 from tnest     # each-both ' 
wtavg                                     # X wavg Y = Y weight average by X 
--------                                  #          = (sum X*Y) % sum X 
2.566667 
4.4 
6.1 
 
 
### 
###   filtering with "where" "within" 
### 
 
q)s 
s | name  status city 
--| ------------------- 
s1| smith 20     london 
s2| jones 10     paris 
s3| blake 30     paris 
s4| clark 20     london 
s5| adams 30     athens 
 
q)select from s where status>10, city <> `paris      # notice , == "AND" 
s | name  status city 
--| ------------------- 
s1| smith 20     london 
s4| clark 20     london 
s5| adams 30     athens 
 
q)select from s where 00101b      # you can specify boolean for each row like this 
s | name  status city 
--| ------------------- 
s3| blake 30     paris 
s5| adams 30     athens 
 
q)select from s where i within 0 2      # same as  q) select[3] from s 
s | name  status city 
--| -------------------                 # note within is useful for specifying time range 
s1| smith 20     london                 # e.g. 
s2| jones 10     paris                  # select from trades where ticker=`ibm within 15:59:59.000 16:00:00.000 
s3| blake 30     paris 
 
q)start_i : 0 
q)end_i : 2 
q)select from s where i within start_i end_i        # if you specify "within N M" with variables, 
'type                                               # you must give a general list 
 
q)select from s where i within (start_i ; end_i)    # like this 
s | name  status city 
--| ------------------- 
s1| smith 20     london 
s2| jones 10     paris 
s3| blake 30     paris 
 
q)select from s where city in `paris`athens       # "in"  just like SQL 
s | name  status city 
--| ------------------- 
s2| jones 10     paris 
s3| blake 30     paris 
s5| adams 30     athens 
 
 
NOTE: where clause in q-sql evaluates from left to right. so always put most restrictive conditions first. 
  e.g. 
       // suppose t is a giant market data history table. 
       q) select from t where ymd=2018.05.13, name=`ibm, not null bidSize;   // filter by date first, then name, then bidSize 
 
 
NOTE: for nested columns, you have to use where with adverb 
  e.g. 
 
q)ts:([] f:1.1 2.2 3.3; s:("abc";enlist "d";"efg")) 
q)ts 
f   s 
--------- 
1.1 "abc" 
2.2 ,"d" 
3.3 "efg" 
 
q)select from ts where s ~ "efg"       # "~" here compares the entire column, not each item 
f s 
--- 
 
q)select from ts where s ~\: "efg"     # each-left \: 
f   s 
--------- 
3.3 "efg" 
 
q)select from ts where s like "efg"     # but this only captures the exact match 
f   s                                   # for partial match (like grep), use ss 
--------- 
3.3 "efg" 
 
q)ts[where `boolean$ count each ts[`s] ss\: "e"] 
f   s 
--------- 
3.3 "efg" 
 
q)select from ts where `boolean$ count each s ss\: "e" 
f   s 
--------- 
3.3 "efg" 
 
 
NOTE: you can use "&" (i.e. "and") or "|" operator in "where" phrase, but it requries parenthesis grouping. 
   e.g. 
 
q)select from t where c1 > 34, c2 = `ibm 
q)select from t where (c1 > 34) & c2 = `ibm      # equivalent. notice the parenthesis on the left operand is a must. 
q)select from t where (c1 > 34) and c2 = `ibm    # same thing. generally, you should use "," instead, for logical AND 
q)select from t where (c1 > 34) | c2 = `ibm      # logical OR 
 
 
### 
###   by, fby        (GROUP BY, HAVING in SQL) 
### 
 
NOTE: "by" always returns a keyed table 
 
q)p 
p | name  color weight city 
--| ------------------------- 
p1| nut   red   12     london 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p4| screw red   14     london 
p5| cam   blue  12     paris 
p6| cog   red   19     london 
 
q)select by city from p             # see the syntax 
city  | p  name  color weight       # notice "by" groups by column, but only takes the last row 
------| ---------------------       # which is often useless so you usually combine groupby with some aggr functions 
london| p6 cog   red   19           # here you got "19"  i.e. the last row 
paris | p5 cam   blue  12 
rome  | p3 screw blue  17 
 
q)select avg weight by city from p     # more realistic to both aggregate and group 
city  | weight                         # aggr means avg, max, min, sum, etc 
------| ------                         # average weight by city 
london| 15 
paris | 14.5 
rome  | 17 
 
q)select max weight by city from p     # max weight by city 
city  | weight 
------| ------ 
london| 19 
paris | 17 
rome  | 17 
 
q)select max weight by city,name from p     # max weight by city AND name 
city   name | weight 
------------| ------ 
london cog  | 19 
london nut  | 12 
london screw| 14 
paris  bolt | 17 
paris  cam  | 12 
rome   screw| 17 
 
NOTE: what if you want the name who has max weight ? 
      a simple "by" gives you below. no good. 
 e.g. 
 
q)select name, max weight by city from p 
city  | name           weight 
------| --------------------- 
london| `nut`screw`cog 19 
paris | `bolt`cam      17 
rome  | ,`screw        17 
 
===> solution is "fby"  equivalent to HAVING in SQL 
     its syntax is   (aggr_func; column) fby column 
 
q)select from p where weight=(max;weight) fby city      # perfect 
p | name  color weight city                             # shows records having max weight for each city 
--| ------------------------- 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p6| cog   red   19     london 
 
q)select from p where weight=(max;weight) fby city, color=`blue     # you can add more filter like this 
p | name  color weight city 
--| ----------------------- 
p3| screw blue  17     rome 
 
 
note: what if you wnat to fby multiple columns ?   e.g. city,color 
 
q)select from p where weight=(max;weight) fby ([] city;color)     # sadly, you have to do this anonymous table thing 
p | name  color weight city 
--| ------------------------- 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p5| cam   blue  12     paris 
p6| cog   red   19     london 
 
 
// a quiz: find records where weight is one of min,avg,max  per city & color 
 
select name,weight,color,city from p where ({x in (min;avg;max)@\:x};weight) fby ([] city;color) 
 
// a more realistic example 
 
q)\l /Users/kenics/q/mktrades.q 
q)count trades 
10000000 
 
q)5 # trades 
dt         tm           sym  qty  px 
---------------------------------------- 
2015.01.01 00:00:01.129 aapl 6410 99.18 
2015.01.01 00:00:02.099 aapl 5200 93.82 
2015.01.01 00:00:02.670 ibm  6840 196.92 
2015.01.01 00:00:02.885 goog 7400 610.98 
2015.01.01 00:00:03.396 aapl 8460 97.14 
 
q)-5 # trades 
dt         tm           sym  qty  px 
---------------------------------------- 
2015.01.31 23:59:56.633 ibm  6360 195.34 
2015.01.31 23:59:56.936 goog 1920 591.96 
2015.01.31 23:59:57.204 ibm  290  183.96 
2015.01.31 23:59:57.226 aapl 6240 95.86 
2015.01.31 23:59:57.754 aapl 1730 95.3 
 
===> here are typical operations 
 
q)select from trades where ({x in (min x;max x)};px) fby ([] dt;sym)    // find trades with min or max px per date & ticker 
q)select from trades where ({x in (max;min)@\:x};px) fby ([] dt;sym)    // equivalent 
 
q)select from trades where ({x within (avg[x]-10; avg[x]+10)};px) fby ([] dt;sym)   // trades within +-10 of avg px per date & ticker 
 
 
 
### 
###  grouping and aggregation examples 
### 
 

#  aggr without grouping 

 
q)sp 
s  p  qty 
--------- 
s1 p1 300 
s1 p2 200 
s1 p3 400 
s1 p4 200 
s4 p5 100 
s1 p6 100 
s2 p1 300 
s2 p2 400 
s3 p2 200 
s4 p2 200 
s4 p4 300 
s1 p5 400 
 
q)select total:sum qty, mean:avg qty from sp 
total mean 
-------------- 
3100  258.3333 
 
 

#  grouping without aggregation   (and ungrouping) 

 
q)t:([] c1:`a`b`a`b`c; c2:10 20 30 40 50) 
q)select c2 by c1 from t 
c1| c2 
--| ----- 
a | 10 30        # grouping without aggr gives you nested column 
b | 20 40        # ask yourself if you really want it 
c | ,50 
 
q)ungroup select c2 by c1 from t      # "ungroup" example to untangle nested column 
c1 c2 
----- 
a  10 
a  30 
b  20 
b  40 
c  50 
 
NOTE: sometimes, you intentionally want to group by a column and ALL remaining columns nested. 
      use "xgroup" like below. 
 
q)sp 
s  p  qty 
--------- 
s1 p1 300 
s1 p2 200 
s1 p3 400 
s1 p4 200 
s4 p5 100 
s1 p6 100 
s2 p1 300 
s2 p2 400 
s3 p2 200 
s4 p2 200 
s4 p4 300 
s1 p5 400 
 
q)select s,qty by p from sp            # "xgroup" works also 
p | s               qty 
--| ------------------------------- 
p1| `s$`s1`s2       300 300 
p2| `s$`s1`s2`s3`s4 200 400 200 200 
p3| `s$,`s1         ,400 
p4| `s$`s1`s4       200 300 
p5| `s$`s4`s1       100 400 
p6| `s$,`s1         ,100 
 
q)`p xgroup sp 
p | s               qty 
--| ------------------------------- 
p1| `s$`s1`s2       300 300 
p2| `s$`s1`s2`s3`s4 200 400 200 200 
p3| `s$,`s1         ,400 
p4| `s$`s1`s4       200 300 
p5| `s$`s4`s1       100 400 
p6| `s$,`s1         ,100 
 
 

#  aggregating and grouping  example 

 
q)show trd:([] desk:`a`b`a`b`a`b; acct:`1`2`3`4`1`4; pnl:1.1 -2.2 3.3 4.4 5.5 -.5) 
desk acct pnl 
-------------- 
a    1    1.1 
b    2    -2.2 
a    3    3.3 
b    4    4.4 
a    1    5.5 
b    4    -0.5 
 
q)select ct:count desk, sum pnl by desk,acct from t 
desk acct| ct pnl 
---------| ------- 
a    1   | 2  6.6 
a    3   | 1  3.3 
b    2   | 1  -2.2 
b    4   | 2  3.9 
 
NOTE: "xbar" is a powerful func, that lets you group by range 
     (cannot do this in SQL) 
 
q)trd:([] time:00:00:00.000+til 1000000;ticker:1000000?`aapl`ibm; price:1000000?100.0) 
q)select[7] from trd 
time         ticker price 
---------------------------- 
00:00:00.000 ibm    51.28939 
00:00:00.001 aapl   71.54531 
00:00:00.002 ibm    34.79039 
00:00:00.003 aapl   93.31341 
00:00:00.004 ibm    73.64311 
00:00:00.005 aapl   22.54372 
00:00:00.006 ibm    46.09312 
 
q)10 # select avg price by 100 xbar time,ticker from trd 
time         ticker| price 
-------------------| -------- 
00:00:00.000 aapl  | 46.28815 
00:00:00.000 ibm   | 49.48895 
00:00:00.100 aapl  | 40.09082 
00:00:00.100 ibm   | 53.79035 
00:00:00.200 aapl  | 43.3644 
00:00:00.200 ibm   | 45.26295 
00:00:00.300 aapl  | 47.91035 
00:00:00.300 ibm   | 41.21501 
00:00:00.400 aapl  | 50.0714 
00:00:00.400 ibm   | 55.6332 
 
 
(ref) https://code.kx.com/wiki/Reference/xbar 
e.g. 
 
q)3 xbar til 16                         # rounds its right arg down to the nearest multiple of the left arg (integer) 
0 0 0 3 3 3 6 6 6 9 9 9 12 12 12 15 
 
q)5 xbar 11:00 + 0 2 3 5 7 11 13 
11:00 11:00 11:00 11:05 11:05 11:10 11:10 
 
 
### 
###  table operation example: how to collaps a span with a gap within a threshold 
### 
 
q)show  t : ([] name:`a`b`c`a`a`b`a`a`c`b`a`c`b; iq:12 34 56 12 12 34 88 12 99 77 12 99 77; sd:2010.12.03 2009.11.23 2010.12.03 2010.12.11 2011.03.05 2018.01.01 2003.07.13 2003.04.25 2020.10.01 2015.03.15 2003.07.15 2010.12.10 2015.04.03; ed:2010.12.05 2009.12.31 2010.12.09 2011.03.05 2011.04.23 2018.09.14 2003.09.05 2003.07.09 2020.12.31 2015.04.01 2003.09.08 2010.12.20 2015.04.15) 
name iq sd         ed 
----------------------------- 
a    12 2010.12.03 2010.12.05 
b    34 2009.11.23 2009.12.31 
c    56 2010.12.03 2010.12.09 
a    12 2010.12.11 2011.03.05 
a    12 2011.03.05 2011.04.23 
b    34 2018.01.01 2018.09.14 
a    88 2003.07.13 2003.09.05 
a    12 2003.04.25 2003.07.09 
c    99 2020.10.01 2020.12.31 
b    77 2015.03.15 2015.04.01 
a    12 2003.07.15 2003.09.08 
c    99 2010.12.10 2010.12.20 
b    77 2015.04.03 2015.04.15 
 
collapseSpan:{[tbl,klist] 
 
 

 
 
 
### 
###   exec 
### 
 
identical syntax to "select" 
only difference is the return type 
 
select : always returns a table 
  exec : returns a list  if one column selected 
                 a dict  if multiple columns selected 
 
e.g. 
 
q)show t:([] name:`a`b`c`d`e; state:`NY`FL`OH`NY`HI) 
name state 
---------- 
a    NY 
b    FL 
c    OH 
d    NY 
e    HI 
 
q)select name from t 
name 
----                    # select always returns a table 





 
q)exec name from t      # returns a list if a single column selected 
`a`b`c`d`e 
 
q) exec name,state from t     # returns a dict if multiple columns selected 
name | a  b  c  d  e 
state| NY FL OH NY HI 
 
note: a real powerful example of exec returning a dict is when column length differs in the output. 
  e.g. 
 
q)select name, distinct state from t      # error 
'length 
 
q)exec name, distinct state from t      # here you get dict 
name | `a`b`c`d`e 
state| `NY`FL`OH`HI 
 
NOTE: distinct works on a single column only in exec, unlike how it works on multiple columns in select 
e.g. 
q)select distinct state,name from t      // this selects distinct combo of name & state 
state name 
---------- 
NY    a 
FL    b 
OH    c 
NY    d 
HI    e 
 
q)exec distinct state,name from t       //  notice how distinct works only on a single column 
state| `NY`FL`OH`HI 
name | `a`b`c`d`e 
 
q)exec distinct state,distinct name from t     // if you wanna distinct multi columns, you must do distinct on each column like this 
state| `NY`FL`OH`HI 
name | `a`b`c`d`e 
 
 
### 
###  update 
### 
 
upsert lets you update table record-wise. (insert for regular tables, upsert for keyed tables) 
here you may want to update table column-wise. 
 
e.g. 
 
q)t:([] c1:`a`b`c; c2:10 20 30) 
 
q)update c1:`x`y`z from t     # updating c1 
c1 c2                         # to update in-place, you need `t 
----- 
x  10 
y  20 
z  30 
 
q)t 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
q)update c3:`x`y`z from t     # inserting c3 
c1 c2 c3                      # again, to modify in-place, you need `t 
-------- 
a  10 x 
b  20 y 
c  30 z 
 
q)t 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
q)update c3:999 from t      # notice how a single item extends dynamically 
c1 c2 c3 
--------- 
a  10 999 
b  20 999 
c  30 999 
 
##  a more practical example 
 
q)show accnt : ([] name:`tom`simon`adam ; balance:100 90 120) 
name  balance 
------------- 
tom   100 
simon 90 
adam  120 
 
q)update balance:balance + 10 from accnt where name = `adam         # adding 10 to adam's account 
name  balance 
------------- 
tom   100 
simon 90 
adam  130 
 

#  update-by 

 
another powerful example, you can use grouping and aggregating 
 
q)p 
p | name  color weight city 
--| ------------------------- 
p1| nut   red   12     london 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p4| screw red   14     london 
p5| cam   blue  12     paris 
p6| cog   red   19     london 
 
q)update avg weight by city from p     # see how you applied avg weight for every entry, grouped by city 
p | name  color weight city 
--| ------------------------- 
p1| nut   red   15     london 
p2| bolt  green 14.5   paris 
p3| screw blue  17     rome 
p4| screw red   15     london 
p5| cam   blue  14.5   paris 
p6| cog   red   15     london 
 
===> a more realistic example 
 
q)update city_avg_weight:avg weight, city_max_weight:max weight by city from p 
p | name  color weight city   city_avg_weight city_max_weight 
--| --------------------------------------------------------- 
p1| nut   red   12     london 15              19 
p2| bolt  green 17     paris  14.5            17 
p3| screw blue  17     rome   17              17 
p4| screw red   14     london 15              19 
p5| cam   blue  12     paris  14.5            17 
p6| cog   red   19     london 15              19 
 
 
### 
###   delete (columns or rows) of tables 
### 
 
q)show t:([] c1:`a`b`c; c2:10 20 30) 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
## 
##  delete column 
## 
 
q)delete c1 from t 
c2 
-- 
10 
20 
30 
 
 
## 
##  delete row 
## 
 
q)delete from t where c2>10, c2<30 
c1 c2 
----- 
a  10 
c  30 
 
 
NOTE: you can make good use of delete if you want to select everything except for a column (or a row) 
  e.g. 
 
q)p 
p | name  color weight city 
--| ------------------------- 
p1| nut   red   12     london 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p4| screw red   14     london 
p5| cam   blue  12     paris 
p6| cog   red   19     london 
 
q)delete from p where p=`p4       # only removing `p4 row 
p | name  color weight city 
--| ------------------------- 
p1| nut   red   12     london 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p5| cam   blue  12     paris 
p6| cog   red   19     london 
 
 
### 
###  sort (by column)         # ORDER BY 
### 
 
q)t:([] c1:`a`b`c`a; c2:20 10 40 30) 
 
## 
##   x xasc y 
## 
 
y is a table 
x is a symbol list of column names of y 
 
recall "asc" sorts a list. "xasc" works on columns of a table 
 e.g. 
 
q)asc  32 12 45 8 
`s#8 12 32 45 
 
q)`c2 xasc t          # sort by c2, ascending order 
c1 c2                 # to modify in-place, use `t 
----- 
b  10 
a  20 
a  30 
c  40 
 
q)`c1`c2 xasc t        # sort by c1 first, then by c2 
c1 c2 
----- 
a  20 
a  30 
b  10 
c  40 
 
q)`c1 xasc `t        #  if you give y (table name) by reference, it updates in place 
 
## 
##  x xdesc y 
## 
 
q)desc  32 12 45 8 
45 32 12 8 
 
q)`c1`c2 xdesc t       # to modify in-place, use `t 
c1 c2 
----- 
c  40 
b  10 
a  30 
a  20 
 
 
### 
###  rename & rearrange columnss 
### 
 
let's use this table. 
 
q)show t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3) 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
## 
##   x xcol y   (rename columns) 
## 
 
x is new column names (given in a symbol list) 
y is a table 
 
well you can rename like this. 
 
q)select c4:c1, c5:c2, c6:c3 from t     // note if you use "update" it gives you c1,c2,c3,c4,c5,c6 instead 
c4 c5 c6 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
q)`c4`c5`c6 xcol t      #  "xcol" lets you rename columns also 
c4 c5 c6 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
 
## 
##   x xcols y    (rearrange columns) 
## 
 
q)select c3,c1,c2 from t     # a simple way to rearrange columns 
c3  c1 c2 
--------- 
1.1 a  10 
2.2 b  20 
3.3 c  30 
 
q)`c3`c1 xcols t      # or you can use "xcols" to specify first columns 
c3  c1 c2             # note; you cannot use xcols on keyed tables 
--------- 
1.1 a  10 
2.2 b  20 
3.3 c  30 
 
 
## 
##   x xgroup y 
## 
 
y is a table 
x is a symbol atom (or list) of column names of y 
 
recall group by in q-sql 
 
q)select c2,c3 by c1 from t 
c1| c2    c3 
--| --------- 
a | 10 40 0 3 
b | ,20   ,1 
c | ,30   ,2 
 
q)`c1 xgroup t 
c1| c2    c3 
--| --------- 
a | 10 40 0 3 
b | ,20   ,1 
c | ,30   ,2 
 
 
## 
##  ungroup[x]       // x is a table 
## 
 
continuning form the above example. 
 
q)ungroup `c1 xgroup t      // it basically looks at all columns/rows and creates a table where every cell is an atom 
c1 c2 c3                    // i.e. list decomposition 
-------- 
a  10 0 
a  40 3 
b  20 1 
c  30 2 
 
 
 
#### 
####  table join 
#### 
 
recall from SQL 
 
[1] (inner) join       : intersection (i.e. records matched in both tables) 
[2] left (outer) join  : entire left table | matched vals from right table 
[3] right (outer) join : entire right table | matched vals from left table 
[4] full (outer) join  : return all records when there is a match in either left or right table 
 
 
(ref)  http://kenics.net/misc/sql_note/       # read the section on "why venn digram is bad for illustrating left/right/full joins" 
 
 
for demo, let's use these. 
 
$ cat sp.q 
 
s:([s:`s1`s2`s3`s4`s5] 
 name:`smith`jones`blake`clark`adams; 
 status:20 10 30 20 30; 
 city:`london`paris`paris`london`athens) 
 
p:([p:`p1`p2`p3`p4`p5`p6] 
 name:`nut`bolt`screw`screw`cam`cog; 
 color:`red`green`blue`red`blue`red; 
 weight:12 17 17 14 12 19; 
 city:`london`paris`rome`london`paris`london) 
 
sp:([] 
 s:`s$`s1`s1`s1`s1`s4`s1`s2`s2`s3`s4`s4`s1;     / fkey 
 p:`p$`p1`p2`p3`p4`p5`p6`p1`p2`p2`p2`p4`p5;     / fkey 
 qty:300 200 400 200 100 100 300 400 200 200 300 400) 
------------------------------------------- 
 
q) p 
p | name  color weight city 
--| ------------------------- 
p1| nut   red   12     london 
p2| bolt  green 17     paris 
p3| screw blue  17     rome 
p4| screw red   14     london 
p5| cam   blue  12     paris 
p6| cog   red   19     london 
 
q) s 
s | name  status city 
--| ------------------- 
s1| smith 20     london 
s2| jones 10     paris 
s3| blake 30     paris 
s4| clark 20     london 
s5| adams 30     athens 
 
q)sp 
s  p  qty 
--------- 
s1 p1 300 
s1 p2 200 
s1 p3 400 
s1 p4 200 
s4 p5 100 
s1 p6 100 
s2 p1 300 
s2 p2 400 
s3 p2 200 
s4 p2 200 
s4 p4 300 
s1 p5 400 
 
q)meta sp 
c  | t f a 
---| ----- 
s  | s s 
p  | s p 
qty| j 
 
 
## 
##  (implicit) join using foreign keys 
## 
 
q)select s, s.name: s.name, p, p.name: p.name , qty from sp 
s  s.name p  p.name qty 
----------------------- 
s1 smith  p1 nut    300 
s1 smith  p2 bolt   200 
s1 smith  p3 screw  400 
s1 smith  p4 screw  200 
s4 clark  p5 cam    100 
s1 smith  p6 cog    100 
s2 jones  p1 nut    300 
s2 jones  p2 bolt   400 
s3 blake  p2 bolt   200 
s4 clark  p2 bolt   200 
s4 clark  p4 screw  300 
s1 smith  p5 cam    400 
 
NOTE: you can reference foreign key to a key that is foreign to other key across tables. 
  e.g. 
 
q)emaster:([eid:1001 1002 1003 1004 1005] currency:`gbp`eur`eur`gbp`eur)      # define a new table "emaster" 
q)update eid:`emaster$1001 1002 1005 1004 1003 from `s                        # set a new column "eid" in table s, as foreign key to key column in emaster 
`s 
q)select s.name, qty, s.eid.currency from sp       # see s.eid.currency syntax 
name  qty currency 
------------------ 
smith 300 gbp 
smith 200 gbp 
smith 400 gbp 
smith 200 gbp 
clark 100 gbp 
smith 100 gbp 
jones 300 eur 
jones 400 eur 
blake 200 eur 
clark 200 gbp 
clark 300 gbp 
smith 400 gbp 
 
 
## 
##  left-join "lj" 
## 
 
syntax is    source_table lj target_keyed_table 
 
note: source_table can be either regular or keyed. 
      but source_table must have a column whose name is the same as the key column of target_table 
 
q) t:([] k:1 2 3 4; c:10 20 30 40)        # here source_table is regular 
q) kt:([k:2 3 4 5] v:200 300 400 500) 
q) t lj kt 
k c  v 
-------- 
1 10 
2 20 200 
3 30 300 
4 40 400 
 
q) tt:([k:1 2 3 4] c:10 20 30 40)        #  here source_table is keyed 
q) tt lj kt 
k| c  v 
-| ------ 
1| 10 
2| 20 200 
3| 30 300 
4| 40 400 
 
note:  if colliding column names betwen source_table and target_table, the target_table wins. 
  e.g. 
 
q)t:([] k:1 2 3 4; v:10 20 30 40) 
q)kt:([k:2 3 4 5]; v:200 300 400 500) 
q)t lj kt 
k v 
----- 
1 10 
2 200 
3 300 
4 400 
 
 
NOTE: there are  ljf[t1;t2]  slightly diff from  lj[t1;t2] 
 
q)show t1:([] name:`ibm`msft; px:12 34) 
name px 
------- 
ibm  12 
msft 34 
 
q)show t2:([] name:`ibm`msft; px:0N 56) 
name px 
------- 
ibm 
msft 56 
 
q)t1 lj 1!t2     // even null overlays 
name px 
------- 
ibm 
msft 56 
 
q)t1 ljf 1!t2    // null doesnt overlay. i.e. it's coalesce 
name px 
-------          // use ljf[] when you really want it 
ibm  12          // often overlay null is actually what you want 
msft 56 
 
==> similarly, there are ijf[], ujf[] 
 
 
### 
###  right-join "rj" 
### 
 
no native implementation, but it's trivial. 
 
rj:{[t1;t2]      // assume t2 is a keyed table 
 k:keys t2; 
 lj[() xkey t2;keys[t2] xkey t1] 
 } 
 
### 
###  inner-join "ij" 
### 
 
in SQL, inner join looks like below. 
(given two tables m & d, and a common key k) 
 
select m.foo, d.bar from m,d WHERE m.k = d.k 
select m.foo, d.bar from m INNER JOIN d ON m.k = d.k 
 
q-sql "ij" syntax is 
 
source_table ij target_keyed_table     # similar to left-join "lj" 
 
note: source_table can be regular or keyed 
      joins based on foreign key (because they are linked), or based on common key (key column on target_table). 
 
e.g. 
 
q)t:([] k:1 2 3 4; c:10 20 30 40) 
q)kt:([k:2 3 4 5]; v:2.2 3.3 4.4 5.5) 
q)t ij kt 
k c  v 
-------- 
2 20 2.2 
3 30 3.3 
4 40 4.4 
 
q)t:([] k:1 2 3 4; v:10 20 30 40) 
q)kt:([k:2 3 4 5]; v:200 300 400 500) 
q)t ij kt 
k v 
-----       # notice how target_table prevailed if duplicate columns 
2 200 
3 300 
4 400 
 
q)kt1:([k:1 2 3 4]; v:10 0N 30 40)       # source_table can be keyed like this also 
q)kt:([k:2 3 4 5]; v:200 300 400 500) 
q)kt1 ij kt 
k| v 
-| --- 
2| 200 
3| 300 
4| 400 
 
 
### 
###  equi-join "ej" 
### 
 
syntax is ej[column_symbol_name ; t1; t2] 
 
note:  "ej" joins two tables with by the given column_symbol_name 
       both tables can be regular, as we explicitly specify column name 
 
e.g. 
 
q)t1:([] k:1 2 3 4; c:10 20 30 40) 
 
q)show t2:([] k:2 2 3 4 5; c:200 222 300 400 500; v:2.2 22.22 3.3 4.4 5.5) 
k c   v 
----------- 
2 200 2.2 
2 222 22.22 
3 300 3.3 
4 400 4.4 
5 500 5.5 
 
q)`k xkey t2 
k| c   v 
-| --------- 
2| 200 2.2 
2| 222 22.22 
3| 300 3.3 
4| 400 4.4 
5| 500 5.5 
 
q)t1 ij `k xkey t2      # notice how "ij" ignores the 222 22.22 record 
k c   v 
--------- 
2 200 2.2 
3 300 3.3 
4 400 4.4 
 
q)ej[`k;t1;t2]         # "ej" returns ALL matching record from the right table 
k c   v 
----------- 
2 200 2.2 
2 222 22.22 
3 300 3.3 
4 400 4.4 
 
 
### 
###  plus-join "pj" 
### 
 
"pj" is "lj" that adds data across dupe columns, instead of target_table overwriting. 
 
note: as such, all non-key columns MUST be numeric. 
 e.g. 
 
q) t :([] k:`a`b`c; a:100 200 300; b:10. 20. 30.; c:1 2 3) 
q) kt:([k:`a`b] a:10 20; b:1.1 2.2) 
q) t pj kt 
k a   b     c 
------------- 
a 110 11.1  1 
b 220 22.2  2 
c 300 30 3 
 
 
### 
###  union join "uj" 
### 
 
recall you can do union join two tables with ","  IF two tables meta match. 
 
q)t 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
q)t,t 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
===> "uj" is more powerful, joins two tables of any shape. 
 
q)t1:([] c1:`a`b`c; c2: 10 20 30) 
q)t2:([] c1:`x`y; c3:8.8 9.9) 
q)t1 uj t2 
c1 c2 c3 
--------- 
a  10 
b  20 
c  30 
x     8.8 
y     9.9 
 
q)kt1:([k:1 2 3] v1:`a`b`c; v2:10 20 30)     # works on keyed tables also 
q)kt2:([k:3 4] v2:300 400; v3:3.3 4.4)       # see how right_table prevails (aka upsert semantics) 
q)kt1 uj kt2 
k| v1 v2  v3 
-| ----------    // note: when uj[t1;t2] with keyed column(s), you must key the exact same column(s) on BOTH tables 
1| a  10         //       it's just syntactic sugar 
2| b  20 
3| c  300 3.3 
4|    400 4.4 
 
 
### 
###  full outer join 
### 
 
sadly, there is no native "oj" function. but it's trivial to implement. 
 
conceptually all you need is 
- get inner join table : ijt 
- get left join table  : ljt 
- get right join table : rjt 
 
then (ljt - ijt) + ijt + (rjt - ijt) = oj[t1;t2;keyCols] 
 
q)show t1:([] name:`ibm`msft`aapl; val:12 34 56) 
name val 
-------- 
ibm  12 
msft 34 
aapl 56 
 
q)show t2:([] name:`ibm`ibm`dell; qty:9.8 7.6 5.4) 
name qty 
-------- 
ibm  9.8 
ibm  7.6 
dell 5.4 
 
 
q)show ijt: t1 ij `name xkey t2            // inner join 
name val qty 
------------ 
ibm  12  9.8 
 
q)show ljt: t1 lj `name xkey t2            // left join 
name val qty 
------------ 
ibm  12  9.8 
msft 34 
aapl 56 
 
q)show rjt: t2 lj `name xkey t1            // right join (but col order gets diff, let's enforce consistency with ij & lj results) 
name qty val 
------------ 
ibm  9.8 12 
ibm  7.6 12 
dell 5.4 
 
q)show rjt: (cols ljt) xcols t2 lj `name xkey t1      // right join (ver2) 
name val qty 
------------ 
ibm  12  9.8 
ibm  12  7.6 
dell     5.4 
 
q)ijt,(ljt except ijt),(rjt except ijt)         // outer join 
name val qty 
------------ 
ibm  12  9.8 
msft 34 
aapl 56 
ibm  12  7.6 
dell     5.4 
 
 
NOTE: in practice, maybe you can get away with just distinct ljt,rjt 
 
q)distinct ljt,rjt 
name val qty 
------------ 
ibm  12  9.8 
msft 34 
aapl 56 
ibm  12  7.6 
dell     5.4 
 
// putting it all together 
q)oj:{[t1;t2;keyCols] ljt:t1 lj keyCols xkey t2; rjt:(cols ljt) xcols t2 lj keyCols xkey t1; distinct ljt,rjt} 
q)oj[t1;t2;`name] 
name val qty 
------------ 
ibm  12  9.8 
msft 34 
aapl 56 
ibm  12  7.6 
dell     5.4 
 
q)t1 uj t2           // notice how oj and uj differ 
name val qty 
------------ 
ibm  12 
msft 34 
aapl 56 
ibm      9.8 
ibm      7.6 
dell     5.4 
 
q)(1!t1) uj 1!t2       // even when you key t1 & t2, and uj 
name| val qty          // clearly not what we intended 
----| ------- 
ibm | 12  9.8 
msft| 34 
aapl| 56 
dell|     5.4 
 
 
NOTE: the above oj fails in certain situations involving dupe keys 
 
q)show t1:([] name:`ibm`msft`aapl; val:12 34 56; tc:`odd`normal`odd) 
name val tc 
--------------- 
ibm  12  odd 
msft 34  normal 
aapl 56  odd 
 
q)show t2:([] name:`ibm`ibm`dell; qty:9.8 7.6 5.4; tc:`odd`normal`odd) 
name qty tc 
--------------- 
ibm  9.8 odd 
ibm  7.6 normal 
dell 5.4 odd 
 
name val tc     qty 
-------------------     // what we expect from oj[] output 
ibm  12  odd    9.8 
msft 34  normal 
aapl 56  odd 
ibm  12  normal 7.6 
ibm  12  odd    7.6 
dell     odd    5.4 
 
q)oj[t1;t2;`name] 
name val tc     qty 
-------------------     // but in reality, our oj[] doesn't capture all the permutation 
ibm  12  odd    9.8     // this is the natural result of lj logic, which assumes the key is unique 
msft 34  normal         // solution is ej which allows dupe key records 
aapl 56  odd            // but because ej is essentially ij, you need to account for outer records via uj 
ibm  12  odd    7.6 
dell     odd    5.4 
 
q)foj:{[t1;t2;kCols] ljt:ej[kCols;t1;t2]; rjt:cols[ljt] xcols ej[kCols;t2;t1]; ujt:cols[ljt] xcols 0!uj[kCols xkey t1; kCols xkey t2]; distinct ljt,rjt,ujt} 
 
q)foj[t1;t2;`name] 
name val tc     qty 
-------------------    // beautiful. but beware the result can be a huge table 
ibm  12  odd    9.8 
ibm  12  normal 7.6 
ibm  12  odd    7.6 
msft 34  normal 
aapl 56  odd 
dell     odd    5.4 
 
 
 
### 
###  as-of join "aj" 
### 
 
syntax is  aj[`c1`c2,,`cN ; t1 ; t2] 
 
===> does inner-join in column `c1`c2,,,`cN-1, then for every cN value in t1, picks the biggest cN in t2 less than or equal to the cN value from t1. 
     (assuming cN is like timestamp, then you collect as-of data from the right-table) 
     (NOTE: cN column must be sorted in ascending order) 
 e.g. 
 
q)show t:([] ti:10:01:01 10:01:03 10:01:04;sym:`msft`ibm`ge;qty:100 200 150) 
ti       sym  qty 
----------------- 
10:01:01 msft 100 
10:01:03 ibm  200 
10:01:04 ge   150 
 
q)show q:([] ti:10:01:00 10:01:01 10:01:01 10:01:02;sym:`ibm`msft`msft`ibm;px:100 99 101 98) 
ti       sym  px 
----------------- 
10:01:00 ibm  100 
10:01:01 msft 99 
10:01:01 msft 101 
10:01:02 ibm  98 
 
q) aj[`sym`ti; t ; q]      # now see the result, to appreciate the power of "aj" 
ti       sym  qty px       # it's essentially t1, and picked as-of cN in t1 data from t2 
--------------------- 
10:01:01 msft 100 101 
10:01:03 ibm  200 98 
10:01:04 ge   150 
 
q) aj0[`sym`ti; t ; q]     # "aj0" if you want cN value from t2 instead 
ti       sym  qty px       # see ti value is from t2, instead of t1 
---------------------      # NOTE: this is not the same as swapping t1,t2 in aj like below 
10:01:01 msft 100 101 
10:01:02 ibm  200 98 
10:01:04 ge   150 
 
q)aj[`sym`ti ; q ; t] 
ti       sym  px  qty 
--------------------- 
10:01:00 ibm  100 
10:01:01 msft 99  100 
10:01:01 msft 101 100 
10:01:02 ibm  98 
 
 
NOTE: recall aj[], for every cN value in t1, picks the biggest cN in t2 less than or equal to the cN value from t1. 
it will be an interesting exercise to create its derivatives. 
e.g. 
ajb[] that only picks the biggest cN in t2 less than (not equal to ) cN value from t1 
aja[] that picks the smallest cN in t2 bigger than cN value from t1 
 
 
### 
###  window-join "wj" 
### 
 
http://code.kx.com/q4m3/9_Queries_q-sql/#999-window-join 
 
"wj" is further generalization of "aj" 
e.g. 
given two tables (trade and quote), you want trade augmented with quotes as of each trade. ("aj" gives you that) 
but you may want adjacent quotes too. then use "wj" 
 
q) show t:([]sym:3#`aapl;time:09:30:01 09:30:04 09:30:08;price:100 103 101) 
sym  time     price 
------------------- 
aapl 09:30:01 100 
aapl 09:30:04 103 
aapl 09:30:08 101 
 
q)show q:([] sym:8#`aapl;  time:09:30:01+(til 5),7 8 9;  ask:101 103 103 104 104 103 102 100;  bid:98 99 102 103 103 100 100 99) 
sym  time     ask bid 
--------------------- 
aapl 09:30:01 101 98 
aapl 09:30:02 103 99 
aapl 09:30:03 103 102 
aapl 09:30:04 104 103 
aapl 09:30:05 104 103 
aapl 09:30:08 103 100 
aapl 09:30:09 102 100 
aapl 09:30:10 100 99 
 
syntax is   wj[w;c;t;(q;(f0;c0);(f1;c1))]        #  w = a list of windows 
                                                 #  c = a list of column names 
                                                 #  t = a table 1 
                                                 #  q = a table 2 
                                                 #  f0, f1 = aggr funcs (can be null ::) 
                                                 #  c0, c1 = individual column names 
q)w:-2 1+\:t`time     // a very common way to set up w in wj[] 
q)w                   // basically for every `time, you specify startTime and endTime by subtracting/adding some int 
09:29:59 09:30:02 09:30:06 
09:30:02 09:30:05 09:30:09 
 
q) wj[w; `sym`time ;t ; (q;(::;`ask) ; (::;`bid))] 
sym  time     price ask             bid 
-------------------------------------------------- 
aapl 09:30:01 100   101 103         98 99 
aapl 09:30:04 103   103 103 104 104 99 102 103 103 
aapl 09:30:08 101   104 103 102     103 100 100 
 
q) wj[w; `sym`time ;t ; (q;(max;`ask) ; (min;`bid))] 
sym  time     price ask bid 
--------------------------- 
aapl 09:30:01 100   103 98 
aapl 09:30:04 103   104 99 
aapl 09:30:08 101   104 100 
 
 
NOTE: wj[] considers whatever value is "prevailing" as of startTime, and any other values that happen during [startTime,endTime] 
      wj1[] does not consider the prevailing value, and only strictly considers [startTime,endTime] 
 
### 
###  parameterized queries   (i.e. stored procedure in SQL) 
### 
 
q)t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3) 
 
q)select from t where c2 > 15        # suppose you wanna generalize this as parameterized store proc 
c1 c2 c3 
--------- 
b  20 2.2 
c  30 3.3 
 
q)proc : {[x] select from t where c2 > x}       # just define a function like this 
q)proc 15 
c1 c2 c3 
--------- 
b  20 2.2 
c  30 3.3 
 
q)proc2 : {[nms;sc] select from t where c1 in nms, c2 > sc} 
q)proc2[`a`c; 15] 
c1 c2 c3 
--------- 
c  30 3.3 
 
q)proc3 : {[tbl;nms;sc] select from tbl where c1 in nms, c2 > sc}      # you can pass in a table itself 
q)proc3[t ; `a`c; 15] 
c1 c2 c3 
--------- 
c  30 3.3 
 
### 
###  view   (aka alias for q-sql) 
### 
 
q) t : ([] c1:`a`b`c; c2:10 20 30) 
q) u : select from t where c2>15 
q) v::select from t where c2>15      #  don't put whitespace besides :: 
q) u ~ v 
1b 
 
q)update c2:15 from `t where c1=`b 
`t 
 
q)t 
c1 c2 
----- 
a  10 
b  15 
c  30 
 
q) u 
c1 c2 
----- 
b  20 
c  30 
 
q) v      # view (alias) picks up the latest value 
c1 c2 
----- 
c  30 
 
q)view `v                       # to see alias definition 
"select from t where c2>15" 
 
q)views `.                      # to see alias list 
,`v 
 
 
### 
###  functional forms 
### 
 
http://code.kx.com/q4m3/9_Queries_q-sql/#912-functional-forms 
 
- very hard, very cryptic 
- useful when you wanna parameterize column names of table queries. 
 
# syntax  - there are two 
 
?[t;c;b;a]     # select and exec 
![t;c;b;a]     # update and delete 
 
 
# note parse[] lets you generate functional form. 
 
## 
##   functional select 
## 
 
t : a table (or table name) 
c : a list of constraints, i.e. "where" clause 
b : a dict of groupbys or a flag controlling other aspects of the query 
a : a dict of aggregates 
 
NOTE:  c,b,a parameters can involve the columns of t and also any variables that are in scope. 
       in such case, 
       - columns are always represented by their symbolic names 
       - (as a consequence) any literal symbols (or lists of symbols) used in c,b,a must be distinguished. how? - by enlisting them. (seriously, AND this is very important to remember, as we will see below) 
 
 
e.g.     // let's start easy 
 
q)t:([] c1:`a`b`a`c`a`b`c; c2:10*1+til 7; c3:1.1*1+til 7)       // borrowing sample tables from q4m3 
q)ft:{([] c1:`a`b`a`c`a`b`c; c2:10*1+til 7; c3:1.1*1+til 7)}    // a func that returns t 
 
q)t 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
a  30 3.3 
c  40 4.4 
a  50 5.5 
b  60 6.6 
c  70 7.7 
                                      // here is the simplest functional form 
q)?[t; (); 0b; ()]~select from t      // no constraints, no group by, no aggregates 
1b 
 
q)?[`t; (); 0b; ()] ~ ?[t; (); 0b; ()]     // notice the input table can be by-value or by-name 
1b 
 
q)?[ft[]; (); 0b; ()] ~ ?[t; (); 0b; ()]    // notice the input table can be supplied by other func like this 
1b 
 
q)select c1,c2 from t           // lets write this in functional form below 
c1 c2 
----- 
a  10 
b  20 
a  30 
c  40 
a  50 
b  60 
c  70 
 
q)?[t; (); 0b; `foo`bar!`c1`c2]       //  equivalent to the above select (except column name renamed for illustration) 
foo bar                               //  no constraints, no groupby, just specifying columns in aggregate param 
-------                               //  notice column names are specified by-symbol 
a   10                                //  notice, in functional form, you must specify the output column names (again by-symbol) 
b   20                                //  recall aggregate param must be in dict. (more example below) 
a   30 
c   40 
a   50 
b   60 
c   70 
 
q)?[t; (); 0b; `c1`c2!`c1`c2]        //  see how it's equivalent to the above select query 
c1 c2                                //  why do you need functional form in this case ? 
-----                                //  well you don't in this case, but in reality, column name list is given as a parameter 
a  10                                //  then you have no choice but to use functonal form 
b  20 
a  30 
c  40 
a  50 
b  60 
c  70 
 
q)?[t; (); 0b; a!a:`c1`c2]       //  again, this is equivalent to the above 
c1 c2                            //  notice a clever intra-line assignment 
-----                            //  when your parameterized column list is long, this is a common neat technique 
a  10 
b  20 
a  30 
c  40 
a  50 
b  60 
c  70 
 
 
 
q)select from t where c2>35,c1 in `b`c      // let's look at how constraints are specified in functional form 
c1 c2 c3                                    // it is just a condition you specify in "where" phrase 
--------- 
c  40 4.4 
b  60 6.6 
c  70 7.7 
 
q)?[t; ((>;`c2;35); (in;`c1;enlist `b`c)); 0b; ()]     // this is equivalent to the above select query 
c1 c2 c3                                               // notice the syntax for constraint param. it is a list of "parse trees" 
---------                                              // a parse tree is simply prefix form of a function with arguments in a list 
c  40 4.4                                              // also notice column names are by-symbol, and literal symbols are enlisted! 
b  60 6.6 
c  70 7.7 
 
===> recall parse[] pretty much gives you the functional form. 
 
q)parse "select from t where c2>35,c1 in `b`c" 

`t 
,((>;`c2;35);(in;`c1;,`b`c)) 
0b 
() 
 
 
 
q)select max c2, c2 wavg c3 from t      // let's look at how aggregates are specified in functional form 
c2 c3                                   // NOTE: general form select template automatically chose output column names `c2`c3 for you. 
------                                  //       but in functional form, output column names MUST be explicitly specified. 
70 5.5 
 
q)?[t; (); 0b; `maxc2`wtavg!((max;`c2); (wavg;`c2;`c3))]    // equivalent to the above select query 
maxc2 wtavg                                                 // note the syntax, a dict of aggregates 
-----------                                                 // notice how you must explicity specify output column names 
70    5.5 
 
 
q)select by c1 from t         // let's look at how group by params are specified in functional form 
c1| c2 c3 
--| ------ 
a | 50 5.5                   // notice/recall, groupby simply gives you the last row, if you don't apply any aggr func. 
b | 60 6.6 
c | 70 7.7 
 
q)?[t; (); (enlist `c1)!enlist `c1; ()]     // equivalent to the above select query 
c1| c2 c3                                   // here you just specify the column name by symbol `c1 
--| ------                                  // NOTE: it's confusing but here enlist is because both key and value must be enlisted in a singleton dict. 
a | 50 5.5                                  // this is separate from enlisting literal symbols in functional form. 
b | 60 6.6 
c | 70 7.7 
 
===> note: so far we looked at c,b,a params independently, but they are usually used together (especially aggr and groupby) 
 
e.g. 
 
q)select max c2, c2 wavg c3 by c1 from t where c2>35,c1 in `b`c 
c1| c2 c3 
--| ------ 
b | 60 6.6 
c | 70 6.5        // yes, wavg worked nicely 
 
q)c:((>;`c2;35); (in;`c1;enlist `b`c))            // now piecing everything together 
q)b:(enlist `c1)!enlist `c1                       // here is an equivalent functional form to the above select query 
q)a:`maxc2`wtavg!((max;`c2); (wavg;`c2;`c3)) 
q)?[t;c;b;a] 
c1| maxc2 wtavg 
--| ----------- 
b | 60    6.6 
c | 70    6.5 
 
 
NOTE: to enable "distinct", set "by" param 1b     // yes, you just have to memorize this syntax 
e.g. 
q)t:([] c1:`a`b`a`c`b`c; c2:1 1 1 2 2 2; c3:10 20 30 40 50 60) 
q)select distinct c1,c2 from t 
c1 c2 
----- 
a  1 
b  1 
c  2 
b  2 
 
q)?[t; (); 1b; `c1`c2!`c1`c2]         //  recall the syntax  ?[t;c;b;a]  and how b:1b 
c1 c2 
----- 
a  1 
b  1 
c  2 
b  2 
 
 
NOTE:  to enable select[n;>c1]  you specify them in 5th, 6th params. 
e.g. 
q)select[2] from t      // these two are equivalent 
q)?[t;();0b;();2] 
 
q)select[>c1] c1,c2 from t                   // these two are equivalent 
q)?[t;();0b;`c1`c2!`c1`c2; 0W; (>:;`c1)]     // infinity, and the 6th param is in k form. duh. 
 
 
NOTE: beware when selecting (for some reason) the same column multiple times in functional select 
e.g. 
 
q)show t:([] name:`ibm`msft`aapl; vol:123 456 789; price:1.1 2.2 3.3) 
name vol price 
-------------- 
ibm  123 1.1 
msft 456 2.2 
aapl 789 3.3 
 
q)select name,name,name from t 
name name1 name2 
----------------   // recall how qsql conveniently adds index number for you 
ibm  ibm   ibm     // when selecting the same col multiple times 
msft msft  msft 
aapl aapl  aapl 
 
q)?[t;();0b;a!a:`name`name`name] 
name name name 
--------------     // but NOT in functional select 
ibm  ibm  ibm      // be very careful because if this was some numeric column 
msft msft msft     // and you do some computation with other columns, then it can mess up 
aapl aapl aapl 
 
q)parse "select name,name,name from t"    // interestingly, parse tree func is smart 

`t 
() 
0b 
`name`name1`name2!`name`name`name 
 
 
### 
### functional exec 
### 
 
?[t;c;b;a] 
 
one tricky thing about exec is, as you recall, if you select one column, you get a list, otherwise you get a dict. 
this is distinguished in the way we specify aggr param. 
 
c: same syntax s functional select. 
b: almost same syntax as select, but if empty, use () instead of 0b 
   also you can simplify this to a symbol list if you are only specifying one column to group by. 
   e.g. parse["exec by foo from t"] ---> ?[t; (); enlist `foo; ()] 
   in any case, you can just do parse[] to find out anyway. 
a: parse["exec foo from t"]      -->  ?[t; (); (); `foo]                 // notice you don't even do enlist`foo here. jsut `foo 
   parse["exec foo,bar from t"]  -->  ?[t; (); (); `foo`bar!`foo`bar] 
 
 
q)t:([] c1:`a`b`c`a; c2:10 20 30 40; c3:1.1 2.2 3.3 4.4) 
 
q)exec distinct c1 from t             // these two are equivalent 
`a`b`c 
q)?[t; (); (); (distinct; `c1)]       // notice how "by" param is an empty list, and also how aggr param is specified 
`a`b`c 
 
q)?[t; (); (); `c1]        // notice how aggr param is just `c1, not even enlisted 
`a`b`c`a                   // this is how you get a list returned 
 
===> if no grouping, like above, then notice we use an empty list () for "by" param. recall in functional select, we used 0b or 1b 
 
 
q)exec c1:distinct c1 from t                             //  these two are equivalent 
c1| a b c 
q)?[t; (); (); (enlist `c1)!enlist (distinct; `c1)]      //  recall a singleton dict requires elem enlisted 
c1| a b c 
 
q)exec distinct c1, c2 from t                       // these two are equivalent 
q)?[t; (); (); `c1`c2!((distinct; `c1); `c2)] 
 
 
NOTE: to group on a column, specify a column by its symbol name in "by" param 
e.g. 
q)exec c2 by c1 from t        // these two are equivalent 
a| 10 40 
b| ,20 
c| ,30 
q)?[t; (); `c1; `c2]          // see how simple it is 
a| 10 40 
b| ,20 
c| ,30 
 
q)exec max c2 by c1 from t     // these two are equivalent 
a| 40                          // a more realistic example 
b| 20 
c| 30 
q)?[t; (); `c1; (max;`c2)] 
a| 40 
b| 20 
c| 30 
 
 
## 
##  functional update 
## 
 
essentially the same as select, just swap "!" with "?" 
 
?[t;c;b;a]       // select, exec 
![t;c;b;a]       // update, delete 
 
==> the way we distinguish btwn 'update' and 'delete' is 
    - update:  'a' is a dict 
    - delete:  'a' is a symbol list 
 
q)t:([] c1:`a`b`c`a`b; c2:10 20 30 40 50) 
 
// an example 
 
q)update c2:100 from t where c1=`a 
c1 c2 
------ 
a  100 
b  20 
c  30 
a  100 
b  50 
 
q)c:enlist (=;`c1;enlist `a)     // recall constraint param is a list of "parse trees" 
q)b:0b                           // pay attention to the use of enlist 
q)a:(enlist `c2)!enlist 100      // recall aggr param is a dict 
q)![t;c;b;a]                     // equivalent to the above update cmd 
c1 c2 
------ 
a  100 
b  20 
c  30 
a  100 
b  50 
 
 
// more example 
 
q)update c2:sum c2 by c1 from t     // let's study how to configure "by" param in functional update 
c1 c2 
----- 
a  50 
b  70 
c  30 
a  50 
b  70 
 
q)c:() 
q)b:(enlist `c1)!enlist `c1          // recall "by" param is a dict of groupbys. NOTE you must match the column name here 
q)a:(enlist `c2)!enlist(sum; `c2)    // notice for aggr param, you can change the column name. 
q)![`t; c; b; a]                     // just like the regular form you can specify the table by name, and modify in-place 
`t 
q)t 
c1 c2 
----- 
a  50 
b  70 
c  30 
a  50 
b  70 
 
q)a:(enlist `c2sum)!enlist(sum; `c2) 
q)![t; c; b; a] 
c1 c2 c2sum 
----------- 
a  10 50 
b  20 70 
c  30 30 
a  40 50 
b  50 70 
 
 
NOTE: recall how column names are referenced by their symbol names, AND as a consequence, we had to refer to symbol literals differently - by enlisting them ? 
 
q)parse "update weather:`rain from t where rain = `true" 

`t 
,,(=;`rain;,`true)          // notice `true is enlisted ! 
0b 
(,`weather)!,,`rain         // notice `rain is enlisted doubly ! 
 
q)![t; enlist(=;`rain;enlist `true);0b;enlist[`weather]!enlist enlist[`rain]]     // see how symbol literals are enlisted 
                                                                                  // recall the functional form syntax 
                                                                                  // column names are in symbols 
                                                                                  // literal symbols are enlisted (duh) 
 
## 
##  functional delete 
## 
 
it is a simplified form of functional update. 
 
![t;c;0b;a]      // "by" param is always 0b  for functional delete  (same for non-functional delete. i.e. you cannot group by for delete) 
                 // ONLY one of "c" or "a" can/must be present. 
c : the usual list of parse trees 
a : symbol list of column names. NOTE: to make it empty, you must specify `$(), instead of just () 
 
q)t:([] c1:`a`b`c`a`b; c2:10 20 30 40 50) 
 
q)delete from t where c1=`b                        // these two are equivalent 
q)![t;enlist (=;`c1;enlist `b); 0b; `symbol$()]    // notice only constraint param is present, you delete rows 
 
q)delete c2 from t              // these two are equivalent 
q)![t;();0b;enlist `c2]         // notice only aggr param is present, you delete columns 
 
// note you can modify t in-place by specifying the table by name 
 
q)![`t;();0b;enlist `c2] 
 
 
 
 
### 
###  q-sql Examples : trades table 
### 
 
$ cat mktrades.q 
 
mktrades:{[tickers; sz]    # indentation (with at least one whitespace) is crucial if you are spanning a single statement across multiple lines 
  dt:2015.01.01+sz?31;     # date range January of 2015 
  tm:sz?24:00:00.000;      # time range 00:00:00.000 and 23:59:59.999 
  sym:sz?tickers; 
  qty:10*1+sz?1000;        # notice how semicolon is necessary (obvious if you write this definition in a single line) 
  px:90.0+(sz?2001)%100; 
  t:([] dt; tm; sym; qty; px); 
  t:`dt`tm xasc t; 
  t:update px:6*px from t where sym=`goog; 
  t:update px:2*px from t where sym=`ibm; 
  t} 
 
trades:mktrades[`aapl`goog`ibm; 10000000] 
 
q)\l /Users/kenics/q/mktrades.q      # must be strictly a SINGLE whitespace after \l 
 
q)select[10] from trades 
dt         tm           sym  qty  px 
---------------------------------------- 
2015.01.01 00:00:00.623 goog 3970 566.52 
2015.01.01 00:00:00.692 goog 8720 650.76 
2015.01.01 00:00:00.715 ibm  2510 186.98 
2015.01.01 00:00:01.916 goog 6710 575.1 
2015.01.01 00:00:02.142 goog 4930 557.22 
2015.01.01 00:00:02.265 ibm  2910 204.66 
2015.01.01 00:00:02.574 aapl 7620 94.89 
2015.01.01 00:00:02.690 goog 3830 573.3 
2015.01.01 00:00:02.786 aapl 5640 91.17 
2015.01.01 00:00:02.992 goog 7690 600 
 
 
## 
## instrument table 
## 
 
q)instr:([sym:`symbol$()] name:`symbol$(); industry:`symbol$()) 
q)`instr upsert (`ibm; `$"International Business Machines"; `$"Computer Services") 
`instr 
q)`instr upsert (`msft; `$"Microsoft"; `$"Software") 
`instr 
q)`instr upsert (`goog; `$"Google"; `$"Search") 
`instr 
q)`instr upsert (`aapl; `$"Apple"; `$"Electronics") 
`instr 
 
q)instr 
sym | name                            industry 
----| ------------------------------------------------- 
ibm | International Business Machines Computer Services 
msft| Microsoft                       Software 
goog| Google                          Search 
aapl| Apple                           Electronics 
 
 
===> let's make `sym a foreign key from trades 
 
q)update `instr$sym from `trades 
`trades 
q)meta trades 
c  | t f     a 
---| --------- 
dt | d       s 
tm | t 
sym| s instr 
qty| j 
px | f 
 
 
## 
##  basic queries 
## 
 
q) count trades 
10000000 
 
q)exec count i from trades 
10000000 
 
q)select count i from trades 

-------- 
10000000 
 
q)select count i by sym from trades 
sym | x 
----| ------- 
aapl| 3333498 
goog| 3330409 
ibm | 3336093 
 
q)() xkey select count i by sym from trades      # you can unkey if you like 
sym  x 
------------ 
aapl 3333508 
goog 3336580 
ibm  3329912 
 
q)select[10] from trades where dt=2015.01.15,sym=`aapl 
dt         tm           sym  qty  px 
---------------------------------------- 
2015.01.15 00:00:00.387 aapl 9890 103.29 
2015.01.15 00:00:01.362 aapl 4350 97.37 
2015.01.15 00:00:01.379 aapl 30   91.12 
2015.01.15 00:00:02.428 aapl 5280 99.44 
2015.01.15 00:00:03.249 aapl 5280 104.19 
2015.01.15 00:00:04.608 aapl 2700 100.29 
2015.01.15 00:00:05.367 aapl 6750 105.07 
2015.01.15 00:00:06.174 aapl 9550 105.6 
2015.01.15 00:00:06.894 aapl 3750 97.4 
2015.01.15 00:00:07.479 aapl 5430 108.41 
 
q)select[10] from trades where sym=`goog, tm within 12:00:00 13:00:00 
dt         tm           sym  qty  px 
---------------------------------------- 
2015.01.01 12:00:00.316 goog 4320 597.66 
2015.01.01 12:00:00.374 goog 7370 552.66 
2015.01.01 12:00:03.617 goog 1150 640.02 
2015.01.01 12:00:04.193 goog 4190 622.8 
2015.01.01 12:00:06.546 goog 290  655.2 
2015.01.01 12:00:07.248 goog 6650 553.62 
2015.01.01 12:00:08.245 goog 9780 557.34 
2015.01.01 12:00:09.378 goog 6900 571.74 
2015.01.01 12:00:09.515 goog 3920 554.82 
2015.01.01 12:00:09.600 goog 5740 608.22 
 
q)noon: 12:00:00                        # recall if you use variables for within args, you must specify a general list 
q)thirteen: 13:00:00 
q)select[0] from trades where sym=`goog, tm within (noon;thirteen)      # equivalent to the above query 

 
 
q)select maxpx:max px by dt from trades where sym=`aapl 
dt        | maxpx 
----------| -----     # max px for `aapl daily 
2015.01.01| 110 
2015.01.02| 110 
2015.01.03| 110 
2015.01.04| 110 
.. 
 
 
q)select lo:min px, hi:max px by sym.name from trades 
name                           | lo  hi 
-------------------------------| ------- 
Apple                          | 90  110 
Google                         | 540 660 
International Business Machines| 180 220 
 
q)select totq:sum qty, avgq:avg qty by sym from trades where sym in `ibm`goog 
sym | totq        avgq 
----| -------------------- 
goog| 16670706340 5005.603 
ibm | 16702382420 5006.57 
 
q)select hi:max px,lo:min px,open:first px, close:last px by dt,tm.minute from trades where sym=`goog 
dt         minute| hi     lo     open   close 
-----------------| --------------------------      # finding minute interval hi,lo,op,close for `goog 
2015.01.01 00:00 | 658.98 540.72 610.98 561.66 
2015.01.01 00:01 | 659.4  542.82 594.24 595.86 
2015.01.01 00:02 | 652.5  540    625.26 583.5 
.. 
 
 
## 
##  more query examples 
## 
 
q)select vwap:qty wavg px by dt from trades where sym=`ibm 
dt        | vwap 
----------| --------     # daily vwap for `ibm 
2015.01.01| 200.064 
2015.01.02| 200.0207 
2015.01.03| 200.0109 
.. 
 
q)select vwap:qty wavg px by dt,100 xbar tm from trades where sym=`ibm 
dt         tm          | vwap 
-----------------------| ------     # 100-millisecond interval vwap 
2015.01.01 00:00:00.700| 194.8 
2015.01.01 00:00:01.300| 200.96 
2015.01.01 00:00:03.900| 215.34 
.. 
 
 
q)select from trades where px=(max;px) fby sym 
dt         tm           sym  qty  px 
-------------------------------------    # getting all columns with max px per sym 
2015.01.01 00:20:05.835 goog 9750 660    # you see `goog multiple times because there are many records with the same max px value 
2015.01.01 00:33:19.125 goog 3150 660 
2015.01.01 00:42:13.379 goog 8790 660 
2015.01.01 00:42:58.623 aapl 6090 110 
.. 
 
q)show atrades:select avgqty:avg qty, avgpx:avg px by sym, dt from trades 
sym  dt        | avgqty   avgpx 
---------------| ----------------- 
aapl 2015.01.01| 4997.978 99.99409 
aapl 2015.01.02| 5006.318 100.0012 
aapl 2015.01.03| 5002.49  100.0019 
aapl 2015.01.04| 5012.752 99.97018 
.. 
 
q)deltas0:{first[x] -': x}       # finding the dates when the avgpx went up 
q)select dt, avgpx by sym from atrades where 0<deltas0 avgpx 
sym | dt 
----| ------------------------------------------------------------------ 
aapl| 2015.01.02 2015.01.03 2015.01.05 2015.01.07 2015.01.08 2015.01.10.. 
goog| 2015.01.01 2015.01.02 2015.01.03 2015.01.05 2015.01.07 2015.01.00.. 
ibm | 2015.01.05 2015.01.07 2015.01.08 2015.01.10 2015.01.11 2015.01.13.. 
 
q)select 2#dt, 2#avgpx by sym from atrades where 0<deltas0 avgpx     # only displaying 2 per columns 
sym | dt                    avgpx 
----| --------------------------------------- 
aapl| 2015.01.02 2015.01.03 100.0012 100.0019 
goog| 2015.01.01 2015.01.02 599.8873 600.0021 
ibm | 2015.01.05 2015.01.07 200.0634 200.0022 
 
 
q)select 2#dt,2#tm,2#qty,2#px by sym from trades 
sym | dt                    tm                        qty       px 
----| ----------------------------------------------------------------------- 
aapl| 2015.01.01 2015.01.01 00:00:00.602 00:00:00.840 540 1260  94.63 92.87 
goog| 2015.01.01 2015.01.01 00:00:00.448 00:00:01.039 6940 7260 540.18 560.04 
ibm | 2015.01.01 2015.01.01 00:00:00.754 00:00:01.377 3100 5150 194.8 200.96 
 
q) dntrades : select dt,tm,qty,px by sym from trades 
q) select sym,cnt:count each dt,avgpx:avg each px from dntrades 
sym  cnt     avgpx 
--------------------- 
aapl 3333498 99.99694 
goog 3330409 600.02 
ibm  3336093 200.0041 
 
q)select sym,vwap:qty wavg' px from dntrades 
sym  vwap 
------------- 
aapl 99.99915 
goog 600.0493 
ibm  200.0061 
 
 
note: here is a cool example. we calculate the max potential profit we could've made for each symbol. 
      we use "mins" function. 
 
https://code.kx.com/wiki/Reference/mins 
e.g. 
 
q)mins 2 5 7 1 3 
2 2 2 1 1 
 
q)mins "genie" 
"geeee" 
 
q)select max px-mins px by sym from trades 
sym | px 
----| --- 
aapl| 20 
goog| 120 
ibm | 40 
 
==> similarly, you can get the max possible loss you could've made per symbol. 
 
q)select min px-maxs px by sym from trades 
sym | px 
----| ---- 
aapl| -20 
goog| -120 
ibm | -40 
 
 
### 
###  pivot table 
### 
 
http://code.kx.com/q4m3/9_Queries_q-sql/#9135-excursion-pivot-table 
 
q)show u:`$exec string asc distinct sym from trades 
`aapl`goog`ibm 
 
q)3 # trades 
dt         tm           sym  qty  px 
---------------------------------------- 
2015.01.01 00:00:01.129 aapl 6410 99.18 
2015.01.01 00:00:02.099 aapl 5200 93.82 
2015.01.01 00:00:02.670 ibm  6840 196.92 
... 
 
q)exec u#(`$string sym)!qty by dt.week from trades 
          | aapl goog ibm 
----------| --------------     // essentially, used the step function dict, which gives the first match (first qty) 
2014.12.29| 6410 7400 6840     // then exec promotes the resulting dict (of each row) to a table 
2015.01.05| 3420 4410 2350     // but what if we wanted to, for example, count how many records of qty, instead of first[] value ? 
2015.01.12| 5250 3380 3980     // or any other aggregation function for that matter ? 
2015.01.19| 9050 5400 2390     // it's trivial to set it up like below 
2015.01.26| 4420 70   7160 
 
q)select N:count qty by dt.week,sym from trades 
week       sym | N 
---------------| ------ 
2014.12.29 aapl| 430480 
2014.12.29 goog| 430869 
2014.12.29 ibm | 429009 
2015.01.05 aapl| 752402 
2015.01.05 goog| 752805 
2015.01.05 ibm | 750531 
2015.01.12 aapl| 753484 
2015.01.12 goog| 752564 
2015.01.12 ibm | 752854 
2015.01.19 aapl| 750792 
2015.01.19 goog| 754248 
2015.01.19 ibm | 752062 
2015.01.26 aapl| 646350 
2015.01.26 goog| 646094 
2015.01.26 ibm | 645456 
 
q)exec u#(`$string sym)!N by week from t 
          | aapl   goog   ibm 
----------| -------------------- 
2014.12.29| 430480 430869 429009 
2015.01.05| 752402 752805 750531 
2015.01.12| 753484 752564 752854 
2015.01.19| 750792 754248 752062 
2015.01.26| 646350 646094 645456 
 
===> lets put it all together into a generalized function 
 
//  t = input table data 
// p1 = pivot col 1 
// p2 = pivot col 2 
//  c = col to aggr 
// fn = aggr func 
 
pivot:{[t;p1;p2;c;fn] 
 c:(p1;p2;c); 
 u:`$string asc ?[t;();();(distinct;c 1)];                   // exec distinct sym from t 
 t:?[t;();b!b:c 0 1;enlist[`N]!enlist(fn;c 2)];              // select N:count qty by dt.week,sym from t 
 ?[t;();c 0;(#;enlist u;(!;($;enlist[`];(string;c 1));`N))]  // exec u#(`$string sym)!N by week from t 
 } 
 
e.g. 
 
q)pivot[trades;`dt;`sym;`qty;avg] 
          | aapl     goog     ibm 
----------| -------------------------- 
2015.01.01| 4989.021 4997.958 5001.316 
2015.01.02| 5004.019 5009.737 5008.435 
2015.01.03| 5011.448 5006.199 5010.306 
2015.01.04| 5002.787 5005.804 5011.205 
2015.01.05| 5000.981 5000.529 5001.409 
2015.01.06| 5004.591 5001.383 5016.646 
2015.01.07| 4995.854 5022.027 4998.582 
2015.01.08| 5015.4   5011.719 5016.685 
2015.01.09| 5001.951 5000.2   4993.349 
2015.01.10| 5006.982 5009.731 5008.115 
... 
 
q)pivot[trades;`dt;`sym;`qty;count] 
          | aapl   goog   ibm 
----------| -------------------- 
2015.01.01| 107566 107195 107608 
2015.01.02| 107735 107822 106510 
2015.01.03| 107237 107839 107494 
2015.01.04| 107942 108013 107397 
2015.01.05| 107470 107646 107271 
2015.01.06| 107288 107575 107317 
2015.01.07| 107706 107436 107546 
2015.01.08| 107356 107629 107172 
2015.01.09| 107116 107494 107267 
2015.01.10| 107637 107530 106665 
... 
 
 
################################### 
####     Execution Control     #### 
################################### 
 
### 
###  conditional eval  $[] function 
### 
 
$[ pred_cond; expr_true; expr_false]    # eval pred_cond, 
                                        # execute expr_true if true 
 e.g.                                   # execute expr_false if false 
 
q) $[1b; 42; 9*6] 
42 
 
q) $[0b; 42; 9*6]        # NOTE: obviously nice to make return data type consistent between expr_true and expr_false 
54 
 
note: $[] conditional doesn't create any local lexical scope. 
 e.g. 
 
q)a 
'a 
q)$[1b; a:42; a:999] 
42 
q)a 
42 
 
q)v:0N 
q)$[null v; `isnull; `notnull] 
`isnull 
 
NOTE: if you want multiple operations executed for expr_{true|false}, then you enclose it with [] 
      syntax is  $[pred_cond; [expr_t1; expr_t2; ,,,; expr_tN] ; [expr_f1; expr_f2; ,,,; expr_fN]] 
 e.g. 
 
q)v:42 
q) $[v=42; [a:6; b:7; `everything]; [a:`Life; b:`the; c:`Universe; a,b,c]]    # notice only the very last clause's output is returned 
`everything 
q) $[v=43; [a:6; b:7; `everything]; [a:`Life; b:`the; c:`Universe; a,b,c]] 
`Life`the`Universe 
 
 
NOTE:  $[cond1 & cond2; expr_t1; expt_f1]  is not  lazy evaluation (aka call-by-need) 
       so even if cond1 evaluates to 0b, it still attempts to eval cond2 
       so dont write your code assuming cond2 doesnt get called if cond1 evals to 0b 
       then the entire script can die. 
 
(ref) https://en.wikipedia.org/wiki/Lazy_evaluation 
 
 
## 
##  if else-if else conditional 
## 
 
syntax is   $[pred_cond_1 ; expr_t1; pred_cond_2; expr_t2; expr_f]     # you can infinitely go down. 
 
 e.g. 
 
q)a:0 
q)$[a=0;`zero; a>0;`pos; `neg] 
`zero 
 
q)a:42 
q)$[a=0;`zero; a>0;`pos; `neg] 
`pos 
 
q)a:-42 
q)$[a=0;`zero; a>0;`pos; `neg] 
`neg 
 
 
### 
###  vector conditional eval  ?[x;y;z] function       // distinguish this from rand/deal/roll operator ?[x;y] 
###                                                   // also distinguish this from functional select/exec ?[t;c;b;a] 
 
syntax is   ?[v; expr_true; expr_false] 
 
e.g. 
 
q)?[1100b;"abcd";"ABCD"] 
"abCD" 
q)?[1100b;"a";"ABCD"] 
"aaCD" 
q)?[1100b;"abcd";"X"] 
"abXX" 
q)?[1b;"abcd";"X"] 
"abcd" 
q)?[0b;"abcd";"X"] 
"X" 
q)?[0b;"abcd";"ABCD"] 
"ABCD" 
 
// more examples 
 
q)L: til 10 
q)L mod 3 
0 1 2 0 1 2 0 1 2 0 
 
q)`boolean$ L mod 3 
0110110110b 
 
q) ?[`boolean$L mod 3; L; -999] 
-999 1 2 -999 4 5 -999 7 8 -999 
 
===> this can be very powerful. 
 
q)t:([] c1:1.1 2.2 3.3; c2:10 20 30; c3:100 200 300) 
q)update mix:?[c1>2.0; c3; c2] from t 
c1  c2 c3  mix 
-------------- 
1.1 10 100 10 
2.2 20 200 200 
3.3 30 300 300 
 
note: you can easily get complex like below. 
 
q)update band:?[c2 within 5 15; 1; ?[c2 within 16 25; 2; 3]] from t 
c1  c2 c3  band 
--------------- 
1.1 10 100 1 
2.2 20 200 2 
3.3 30 300 3 
 
 
### 
###  if[] statement    (not a function, thus returns nothing) 
### 
 
syntax is  if[pred_cond; expr_t1; expr_t2; ,,, expr_tN] 
 
q)a:42 
q)b:98.6 
q)if[a=42;x:6;y:7;b:a*b]     # NOTE: it's not a function, thus doesn't return anything 
q)x 

q)y 

q)b 
4141.2 
 
note: a legitimate use of if[] is to check error. 
 
e.g. 
 
.. 
.. 
if[a >= 0; '`$"error value of a must be negative"];      // yes usually a symbol message after a single quote 
if[a >= 0; 'aHasPositiveValue];                          // or you also can do a bare err msg like this 
.. 
.. 
 
 
// another use case may be when processing data (daily data for example) 
// and if today's data is empty, then you wanna return an empty table with the proper schema, instead of just () 
// because downstream application may expect the usual schema and attempt to conduct some join etc 
// e.g. 
 
et:([] name:`symbol$(); price:`float$(); volume;`int$());   // empty table schema 
if[count[todayData]=0; :et];                                // here you return et, and exit the function (very common technique) 
... 
... 
 
if you function doesn't return anything by design, then you will need to return (::)  e.g.  if[count[t]=; :(::)]; 
 
NOTE:  just like $[],  if[] is also not lazy evaluation. so beware. 
       e.g. 
       if[cond1 & cond2; foo:123]      // cond2 still gets evaluated even if cond1 = 0b 
 
### 
###  do[] statement   (not a func, thus returns nothing) 
### 
 
syntax is do[a_non_negative_int_x;  expr_1; expr_2; ,,, expr_N] 
 
it repeats expr_1~N for x times. 
 
the only legimate use is to measure the elapsed time of a quick operation. 
 
e.g. 
 
q)\t do[100; v*v:til 1000000] 
677 
 
q)\t:100 v*v:til 1000000       # but \t:100 can facilitate that anyway. so do[] is probably never necessary. 
681 
 
 
### 
###  while[] statement 
### 
 
syntax is  while[pred_cond; expr_1; expr_2; ,,, ; expr_N] 
 
NOTE: probably never necessary 
 
### 
###  return signals 
### 
 

#  successful return with ":" 

 
if you start a statement with a colon ":" then it returns immediately. 
 e.g. 
 
q)f:{ a:x; b:y; :a*b; "End"}    # notice :a*b returns immediately 
q)f[6;7]                        # never reaches "End" 
42 
 

#  return error with ' 

 
q)g:{ a:x; b:y; '"End"; a:b} 
q)g[6;7] 
'End 
 
..    # recall a good use of if[] is to check for error 
.. 
if[a >= 0; '"error. value of a must be negative"] 
.. 
.. 
 
 
### 
###  protected eval (aka "trap")     @[]  and .[] 
### 
 
try-and-catch semantics. so your code doesn't crash on error. 
 
syntax is 
 
@[f_monadic ; arg ; expr_fail] 
.[f_multivalent ; L_arg ; expr_fail] 
 
 
q) s:"6*7" 
q) @[value; s; `failed]       # monadic func 
42 
 
q) s:"6*`7" 
q) @[value; s; `failed]      # here it failed, but it doesn't die 
`failed 
q) 
 
q) prod:{x*y} 
q) .[prod; (6;7); `failed]     # multivalent func 
42 
 
q) .[prod; (6;`7); `failed] 
`failed 
q) 
 
 
//   a more realistic example 
 
dirPath:"/some/dir/"; 
fileName:`foo; 
fileList: @[system; "/bin/ls ",dirPath,string[fileName],"_",ssr[string .z.D;".";""],"_*.csv"; ()]; 
 
==> here suppose you are trying to catch all files /some/dir/foo_YYYYMMDD_*.csv  where * is a shell wildcard, it can be some version number or whatever. 
 
fhList: hsym `$ fileList;     // then you can proceed like this. 
 
 
##################### 
##    debugging    ## 
##################### 
 
## 
##  0N!  is a useful print function 
## 
e.g. 
 
q)7 * 0N!3 

21 
 
q){a:x*y; 0N!b:a+x; c:b-a}[3;4] 
15 

 
## 
##  break    (breakpoint) 
## 
 
q)f:{a:x*x; b:y*y; a+b} 
q)f:{a:x*x; b:y*y; break; a+b} 
q)f[3;4] 
{a:x*x; b:y*y; break; a+b} 
'break 
q))                           # here you entered a suspended session 
q))a                          # you can inspect variables like this 

q))b 
16 
q))\                          # backslash to exit 
q) 
 
## 
##   \e 
## 
 
in certain cases, error doesnt invoke the debugging session (e.g. error caused by other q instance, or compiled lib, etc) 
you can enable error trapping via \e 
 
q)\e 1      // to enable error trap 
q)\e 0      // to disable error trap 
 
in general, you wanna handle unexpected error with protected eval. 
 
 
################# 
##   script    ## 
################# 
 
files with .q suffix, which you can load with \l command, or load as an input arg at the start up. 
 
e.g. 
 
$ cat /tmp/mktrades.q 
 
mktrades:{[tickers; sz] 
  dt:2015.01.01+sz?31;     # NOTE indentation is required if you span a single statement across multiple lines 
  tm:sz?24:00:00.000;             (first line must not be indented) 
  sym:sz?tickers; 
  qty:10*1+sz?1000; 
  px:90.0+(sz?2001)%100; 
  t:([] dt; tm; sym; qty; px); 
  t:`dt`tm xasc t; 
  t:update px:6*px from t where sym=`goog; 
  t:update px:2*px from t where sym=`ibm; 
  t} 
 
trades:mktrades[`aapl`goog`ibm; 1000000] 
 
 
q)\l /tmp/mktrades.q      # recall you can only put a single whitespace after \l and before the script name 
 
or 
 
$ q /tmp/mktrades.q 
 
 
### 
###  command line params 
### 
 
$ cat test.q 
 
size:"I"$.z.x 0        # .z.x is a list of cmd line args. .z.x[0] = the first arg 
 
$ q test.q 1234 
q)size 
1234 
 
 
########################### 
####        IO         #### 
########################### 
 
a handle == a symbol name of resources (such as files, machines, networks), which you open, read/write, then close. 
 
serialize == translate into a format that can be transferred across multiple storage media, e.g. file, network, variables. 
            e.g. serialize a table, a column, a variables, parameters, etc etc 
 
in q IO, there are two kinds of data. 
 
1. Binary : a binary file == a list of byte lists. its function has "1" 
2. Text   : a text file == a list of strings (i.e. a list of char lists). its function has "0" 
 
e.g. 
 
read1       # binary 
read0       # text 
 
#### 
####   Binary data 
#### 
 
### 
###   file handle 
### 
 
a file handle is   `:/path/to/afile     # see the syntax, backtick colon then file path 
 
NOTE: if a file name contains a char like "-" then q attempts to evaluate as arithmetic operator, so it's generally a good idea to make it a string then convert to symbol file handle. 
e.g 
 
q)`$":/tmp/foo" 
`:/tmp/foo            # if you see a file handle returned, it means successful. 
 
q)hsym `$"/tmp/foo"     # suppose you get a file name as a string variable, then all you wanna do is add a colon prefix then cast it to symbol 
`:/tmp/foo              # "hsym" is a nice function that adds a colon ":" 
 
NOTE: repetitively using hsym is harmless, so you may see it used (when it's not really necessary) to be safe. 
    e.g. 
q)hsym hsym `$"/tmp/foo" 
`:/tmp/foo 
 
q)hcount `$":/tmp/foo"    # "hcount" gives you byte size 
11 
 
q)hdel `$":/tmp/foo"      # "hdel" delets a file. (be careful) 
`:/tmp/foo 
 
 
### 
###  read/write binary data to files 
### 
 
"set" and "get" 
 
e.g. 
 
q) (`$":/tmp/a") set 34     # plain and simple 
`:/tmp/a                    # don't forget parenthesis, because of left-to-right eval 
 
q)get `$":/tmp/a"           # realistically, you don't need a single var as a file like this. 
34                          # but writing a table to a file, as persisted storage, can be super useful. 
 
q)hdl : `$":/tmp/a"         # of course, a file handle can be assigned to a variable 
 
q)hdl set 999               # NOTE: "set" overwrites if any existing file, otherwise creates a new file. 
`:/tmp/a 
 
q)get hdl 
999 
 
q) (`$":/tmp/L") set 34 12 56 
`:/tmp/L 
 
q)get `$":/tmp/L"           # writing a list to a file 
34 12 56 
 
q)(`$":/tmp/t") set ([] c1:`a`b`c; c2:10 20 30)        # writing a table to a file 
`:/tmp/t 
 
q)get `$":/tmp/t" 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
q)t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3) 
 
q)save `$":/tmp/t"                                 # same as  q)(`$":/tmp/t") set t 
`:/tmp/t                                           # "save" here looks for a global variable "t" then saves to /tmp/t 
 
q)get `:/tmp/t 
c1 c2 c3 
--------- 
a  10 1.1 
b  20 2.2 
c  30 3.3 
 
 
q)(`$":/tmp/a-b/t") set ([] c1:`a`b`c; c2:10 20 30) 
`:/tmp/a-b/t 
 
q)\l /tmp/a-b/t           # notice if you use \l then you can use a bare file name (even it can include a dash) 
`t                        # but of course, remember you can only put a single whitespace after \l 
 
q)t 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
q)load `$":/tmp/a-b/t"     # another way to load a file. (difference from "get" is you actually loaded into a variable) 
`t                         # here you have to use `$"" cast 
 
q)t 
c1 c2 
----- 
a  10 
b  20 
c  30 
 
 
## 
##  open handle 
## 
 
"hopen" a handle, then assign it to some var. 
 
q)`:/tmp/L set 10 20 30 
`:/tmp/L 
q)h:hopen `:/tmp/L      # if you hopen an existing file, it is "append", not "overwrite" 
q)h 
3i                   # 3i here is an integer for the open handle (file descriptor number) it can be any integer. 
q)h[34]              # appending 34 as an item 
3i 
q)h 100 200          # appending 100 200 simple list 
3i 
 
q)hclose h            # don't forget to hclose   (otherwise buffer may not get flushed) 
 
q)get `:/data/L 
10 20 30 34 100 200 
 
 
## 
##  reading/writing binary  with read1 and 1: 
## 
 
read1    to read binary 
  1:     to write binary 
 
e.g. 
 
q)read1 `:/data/L set 10 20 30                # "read1" is the binary read operator 
0xfe2007000000000003000000000000000a0....     # you get binary representation, if you read1 
 
q)`:/data/a 1: 0x06072a       #  "1:" is the binary write operator, yeah it's terse 
`:/data/a 
 
q)read1 `:/data/a 
0x06072a 
 
## 
##   a cryptic alternative to "set" 
## 
 
instead of "set", you can use general application "." for dyadic func ":" 
e.g. 
.[file_handle; (); :; data]       # why can't people use "set" 
 
e.g. 
 
q).[`:/data/raw; (); :; 1001 1002 1003]     # be able to recognize when you see it. 
`:/data/raw 
q)get `:/data/raw 
1001 1002 1003 
 
to append, use "," 
 
e.g. 
 
q).[`:/data/raw; (); ,; 42]      # just be able to recognize when you see it. 
`:/data/raw 
q)get `:/data/raw            # "42" got appended 
1001 1002 1003 42 
 
 
############################## 
###     Splayed Tables     ### 
############################## 
 
eventually your data becomes too big to be saved as a single file. 
instead of saving an entire table into a file, you can save by columns. (called splayed table) 
what happens is columns get loaded on demand, so as long as you select specific columns, only those get loaded onto mem. 
 
e.g. 
 
q)t:([] c1:10 20 30; c2:1.1 2.2 3.3) 
q)`:/tmp/tsplay/ set t                 # syntax is subtle. you put a trailing "/" thus making a handle point to a directory, not a file. 
`:/tmp/tsplay/ 
 
q)system "ls -la /tmp/tsplay" 
"total 24" 
"drwxr-xr-x   5 kenics  wheel  170 Mar  3 02:24 ." 
"drwxrwxrwt  18 root    wheel  612 Mar  3 02:25 .." 
"-rw-r--r--   1 kenics  wheel   14 Mar  3 02:24 .d"     # ".d" contains the column order. see below 
"-rw-r--r--   1 kenics  wheel   40 Mar  3 02:24 c1"     # column name 
"-rw-r--r--   1 kenics  wheel   40 Mar  3 02:24 c2"     # column name 
 
q)get `:/tmp/tsplay/.d     # keeps the column order info  (column names in "symbol") 
`c1`c2 
 
NOTE: this means you can easily change the "table schema" by editting .d file 
NOTE: you cannot splay a keyed table. (but you can use link column in regular table) 
NOTE: all columns must be simple list, or "compound" list (list of simple lists of uniform type) 
      e.g. a column of string type is a perfect example of compound list. 
      compound column's type appear as upper case in meta[tbl] output. 
      but don't trust meta[tbl] output too much. it only checks the first item of each column. 
      e.g. 
      q)meta ([] c1:(1 2 3f ;12,34,56;`ibm)) 
      c | t f a 
      --| ----- 
      c1| F         //  this is NOT a compound list ! 
 
NOTE: symbol MUST be enumerated in splayed table. (duh, very tedious, but .Q.en[] takes care of it) 
 
 e.g. 
 
q)`:/tmp/tok/ set ([] c1:2000.01.01+til 3; c2:1 2 3)      # success 
`:/tmp/tok/ 
 
q)`:/tmp/tok/ set ([] c1:1 2 3; c2:(1.1 2.2; enlist 3.3; 4.4 5.5))     # success 
`:/tmp/tok/ 
 
q)`:/tmp/toops/ set ([] c1:1 2 3; c2:(1;`1;"a"))     # cannot splay general list table 
'type 
 
q)`:/tmp/toops/ set ([] c1:`a`b`c; c2:10 20 30)      # symbol MUST be enumerated. 
'type 
 
 
NOTE: how to enumerate symbol columns before splaying a table? 
 
q)`:/tmp/tsplay/ set ([] `sym?c1:`a`b`c; c2:10 20 30)     # here is a tedious example. notice the syntax "?" 
`:/tmp/tsplay/                                            # this cmd `sym? actually newly created "sym" variable on the spot 
q)sym                        # you keep distinct symbol list in "sym" variable 
`a`b`c                       # (you can use a custom name other than default "sym" if you must 
q)`:/tmp/sym set sym         # and keep it one level below tsplay dir. 
`:/tmp/sym                   # this is a tedious work, so .Q.en[] does this for you. as below. 
 
q)t:([] c1:`a`b`c; c2:10 20 30) 
q)`:/tmp/tsplay/ set .Q.en[ `:/tmp ; t ]     # NOTE: 1st arg is a file handle to the dir. (don't put / at the end) 
`:/tmp/tsplay/                               # 2nd arg is the table.  i.e.  .Q.en[file_hdl_to_dir, table_name] 
 
---> it automatically creates /tmp/sym 
 
NOTE: think about the question "how do i enumerate and splay a table with a custom sym file name ?" 
      the answer is precisely the above, instead of .Q.en[], you just do `mysym?colname 
 
e.g. 
 
q)-3! t:([] c1:`a`b`c; c2:10 20 30; c3:`ibm`ms`gs)        // no enumeration yet 
"+`c1`c2`c3!(`a`b`c;10 20 30;`ibm`ms`gs)"                 // column c1 & c3 are symbol type 
 
q)update `mysym?c1, `mysym?c3 from t       // this is the syntax to enumerate 
c1 c2 c3 
---------                                  // did it work ? 
a  10 ibm 
b  20 ms 
c  30 gs 
 
q)-3! update `mysym?c1, `mysym?c3 from t 
"+`c1`c2`c3!(`mysym$`a`b`c;10 20 30;`mysym$`ibm`ms`gs)"      // yes it did 
 
q)mysym 
`a`b`c`ibm`ms`gs 
 
// if you want to do a more systematic way, then here it is 
 
q)show clist: exec c from meta[t] where t = "s" 
`c1`c3 
q)symName : `mysym 
 
q) -3!  ![t;();0b;clist!(?;enlist[symName]) ,/: clist]       // recall symbol in parse tree must be enlisted, hence enlist[symName] 
"+`c1`c2`c3!(`mysym$`a`b`c;10 20 30;`mysym$`ibm`ms`gs)"      // coming up with ,/: should be a cake walk 
 
q)(hsym `$"/some/path/mysym") set mysym      // persisting onto disk 
 
NOTE: if you are creating this splayed table as a new partition, then just before you enumerate, make sure you lock & load the existing mysym file from disk, then enumerate, then save back to disk (then unlock) 
      recall .Q.en[] automatically does this for you, but it only lets you use "sym" as sym file name. so if you want your custom sym file name, then you must do these steps manually. ALSO, make sure you don't save sym file compressed. just system"x .z.zd" and .z.zd:17 2 6 around the sym file save step. 
e.g. 
q).Q.en    // notice how `sym is hard-coded 
k){[s;d;x]if[#f@:&{$[11h=@*x;&/11h=@:'x;11h=@x]}'x f:!+x;(`/:d,`sym)??,/?:'{$[0h=@x;,/x;x]}'x f];@[x;f;{$[0h=@y;(-1_+\0,#:'y)_x[`sym;,/y];x[`sym;y]]}s]}[?] 
 
 
NOTE:  it turns out kx implemented .Q.ens[] that lets you specify your own sym file name ! 
 
e.g. 
(hsym `$"/tmp/tsplay/") set .Q.en[`:/tmp ; t]               // creates "sym" file 
(hsym `$"/tmp/tsplay/") set .Q.ens[`:/tmp ; t; `mysym]      // creates "mysym" file 
 
 

#  splayed table with compound columns 

 
when you splay a table with compound columns, those compound columns produce two files per column. 
notice one extra file with sharp "#" below 
(reason is purely for performance, then get saved/loaded automatically, so you don't have to worry about them too much) 
 
e.g. 
 
q)`:/tmp/testTbl/ set ([] c1:("foo";enlist "b";"ar")) 
q)system "ls -a /tmp/testTbl" 
,"." 
".." 
".d" 
"c1"         // this is a list of indices to reconstruct the compound column from c1# file 
"c1#"        // binary data of the column data in a raze'ed (i.e. flattened) way 
 
 
NOTE: a big question every kdb+ designer must consider is to keep the column type "symbol" or "string" 
      - if the possible value is from a small universe/domain, and often repeted, then use symbol (symbol, being atom, is much faster than string, also the benefit of enumeration kicks in when the same symbol appears many times) 
      - if the possible value can be any free text with not much repetition, like news headline, then use string (because if you enumerate such column, your sym file can quickly grow huge, i.e. inefficient) 
 
 
## 
##  mapping / loading splayed tables 
## 
 
q)\l /tmp             // option 1   (load both the sym file and map tsplay table) 
 
$q /tmp/tsplay        // option 2   (only maps tsplay table, no sym file loaded) 
 
q)\l /tmp/tsplay      // option 3   (only maps tsplay table, no sym file loaded) 
 
NOTE: at this point, you only "mapped", and haven'y actually loaded any data into memory. 
NOTE: option 1 will try load everything in /tmp so if that directory has other garbage, then you may get error. beware. 
NOTE: option 2 & 3 dont load "sym" file automatically - make sure you load it, like below 
 
q)sym:get hsym `$"/tmp/sym" 
q)sym 
`a`b`v`ibm`msft`aapl 
 
q)select c1,c3 from tsplay   // recall this only actually "loads" c1 & c3 from disk, not not c2 
                             // hence the performance of splayed table 
 
q)t: get hsym `$"/tmp/tsplay/"     // option 4   (only maps, no load to mem yet) 
q)t: get hsym `$"/tmp/tsplay"      // option 5   (maps & loads to mem.) 
                                   // you have to load sym file like above separeately for both option 4 & 5 
 
 
#### 
####  text file 
#### 
 
read0   to read text       # recall "1" is for binary 
  0:    to write text      # recall "0" is for text 
 
q)system "cat /tmp/solong.txt" 
"So long suckers" 
"and thanks" 
"for nothing" 
 
q)read0 `:/tmp/solong.txt    # a text file is treated as a list of string. i.e. a list of char lists. 
"So long suckers" 
"and thanks" 
"for nothing" 
 
q) `:/tmp/greeting.txt  0: ("hello and"; "goodbye so long"; "see you later bye")        # so basically "set" is binary write, and 0: is text write 
`:/tmp/greeting.txt 
 
q)read0 `:/tmp/greeting.txt 
"hello and" 
"goodbye so long" 
"see you later bye" 
 
q)get `:/tmp/greeting.txt      # error. cannot read text with "get" 
'/tmp/greeting.txt 
 
q)read1 `$":/tmp/greeting.txt"                      # you can read text file as binary like this 
0x68656c6c6f20616e640a676f6f6462796520736f206c6.. 
 
q)"c"$read1 `$":/tmp/greeting.txt"                  # obviously, if you cast to char, it works 
"hello and\ngoodbye so long\nsee you later bye\n" 
 
 
NOTE:  0:  doesn't work if compression is enabled. 
e.g. 
q).z.zd:17 2 6 
q)`:/tmp/foo 0: ("hoge";"bar") 
q)\cat /tmp/foo 
kxzipped...          // doesn't work, but if you add ".txt" suffix, it works. 
                     // one way is to save .txt and remove .txt 
                     // alternatively, you can disable compression by  q)\x .z.zd   and re-enable it. 
 
## 
##  writing to text file, with "save" 
## 
 
q)t:([] c1:`a`b`c; c2:10 20 30; c3:1.1 2.2 3.3) 
 
q)save `:/tmp/t       # recall "save"   here it's just binary write, equivalent to  q)`:/tmp/t set t 
`:/tmp/t 
 
q)save `:/tmp/t.txt    # "save" is smart, based on file extention, it changes to "text" write. 
`:/tmp/t.txt           # in which case, save == 0:      (obviously 0: is more flexible because you can specify a filename) 
 
q)system "cat /tmp/t.txt"     # .txt adds a tab delimiter 
"c1\tc2\tc3" 
"a\t10\t1.1" 
"b\t20\t2.2" 
"c\t30\t3.3" 
 
q)save `:/tmp/t.csv 
`:/tmp/t.csv 
 
q)system "cat /tmp/t.csv"      # .csv is comma separated 
"c1,c2,c3" 
"a,10,1.1" 
"b,20,2.2" 
"c,30,3.3" 
 
q)save `:/tmp/t.xml 
`:/tmp/t.xml 
 
q)system "cat /tmp/t.xml"       # .xml 
"<R>" 
"<r><c1>a</c1><c2>10</c2><c3>1.1</c3></r>" 
"<r><c1>b</c1><c2>20</c2><c3>2.2</c3></r>" 
"<r><c1>c</c1><c2>30</c2><c3>3.3</c3></r>" 
"</R>" 
 
q)save `:/tmp/t.xls       # you can even do excel spreadsheet format also 
`:/tmp/t.xls 
 
 
## 
##  using "hopen" and "hclose"  on text files 
## 
 
just like binary files, you can use hopen / hclose on text file too. 
 
q) h:hopen `$":/tmp/new.txt" 
q) neg[h] enlist "this"         # the catch is you use neg[h]  which is a very cryptic syntax. 
-3i                             # also don't forget to enclose a string in this case, because a text file is a list of strings 
q) neg[h] ("is";"pizza") 
-3i 
q) hclose h                     # note you hclose on h, not on neg[h] 
q) read0 `$":/tmp/new.txt" 
"this" 
"is" 
"pizza" 
 
 
q) h:hopen `$":/tmp/new.txt"     # recall, hoepn on an existing file means "append", not "overwrite" 
q) neg[h] ("or";"pasta") 
-3i 
 
q) hclose h 
 
q) read0 `$":/tmp/new.txt" 
"this" 
"is" 
"pizza" 
"or" 
"pasta" 
 
 
## 
##  preparing/formating text 
## 
 
another overload of  0: lets you format text 
 
 
q)show t:([] c1:`a`b`c; c2:1 2 3) 
c1 c2 
----- 
a  1 
b  2 
c  3 
 
q)"|"  0:  t 
"c1|c2" 
"a|1" 
"b|2" 
"c|3" 
 
q)csv       #  here is a pre-defined constant in q 
"," 
 
q) "," 0: t 
"c1,c2" 
"a,1" 
"b,2" 
"c,3" 
 
q) csv  0:  t 
"c1,c2" 
"a,1" 
"b,2" 
"c,3" 
 
q) `:/tmp/foo.csv 0: csv 0: t        # so you will see something like this 
`:/tmp/foo.csv 
 
q)read0 `:/tmp/foo.csv 
"c1,c2" 
"a,1" 
"b,2" 
"c,3" 
 
 
### 
###   parsing text 
### 
 
## 
##  parsing fixed-length text 
## 
 
http://code.kx.com/q4m3/11_IO/#1151-fixed-width-records 
 
 
## 
##  parsing variable-length text 
## 
 
syntax is 
(L_upper_case_char ; delim)  0:  file_handle      # for text 
(L_lower_case_char ; delim)  1:  file_handle      # for binary 
 
e.g. 
 
q)system "cat /tmp/hoge.csv"      # suppose you have this CSV file 
"1001,DBT12345678,98.6" 
"1002,EQT98765432,24.75" 
"1004,CCR00000001,121.23" 
 
q)("JSF"; ",") 0:  `:/tmp/hoge.csv     # see the syntax, upper case char list, then delimieter 
1001        1002        1004           # the output is just a list of lists 
DBT12345678 EQT98765432 CCR00000001 
98.6        24.75       121.23 
 
q)`c1`c2`c3 ! ("JSF"; ",") 0:  `:/tmp/hoge.csv        # so you can easily construct a column dict 
c1| 1001        1002        1004 
c2| DBT12345678 EQT98765432 CCR00000001 
c3| 98.6        24.75       121.23 
 
q)flip `c1`c2`c3 ! ("JSF"; ",") 0:  `:/tmp/hoge.csv      # and even construct a table 
c1   c2          c3 
----------------------- 
1001 DBT12345678 98.6 
1002 EQT98765432 24.75 
1004 CCR00000001 121.23 
 
NOTE: if you wanted to extract the above 2nd column as string, not symbol, then just change "S" to "*" 
 
q)flip `c1`c2`c3 ! ("J*F"; ",") 0:  `:/tmp/hoge.csv      # yes, it's "*", because "C" means a single char 
c1   c2            c3 
------------------------- 
1001 "DBT12345678" 98.6 
1002 "EQT98765432" 24.75 
1004 "CCR00000001" 121.23 
 
 
NOTE: what if your input file has a header ? 
 
$ cat /tmp/t.csv          # here is a file with a header 
id,ticker,price 
1001,DBT12345678,98.6 
1002,EQT98765432,24.7 
1004,CCR00000001,121.23 
 
q)("JSF"; enlist ",")  0: `:/tmp/t.csv     # then use "enlist" on delimiter, 
id   ticker      price                     # then the result is magically a table !! 
----------------------- 
1001 DBT12345678 98.6 
1002 EQT98765432 24.7 
1004 CCR00000001 121.23 
 
NOTE: to skip a column, just use a whitespace. 
      e.g.  "J FD" means you skip the 2nd column, and any columns after the 4th columns. 
 
NOTE: what if your input is key-value pair ? 
 
====> yet another special case of  0: 
 
q) "S=;" 0: "one=1;two=2;three=3"       # see the syntax, left operand to 0:  is 3 letter string 
one two three                           # 1st is key variable type "S" for symbol, "I" for integer 
,"1" ,"2" ,"3"                          # 2nd is key-value separator, like "=" in this case 
                                        # 3rd is pair delimiter 
                                        # value is always string 
                                        # output is a paired list (which is easy to transform to a column dict, then into a table) 
 
q) "S:/" 0: "one:1/two:2/three:3"      # another example 
one  two  three 
,"1" ,"2" ,"3" 
 
q) "I=;" 0: "1=one;2=two;3=three"      # another example 
1     2     3 
"one" "two" "three" 
 
q)flip `k`v!"I=;" 0: "1=one;2=two;3=three"      # see how easy it is to transform to a table 
k v 
--------- 
1 "one" 
2 "two" 
3 "three" 
 
 
### 
###  stdout, stderr 
### 
 
recall the file handle you hopen[] and hclose[] 
 
e.g. 
q)h:hopen `:hostName:portNum 
q)h                            // recall h is just an integer 
3i 
q)hclose h 
 
===>  "1" is STDOUT file handle.   "-1" or neg[1] is the same but appends "\n" at the end 
      "2" is STDERR file handle.   "-2" or neg[2] appends "\n" 
 
 
-1 "foobar";     // you can print log msg, or the content of variable 
show `foobar;    // alternative way 
 
 
### 
###  x 2: y          // dynamic load 
### 
 
- x is a file handle of C dynamic library (.so) 
- y is a list of argument 
 
// assume you have /foo/bar/libname.so 
 
libpath: "/foo/bar/libname";                        // without .so suffix 
cFuncHandle: (hsym `$libpath) 2: (`funcname;5);     // function name in symbol, and the number of args 
 
cFuncHandle[12; 34; 56; 78 ;90] 
 
 
ref:   https://code.kx.com/wiki/Reference/TwoColon 
 
 
 
################################################### 
###    interprocess communication  (aka IPC)    ### 
################################################### 
 
start two q processes - if one opens a tcp port (it becomes a server), then the other can access it (as client). 
 
in general, if you run q code, it just exits once it finishes every line. 
interactive mode is infinite while loop to wait for STDIN. 
to run a q server (that waits for input from clients on a certain port), you start q with -p N (or call system "p ",string[portNum] inside your q code) 
to exit a q server, it must call exit[]  (or you kill it externally) 
 
$ cat pw.txt 
tom:password1       // pw can be md5[]  like bwloe 
alice:password2 

 
q)h:hopen `:/tmp/pw.txt 
q)neg[h] 0N!"tom:",raze string md5 "password1" 
"tom:7c6a180b36896a0a8c02787eeafb0e4c"           // user can still supply password1 when connecting 
-3i 
q)hclose h 
 
 
$ q -p 5423 [-u pw_filename]   // server (let's call it ebisu) opened port 5432 
                               // there are two use cases of -u option. 
                               // 1) -u pw_filename 
                               // 2) -u 1  (this is enabled by default) prevents system command from remote user. causes 'access 
                               //          also restricts access to outside the start dir, 
                               // -U pw_filename means -u 0   so remote user can even call \\ to exit server proc 
 
q)\p                           // default value is 0i 
5423i 
 
q)\p 7890                      // you can change if needed 
 
// another q process (could be on the same or diff host/network) as client 
$ q 
q)h: hopen `:ebisu:5432           // notice the syntax  `:host:port[:user:pw]    e.g. `::5432:alice:password2 
q)h: hopen `:192.168.29.12:5432   // ip addr is ok also                                you get 'access error if not authenticated 
q)h: hopen `:www.foo.com:5432     // url is ok also 
q)h: hopen `::5432                // you can omit if the server is on the same localhost 
q)h: hopen 5432                   // equivalent to `::5432 
q)h: hopen (5432; 1000)           // if you supply the 2nd arg, e.g. 1000, it means timeout in milli-sec 
 
q)h       // recall handle is just an int (not the same int as the handle on the server side) 
5i 
 
q)hclose h           // close the connection 
 
NOTE: to avoid collision with already currently used ports, you can do use infinity to let q pick an un-used port. 
 
$ q -p 0W           // to know which port got allocated, run system "p" 
 
NOTE: positive port number means single-threaded. a negative port number means multi-threaded. 
 
### 
###  remote execution format - string[q statement] 
### 
 
q)h "3+4"          // "synchronous" - waits for the server response (aka "get" semantics) 

q)neg[h] "3+4"     // "asynchronous" - doesnt wait for the server response (aka "set" semantics) 
q)                 // more realistic when you just send something for which you don't need a reply 
                   // e.g. you just update some variable, like  neg[h] "mkt:`closed" 
                   // or when you send some subscription request then you expect responses later 
 
NOTE: when doing asynch call, your msg may be buffered and not actually sent yet. so it is safe to do a chaser synch call to flush buffer. 
e.g. 
q)h ""        // this will suffice as "chaser" so client waits until the server finishes all your async call. 
              // similarly, though less through, h[] or h(::) will block the proc until all prev async msg is flushed to the NW 
 
### 
###  another format of remote execution - parse tree 
### 
 
q)h ({x+y}; 3; 4)       // notice the form (f; arg1; arg2; ...; argN) 
7                       // the first arg can be any of function, list, dictionary (we call them a "map") 
 
q)h ({x,x}; 3 4)        // another example 
3 4 3 4 
 
q)f:{x*y}               // NOTE: if you use the value (as opposed to the symbolic reference), 
q)h (f; 3; 4)           //       then it calls "client" side map. 
12 
 
// on server 
q)f:{x+y}               // lets define this on "server" side 
 
// back on client side 
q)h (f; 3; 4)           // calls f[x;y] defined on "client" side 
12 
q)h (`f; 3; 4)          // calls f[x;y] defined on "server" side 

 
 
q)h (system; "who")      // to see who is logged on the server 
 
 
 
### 
###  qSQL stored proc 
### 
 
q)t:([] name:`ibm`msft`aapl; price:12 34 56)           // on server 
q)getPriceByTicker:{select price from t where name = x} 
 
q)h (`getPriceByTicker; `ibm)             // on client 
price 
----- 
12 
 
 
### 
###  callback using .z.w 
### 
 
.z.w  (w for who) has the handle of the remote proc (client in this case) 
 
q)f:{neg[.z.w] (`callback; x+y)}      // on server, calling client's callback function (named `callback in this case) 
                                      // make this async as in neg[.z.w] otherwise you get deadlock 
 
q)callback:{show x*x}         // on client 
q)h (`f; 3; 4) 
49 
q) 
 
 
### 
###   protecting servers with  .z.pg  &  .z.ps 
### 
 
.z.pg     //  invoked at every  synchronous call   (pg = proc get) 
.z.ps     //  invoked at every asynchronous call   (ps = proc set) 
 
these are not defined by default. the default behavior is just to apply value[x] for any incoming call. 
lets make it safer. 
 
// first step is to only accept a symbol literal of a function name defined on server side, so client cannot execute their arbitrary code on server 
 
.z.pg:{if[-11<>type first x]; '`$"first arg must be a symbol literal"} 
 
// lets make it even more restricted, by allowing only a list of functions. 
 
.z.pg:{funcName: first x; if[-11<>type funcName; '`$"first arg must be a symbol"]; if[not funcName in `f1`f2`f3; 'funcName]} 
 
// finally, you can add reval[x] to safely execute 
 
.z.pg:{funcName: first x; if[-11<>type funcName; '`$"first arg must be a symbol"]; if[not funcName in `f1`f2`f3; 'funcName]; reval x} 
 
 
NOTE: other "callbacks" 
 
.z.po      // invoked when a connection is opened 
.z.pc      // invoked when a connection is closed 
 
.z.p{o,c,g,s} are aka "callbacks" because the following will yield the "client side" (aka remote session) data. 
 
.z.u      // in other words, you can collect data about incoming connections 
.z.a      // note, they are only these 3. others like .z.h, .z.z, .z.p are all still server side (i.e. current session) 
.z.w 
 
### 
###  keeping track of user access 
### 
 
$ q -p 7890         // server 
 
q)meta accessLog:1!([] handle:"i"$(); user:"s"$(); src:"s"$(); time:"p"$(); alive:"b"$()) 
c     | t f a 
------| ----- 
handle| i 
user  | s 
src   | s 
time  | p 
alive | b 
 
q)accessLog 
handle| user src time alive 
------| ------------------- 
 
q).z.po:{[h] `accessLog upsert (h; .z.u; .Q.host[.z.a]; .z.P; 1b)}    // h == .z.w 
q).z.pc:{[h] update time:.z.P, alive:0b from `accessLog where handle = h} 
 
// (client connects) 
$ q                 // client 
q)h:hopen 7890 
q)h 
7i 
 
// back to server side 
q)get `accessLog 
handle| user   src       time                          alive 
------| ---------------------------------------------------- 
10    | kenics localhost 2018.12.16D02:46:58.837910000 1 
 
 
// on client side 
q)hclose h 
 
// back to server side 
q)get `accessLog 
handle| user   src       time                          alive 
------| ---------------------------------------------------- 
10    | kenics localhost 2018.12.16D07:50:22.629110000 0 
 
 
### 
###  .z.vs[x;y]         # vs = variable set 
### 
 
when a global variable is changed, .z.vs[x;y] is invoked. 
x = variable name 
y = index           // empty list () if the whole variable 
 
note: useful for debugging/logging 
 
 
### 
###   qcon binary 
### 
 
qcon host:port[:usr:pw]         // syntax 
 
q)       // this essentially is ssh login into the server 
 
what happens in reality is you open/close TCP remote access to the server at every command 
typing \\ will only quit your local session, not the server 
but if you type "exit 0" it will shut down the server 
 
### 
###  http web sockets 
### 
 
http://code.kx.com/q4m3/11_IO/#117-http-and-web-sockets 
 
 
 
 
##################################### 
####     namespace / context     #### 
##################################### 
 
if a q script runs and calls another q script, and both scripts define global variables, then the 2nd script overwrites. 
(a typical name clash) 
 
hierarchical context (aka namespace) is separated by a dot "." 
 
notice namespace/context is a dictionary 
 
q)get `.             # your global variables 
hdl| `:/tmp/a 
t  | +`c1`c2!(`a`b`c;1 2 3) 
h  | 3i 
a  | 56 
 
q).conf.a : 34       # it created a context "conf" then defined a variable "a" 
q).conf.b : 76 
q).conf.f : {x*x}    # you can define a function too, of course 
q).conf.f 4 
16 
 
q)get `.conf 
 | ::              # note, this "::" is just there to prevent q from collapsing a dictionary value list into a single list. not a big deal. 
a| 34 
b| 76 
f| {x*x} 
 
q)get `.conf.a    # called fully qualified name of variable "a" 
34 
 
q)`.conf[`b]     # because it's a dict, you can access like this 
76 
 
q).conf[`b]      # or like this 
76 
 
q)key `              # to view the root contexts. (usually single letter context and .kx namespace are reserved in q) 
`q`Q`h`j`o`conf 
 
q)key .conf       # to view .conf context dict key 
``a`b`f 
 
NOTE: if you must delete a variable from a context, use "delete" template 
  e.g. 
 
q)delete b from `.conf 
`.conf 
 
q)get `.conf 
 | :: 
a| 34 
f| {x*x} 
 
 
### 
###   saving & loading contexts 
### 
 
a context is just a dictionary, so we can save to a file, and later reload. 
 
q).conf        # let's save this context 
 | :: 
a| 34 
b| 76 
f| {x*x} 
 
q)`:/tmp/conf set get `.conf     # get then set   (get used to this step) 
`:/tmp/conf 
 
q)\\             # exit a q session once 
$ q              # start a new session 
q)get `.conf     # notice .conf namespace doesn't exist 
'.conf 
 
q) get `:/tmp/conf 
 | :: 
a| 34 
b| 76 
f| {x*x} 
 
q)`.conf set get `:/tmp/conf     # get set again 
`.conf 
 
q)get `.conf          # .conf namespace loaded again 
 | :: 
a| 34 
b| 76 
f| {x*x} 
 
 
NOTE: \d lets you move (kind of like "cd") into a context, but not recommended. unless you are debugging. 
      common situation you need \d (dashd) is when your .Q.* fails in error. 
      then your q) console ends up in q.Q) namespace aka context  // lets call it qdotq 
      you can simply do \d .  which will take you back to the root context q) 
      e.g. 
      q).Q.en[`foo;`bar] 
      q.Q)             // oh no, we fell into qdotq context 
      q.Q)             // how do i get out ? 
      q.Q)\d . 
      q)               // yay 
 
by now, you are familiar with the both expressions below 
 
q)kt 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
q)get `kt 
eid | name  age 
----| --------- 
1001| tom   65 
1002| bob   19 
1003| simon 34 
 
 
 
#################################################### 
####    system commands  &  system variables    #### 
#################################################### 
 
see here for the definitive list - http://code.kx.com/q4m3/13_Commands_and_System_Variables/ 
 
NOTE: it's very case sensisitve. e.g. \p is for port number, \P is for float decimal place precision 
 
NOTE: it's very sensitive to whitespace 
      q) \p 12345   // only one whitespace between the command and its arg 
        ^ 
       this whitespace not allowed 
 
NOTE: to use system commands inside a .q script, you must use system[] 
      e.g. 
      q)\l /tmp/fooTbl       //  system "l /tmp/fooTbl" 
 
## 
##  q commands 
## 
 
command  |  description 
----------------------------- 
  \v     |  list of variables 
  \a     |  list of tables 
  \f     |  list of functions 
  \b     |  list of views/alias 
  \c     |  size of console screen 
  \P     |  decimal place precision for float 
  \p     |  port number 
  \l     |  load a file (like a q script) or a dir (like a splayed table) 
  \t     |  timer  (in millisec) 
  \ts    |  timer and space (in millisec and bytes) 
  \s     |  shows how many slaves (threads or processes) 
  \w     |  workspace mem info 
  \x     |  expunge event handler 
  \      |  terminate 
  \\     |  exit    (for q code, it is better to exit with "exit" func. e.g.  exit[0] 
 
recall you must use system[] if you are calling the above commands within a q script. 
 
e.g. 
 
q)system "P"      // these two are identical 
7i 
q)\P 
7i 
 
q)\v 
`L`a`age`d`dc`ds`dt`e`foo`height`iq`k`kt`kt2`ktc`m`mix`t`t1`t2`tf`tp`tt`ttt`u`v 
 
q)\a 
`dt`k`kt`kt2`ktc`t`t1`t2`tf`tp`tt`ttt 
 
q)\f 
`apply`f`g`sq 
 
q)\c 1000 1000 
q) \c 1000 1000         // note you cannot put any whitespace 
'\ 
 
q)\P 
7i 
q)4 % 3f 
1.333333 
 
q)\P 4 
q)4 % 3f 
1.333 
 
 

#  work space mem 

 
q)\w 
289936 67108864 67108864 0 0 8589934592     // 6 longs, each defined as below 
 
1. # of bytes allocated 
2. # of bytes available in heap 
3. max heap size used so far in the current session 
4. max bytes specified at startup with -w 
5. # of bytes for mapped entities 
6. # of bytes of physical host memory 
 
also, \w 0 returns two longs; 
1. # of symbols 
2. # of bytes used for the symbols 
 
e.g. 
 
q)\w 0 
637 21009 
 
===>  .Q.w[]  displays the above info in human readable way 
 
q).Q.w[] 
used| 290096 
heap| 67108864 
peak| 67108864 
wmax| 0 
mmap| 0 
mphy| 8589934592 
syms| 637 
symw| 21009 
 
 
(at startup) 
$ q -w 12345      // this sets max mem 12345 MB,  i.e. approx 12GB 
q)                // otherwise it sets -w 0  by default, which means no limit 
q) do something mem intensive 
-w abort          // then it will die as "-w abort" or 'wsfull 
$                 // if you specified -w then it will die as "-w abort" otherwise it will die as "wsfull" 
 
// mem intensive operation example 
q)f/[0; str] 
"kenics" 
q)f/[1; str]       // keep increasing this then you soon explode 
"kenicskenics" 
q)f/[2; str] 
"kenicskenicskenicskenics" 
q)f/[3; str] 
"kenicskenicskenicskenicskenicskenicskenicskenics" 
q)count a : f/[30; str] 
'wsfull 
 
 

#   slaves 

 
q)\s            # number of slaves 
0i              # you MUST set this at the start up, e.g.  q -s 10 
 
q -s N      # if N is a positive int, then it just starts N "threads"  (which q will use whenever it can to optimize) 
            # if N is a negative int, then "peach" and each-previous "':" will call .z.pd which is an integer list of handles to processes (aka workers/slaves, but you must start those worker/slave procs separately) 
 
https://code.kx.com/q/ref/dotz/#zpd-peach-handles 
https://code.kx.com/q/cookbook/load-balancing/ 
 
 
 
q)\l /path/to/afile      # recall with get hsym `$"/path/to/afile" you needed parenthesis around the path and conversion to symbol 
                         # but with \l, you can supply a raw path. 
                         # in this case, assume you have a table named "afile" 
                         # if you specify a directory, it can get complicated. it might potentially load all files under a dir recursively, and crash file system, so be careful. 
                         (ref) http://code.kx.com/q4m3/13_Commands_and_System_Variables/#13111-load-l 
 
 
NOTE: to run q commands in a script, use a builtin func "system" so you don't have to escape the backslash 
e.g. 
 
q)system "p 5042"            // notice the syntax. when invoking q cmd within system[], no need for backslash 
q)system "c 1000 1000"       // also, system checks if the supplied cmd exists at all. 
 
// note some param you can specify at startup 
e.g. 
 
> q -p 5042       // same as system "p 5042" later 
 
 
q).z.zd:17 2 6      // enabled file compression.  (17;2;6) is the standard zip 
q)\x .z.zd          // reset 
q).z.zd 
'.z.zd 
 
 
NOTE: when you splay a table, symbol columns get enumerated and you get the sym file, which shouldn't be compressed. 
      it just causes issues apparently. (similarly, .d file shouldn't be compressed apparently) 
 
one way to solve it is simply load and save every such sym file like below. 
e.g. 
 
q) {x!{x set get x} each x} hsymList          // how to uncompress 
 

#  timer  \t    // overloaded for two diff use cases 

 
// 
//  case 1 
// 
 
q)\t 1234     // a positive integer N, then this calls whatever is defined in .z.ts every N milliseconds 
              // you can specify this at the startup, e.g.  q -t 1234 
              // set to 0 to disable 
// 
//  case 2 
// 
 
q)\t sum til 10000       # measures how many milliseconds to sum 0 to 9999 
0                        # zero millisec. you wanna repeat it to get some number 
 
q)\t:1000 sum til 100000    # \t:x  lets you run the subsequent command x times 
471                         # 471 milliseconds 
 
note: you can use \ts exactly the same way. 
 
e.g. 
 
q)\ts:1000 sum til 100000 
435 1048816                    // 435 msec, and 1MB 
 
## 
##  running OS commands within q session 
## 
 
like you can do `ls -la /tmp` in Perl. 
 
there are two ways in q. 
 
q)\ls -la /tmp 
q)system "ls -la /tmp" 
 
system[x] is smart enough to know if you are running Q commands (tried first) or OS commands in x 
 
 
## 
##  debug 
## 
 
q)f:{[x] foo:x ; break ; foo*foo }     # suppose you execute something, and it gets an error, and goes into a debug mode 
q)f[12]                                # where you can inspect intermediate variables 
'break 
q)) 
q))x            # like this 
12 
q))\            # a single backtick terminates the debug session 
q)x 
'x 
 
 
### 
###   command line options 
### 
 
https://code.kx.com/wiki/Reference/Cmdline 
 
many of them we already studied here. 
 
-g 1 or 0    // 1 means run .Q.gc[] immediately whenever appropriate. 
             // 0 means only runs .Q.gc[] when user invokes it or malloc fails 
-s 5         // start 5 slaves 
-P 6         // float precision 
-w 12345     // workspace mem limit.  12345 MB = 12GB 
-b           // block client write access 
-c 200 300   // display 200 rows & 300 columns 
-t 123       // invokes .z.ts  every 123 milliseconds (default is 0, i.e. not invoking .z.ts[] at all) 
-p 98765     // listens to port 98765   (i.e. runs as server. and other kdb procs can access this server proc at port 98765) 
 
 
### 
###  system variables  (.z.* namespace) 
### 
 
http://code.kx.com/q/ref/dotz/ 
 
- variables for "info" like .z.u is your user id 
- variables for "callbacks" like .z.exit is invoked whenever q sesion exits 
 
q).z.d          # GMT date 
2018.02.24 
q).z.D          # local date      (obviously, used a lot) 
2018.02.24 
 
q).z.t          # GMT time     // 'time' data type 
05:30:44.427 
q).z.T          # local time 
00:30:45.387 
 
q).z.z                    # GMT datetime         (recall datetime is deprecated) 
2018.07.30T00:51:47.514 
q).z.Z                    # local datetime 
2018.07.29T20:51:52.635 
 
q).z.p                            # GMT timestamp    (use timestamp, instead of datetime) 
2018.12.12D00:19:11.181399000 
q).z.P                            # local timestamp 
2018.12.11D19:19:12.829446000 
 
 
 
$ q /some/path/startup.q        # suppose you start up a q session with startup.q 
q).z.f                          # this shows the symbol name of the startup file/script 
`/some/path/startup.q 
 
$ q startup.q -s 10 -abc 34 foobar 3.14 -g 1 
q).z.f 
`startup.q 
q).z.x                            # command line args in string list 
("-abc";"34";"foobar";"3.14")     # so you need to parse them as needed. use .Q.opt[.z.x] 
                                  # notice options recognized by q (like -s 10 -g 1 in this case) are not included 
q).z.i      # PID 
7697i 
 
q).z.h                       # hostname 
`kenics-macbook-pro.local 
 
q).z.K      # q version 
3.5 
 
q).z.u      # user 
`kenics 
 
q){.z.s}[]     # self-reference function, used to implement recursive calls. 
{.z.s} 
 
e.g. 
 
q)fact:{$[x<=0;1;x*.z.s x-1]}      // factorial. note .z.s[] can blow up stack quickly 
q)fact[5]                          // usually you can implement better recursive call with over "/" 
120 
 
q).z.a        // ip addr in int 
169685442i 
q)"." sv string `short$ 0x0 vs .z.a     // ip addr in the usual format 
"10.29.49.194" 
 
q).z.ts:{show x}     # x is .z.P 
 
q).z.zd:17 2 6       # compression param. "zd" = zip default.  (17;2;6) is the standard zip 
                     # 17 = logical block size, 2 = compression algo, 6 = compression level 
 
NOTE: compressed files can be read the same way as uncompressed files. 
NOTE: to uncompress a file, simply do this 
     q)fh:hsym `$"/path/to/afile" 
     q)system "x .z.zd" 
     q)fh set get fh 
 
NOTE: should we always compress because it saves disk space ? 
      ==> no, because you don't wanna save text file compressed. 
      ==> also, sym file and .d file shall not be compressed. (causes error. hopefully this restriction goes away in a future version of q) 
 
NOTE: in terms of computing speed, (de)compression itself can add extra processing time when saving to (and reading from) disk. 
      but if your file system is some remote NFS, then because compressed file size is smaller, your I/O time is reduced, and you may end up saving more time even considering the extra time caused by (de)compression itself. 
 
## 
##  how to tell if a file is compressed. 
## 
 
q)fh : hsym `$"/tmp/compressedFile" 
q)-21! fh                            //    -21!  is the internal function that checks the status of compression 
compressedLength  | 137349 
uncompressedLength| 80000016 
algorithm         | 2i 
logicalBlockSize  | 17i 
zipLevel          | 6i 
 
q)-21! hsym `$"/tmp/unCompressedFile"        //  -21!  returns nothing if a file is uncompressed 
q) 
q)count -21!`:test       // so this is a neat way to check 

 
(ref) http://code.kx.com/q/cookbook/file-compression/ 
 
## 
##  saving a file whose name contains a dot 
## 
 
as of q3.5 
 
q).z.zd:17 2 6 
q)(hsym `$"/tmp/foo.db") set t     // this doesn't compress, because file names contains a dot 
`:/tmp/foo.db                      // regardless of the suffix. it's the dot. 
 
q)((hsym `$"/tmp/foo.db"),.z.zd) set t     // now it is compressed 
`:/tmp/foo.db                              // BUT be careful, this errors if .z.zd is unset 
 
==> there are many ways to go about this. 
    - save without a dot, then rename the file. 
    - maybe have a .my.save[path;t] that checks .z.zd and handles it accordingly 
e.g. 
 
.my.save:{[filepath;t] 
 fh:hsym `$filepath; 
 $[() ~ key `.z.zd;        // a common procedure 
   fh set t; 
   (fh,.z.zd) set t; 
  ]; 
 } 
 
NOTE: why no compression if a file name contains a dot "." ? 
      it's likely because to not compress .d file in a splayed table 
 
 

#   exit callback/status 

 
q).z.exit:{show "exiting with code ",string x}       # this executes when q exits 
                                                     # usually you wanna set some "free" routines as needed 
                                                     # maybe you launched slave procs or occupied some port, etc 
q) exit 34 
34i 
 
or 
 
q)\\          # another way to exit 
0i 
 
 
 
### 
###  web markup / html / http functions (.h.* namespace) 
### 
 
http://code.kx.com/q/ref/doth/ 
 
 
################################################### 
###     utility functions (.Q.* namespace)      ### 
################################################### 
 
http://code.kx.com/q/ref/dotq/ 
 
### 
###  .Q.an  .Q.a  Q.aA  .Q.n 
### 
 
q).Q.an 
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789" 
q).Q.a 
"abcdefghijklmnopqrstuvwxyz" 
q).Q.A 
"ABCDEFGHIJKLMNOPQRSTUVWXYZ" 
q).Q.n 
"0123456789" 
 
### 
###  .Q.addmonths[x;y] 
### 
 
q).Q.addmonths[2018.01.16; 7]     // literally adds months 
2018.08.16 
q).Q.addmonths[2018.01.29; 2] 
2018.03.29 
q).Q.addmonths[2018.01.29; 1]     // notice it can overflow to next month like this 
2018.03.01 
q).Q.addmonths[2018.01.31; 1]     // another overflow example 
2018.03.03 
 
### 
###  .Q.s[]     // format to string so you can print to, for example, a log file. 
### 
 
e.g. 
 
q)t 
name val 
-------- 
a    12 
b    34 
c    56 
 
q).Q.s[t] 
"name val\n--------\na    12 \nb    34 \nc    56 \n" 
 
 
### 
###  .Q.en[]   .Q.ens[] 
### 
 
see splay table section 
 
### 
###  .Q.gc[]       // garbage collection 
### 
 
run garbage collector, and returns how many bytes freed/returned to the OS heap 
 
q).Q.gc[] 

 
q)a:til 12345678 
 
q)delete a from `. 
`. 
 
q).Q.gc[] 
134217728    // 134 MB ? 
 
 
### 
###  .Q.dpft[d;p;f;t] 
### 
                                         #  e.g. 
d = hsym of the dir to save sym file.    #  `:/tmp 
p = partition value.                     #  2018.12.15 
f = a column to xasc                     #  `ticker 
t = table                                #  tradeTbl 
 
NOTE: the problem is it runs .Q.en[] which assumes your enumeration sym file name is "sym" 
      so if you want other name for your sym file, then you cannot use .Q.dpft[] 
 
### 
###  .Q.id[x] 
### 
 
- x is an atom (usually a symbol) 
- returns alphanumeric (and underscore _) of x 
 
the idea is x is some data you want to use as "id" like a column name. 
if it starts with a number or non-alphanumeric characters, or contains q operator words, then it causes problems when you try to query it later, using q-sql. 
so .Q.id[] transforms it by removing q operator words, and prepending alphabets or appending numbers 
 
e.g. 
q).Q.id `$"Ibm/Msft_Aapl.Nflx"   // removed non-alphanumeric 
`IbmMsft_AaplNflx 
 
q).Q.id each `ibm.n`aapl.q      // removed a dot 
`ibmn`aaplq 
 
q).Q.id 2009.11.23 
`a20091123                  // removed a dot, and appended `a 
 
q)`$1_string .Q.id 2009.11.23     // maybe we can do this 
`20091123 
 
q)show t:flip `div`sum!(12 34; 56 78) 
div sum                                  // these col names are legal, but q-sql cannot work on these 
------- 
12  56 
34  78 
 
q).Q.id t 
div1 sum1              // neat 
--------- 
12   56 
34   78 
 
 
### 
###  .Q.ind[] 
### 
 
### 
###  .Q.fs[] 
### 
 
### 
###  .Q.view[] 
### 
 
### 
###  .Q.ty[x] 
### 
 
type[] with string output 
 
q)type 1.2 
-9h 
q)type 1.2 3.4 
9h 
 
q)type "a" 
-10h 
q)type "abc" 
10h 
 
q).Q.ty 1.2 
"F" 
q).Q.ty 1.2 3.4 
"f" 
 
q).Q.ty "a" 
"C" 
q).Q.ty "abc" 
"c" 
 
### 
###  .Q.opt[.z.x]  and  .Q.def[x;y]     // "def" for default, not definition 
### 
 
$ q -p 56437 -s 10 -ymd 2018.04.25 -exch nyse nsdq -bound 100 -offset 123 -g 1 
q).z.x 
"-ymd" 
"2018.04.25"     // so .z.x is a list of strings (not including native options like -s 10 -g 1) 
"-exch"          // .Q.opt[] takes care of parsing as below 
"nyse" 
"nsdq" 
"-bound" 
"100" 
"-offset" 
"123" 
 
q).Q.opt .z.x 
ymd   | ,"2018.04.25"      // notice it is now dictionary 
exch  | ("nyse";"nsdq")    // but value is still all string list 
bound | ,"100" 
offset| ,"123" 
 
q).Q.def[`ymd`exch`bound`offset`compress!(2000.01.01;`nyse`nsdq`tse;75f;456j;0b)] .Q.opt .z.x 
ymd     | 2018.04.25 
exch    | `nyse`nsdq 
bound   | 100f 
offset  | 123 
compress| 0b 
 
===> notice  .Q.def[] - lets you define default data type & value for each input option. 
                      - sets default values for options not supplied. 
                      - handles single vs multiple inputs cleanly 
 
### 
###  .Q.fu[f;x]     // x is a list of data 
### 
 
suppose you need to do   f each x   when f is expensive and x has repeats. 
then .Q.fu[f;x] basically does memoization to speed things up. 
 
q)show a:100000 ? 10 
8 1 9 5 4 6 6 1 8 5 4 9 2 7 0 1 9 2 1 8 8 1 7 2 4 5 4 2 7 8 5 6 4 1 3 3 7 8 2 1 4 2 8 0 5 8 5 2... 
 
q)f:{log x} 
q)\t f each a       // 66 millisec 
66 
q)\t .Q.fu[f] a     // 1 millisec     // yes commonly written as .Q.fu[f] x 

 
 
### 
###  .Q.fc[f;x]      // x is a list of data 
### 
 
just runs  f each x   using threads 
 
q)\s        // started q with -s 10 option 
10i 
q)f:{2 xexp x} 
 
q)\t f each til 123456 
111 
 
q)\t .Q.fc[f] til 123456 
13 
 
 
### 
###  .Q.dd[x;y] 
### 
 
.Q.dd:{` sv x,`$string y} 
 
recall the power of (` sv) x 
 
q)` sv `ibm`msft`aapl      // if x = a symbol list, then concatenate with a dot "." 
`ibm.msft.aapl 
 
q)` sv `:/tmp`ibm`msft`aapl    // if first x is a file or dir handle, then concatenates with a slash "/" 
`:/tmp/ibm/msft/aapl 
 
q).Q.dd[`:/tmp]`foo.txt 
`:/tmp/foo.txt 
 
q){x .Q.dd'key x} `:/tmp        // recall each-both ' lets you use a function infix way 
`:/tmp/foo.txt`:/tmp/bar.csv`:/tmp/baz.xlsx 
 
q).Q.dd[`ibm]`n 
`ibm.n 
q).Q.dd[`ibm]"n"      // why does this work ? 
`ibm.n 
q)`q = `$ string "q"      // recall this actually holds true 
1b 
 
q)`ibm`ms`gs`bac .Q.dd' `n 
`ibm.n`ms.n`gs.n`bac.n 
 
 
### 
###  .Q.w[] 
### 
 
a human friendly version of \w 
 
q).Q.w[] 
used| 290096 
heap| 67108864 
peak| 67108864 
wmax| 0 
mmap| 0 
mphy| 8589934592 
syms| 637 
symw| 2100 
 
 
###########  partition table related .Q.*   ############ 
 
### 
###  .Q.chk[] 
### 
 
see partition table section 
 
### 
###  .Q.qp[t] 
### 
 
.Q.qp[t] returns 1b   if t = partitioned table 
                 0b   if t = splayed tabled 
                 0    otherwise 
 
### 
###  .Q.pt[] 
### 
 
returns a list of partitioned tables. 
e.g. 
if[`foo in .Q.pt[]; t:select from foo where date = ymd]; 
 
### 
###  .Q.pv[] 
### 
 
 
 
############################# 
####    Intro to kdb+    #### 
############################# 
 
kdb+ : q tables being persisted aka "serialized" (being saved to disk and then mapped back into memory) 
 
RDB: realtime DB (in memory. no symbol column enumeration) 
HDB: historical DB (on disk, data being read on demand. normally partitioned. symbol columns are enumerated) 
 
q)([] s:`a`b`c; v:100 200 300)    // same as   flip `s`v!(`a`b`c;100 200 300) 
s v 
----- 
a 100 
b 200 
c 300 
 
q)show t:([] s:`symbol$(); v:`int$())     // defining a schema. NOTE this means each column must be a simple list 
s v 
--- 
 
q)meta t 
c| t f a 
-| ----- 
s| s 
v| i 
 
q)([id:1001 1002 1003] s:`a`b`c; v:100 200 300)           // same as  ([] id:1001 1002 1003)!([] s:`a`b`c; v:100 200 300) 
 id | s v 
----| ----- 
1001| a 100 
1002| b 200 
1003| c 300 
 
 
#### 
####   splayed table   ---   revisited 
#### 
 
a splayed table == a table saved to disk by individual columns, instead of a single big file. 
we splay a table when there are many columns or a table is huge and cannot fit into a single file. 
the performance benefit is that when you map a splayed table and load certain columns, you only load the selected columns, which is efficient. 
 
[restrictions] 
- cannot splay keyed table (not a big deal, you can still effectively key on a foreign column via "link columns") 
- each column must be a simple list or a compound list (i.e a list of simple lists of uniform type) 
- symbol columns MUST be enumerated. (fret not as .Q.en[] takes care of it easily) 
 
e.g. 
q)t:([] c1:`a`b`c; c2:10 20 30; c3:`ibm`msft`aapl) 
q)`:/tmp/tsplay/ set .Q.en[ `:/tmp ; t ]     # NOTE: .Q.en[arg1;arg2] 
`:/tmp/tsplay/                               #       1st arg is a file handle to the dir. (don't put / at the end) 
                                             #       2nd arg is the table.  i.e.  .Q.en[file_hdl_to_dir, table_name] 
                                             # NOTE: it's very common to write  .Q.en[h;] t  or  .Q.en[h] t  instead of  .Q.en[h;t] 
q)system "ls /tmp" 
"sym"       # this is a binary file 
"tsplay"    # this is a dir that contains .d c1 c2 c3 
 
NOTE: .Q.en[] will lock any existing sym file, load/update it, unlock it, and save back to disk. 
      (also it is smart enough to not compress sym file and .d file even if .z.zd is set) 
 
## 
##  mapping / loading splayed tables 
## 
 
q)\l /tmp             // option 1   (load both the sym file and map tsplay table) 
 
$q /tmp/tsplay        // option 2   (only maps tsplay table, no sym file loaded) 
 
q)\l /tmp/tsplay      // option 3   (only maps tsplay table, no sym file loaded) 
 
NOTE: at this point, you only "mapped", and haven'y actually loaded any data into memory. 
NOTE: option 1 will try load everything in /tmp so if that directory has other garbage, then you may get error. beware. 
NOTE: option 2 & 3 dont load "sym" file automatically - make sure you load it, like below 
 
q)sym:get hsym `$"/tmp/sym" 
q)sym 
`a`b`v`ibm`msft`aapl 
 
q)select c1,c3 from tsplay   // recall this only actually "loads" c1 & c3 from disk, not not c2 
                             // hence the performance of splayed table 
 
q)t: get hsym `$"/tmp/tsplay/"     // option 4   (only maps, no load to mem yet) 
q)t: get hsym `$"/tmp/tsplay"      // option 5   (maps & loads to mem.) 
                                   // you have to load sym file like above separeately for both option 4 & 5 
 
### 
###  manipulating splayed tables 
### 
 
- you can only use {select,exec,upsert,xasc,`#p,`#g,`#s} directly on  splayed table file handle like `:/tmp/tsplay 
  i.e. you cannot use "update" 
  suppose you want to modify or add records. ideally you can just regenerate the whole thing and save to disk, overwriting the old files. 
  but if you need to add modification to the existing files, see below. 
 
 (1) upsert 
 
  `:/tmp/tsplay upsert .Q.en[`:/tmp; ([] c1:`q`w; c2:40 50; c3:`e`r)]     // adding two more rows (meta must match) 
 
 (2) modifying existing splayed table 
 
 you could just open up a column file and modify its values, get the index based on other corresponding columns so on. but be very careful. 
 
 (3) updating schema  -- this is a realistic example. you may just want to add a new column (or delete a column) 
 
   // lets add a new column c4 
   q)`:/tmp/tsplay/c4 set (count get `:/tmp/tsplay/c1)#`       // use the null of any data type you wish, e.g. 0b, `, 0N, 0n 
   q)`:/tmp/tsplay/.d set get[`:/tmp/tsplay/.d] , `c4          // add the column name in .d file 
 
   // lets delete an existing column c1 
   q)system "rm /tmp/tsplay/c1"        // delete the sharp file too if present 
   q)`:/tmp/tsplay/.d set get[`:/tmp/tsplay/.d] except `c1 
 
### 
###  manipulating sym file    (serious business - if you corrupt sym file, then data is useless. take a complete root dir backup) 
###                                               or save a sym file backfup before any change, and make change in a reversible way if possible 
 
you may want to edit "sym" file when for example 
- (1) you want to merge two sym files as you merge two splayed tables from diff root dirs. 
- (2) you accidentally made a column symbol type when it should've been string, and it got enumerated. 
 
## 
##  case (1) you want to merge two sym files as you merge two splayed tables from diff root dirs. 
## 
 
q)`:/tmp/foo/ts1/ set .Q.en[`:/tmp/foo;] ([] ticker:`ibm`msft`aapl; price:12.34 34.56 78.90) 
`:/tmp/foo/ts1/ 
 
q)`:/tmp/bar/ts2/ set .Q.en[`:/tmp/bar;] ([] name:`amzn`msft`aapl; vol:111 222 333) 
`:/tmp/bar/ts2/ 
 
==> let's merge /tmp/bar/sym into /tmp/foo/sym  (i.e. move ts2 into /tmp/foo root dir) 
    all we have to is "de-enumerate" ts2, and splay at /tmp/foo 
 
q)get `:/tmp/foo/sym      // before merge 
`ibm`msft`aapl 
 
q)-3! tbl: get hsym `$"/tmp/bar/ts2" 
"+`name`vol!(`sym$`amzn`msft`aapl;111 222 333)"      // notice how "name" columne is enumerated 
 
q)-3! update name:value name from tbl                // just de-enumerate with value[] 
"+`name`vol!(`amzn`msft`aapl;111 222 333) 
 
 
==> we can generalize the above step. 
 
q)meta tbl 
c   | t f a 
----| ----- 
name| s 
vol | j 
 
q)show cols2value : exec c from meta[tbl] where t = "s" 
,`name 
 
q)f:{[t;colName] update colName:value colName from t}    // we must write this in functional form, because colName is parameterized 
 
q)parse "update colName:value colName from t" 

`t 
() 
0b 
(,`colName)!,(.:;`colName) 
 
q)f:{[t;colName] ![t; (); 0b; enlist[colName]!enlist(value;colName)]} 
 
q)show ts2NEW : f/[tbl;cols2value] 
name vol 
--------                // did this really de-enumerate ? 
amzn 111 
msft 222 
aapl 333 
 
q)-3! ts2NEW 
"+`name`vol!(`amzn`msft`aapl;111 222 333)"         //  yes it did 
 
q)`:/tmp/foo/ts2/ set .Q.en[`:/tmp/foo; ts2NEW]    // don't mess this up, double check the table name and root dir 
`:/tmp/foo/ts2/ 
 
q)get `:/tmp/foo/sym          // yes, sym file merged 
`ibm`msft`aapl`amzn 
 
====>  with general apply @[] you can write this even simpler. 
 
e.g. 
q)tbl: get hsym `$"/tmp/bar/ts2" 
q)cols2value : exec c from meta[tbl] where t = "s" 
q)`:/tmp/foo/ts2/ set .Q.en[`:/tmp/foo] @[tbl; cols2value; value]       // appreciate the power of @[] 
 
 
 
## 
##  (2) you accidentally made a column symbol type when it should've been string, and it got enumerated. 
## 
 
- this is more of a surgical work. 
 
q)`:/tmp/foo/ts1/ set .Q.en[`:/tmp/foo] ([] ticker:`msft`aapl`goog; vol:12 34 56) 
`:/tmp/foo/ts1 
q)`:/tmp/foo/ts2/ set .Q.en[`:/tmp/foo] ([] name:`amzn`aapl`ibm; news:`$("abcd";"xyz";"qwerty")) 
`:/tmp/foo/ts2 
 
q)\\ 
$ q 
 
q)show sym: get hsym `$"/tmp/foo/sym"            // suppose news column should've been string 
`msft`aapl`goog`amzn`ibm`abcd`xyz`qwerty 
 
q)system "l /tmp/foo"     // maps ts1 & ts2, and sym 
 
q)show ts2 : select from `:/tmp/foo/ts2 
name news 
----------- 
amzn abcd 
aapl xyz 
ibm  qwerty 
 
q)-3! ts2 
"+`name`news!(`sym$`amzn`aapl`ibm;`sym$`abcd`xyz`qwerty) 
 
q)show syms2remove : value exec news from ts2 
`abcd`xyz`qwerty 
 
NOTE: in reality, news column may contain a string whose symbol representation is actually a valid value in other columns/tables. if so you must get a distinct symbol list from the rest of your columns & other tables. so you probably need to do whitelist approach than blacklist. if you mess this up, your entire data can get corrupt. (to be precise, the columns which try to reference removed symbols in sym file no longer can resolve) 
 
q)show ts2: update news:string news from ts2 
name news 
------------- 
amzn "abcd" 
aapl "xyz" 
ibm  "qwerty" 
 
q)show sym : sym except syms2remove 
`msft`aapl`goog`amzn`ibm 
 
--> now de-enumerate and re-enumerate the columns of ALL tables in /tmp/foo 
 
q)show tbls : system "a"     // you may have loaded tables from other root dir, then just remove them manually 
`ts1`ts2 
 
q)deenum:{[tbl] @[select from tbl; exec c from meta[tbl] where t="s";value]}    // tbl is passed by name, not by value, thus "select from tbl" is needed 
q)reenum:{[tbl] @[tbl; exec c from meta[tbl] where t="s";`sym?]}                // see this explicit enum using  `sym? 
q)f:{[tbl] (hsym `$"/tmp/foo/",string[tbl],"/") set reenum deenum tbl} 
q)f each tbls 
`:/tmp/ts1/`:/tmp/ts2/ 
 
`:/tmp/foo/sym set sym         // one last step of the surgery 
 
deenum:{[tbl] @[tbl; where (type each flip tbl) within 20 76; value]}     // deenum alternative implementation 
                                                                          // tbl is passed by value 
 
### 
###   splayed tables with linked columns 
### 
 
- while this is neat and useful, it is easy to simply save two tables and join them later as needed. 
 
 
 
############################### 
###    Partitioned Table    ### 
############################### 
 
splayed table was essentially "virtical" (or "column") split, but if the data is huge, you may also further need "horizontal" (or "row") split aka "partition" by a certain column. 
 
e.g. 
 
a very typical use is market data history, where you partition by date, and splay each column (timestamp, ticker, ask, bid, etc) 
 
- every partition table is splayed  (not vice versa) 
- you can partition by a column of integer type (not "i" but underlyiing value is an int) i.e. `date`year`month`int`long`short`second`minute 
- each partition dir is named by the value of the column you partition by. 
   e.g. if you partition by date, then each partition dir name will be yyyy.mm.dd which will have a splayed table in it for that day. 
   e.g. 
 
 /mktdata               # root dir, this is where the sym file resides 
    /sym 
    /2018.04.01         # partition value 1 
         /tblName 
              /.d 
              /colName1          # NOTE: schema must match across partitions 
              /colName2          #       i.e. identical .d and columns 
              /colName3 
              ... 
    /2018.04.02         # partition value 2 
         /tblName 
              /.d 
              /colName1         # NOTE: your .d and colNameN CANNOT contain the column you partitioned by 
              /colName2         #       i.e. dont include your date column in the splayed table 
              /colName3         #            it's just redundant, and syntactically not allowed 
              ... 
    ... 
 
 
NOTE:  q/kdb will magically decide the name of the column you partitioned by, and append when you query. 
       so you just have to memorize - luckily it is intuitive. 
       2018.04.01   --> "date" 
       2018.04m     --> "month" 
       2018         --> "year" 
       0,1,2,3      --> "int" 
 
q)system "l /mktdata" 
q)t : select from mktdata where date = 2018.04.02       // recall q decided the name "date" for yyyy.mm.dd 
 
 
NOTE: as above, you cannot mix diff granularity of partition. 
      but it is ok to skip some values. e.g. you can have yyyy.mm.dd only for weekdays, skipping holidays, etc 
 
NOTE: it is perfectly a legit usage pattern to have multiple tables under each partition 
 
 /nyse 
   /sym 
   /2018.04.01 
      /trade 
      /quote 
   /2018.04.02           # NOTE: every table must be in every partitioin 
      /trade             <- this partition is missing "quote" so it'll die on error when you query "quote" tbl on this date 
   /2018.04.03              solution is just create an empty table, manually or by using .Q.chk[] 
      /trade 
      /quote 
   ... 
 
 
 
### 
###  how to create a partitioned table 
### 
 
a simple example below. 
 
 
q)`:/tmp/nyse/2018.04.01/pTbl/ set ([] timestamp:09:30:00 09:30:01; price:12.34 12.33) 
`:/tmp/nyse/2018.04.01/pTbl/ 
 
q)`:/tmp/nyse/2018.04.02/pTbl/ set ([] timestamp:09:30:00 09:30:01; price:12.35 12.31) 
`:/tmp/nyse/2018.04.02/pTbl/ 
 
NOTE: notice what you did above is nothing but saving splayed tables. 
      as we see below, it is when you load data using the root dir (in this case /tmp/nyse) that has partition-eligible values then q somehow magically interprets it as partitioned tables rather than a standalone splayed table. 
      in other words, it is wise to never create dirs/splayedTblName that can be interpreted as partition value unless you deliberately want it so. 
 
NOTE: recall each table you are saving cannot have a column whose value you partition this table by. 
      also, in reality, depending on your use case, the root dir and your table name may be the same (in this case "nyse" and "pTbl") 
 
 
q)system "l /tmp/nyse"        //  or q)\l /tmp/nyse        // recall this just "maps" the tables in this root dir 
 
q)pTbl                         // actually "load" data 
date       timestamp price     // notice q automatically added "date" column 
--------------------------     // NOTE: in reality, you likely have years of yyyy.mm.dd 
2018.04.01 09:30:00  12.34     //       so calling the entire table name like this probably blows up your server mem. 
2018.04.01 09:30:01  12.33     //       make sure you only specify  onlythe date range, and the columns you need 
2018.04.02 09:30:00  12.35 
2018.04.02 09:30:01  12.31 
 
 
NOTE: because partition tables are just "mapped" until you actually load them (the subset you need), there are operations you can & cannot run on partitioned tabled. 
 
## 
##  legal operations on partitioned tables 
## 
 
q)meta pTbl 
c        | t f a 
---------| ----- 
date     | d 
timestamp| v 
price    | f 
 
q)cols pTbl 
`date`timestamp`price 
 
q)type pTbl 
98h 
 
q)count pTbl          // this may not work if the tbl is huge 

 
q)select hp:max price, lp:min price by date from pTbl where date within 2018.04.01 2018.04.02 
date      | hp    lp 
----------| ----------- 
2018.04.01| 12.34 12.33 
2018.04.02| 12.35 12.31 
 
 
## 
##  illegal operations on partitioned tables 
## 
 
q)1 # pTbl 
'par 
 
q)select[1] from pTbl 
'nyi 
 
q)pTbl[0] 
'par 
 
q)pTbl`price 
'par 
 
q)exec price from pTbl where date = 2018.04.01      // ONLY "select" works on partitioned tables 
'nyi 
 
q)update price:123.45 from pTbl                // kx might implement others (exec,update,delete) in the future 
'par 
 
q)delete price from pTbl 
'par 
 
 
### 
###  querying partitioned tables  -  map/reduce paradigm 
### 
 
lets look at a realistic query. 
 
q)select hp:max price, lp:min price by date from pTbl where date within 2018.04.01 2018.04.02 
date      | hp    lp 
----------| ----------- 
2018.04.01| 12.34 12.33 
2018.04.02| 12.35 12.31 
 
==> recall you must narrow down to the scope of data you need, by specifying the where conditions. 
- start with the most eliminating condition first. 
   e.g.  where cnd1,cnd2,cnd3     // cnd1 should specify which partition slices you need 
 
- what happens is q loads data from each partition slice (and the columns you specified) and combine the output. 
-- if you start q with -s "slaves" (threads or processes) then the transaction on each slice is concurrently done. (which can boost the query speed until IO becomes a bottleneck, which can be solved by "segmentation" we study later) 
 
https://code.kx.com/q/ref/cmdline/#-s-slaves 
 
// lets look at the below query 
// what happens under the cover is map-reduce operation. 
// map part is getting the sum of prices (and the count of such data points) from each partition. 
// then reduce part is adding all the sums from all partitions, dividing by the sum of all the counts. 
 
q)select avg price from pTbl where date within 2018.04.01 2018.04.02 
price 
------- 
12.3325 
 
 
NOTE: virtual index column "i" is indexed within each partition slice. 
   e.g. 
 
q)select i,date,timestamp,price from pTbl 
x date       timestamp price 
----------------------------       // .Q.ind[] gives you a way to index across partitions 
0 2018.04.01 09:30:00  12.34 
1 2018.04.01 09:30:01  12.33 
0 2018.04.02 09:30:00  12.35 
1 2018.04.02 09:30:01  12.31 
 
 
### 
###   .Q.chk[] 
### 
 
recall you can have multiple tables per partition. 
 
 
 /nyse              # note this is where the sym file resides 
   /sym 
   /2018.04.01 
      /trade 
      /quote 
   /2018.04.02         # NOTE: every table must be in every partitioin 
      /trade           <- this partition is missing "quote" so it'll be error when you query 'quote' tbl on this date 
   /2018.04.03            solution is just create an empty table, manually or by using .Q.chk[] 
      /trade 
      /quote 
   ... 
 
 
q)\l /nyse 
q)\a 
`trade`quote 
q)select quote from date = 2018.04.02 
'error 
 
 
===> q actually loads table names and meta from the "most recent" partition. 
     so a good practice is to have a dummy "most recent" partition where you keep empty tables as "schema" 
 
 /nyse 
   /sym 
   /2018.04.01 
      /trade 
      /quote 
   ... 
   /2018.09.15        # suppose this is current date 
      /trade 
      /quote 
   /2099.12.31        # a dummy "most recent" date that has empty tables as schema for .Q.chk[] to observe 
      /trade 
      /quote 
 
==>  now just run  .Q.chk[] on the root dir, which creates the empty tbl in partitions where the bl is missing. 
 
e.g. 
 
q).Q.chk `:/nyse 
 
 
## 
##  partition tables with symbol columns 
## 
 
nothing special, just make sure to use .Q.en[]  as every partitioned table is a splayed table i.e. you must enumerate symbol columns. 
 
q)`:/nyse/2018.04.01/trade/ set .Q.en[`:/nyse] tradeTbl       // note where you must (and must not) use the trailing "/" 
                          ^                  ^ 
                       required            no slash 
 
q)`:/nyse/2018.04.01/quote/ set .Q.en[`:/nyse] quoteTbl 
 
 
===> "sym" file will be kept at the root dir  `:/nyse  in this case 
 
 
 
## 
##  partitioned table with link columns 
## 
 
http://code.kx.com/q4m3/14_Introduction_to_Kdb+/#14310-partitioned-tables-with-links 
 
 
#### 
####  Segmented Tables 
#### 
 
partitioned tables facilitates scalability. you can symlink a subset of partition slices to a diff underlying disk space, etc to scale further. 
slaves facilitate concurrency. but then IO becomes your bottleneck, which we solve by "segmentation" 
 
 
[BEFORE] 
 
 /nyse 
   /sym 
   /2018.04.01 
      /trade 
      /quote 
   /2018.04.02 
      /trade 
      /quote 
   /2018.04.03 
      /trade 
      /quote 
   ... 
 
 
[AFTER] 
 
$ cat /nyse/par.txt          // "par.txt" is an ascii file, placed under the root dir 
/whatever/path/foo/seg1      // it lists dir path of all segments 
/whatever/path/bar/seg2      // NOTE: seg dir CANNOT be on the root dir 
/whatever/path/baz/seg3      // you can have as many segments as you need. (i've seen 3 ~ 20 for mkt data history) 
... 
 
 
 /nyse 
   /sym 
   /par.txt 
 /whatever/path/seg1       // here we just alternate segments to save data daily 
   /2018.04.01             // this segmentation scheme may be effective if user query tends to uniformly spread across days 
     /trade 
     /quote 
   /2018.04.04 
     /trade 
     /quote 
   ... 
 /whatever/path/seg2 
   /2018.04.02 
     /trade 
     /quote 
   /2018.04.05 
     /trade 
     /quote 
   ... 
 /whatever/path/seg3 
   /2018.04.03 
     /trade 
     /quote 
   /2018.04.06 
     /trade 
     /quote 
   ... 
 
 
===> another way is alpha-split. e.g. [abcdefghij]-[klmnopqrs]-[tuvwxyz] 
 
 /nyse 
   /sym 
   /par.txt 
 /whatever/path/foo/seg1      // seg1 contains tickers starting with a-j 
   /2018.04.01                // this segmentation scheme may be effective if user query tends to uniformly spread across 
     /trade                   // the symbols you spread between segments 
     /quote 
   /2018.04.02 
     /trade 
     /quote 
   ... 
 /whatever/path/bar/seg2      // seg2 contains tickers starting with k-s 
   /2018.04.01 
     /trade 
     /quote 
   /2018.04.02 
     /trade 
     /quote 
   ... 
 /whatever/path/bza/seg3      // seg3 contains tickers starting with t-z 
   /2018.04.01 
     /trade 
     /quote 
   /2018.04.02 
     /trade 
     /quote 
   ... 
 
 
==> you can do other split sheme, like seg1 is always sp500 names, seg2 is the rest, and seg3 is all ETF, seg4 is futures, etc 
    in that case you may wanna rename seg1,2,3,4 dirs to "sp500", "nonsp500", "etf", "futures" etc 
 
==> in any case, you have to prepare the properly split data on your own before saving to respective dirs. 
 
## 
##  saving(creating) segmented tables 
## 
 
nothing new or special. just save individual splayed tables. but note the root dir (that you specify in .Q.en[]) is where your par.txt (and sym file) is. 
 
e.g.  date-split 
 
q)`:/whatever/path/foo/seg1/2018.04.01/trade/ set .Q.en[`:/nyse;]  tradeTbl 
q)`:/whatever/path/bar/seg2/2018.04.02/trade/ set .Q.en[`:/nyse;]  tradeTbl 
q)`:/whatever/path/baz/seg3/2018.04.03/trade/ set .Q.en[`:/nyse;]  tradeTbl 
 
 
e.g.  alpha-split 
 
q)`:/whatever/path/foo/seg1/2018.04.01/trade/ set .Q.en[`:/nyse;]  tradeAtoJ 
q)`:/whatever/path/bar/seg2/2018.04.01/trade/ set .Q.en[`:/nyse;]  tradeKtoS 
q)`:/whatever/path/baz/seg3/2018.04.01/trade/ set .Q.en[`:/nyse;]  tradeTtoZ 
 
 
again, recall your par.txt is in the root dir (which is /nyse in this case) 
 
$ cat /nyse/par.txt 
/whatever/path/foo/seg1 
/whatever/path/bar/seg2 
/whatever/path/baz/seg3 
 
 
## 
##  loading segmented table 
## 
 
again, nothing new or special. just load the root dir, same way as you load partitioned table. 
 
q)\l /nyse 
q)select from trade where date within 2018.04.01 2018.04.30 
 
 
## 
##  design a balanced IO-compute 
## 
 
recall the bottleneck can be IO or compute. 
 
it ultimately depends on your query. but lets look as two scenarios. 
 
-- scenario 1 : compute is easy, but IO is the bottleneck (most of kdb+ query. e.g. vwap calc) 
 
how many different IO streams/channels (e.g. diff/independent disk shares) can you facilitate ? 
let's say N channels. so you create N segments. you set up N slaves, which (if you use multi-proc instead of multi-thread) is restriceted by the number of cores. 
 
so in general, you have  N channels >= N segments >= N slaves [>= N cores] 
 
-- scenario 2 : both compute and IO are equally busy. (rare, but e.g. regression analysis) 
 
let's say one slave busy computing while another slave loading data from disk. then you have the following. 
so you can have 2N slaves for N core machine. 
also 2N segments for N channel IO. 
 
N channels >= 2N segments >= 2N slaves [>= N cores] 
 
 
 
 
 
 
 
########################### 
###       Q Tips        ### 
########################### 
 
### 
###  k language 
### 
 
- q is written in k and C. (k is written in C). so it helps to know some k. 
 
q)last             // this means last[] is implemented in C. (binary) 
last 
 
q)first            // this is k 
*: 
 
q)*: 23 45 67       // you cannot use k in q console 

 
q)k) *: 23 45 67    // but you can invoke k session like this 
23 
 
q)\                 // alternatively you can invoke k with a backslash like this 
  *: 23 45 67 
23 
  \                 // then exit with another backslash 
q) 
q) 
 
 
###################################################### 
###    k commands  -  it is useful to know some    ### 
###################################################### 
 
 
### 
###   -3!         // display 
### 
 
q)show dic:`a`b`c!(12 34; 11 22 33; 5 6 7 8) 
a| 12 34 
b| 11 22 33 
c| 5 6 7 8 
 
q)value dic        // this can be hard to read 
12 34 
11 22 33 
5 6 7 8 
 
q)-3!value dic                 // -3! is a display k function 
"(12 34;11 22 33;5 6 7 8)" 
 
q)show dic:`a`b`c!() 
a| 
b| 
c| 
 
q)value dic 
q) 
q)-3!value dic          // see how useful it can be 
"(();();())" 
 
q)![-3;value dic]       // you can use prefix form too 
"(();();())" 
 
 
### 
###   0N! 
### 
 
q)7 * 0N!3        // useful for displaying intermediate result 

21 
 
q){a:x*y; 0N!b:a+x; c:b-a}[3;4]     // lets you inspect the value of b inside a func 
15 

 
### 
###   -19!      // compress 
### 
 
q)file:`:/tmp/foo 
q)-19!(file; /tmp/bar; 17;2;6)     // compressed foo into bar 
`:/tmp/bar 
q)system "mv /tmp/bar",1_string file   // always nice to have an intermediate file just in case 
 
q)(file; 17;2;6) set file      // or if you wanna do it quicker 
 
### 
###   -21!          // display compression info 
### 
 
q)show t : ([] name:`a`b`c; val:12 34 56) 
name val 
-------- 
a    12 
b    34 
c    56 
q)h1 : `:/tmp/foo 
q)h2 : `:/tmp/bar 
q)h1 set t 
`:/tmp/foo 
 
q).z.zd:17 2 6 
q)h2 set t 
`:/tmp/bar 
 
q)-21! h1                  // not compressed. you can also use prefix form  ![-21;h1] 
q)-21! h2                  // compressed 
compressedLength  | 107    // it returns a dictionary of compression info 
uncompressedLength| 68 
algorithm         | 2i 
logicalBlockSize  | 17i 
zipLevel          | 6i 
 
 
q){x!-21!/:x} (h1;h2)          // {x!-21!/:x} hsymList 
:/tmp/foo| (`symbol$())!() 
:/tmp/bar| `compressedLength`uncompressedLength`algorithm`logicalBlockSize`zipLevel!(107;68;2i;17i;6i) 
 
 
### 
### 
### 
 
 
 
############################################################### 
###    manipulating date/time data types as integer/list    ### 
############################################################### 
 
// first date of this month 
`date$`month$.z.D 
 
// last date of this month 
-1+`date$1+`month$.z.D 
 
// get all days in a given month 
{[m] d+til (`date$m+1)-d:`date$m} 2018.04m            // intra-line assignment can be hard to track, so use it carefully 
                                                      // as a good practice rule, it should be ok if it's used ONLY within the same line 
// get all months of a given year 
{[y] til[12] + 2000.01m + 12*y-2000} 2018 
 
// get the first day of a given year 
{[y] `date$2000.01m + 12*y-2000} 2018 
 
// list all month start dates in a given year 
{[y] `date$til[12] + 2000.01m + 12*y-2000} 2018 
 
// list all month end dates in a given year 
{[y] -1 + `date$ 1 + til[12] + 2000.01m + 12*y-2000} 2018 
 
// list quarter start dates of a given year 
qs: {[y] `date$ (3*til[4]) + 2000.01m + 12*y-2000} 2018 
 
// list quarter end dates of a given year 
qe: {[y] -1 + `date$ 3 + (3*til[4]) + 2000.01m + 12*y-2000} 2018 
 
// how many days in each quarter 
qe - qs 
 
// a list of lists of qs,qe 
q)qs,'qe                          // each both 
2018.01.01 2018.03.31 
2018.04.01 2018.06.30 
2018.07.01 2018.09.30 
2018.10.01 2018.12.31 
 
// first date of this quarter 
`date$3 xbar `month$.z.D            // a neat use of xbar 
 
// last date of this quarter 
-1+`date$3+3 xbar `month$.z.D 
 
// first day of next quarter 
`date$3+3 xbar `month$.z.D 
 
// first day of this year 
`date$12 xbar `month$.z.D 
 
// last day of this year 
-1+`date$12+12 xbar `month$.z.D 
 
 
// exercise - given integer (yyyy; mm; dd) return yyyy.mm.dd 
q) {[y;m;d] ymd:(d-1)+`date$(m-1)+2000.01m+12*y-2000;  $[all (y;m;d)=`year`mm`dd$\:ymd; ymd; 0Nd] }[2018; 05; 13] 
2018.05.13 
 
 
### 
###  list and boolean 
### 
 
update senior:`yes from people where age > 65 
update senior:`no from people where age <= 65    // these two lines can be condensed to one line below 
 
update senior:`no`yes age > 65 from people       // neat. it's now a list lookup 
 
 
### 
###  finding a subset string match 
### 
 
q)select from t where all each nameStr in "AIBMple" 
name nameStr 
------------ 
ibm  "IBM" 
aapl "Apple" 
 
 
### 
###  how to compute pi in q/kdb+ 
### 
 
probably the easiest way is to copy paste a precomputed value from somewhere into a local variable. 
if you really want to compute pi in q, then try 
q)\P 15 
q)x:0.25 
q)2*x + acos sin x 
3.14159265358979 
 
 
### 
###  a type converter function 
### 
 
convertType[tbl;srcType;destType]      // e.g. convertType[tbl;"cC";"s"] 
 
convertType:{[tbl;srcType;destType] 
  srcType:(),srcType; 
  if["s"~destType; destType:enlist[`]]; 
  clmn:exec c from meta[tbl] where t in srcType; 
  if[0=count clmn; :tbl]; 
  f:{($;x;y)}[destType;]; 
  a:clmn!f each clmn; 
  ![tbl;();0b;a] 
 } 
 
 
the above implementation assumes srcType follows the "char" representation.   // https://code.kx.com/q4m3/2_Basic_Data_Types_Atoms/ 
 
type      char     short 
-------------------------   // but it's not hard to accept input in all three possible representations 
boolean    "b"      1       // it's just more input type checking, so on 
byte       "x"      4 
short      "h"      5 
int        "i"      6 
long       "j"      7 
real       "e"      8 
float      "f"      9 
char       "c"      10 
symbol     "s"      11 
timestamp  "p"      12 
month      "m"      13 
date       "d"      14 
(datetime) "z"      15 
timespan   "n"      16 
minute     "u"      17 
second     "v"      18 
time       "t"      19 
 
 
the only special handling case is that symbol casting is a bit irregular. 
 
e.g. 
 
q)"f"$12i 
12f 
q)`float$123j 
123f 
 
q)"s"$"foooo"  // recall this doesn't work (as of q35) 
'type 
q)`$"foooo"    // we have to do this. 
`foooo 
 
also recall symbol must be enlisted in parse tree. so your aggr parameter looks like  clmn!{($;enlist[`];x)} each clmn 
 
 
### 
###  weekdays, weekends 
### 
 
recall 2000.01.01 == 0    // it was Saturday, i.e. 1 = ymd mod 7 means Sunday 
 
q)wday:{x where 1<x mod 7} 
q)wday .z.D + til 7 
2018.12.11 2018.12.12 2018.12.13 2018.12.14 2018.12.17 
 
q)wend:{x where 2>x mod 7} 
q)wend .z.D + til 7 
2018.12.15 2018.12.16 
 
q)days:{[sd;ed] sd + til 1+ed-sd} 
q)days[d-3; d:.z.D] 
2018.12.08 2018.12.09 2018.12.10 2018.12.11 
 
q)wdays:{[sd;ed] wday sd + til 1+ed-sd} 
q)wdays[d-3; d:.z.D] 
2018.12.10 2018.12.11     // weekends omitted 
 
nextwd:{[d] a:d mod 7; $[6=a; d+3; $[0=a; d+2; d+1]]} 
prevwd:{[d] a:d mod 7; $[2=a; d-3; $[1=a; d-2; d-1]]} 
 
// it will be trivial to implement derivatives like lastFri[] 
 
### 
###   round[x;y]     // round y into the precision (or denomination) of x 
### 
 
q)show r:5 ? 1f 
0.5752096 0.7876361 0.150423 0.363534 0.2550319 
 
round:{x*"j"$y%x} 
 
q)round[0.01; r] 
0.58 0.79 0.15 0.36 0.26 
 
 
### 
###  p2z[x]       // datetime to timestamp converter 
### 
 
q).z.Z                          // recall how datetime is deprecated 
2018.12.11T19:20:51.812 
q).z.P                          // and timestamp is preferred 
2018.12.11D19:20:54.652262000 
 
q)z2p:{"p"$x}                   // here is a quick converter 
                                   (make it more robust if you like) 
q)z2p .z.Z 
2018.12.11D19:21:33.033513000 
 
 
### 
###  .log.msg[x] 
### 
 
q).log.msg:{MB:exec "j"$used%1024 from .Q.w[];  -1 "[",string[.z.D]," ",string[.z.T]," ",string[MB],"MB] ",$[type[x]=10h;x;string x]} 
q).log.msg "[LOG] test message foo bar" 
[2018.12.13 20:26:45.984 296MB] [LOG] test message foo bar 
 
### 
###  "exec by" and ohlc      // open, high, low, close 
### 
 
"exec by" is actually powerful. 
 
ohlc:{`op`hp`lp`cp!(first;max;min;last)@\:x} 
 
q)3 #  trades 
dt         tm           sym  qty  px 
----------------------------------------    // sample data 
2015.01.01 00:00:01.129 aapl 6410 99.18 
2015.01.01 00:00:02.099 aapl 5200 93.82 
2015.01.01 00:00:02.670 ibm  6840 196.92 
 
q)exec ohlc px by sym,dt.week from trades 
sym  week      | op     hp  lp  cp 
---------------| --------------------- 
aapl 2014.12.29| 99.18  110 90  106.3 
aapl 2015.01.05| 99.62  110 90  109.87 
aapl 2015.01.12| 93.79  110 90  106.55 
aapl 2015.01.19| 93.9   110 90  97.43 
aapl 2015.01.26| 93.97  110 90  103.37 
goog 2014.12.29| 610.98 660 540 650.58 
goog 2015.01.05| 627.54 660 540 587.22 
goog 2015.01.12| 649.2  660 540 615.9 
goog 2015.01.19| 632.1  660 540 616.26 
goog 2015.01.26| 563.46 660 540 619.38 
ibm  2014.12.29| 196.92 220 180 217.38 
ibm  2015.01.05| 192    220 180 217.1 
ibm  2015.01.12| 180.22 220 180 185.7 
ibm  2015.01.19| 182.58 220 180 190.3 
ibm  2015.01.26| 189.06 220 180 199.3 
 
q)select ohlc px by sym,dt.week from trades     // notice how it works differently for "select" 
sym  week      | px 
---------------| ---------------------------------- 
aapl 2014.12.29| `op`hp`lp`cp!99.18 110 90 106.3 
aapl 2015.01.05| `op`hp`lp`cp!99.62 110 90 109.87 
aapl 2015.01.12| `op`hp`lp`cp!93.79 110 90 106.55 
aapl 2015.01.19| `op`hp`lp`cp!93.9 110 90 97.43 
aapl 2015.01.26| `op`hp`lp`cp!93.97 110 90 103.37 
goog 2014.12.29| `op`hp`lp`cp!610.98 660 540 650.58 
goog 2015.01.05| `op`hp`lp`cp!627.54 660 540 587.22 
goog 2015.01.12| `op`hp`lp`cp!649.2 660 540 615.9 
goog 2015.01.19| `op`hp`lp`cp!632.1 660 540 616.26 
goog 2015.01.26| `op`hp`lp`cp!563.46 660 540 619.38 
ibm  2014.12.29| `op`hp`lp`cp!196.92 220 180 217.38 
ibm  2015.01.05| `op`hp`lp`cp!192 220 180 217.1 
ibm  2015.01.12| `op`hp`lp`cp!180.22 220 180 185.7 
ibm  2015.01.19| `op`hp`lp`cp!182.58 220 180 190.3 
ibm  2015.01.26| `op`hp`lp`cp!189.06 220 180 199.3 
 
==> why did "exec by" work ?  because exec knows to promote dict to a table. (just like it knows when to return a list or a dict) 
    in fact, select is a particular kind of exec. 
 
==> we can still do this using select like below. 
 
q)select op:first px, hp:max px, lp:min px, cp:last px by sym,dt.week from trades 
sym  week      | op     hp  lp  cp 
---------------| --------------------- 
aapl 2014.12.29| 99.18  110 90  106.3 
aapl 2015.01.05| 99.62  110 90  109.87 
aapl 2015.01.12| 93.79  110 90  106.55 
aapl 2015.01.19| 93.9   110 90  97.43 
aapl 2015.01.26| 93.97  110 90  103.37 
goog 2014.12.29| 610.98 660 540 650.58 
goog 2015.01.05| 627.54 660 540 587.22 
goog 2015.01.12| 649.2  660 540 615.9 
goog 2015.01.19| 632.1  660 540 616.26 
goog 2015.01.26| 563.46 660 540 619.38 
ibm  2014.12.29| 196.92 220 180 217.38 
ibm  2015.01.05| 192    220 180 217.1 
ibm  2015.01.12| 180.22 220 180 185.7 
ibm  2015.01.19| 182.58 220 180 190.3 
ibm  2015.01.26| 189.06 220 180 199.3 
 
 
### 
###   tree[rootDir] 
### 
 
tree:{$[x~k:key x; x; 11h=type k; raze (.z.s ` sv x,) each k; ()]} 
 
### 
###   grep[symList;cond] 
### 
 
grep:{x where x like y} 
grep1:{x where not x like y} 
 
### 
###  decomp[fileHandle] 
### 
 
decomp:{x set get x} 
 
### 
###  "merge span" function 
### 
 
mergeSpan[t1;fromCol;toCol;maxgap] 
 
 

  1. 2018-01-08 00:26:33 |
  2. Category : qkdb
  3. Page View:

Google Ads