Most latency is low and in the large peak on the left but we have another smaller peak with far greater latency. Chances are that the reason for the peak with higher latency is something to do with the type of order the trader is submitting to the matching engine - for whatever reason it has a higher latency cost. If we can find scenarios like this we can look for commonality between the orders in this peak and then build some tests and figure out what's going on.
Analysing 100s of traders of many months means we want to do this in an automated way so lets do it in code.
In this case I had many days worth of logs that had been created with BinaryToFIX as discussed in my previous post. I wanted to look analyse each one and log if there was more than one peak where orders where being amended with OrderCancelReplace messages.
First read the CSV and filter out the execution reports telling us the order has been replaced, i.e. with ExecType of 5.
> messages =
read.csv("messages.csv")
> execs = messages[messages$MsgType=="8",]
> replaced = execs[execs$ExecType=="5",]
> execs = messages[messages$MsgType=="8",]
> replaced = execs[execs$ExecType=="5",]
Given a dataset like the above, how do I find each peak?
Lets define a local maximum as a point in the dataset which is greater then its two adjacent points but also a good distance away from its last local minimum. In my case I chose 10% of the highest point to be a good distance.
getLocalMax = function(bucks) {
max = max(bucks$counts)
samends = ksmooth(bucks$breaks, bucks$counts, kernel="normal", bandwidth=2) ;
dsmooth = diff(samends$y) ;
locmax = sign(c(0, dsmooth)) > 0 & sign(c(dsmooth,0)) < 0 ;
locmin = sign(c(0, dsmooth)) < 0 & sign(c(dsmooth,0)) > 0 ;
lastMin = 0 ;
lastLocMin = mapply(function (x,y) {
if (is.na(x)) {
lastMin ;
}
if (x) {
lastMin <<- y ;
} ;
lastMin ;
},
locmin %in% TRUE,
bucks$counts) ;
mapply(function(x,y,z) { (x & ((y - z) > (max / 10))) | y == max },
locmax,
bucks$counts,
lastLocMin) ;
}
We are using diff to find out the difference between two adjacent points, so in the above plot we get the following:
> dsmooth [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 [14] 2 16 58 159 238 536 816 1002 1302 1258 993 684 276 [27] -110 -72 -503 -451 -365 -735 -631 -502 -511 -476 -469 -461 -243 [40] -362 -261 -219 -219 -144 -59 -143 -100 -72 -58 -41 -27 -5 [53] -21 -15 -14 5 -9 -19 4 -12 4 1 -7 -1 -5 [66] 9 2 -10 3 3 -10 2 9 1 -12 7 3 -8 [79] 3 -3 4 0 -3 3 -4 -3 -3 3 7 -6 0 [92] 0 -2 5 -3 -3 -2 5 -3 1 -4 7 -3 0 [105] -2 1 1 -1 7 1 -2 -8 2 -2 5 -5 11 [118] -6 -3 6 -8 2 2 1 -1 2 -3 3 4 -4 [131] -4 0 1 7 -8 0 3 -3 4 -3 3 -4 1 [144] -2 -1 0 4 2 -4 0 7 -4 -3 -5 5 2 [157] -2 -2 -1 3 2 -5 7 -7 3 -3 6 -5 11 [170] -6 2 -2 0 -1 7 -3 -7 3 -1 -1 -4 5 [183] -4 1 -3 3 -1 1 -2 -2 2 3 1 -4 3 [196] -1 -3 0 5 -2 -1 -1 -1 1 4 -3 6 -6 [209] 6 -5 -2 0 6 1 1 -4 7 10 55 121 237 [222] 320 331 385 161 -93 -222 -244 -242 -182 -186 -133 -83 -78 [235] -2 -53 -30 -21 -4 -13 -13 -7 -8 1 -5 -4 0 [248] -5 4 3 -4 -2 -3 4 -1 -5 0 1 5 -2 [261] 1 -1 1 2 -2 -2 2 -2 1 4 -7 7 -2 [274] 2 -3 -2 4 -2 -3 3 -2 -2 0 4 -3 -3 [287] 1 4 -1 1 0 0 3 2 -5 -3 2 -3 -2 [300] 6 1 1 1 -5 0 1 4 -5 1 3 -6 3 [313] -3 2 -1 4 -3 2 -2 NA | |
> locmax
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[23] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[56] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE
[67] FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
[78] TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
[89] FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
[100] FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
[111] TRUE FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE TRUE
[122] FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
[133] FALSE FALSE TRUE FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE
[144] TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE
[155] FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
[166] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE TRUE
[177] FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE
[188] FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE
[199] FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
[210] TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
[221] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[232] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[243] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
[254] FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE
[265] TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE TRUE FALSE TRUE
[276] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE
[287] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
[298] TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE
[309] FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
[320] FALSE FALSE
The rest of the function gets rid of any local maximum that are less than 10% of the maximum point away from the last local minimum:
> getLocalMax(hist)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[23] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[155] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[166] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[177] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[188] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[199] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[210] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[221] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
[232] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[243] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[254] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[265] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[276] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[287] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[298] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[309] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[320] FALSE FALSE
Now we've just got two local maxima:
No comments:
Post a Comment