Monday, July 14, 2014

Bayesian Naive Bayes for Classification with the Dirichlet Distribution

I have a classification task and was reading up on various approaches. In the specific case where all inputs are categorical, one can use “Bayesian Na├»ve Bayes” using the Dirichlet distribution. 

Poking through the freely available text by Barber, I found a rather detailed discussion in chapters 9 and 10, as well as example matlab code for the book, so took it upon myself to port it to R as a learning exercise.

I was not immediately familiar with the Dirichlet distribution, but in this case it appeals to the intuitive counting approach to discrete event probabilities.

In a nutshell we use the training data to learn the posterior distribution, which turns out to be counts of how often a given event occurs, grouped by class, feature and feature state.

Prediction is a case of counting events in the test vector. The more this count differs from the per-class trained counts, the lower the probability the current candidate class is a match.

Anyway, there are three files. The first is a straightforward port of Barber’s code, but this wasn’t very R-like, and in particular only seemed to handle input features with the same number of states.

I developed my own version that expects everything to be represented as factors. It is all a bit rough and ready but appears to work and there is a test/example script up here. As a bigger test I ran it on a sample  car evaluation data set from here, the confusion matrix is as follows:

testY   acc good unacc vgood
  acc    83    3    29     0
  good   16    5     0     0
  unacc  17    0   346     0
  vgood  13    0     0     6

That’s it for now. Comments/feedback appreciated. You can find me on twitter here

Links to files:

Everything in one directory (with data) here

Sunday, June 22, 2014

Trading in a low vol world

I wanted to take a look at what works in low vol environments, such as we are currently experiencing. I am open to the idea we have entered a period of structurally low volatility due to increased regulatory burden and flow on effects from the decline of institutional FICC trading. Or it may just be a function of QE, and post-tapering we will see a return to higher levels.

The plan

The main idea is to compare mean reversion (MR) vs. follow through (FT). For simplicity I define mean reversion as an up day being followed by a down day, and a down day being followed by an up day. Conversely, follow through sees an up day followed by another up day, and a down day followed by a down day.

I took a look at the major US equity indices, SPX (GSPC), NDX and RUT.

For each series we calculate daily log returns for the current period and shift the forward to get the return for the next period. Then we calculate realized volatility (RV) and split the data set into "low volatility" and "high volatility", by looking at median realized vol for the whole series.

Then, for each series, we use bootstrapped samples to simulate a number of trajectories/equity curves for each strategy (MR/FT) under the two classes of RV. Finally we take an average of the total return of each trajectory to get a ballpark idea of how they went.


The data is from the start of 1999 to the present, so roughly 15 years. Each run generates 1000 trajectories with a sample size of roughly 950.

For the low vol case, the results are unfortunately ambiguous. Follow through in a low vol environment seemed to do well for NDX and RUT, but the opposite was the case for SPX.

The TR column is the sum of the series over the whole period for the volatility class (i.e. a simple long only strategy), giving an idea of a directional bias that may be present in the sampling.

In the high vol environment, mean reversion was a clear winner, and consistent over the different underlyings.

The results seem relatively stable across trajectory size/sample size.


I'm not really sure what is going on SPX. My intuition was that FT would do well in low vol environments, but that doesn't seem to be the case, at least not for SPX.

I was actually getting consistent votes for FT in the low vol case, then restarted R to run with a clean environment and started getting the above instead. You can't spell argh without R it seems.

Source is up here. As always you can find me on twitter here. Thanks for stopping by.

Saturday, May 31, 2014

Divergence on NDX

I generally take a dim view of old-timey technical indicators, perhaps they work for some people but I have found there are much better tool available. One exception is divergence, which in this case is when price makes a new high, but the MACD (or your favourite oscillator) does not.  

There is very nice looking divergence on NDX, and it also shows up in a weaker form on SPX and INDU. I never take it as a trade signal by itself, but it does make me look a little closer. I have marked off some previous occurrences as well. It does not give any indication about when a sell off may occur, or how much of a sell off will eventuate. Pretty useful isn't it?

Another form of divergence I take note of is the marked failure of RUT to make it back to its recent highs, which differs from NDX/SPX/INDU. 

You can also see in the charts above that volume has been declining, especially over the last 4-5 weeks. 

I do think we are in a long run bull market which still has a few more years to go. In the event of a shorter term sell off I would generally be looking to buy dips. 

A good sign of a bull market is shrugging off negative events. We've had some reasonably serious geopolitical happenings, the invasion in Ukraine, a coup in Thailand, and anti-Chinese riots in Vietnam that produced a number of fatalities. 

Struggling to think what a catalyst might be, perhaps some unpleasant surprise regarding QE tapering, or unconstrained collapse in the Chinese property market, both of which I think are pretty unlikely

There's a bunch of macro data out next week, and Apple is having its WWDC. Apple used to make up a very large amount of NDX, something like 24% of the index value was determined by AAPL prices. I know they rebalanced it and am not up to date with where it currently stands.

A quick look at FX realized vol

Much has been said about the decline in volatility. At the moment I am very active in FX spot trading and as a generalization do better the more vol there is.

 I wanted to see how things stood on the crosses I am most active in, namely EUR/USD, GBP/USD and USD/JPY.

 I took hourly data from FxPro (not my broker, nor an endorsement), calculated volatility as the high minus the low, and summed the total for each day. You can think of it as how many pips were on offer if one could correctly call the high and low of each hour of each day.

 All up there is about 90 days of data, so it covers roughly the last four months. I also took the average of the last five days which are the red X’s on the box plots. We are here.

As you can see, vol is below average. It was quiet week overall, bank holidays in the US and parts of Europe, and only a moderate amount of data coming out. Next week should be a bit busier I think.

Since I had all the data I also took at look at the average hourly RV per day.

I don’t want to read too much into this chart but things have been quiet. I read somewhere else fx vol is approaching levels of 2007 which was a very quiet time indeed.

Some R code is up here, data is here.

Saturday, May 17, 2014

RcppArmadillo cheatsheet

I have been using RcppArmadillo more and more frequently, so thought I would make a cheatsheet/cookbook type reference that translates common R operations into equivalent arma code.

I have put them up on a github wiki page here.

The functions are all pretty basic and not particularly robust. In particular they do not do any bounds or sanity checking.

You might also enjoy the arma documentation, in particular the matlab/octave syntax conversion example.

There is also an excellent book Seamless R and C++ Integration with Rcpp

Any corrections or additions are most welcome.

Sunday, May 11, 2014

Hedge Fund Managers on YouTube

Bill Ackman reads from the book of Buffet

Ray Dalio gives a run down of macroeconomics

Friday, April 25, 2014

Fractional sums of Perlin Noise

I wanted to mess around with fractional sums of Perlin noise, so made a little openFrameworks app to better understand what is going on.


Frequency comes up quite a bit in the following discussion, and more or less means the rate at which something goes up and down. All you really need to keep in mind is:

High frequency noise goes up and down quickly. It looks like this (where black is 0 and white is 1):

Low frequency noise goes up and down slowly.  It looks like this:

You can see low frequency noise takes a while to get from zero to one and vice versa, while high frequency noise does it much more often.

ofNoise Inputs and Outputs

In general it’s a good starting point to pass normalized coordinates to the noise function.

You can use ofNoise() to get values ranging from [0, 1], or ofSignedNoise() to get values from [-1, 1].

There are a bunch of functions for various dimensions as well.

Fractional Sums

Low frequency noise will give a nice undulating look, but often it is boring. High frequency noise is more interesting, but can be a bit too chaotic. What we want is a nice combination of both.

Using an example from Paul Bourke, we take several instances of noise at increasing frequency and combine them to get the effect we are after.

To do this there are three parameters, an octave count, alpha, and beta.

The octave count is how many layers of noise we will be adding together. This will typically range from 1 to 8.

Each layer of noise is generated at a higher frequency than the one before, which is where the name octave comes from. 

Beta controls the frequency of noise, the larger it is, the higher the frequency of the noise, i.e. the faster it goes up and down.

Adding these together works, but you might find the higher frequency noise overpowers the lower frequency noise.

This is where the third parameter, alpha, comes in, controlling how much of the noise from the current octave ends up in the final sum.

In rough code it looks like this:

double sum = 0.0;
for (int n  = 0; n < octaveCount; n++) 
            sum += 1/alpha^n * noise(n * beta * x, n * beta * y);

Lets say alpha and beta are both 2. In the call to noise(), the n * beta term will get progressively larger for each successive layer of noise. We are increasing the frequency of the generated noise.

However, with alpha as 2, we are adding successively less of each octave, as we get 1/2, 1/4, 1/8, 1/16 … reducing the magnitude of the higher frequency terms.

Taken together, we see the higher frequency noise contributes less and less to the final sum. We end up with a nice smooth but varying noise map to use for displacement or whatever we want.


Summing several octaves means you will typically get values greater than 1 (and less than -1 if you are using signed noise).

There’s a few ways you can normalize these, take a look at the commented out code. You might not want to map the minimum to zero as it can cause jumps when the lower bound of the summed noise changes.

The app 

An app to play with is up here

The small images show the individual octaves, and the big image is the final result.

Also I used the noise function from that app to do vertex displacement of a mesh and put a clip up here:

That’s it for now. Later on I will give an example of making seamless noise loops. I have a bunch of other stuff going on which I will write about when they are finished off, but I am really looking forward to sharing.

As always you can find me being rude and unprofessional on twitter. I love hearing from people, so tweet me pics of your rad noise stuff.