I gave a small talk at the recent Sydney Creative Coding meet on Curl Noise.

It was an introductory talk with the goal of explaining the basics, so those interested can go off and look at other peoples work and hopefully understand what is going on.

I have put the slides up here. Throughout the talk I took breaks for questions and showed pictures of curl and miscellaneous mathematical facts, which is why those appear.

I also wrote the following apps as demos

curlgrid - This is a basic app to explore noise/vectors/curl and particles in perlin noise

gpu curl 2D - This is a 2D version doing particle advection and curl generation on the GPU

gpu curl 3D - As above but a 3D version

All of them require openFrameworks and the ofxGui addon which should come with oF.

Some links to others work

Robert Bridson - Wrote the original paper on the topic, has some code available

Philip Rideout - More comprehensive examples and explanations

miaumiau - More discussion and cool examples in WebGL

I did only really discuss the basics so if you are interested I would strongly encourage you to go check them out.

It did take me some time to get my head around things, so if you didn't manage to pick all of it up in the ~25 minutes I was speaking for, don't worry about it. Even now I am not 100% sure the 2D case is right, appreciate any comments.

The GPU versions use a transform feedback vertex shader, you can find more info about this here.

If you are interested in the math, you can find more info here

Some other links:

Cosine approximation to the normal distribution: here

A derivation of the volume on an n-sphere: here

Questions/Comments email me or post here

You can also find me on tumblr, vimeo and twitter

Thanks!

# Shifting sands

A man with a hammer

## Thursday, February 19, 2015

## Friday, December 26, 2014

### Fitting a mixture of independent Poisson distributions

This is an example from Zucchini &
MacDonald’s book on Hidden Markov Models for Time Series (exercise 1.3).

The data is annual counts of earthquakes of
magnitude 7 or greater, which exhibits both overdispersion for a Poisson (where
the mean should equal the variance) as well as serial dependence.

The aim is to fit a mixture of m independent
Poisson distributions to this data, using the non-linear minimizer nlm to minimize the
negative log-likelihood of the Poisson mixture.

Sounds easy right? I have not done much
optimisation stuff so it took me longer than it probably should have. There’s
little better for the ego then getting stuck on things by page 10.

###
Constrained to unconstrained

There are two sets of parameters, the
lambda parameters to the Poisson distributions and the delta mixing
distribution, the latter giving the proportions of each distribution to mix.

The Poisson parameters must be greater than
zero and there are m of them. The mixing distribution values must all sum to
one, and the first mixing value is implied by the m-1 subsequent values.

These are the constraints (lambdas greater
than zero, deltas sum to one), and there are 2m – 1 values (m lambdas, m-1
deltas). We need to transform these to unconstrained values for use with nlm
(becoming eta and tau respectively).

The formulas for transformation are
reproduced from the book here

These transformed values are combined in to
one parameter vector and passed to nlm. We also pass a function that calculates
the negative log likelihood.

In the function, we first convert the
parameters from the unconstrained “working” form to the constrained “natural”
form using the inverse transform, and then use these values to do the
likelihood calculation.

You can see the code here.

###
Outro

Now that I look over the code and write
this out it does seem all fairly straightforward and relatively trivial, but it
was new to me and took a while, so I figured other people might find the
example useful as well.

The code has three examples for the cases m = 2, 3, and
4. Obviously feel free to mess around with the initial parameter estimates to see what comes up.

It does seem to match the fitted models from the book, which I have copied
here:

For all the details, check out the book,
chapter 1!

## Tuesday, December 23, 2014

### Does Trend Following Work?

I’m not sure how I came across it, but I
have had Jez Liberty’s Au.Tra.Sy blog in my reader since around 2009.

Since then, he has tracked well-known trend
following systems and reported monthly performance figures. These are things
like moving average crossovers, Bollinger band breakouts and stuff like that.

The systems had a very good month in
November, and a very good year, up ~45% YTD.

I thought I would take a look and how
things have gone over the time it has been tracked. Helpfully, Jez posts annual
summaries for each system and a composite average, which is what I used. You can find the links to specific posts in the script linked under the table below. N.B the first two years are the average of actual trend following funds not the generic strategies.

There are results for 6 years, from 2009 to
2014, using the YTD November figure for 2014. We end up with a series like
this, starting from 100.

Source [R]

We can see that 4 of the 6 years ended
under water. Over the same time there has been a huge surge in US equities. A
simple buy and hold in 2009 would have more than doubled equity.

### The pain of losing

I have been following Jez posting results
every month for 5 – 6 years now. When I think back to where I was in 2009
versus where I am today, five years is a really, really long time.

You can get historical data from 1920s and
even further back in time. If you run a backtest over 80 – 90 years of data, a
3 – 5 year period of underperformance is barely noticeable on an equity curve.

Sometimes I see people get excited about
their backtests even though they include these periods of poor performance.
There is plenty of research and evidence that humans in general find it very
hard to stick with a losing system.

How seriously would you take someone
touting a system that is underwater after five years, but they assured you a
45% year was just around the corner?

### Professional CTA

I believe the Au.Tra.Sy data is based on
commodities and futures trading, so thought I would take a look at the Barclay CTA Index as a comparison. The results seem somewhat similar, it has struggled somewhat since 2010, though both the upside and
downside returns are smaller.

In the period we are examining here, the 6 years from 2009 – 2014, four of them have been down years, including three down years in a row (2011 - 2013).

### Proselytizing

In some ways, you can think of mechanical
trend following as being long tail risk. Effectively it wants a big sustained
move in either direction. If you think about the distribution of returns, the
big moves that are “pay days” are out in the tails.

I don’t follow commodities vol at all, but
apparently it has been low for several years, according to this article from September, 2014. It probably shouldn’t be a surprise that strategies dependent on big moves have not performed
well in an environment of low volatility.

I personally dislike these indicator-based
trend following systems, even though I think they give entries as good as any.
The problem is they are too slow to close positions and give back too much
profit.

They only really make serious money when a
megatrend eventuates, and these are relatively rare and IMO getting rarer. In
the mean time they can get chopped around and experience significant drawdowns.

I don’t want to draw any final conclusions
about classical trend following, but I know I would have a lot of trouble
sticking to systems like these in practise, even though a long term backtest might look really nice.

However, if you understand the conditions
under which your system does well or is likely to underperform, it is a lot
easier to stick with it during periods of underperformance, which are
inevitable over the long run.

It is worth spending the time thinking about the conditions in which your system does well and does poorly.

## Tuesday, December 16, 2014

### Book Review: R Object-oriented Programming

Packt sent me a copy of R Object-oriented Programming by Kelly Black to review so here we go:

The aim of this book is to provide a basic understanding of R and programming in R. It is suggested that readers are already familiar programming, but the book does seem to start from square one and I feel is probably accessible to most people using R.

The contents are roughly split 50/50 between general R stuff like working with data and the various data types R has and then OO. The first half provides good coverage of the basics, and I thought it was a good introduction to R. It includes a chapter on working with time variables which I think would be very useful to people new to R and working with time series data.

The second half is about objected orientated programming. S3 and S4 classes are covered, and some decent examples are provided that go beyond the usual trivial hello world or Shape, Circle, Square type examples, which is nice to see.

Strangely though, no mention is made of reference classes. For a book titled R Object-oriented Programming, making no mention about one of the ways to do OO in R seems like a rather large oversight.

Another nitpick I have is the indent style and formatting. It is mostly Whitesmiths style (which I personally dislike), but beyond that it is used inconsistently throughout the examples. This is generally considered a rather bad practice in programming circles. Pick one style and stick with it.

It is challenging to present source code and text together, but some of it is laid out very poorly, with lines wrapping in strange places. It doesn’t really affect the content because most people will likely download the code to experiment with it, but it does detract from what is otherwise a good book.

Arguing about whitespace may seem petty, but the reality is most programmers spend most of their day looking at source code, so it is important. There's plenty of projects that wont even look at your code unless it is properly formatted. Again its not the choice of style, its the inconsistent use and poor formatting that bothers me. There's a lot of terrible R code out there and a book about programming should do better.

Despite these issues I think it gives good coverage of relevant topics for those starting to get more into the programming side of R. If you are starting out in statistics and looking for a book on R to get going with programming (versus say how to do linear regression etc) it's a useful one.

You can see some other reviews here, here, and on Amazon.

You can see some other reviews here, here, and on Amazon.

## Sunday, September 21, 2014

### HMM example with depmixS4

On a scale of one to straight up voodoo, Hidden Markov Models (HMMs) are definitely up there for me.

They have all sorts of applications, and as the name suggests, they can be very useful when you wish to use a Markovian approach to represent some stochastic process.

In loose terms this just means we wish to represent our process as some set of states and probabilistic transitions between them.

For example if I am in state 1, there may be a 85% chance of staying in state 1, and a 15% chance of moving to state 2.

To complete this simple two state model, we would also have to define the transitions for state 2, namely what is the probability we will stay in state 2 if we are already in state 2, and what is the probability we will transition from state 2 to state 1.

But what if we don’t know this information? What if we just have a set of realizations from some process and a vague idea of the number of states that might be behind it?

Enter HMMs. Among other things, they can be used to determine the state transition probabilities that underlie some process, as well as the distributional parameters for each state.

That is, it can work out all the details that are "hidden" from an observer of a process. All we really need to do is specify how many states we think there are.

To test this out, and to get more familiar with the depmixS4 package, I made a small test program. It creates an observation series with:

a) a known number of states

b) known state transition matrices

c) known underlying distribution parameters

The aim is to use depmixS4 to estimate all this information, which should help get a grip on how to use the package, and also let us see if HMMs are actually any good.

We have two states, lets call them S1 and S2. They were setup with the following properties

Both states provide realizations from the normal distribution, in this case S1 has mean -1, standard deviation 1, and S2 has mean 1, standard deviation 1. We can also see the transition matrix between S1 and S2 in the last two columns.

A trajectory was sampled, and depmixS4 used to fit a HMM. It provided the following estimates

You can see it did a pretty good job of recovering the true values.

In the following plot, you can see the sample trajectory, the estimated state from the HMM and the actual state used to generate the sample. This example worked very well, it’s not always the case that things turn out so nicely.

Voodoo I tell you!

There is no guarantee that a fitted HMM will be of any use, and even with this simple example, the state estimates can be wildly inaccurate. You can try this for yourself by using different seed values.

This example is based on one from the book Hidden Markov Models and Dynamical Systems, which I found to be an excellent resource on the topic. It is clearly written, covers the basic theory and some actual applications, along with some very illustrative examples. Source code is provided in python.

I've read (or at least tried to read) pretty much every book on HMMs I could find, and found this one to be the most useful if you are new to HMMs and are interested in applications.

You may also find these other posts about HMMs useful as well:

Fun With R and HMMs

Getting Started with Hidden Markov Models in R

Hidden Markov Models Examples in R, part 3/4

There is also the classic paper by Rabiner.

Hopefully this has shed a bit of light on HMMs and the depmixS4 package. Code is up here.

They have all sorts of applications, and as the name suggests, they can be very useful when you wish to use a Markovian approach to represent some stochastic process.

In loose terms this just means we wish to represent our process as some set of states and probabilistic transitions between them.

For example if I am in state 1, there may be a 85% chance of staying in state 1, and a 15% chance of moving to state 2.

To complete this simple two state model, we would also have to define the transitions for state 2, namely what is the probability we will stay in state 2 if we are already in state 2, and what is the probability we will transition from state 2 to state 1.

But what if we don’t know this information? What if we just have a set of realizations from some process and a vague idea of the number of states that might be behind it?

Enter HMMs. Among other things, they can be used to determine the state transition probabilities that underlie some process, as well as the distributional parameters for each state.

That is, it can work out all the details that are "hidden" from an observer of a process. All we really need to do is specify how many states we think there are.

### Example

To test this out, and to get more familiar with the depmixS4 package, I made a small test program. It creates an observation series with:

a) a known number of states

b) known state transition matrices

c) known underlying distribution parameters

The aim is to use depmixS4 to estimate all this information, which should help get a grip on how to use the package, and also let us see if HMMs are actually any good.

We have two states, lets call them S1 and S2. They were setup with the following properties

Both states provide realizations from the normal distribution, in this case S1 has mean -1, standard deviation 1, and S2 has mean 1, standard deviation 1. We can also see the transition matrix between S1 and S2 in the last two columns.

A trajectory was sampled, and depmixS4 used to fit a HMM. It provided the following estimates

You can see it did a pretty good job of recovering the true values.

In the following plot, you can see the sample trajectory, the estimated state from the HMM and the actual state used to generate the sample. This example worked very well, it’s not always the case that things turn out so nicely.

Voodoo I tell you!

### Outro

There is no guarantee that a fitted HMM will be of any use, and even with this simple example, the state estimates can be wildly inaccurate. You can try this for yourself by using different seed values.

This example is based on one from the book Hidden Markov Models and Dynamical Systems, which I found to be an excellent resource on the topic. It is clearly written, covers the basic theory and some actual applications, along with some very illustrative examples. Source code is provided in python.

I've read (or at least tried to read) pretty much every book on HMMs I could find, and found this one to be the most useful if you are new to HMMs and are interested in applications.

You may also find these other posts about HMMs useful as well:

Fun With R and HMMs

Getting Started with Hidden Markov Models in R

Hidden Markov Models Examples in R, part 3/4

There is also the classic paper by Rabiner.

Hopefully this has shed a bit of light on HMMs and the depmixS4 package. Code is up here.

## Monday, July 14, 2014

### Bayesian Naive Bayes for Classification with the Dirichlet Distribution

I have a classification task and was
reading up on various approaches. In the specific case where all inputs are
categorical, one can use “Bayesian Naïve Bayes” using the Dirichlet distribution.

Poking through the freely available text by Barber, I found a rather detailed discussion in chapters 9 and 10, as well as example
matlab code for the book, so took it upon myself to port it to R as a learning
exercise.

I was not immediately familiar with the
Dirichlet distribution, but in this case it appeals to the intuitive counting
approach to discrete event probabilities.

In a nutshell we use the training data to
learn the posterior distribution, which turns out to be counts of how often a
given event occurs, grouped by class, feature and feature state.

Prediction is a case of counting events in the test vector. The more this count differs from the per-class
trained counts, the lower the probability the current candidate class is a
match.

Anyway, there are three files. The first is
a straightforward port of Barber’s code, but this wasn’t very R-like, and in
particular only seemed to handle input features with the same number of states.

I developed my own version that expects
everything to be represented as factors. It is all a bit rough and ready but
appears to work and there is a test/example script up here. As a bigger test I
ran it on a sample car evaluation data set from here, the confusion matrix is as follows:

testY
acc good unacc vgood

acc 83 3
29 0

good 16 5
0 0

unacc 17 0
346 0

vgood 13 0
0 6

That’s it for now. Comments/feedback appreciated. You can find me on twitter here

Links to files:

Everything in one directory (with data) here

Subscribe to:
Posts (Atom)