I have been look at equities a bit of late,
I am particularly interested in ranking a universe of equities for “low
frequency” manual trading on a weekly or monthly basis.
Every period I would rank each name on a
bunch of different factors, then invest in the highest ranked ones for that
month.
I was initially working in R but the code
grew unwieldy, and I wanted a second opinion on my approach so took the time to
re implement it in python using Pandas.
Setup
For each symbol in our universe, we load
the raw data and generate the information used for ranking. If we have 5 names,
we end up with 5 dataframes.
Then we combine those dataframes into one
big dataframe, and iterate through month by month, selecting the symbols that
meet our ranking criteria. From those selected, we equally weight and sum the next period returns.
One thing that is really cool about the
pandas dataframe is that it allows multiple rows with the same index.
This makes it easy to get the data for the
month under consideration. We just pass the month to index function and get the
subset of data for that month, e.g.
>>> df.ix['2015-02']
cpr npr
avg over sym
Date
2015-02-28
0.043302 -0.062449 -0.038914
False DBC
2015-02-28 -0.025028 0.008524
0.006130 True IEF
2015-02-28
0.056838 -0.014239 0.005434 True
VEU
2015-02-28 -0.037434 0.017171
0.015900 True VNQ
2015-02-28
0.055832 -0.011697 0.009236 True
VTI
[5 rows x 5 columns]
>>>
In this example there are 5 symbols, and we
see the ranking information for February 2015.
Another option would be to use hierarchical
indexing, with a sub-index for each month, but this way worked for my needs and
I think is quite clean and simple.
If anyone knows an equivalent in R that is
as clean and easy to work with for multiple time series I would love to hear about it.
Code Notes
The demo code does a simple back test of
the GTAA/Relative Strength trend following system using ETFs.
I have stripped it down to the basics so hopefully it is easy to understand. Load
the data, generate the dataframe with the info we want, make a combined data
frame, then go through month by month.
The ranking is done by filtering out names
under their 10 month moving average, then selecting the top n based on average
3 month return.
The “cpr” column is the current period
return, and the “npr” column is the next period return, which is the return
realized if we select a given security for that month.
I found Python For Data Analysis a very useful book is when working with pandas.