Saturday, November 24, 2007


I've waited 2 weeks for your reactions and was happy to receive many, either by e-mail or as a comment here on this site.

This is a short summary of those reactions:
- systems are likely over-optimized and lack adaptability to different market conditions
- more criteria than just the Sharpe ratio are needed to select a system
- longer timeframes (e.g. 3-5 years) are needed to select a system
- subscriber should ask vendor if fundamental analysis is part of the system (not relying on only technical analysis)
- results cannot be generalized to all C2 systems, since analysis was limited to end-of-day stock systems
- recent market conditions were unusual

Here's my own opinion. First, I agree that out of the 100 systems I included, a substantial number could have been suffering from over-optimization. The problem is that it is hard to tell from the Sharpe ratio (and perhaps any other statistic) which system is likely over-optimized and which one isn't. I.e. if you have 100 over-optimized systems some of them will still show reasonable Sharpe ratios when going live (on C2) for a substantial amount of time.

Second, using more selection criteria than just the Sharpe ratio might be a good idea. However, these other criteria would need to have low correlation with the Sharpe ratio and still measure reward/risk in some way. Very difficult to find such criteria... An additional problem is that as we add more criteria we also increase the risk of over-optimizing the selection process.

Third, I agree that the longer the timeframe, the better. Ideally, it should include a period of severe Depression... Practically, we have to work with the data available and this is currently about 4 years, a period when the broader market gained a little less than 50%.

Fourth, having a good understanding of the system's underlying method might be a great selection criterion. The problem is, as a subscriber we can never reliably observe it. While the C2 stats are quite honest, we'll never know for sure about the vendor's underlying method as it depends entirely on what he says instead of on an independent third party's assessment.

Fifth, I also agree that results for future, options, forex, and intraday stock systems might look better or worse than what I showed here for end-of-day stock systems. I'll see if I can replicate the analysis for these other categories. A big problem here is that it is questionable if a subscriber could reproduce the hypothetical trades for some of the scalping system.

Finally, the recent credit crunch has done a lot of damage (not only to C2 systems, but to hedge funds as well). Remember though that the analysis includes many systems that were not affected by the credit crunch at all, e.g. in my first analysis (6 months, split into 2 periods of 3 months), only systems started in 2007 would be affected. But I'll see if I can redo the analysis without including data after May 2007.

One person also mentioned that some systems might have shut down while not closing positions. This could mean that results for the 2nd period look worse (or perhaps better...) than they really are, assuming a subscriber would have closed his positions once the vendor terminated the system. I understand this limitation, but as far as I can see it doesn't apply the majority of the systems in the tables I showed.

As I explained before these results were an important reason for me to suspend trading end-of-day C2 systems. Perhaps I will rerun the analyses a year from now when we'll have longer histories and see if things look more attractive then. To keep myself busy, I started a new C2 system myself: Kauai.


Anonymous said...


This might SOUND absurd, but it has been on my heart to relate from one scientist to another the following.

Perhaps there is a way to rate/analyze systems that CAN be traded in reverse fashion to capitalize on a set of systems with resounding and methodological failure.

Optimally those selected would rank high to not only create consistent and significant losses, but that these can be easily adapted with a reverse trading routine.

In other words, without having to create too much methodology or exhaustively laid out trading implementation criteria. Simply bet against 3-5 systems that - when they rack up HUGE losses - we generate HUGE gains!

Do you think this can be done? And if so maybe this "Science Trader's Blog" could document a such study. I'm actually serious and in no way mean any ill-fortune.


Jules Ellis said...

Hi ST,
After reading your conclusions I still have some comments. Let me first say that these analyses are very interesting and I think that they confirm what I conjectured on the C2 forum, namely that Sharpe ratios are subject to a regression effect. That is, if we see a system with a high Sharpe ratio, then this might just be a matter of luck and this luck will not necessarily repeat itself. Moreover, your analysis indicates that it is *often* a matter of luck. Now I have two questions.

The first question is theoretical: You have demonstrated that the *direction* in which the Sharpe ratios change is consistent with the regression effect, but is the *rate* of change also consistent with a regression effect? Remember that I made a plot for forex systems of Sharpe ratio and age. Most points were located neatly within the 95% confidence interval if the true Sharpe ratio is 0 (confidence interval is not the correct name, but I hope you understand what I mean). If the Sharpe ratios fade out slower than can be expected by chance alone, then it might still be a good strategy to chose systems with a high Sharpe ratio.

Of course you cannot make the same plot as I did, because you studied the Sharpe ratios "longitudinal" (within-system) while I considered them "cross-sectional" (between-system). Nevertheless, I would think that your results cannot be explained by a random chance process alone (without a market edge, that is). Because if the results were random, then the Sharpe ratios of the second period would have an average of 0. Well, I didn't check: perhaps the average is not significantly different from 0? Even so, that might be caused by a small sample size and a large variance.

The second question is practical: Even if the Sharpe ratio decreases *on average*, there might be one system that beats the odds consistently (e.g. like Team Aphid Bird in forex seems to do). In that case it would be still be worthwhile to trade only that system. This is also of theoretical importance. There is a huge difference between saying "most traders are gambling" and "all traders are gambling".

Finally, I have a comment. I think that the high Sharpe ratios that we see at C2 are usually unrealistic. As I stated on the C2 forum, my conjecture is that in the long run a Sharpe ratio can at best be between 0.5 and 1. (I cannot prove this though). For higher Sharpe ratios I would definitely assume that a lot of luck is involved. Suppose that you studied only systems between 0.5 and 1. Then you will still see a partial regression effect, because some of the gambling traders will end there too. But I can imagine that some of these systems show a relatively stable Sharpe ratio.


CresceNet said...

barb michelen said...

Webcam said...

jobtimizer said...

Maybe it's not a good idea to compare the measurement period with a an equally sized trading period in the future.
Consider this strategy: Select a system by a long measurement period (say 12 months). Then trade the strategy for a much smaller period (say 3 months).
This strategy assumes that the trading strategies edge does not stop immeadiately but fades out slowly and continuously after the measurement period.


Forex Signals said...

Is the fact that 80% of the top-10 systems for the first 9 months underperformed the S&P 500 during the next 9 months, and in most cases by a substantial amount

Anonymous said...

Anonymous said...

Anonymous said...

Anonymous said...

免費成人卡通色情漫畫 said...

真愛視訊聊天室 - said...

