Conclusion
I've waited 2 weeks for your reactions and was happy to receive many, either by e-mail or as a comment here on this site.
This is a short summary of those reactions:
- systems are likely over-optimized and lack adaptability to different market conditions
- more criteria than just the Sharpe ratio are needed to select a system
- longer timeframes (e.g. 3-5 years) are needed to select a system
- subscriber should ask vendor if fundamental analysis is part of the system (not relying on only technical analysis)
- results cannot be generalized to all C2 systems, since analysis was limited to end-of-day stock systems
- recent market conditions were unusual
Here's my own opinion. First, I agree that out of the 100 systems I included, a substantial number could have been suffering from over-optimization. The problem is that it is hard to tell from the Sharpe ratio (and perhaps any other statistic) which system is likely over-optimized and which one isn't. I.e. if you have 100 over-optimized systems some of them will still show reasonable Sharpe ratios when going live (on C2) for a substantial amount of time.
Second, using more selection criteria than just the Sharpe ratio might be a good idea. However, these other criteria would need to have low correlation with the Sharpe ratio and still measure reward/risk in some way. Very difficult to find such criteria... An additional problem is that as we add more criteria we also increase the risk of over-optimizing the selection process.
Third, I agree that the longer the timeframe, the better. Ideally, it should include a period of severe Depression... Practically, we have to work with the data available and this is currently about 4 years, a period when the broader market gained a little less than 50%.
Fourth, having a good understanding of the system's underlying method might be a great selection criterion. The problem is, as a subscriber we can never reliably observe it. While the C2 stats are quite honest, we'll never know for sure about the vendor's underlying method as it depends entirely on what he says instead of on an independent third party's assessment.
Fifth, I also agree that results for future, options, forex, and intraday stock systems might look better or worse than what I showed here for end-of-day stock systems. I'll see if I can replicate the analysis for these other categories. A big problem here is that it is questionable if a subscriber could reproduce the hypothetical trades for some of the scalping system.
Finally, the recent credit crunch has done a lot of damage (not only to C2 systems, but to hedge funds as well). Remember though that the analysis includes many systems that were not affected by the credit crunch at all, e.g. in my first analysis (6 months, split into 2 periods of 3 months), only systems started in 2007 would be affected. But I'll see if I can redo the analysis without including data after May 2007.
One person also mentioned that some systems might have shut down while not closing positions. This could mean that results for the 2nd period look worse (or perhaps better...) than they really are, assuming a subscriber would have closed his positions once the vendor terminated the system. I understand this limitation, but as far as I can see it doesn't apply the majority of the systems in the tables I showed.
As I explained before these results were an important reason for me to suspend trading end-of-day C2 systems. Perhaps I will rerun the analyses a year from now when we'll have longer histories and see if things look more attractive then. To keep myself busy, I started a new C2 system myself: Kauai.