Help an old man settle a bet

Scorpion Man · #1 10-21-2006, 12:34 AM

It's been 15 years since I have taken stats...

I am hoping someone can give me some direction and shed at least some "directional" light on a discussion around hedge fund returns.

Let's say there are 2000 hedge funds of a minimum size that exist. Let's say we have the results of 8 randomly chosen funds. What have we learned and with what confidence about the sample overall? Is 8 enough to get any sort of helpful information? Do you need to know standard deviation first? Lets say the numbers look like this:

L/S up 6.15% [these are all Net]
L/S up 3.53% [40% net long currently]
L/S up 8.98%
L/S up 7.52%

Multi - up 6.75%
Multi - up 11.16%
Multi - up 8.07%
Multi - up 15.3%

Let's say we know that the average standard deviation over a full year (these are 9 mos numbers) is around 10% for the entire sample. There are those who say that this sample is "misleading" and "meaningless". Is that true in a statistical sense?

This is probably poorly presented...I apologize for that. I am pretty good at understanding this stuff if its presented to me...

Thanks.

AaronBrown · #2 10-21-2006, 11:39 AM

First of all, the people who say the sample is meaningless are silly. Information has meaning. You know more after you've looked at eight randomly chosen returns than you did before.

Second, you're asking about using the sample to make inferences about the population. In other words, we're talking about the past returns of the 2,000 funds, not the future returns of any or all of them. Predicting the future involves another range of assumptions.

One thing you know without any assumption other than the sample is truly random, is that a 9th fund picked at random is equally likely to be the best or the worst of the bunch. That's what a random sample means. So if you pick a fund at random, there is 1/9 chance it will have a return worse than 3.53% and a 1/9 chance it will have a return better than 15.30%. There's also 1 chance in 9 the fund will be between any two successive returns, such as between 3.53% and 6.15%, or 8.07% and 8.98%.

If you're willing to assume the funds' returns fall along a Normal (bell-shaped) distribution, you can be more precise. This is often a good assumption as long as you don't try to predict too far away from the mean (I wouldn't use it to predict the best of the 2,000 funds, or even the 20th best, but from 100th best to 100th worst, it's probably not a bad guess). The standard deviation of the fund within the year is not relevant without a lot more assumptions. You just know that the mean return of your sample is 8.43% with a standard deviation estimate of 3.54%. You use a t-distribution to make the following prediction:

5% 18.50%
10% 16.81%
25% 14.61%
50% 8.43%
75% 2.26%
90% 0.05%
95% -1.64%

The first column is the percentage of funds with the return in the second column or better. So based on your sample, you'd guess 95% of the funds did better than -1.64% but only 5% did better than 18.50%.

Scorpion Man · #3 10-21-2006, 01:21 PM

Thanks so much, Aaron. Just to clarify...the numbers you posted are confidence intervals, correct?

AaronBrown · #4 10-21-2006, 01:34 PM

Not exactly. You can construct confidence intervals from my numbers. A 90% interval, for example, runs from 5% to 95%, so it's -1.64% to 18.50%. You expect that 9 randomly picked funds out of 10 will fall within this interval.

Scorpion Man · #5 10-21-2006, 05:34 PM

Thanks, that was as I understood it. One last thing, because it was core to what I was trying to determine (but I did not ask it well enough).

obviously, our sample has a mean. it also has a standard deviation. what can we say about the chances that the sample mean and standard deviation is close to the true population mean/std dev? I read here:

http://www.itl.nist.gov/div898/handb...on3/eda352.htm

My confusion is this...are we saying that 90% of the sample means would be between -1.64% and 18.5%? Or are saying 90% of the 2000 funds would have returns in that range?

AaronBrown · #6 10-21-2006, 06:37 PM

We predict on the basis of this sample and the assumption that the returns follow a Normal distribution, that 90% of the 2,000 funds (that is, 1,800 of them) will have returns in the range. We would have a smaller confidence interval for the mean of a new sample of funds.

People sometimes use data like this to predict the return of a fund next year. That is much more problematic.