PDA

View Full Version : Simple variance question (Microsoft Excel)


VarlosZ
02-15-2007, 06:44 AM
When using Excel to measure variance, what's the effective difference between the VAR function, which estimates variance based on a sample, and the VARP function, which calculates variance based on the entire population? Why would someone prefer to use one rather than the other?

Thanks.

djames
02-15-2007, 12:16 PM
When the full population of values is known, the true variance of the population is the sum of squared deviations from the mean, divided by the size of the population, VARP.

When only a sample from the full population is known, then the unbiased estimator of the population variance is the sum of squared deviations from the sample mean, divided by the size of the sample minus one, VAR.

arahant
02-15-2007, 06:12 PM
Just to add....you usually want to use VAR (at least for everything I've ever done). If you have a series and you want to guess at the variance for future observations, the series isn't the entire 'population'. It's kind of academic though, because if the sample is small enough that the two give significantly different answers, the variance estimate won't be very significant either.

BruceZ
02-16-2007, 09:34 PM
[ QUOTE ]
Just to add....you usually want to use VAR (at least for everything I've ever done). If you have a series and you want to guess at the variance for future observations, the series isn't the entire 'population'. It's kind of academic though, because if the sample is small enough that the two give significantly different answers, the variance estimate won't be very significant either.

[/ QUOTE ]

On the other hand, VARP, which divides by n, is the maximum likelihood estimator of the variance of a normal distribution. That is, it is the value of the variance that maximizes the likelihood of the observed data.

VAR, which divides by n-1, is the unbiased estimator of the variance for a normal distribution, meaning that the expected value of this estimator is equal to the true variance.

There is a third estimator (not necessarily in Excel) which divides by n+1, and this estimator minimizes the mean squared error of the estimate of the variance for a normal distribution.

These issues are largely theoretical since the difference is usually insignificant when a reasonable number of samples is used, and they converge to the same value as the number of samples goes to infinity.