Two Plus Two Newer Archives - View Single Post

jason1990 · #2 03-10-2007, 10:24 AM

Another Example. Let X_j be 1 or -1 with equal likelihood. Let N = 3 if S_2 = 2, and N = 2 otherwise. (This is the case where I play 2 rounds for sure, but only play a 3rd if I won both the first two.) In this case,

P(N = 3, X_3 = 1) = P(Z_N = 3/sqrt{3}) = 1/8
P(N = 3, X_3 = -1) = P(Z_N = 1/sqrt{3}) = 1/8
P(N = 2, S_2 = 0) = P(Z_N = 0) = 1/2
P(N = 2, S_2 = -2) = P(Z_N = -2/sqrt{2}) = 1/4

Hence,

E[(Z_N)^2] = 3(1/8) + (1/3)(1/8) + 2(1/4) = 11/12.

Therefore, Var(Z_N) <= E[(Z_N)^2] < 1.

A Stronger Conjecture. Is it always the case that E[(Z_N)^2] <= 1?

An Observation. If N is independent of the sequence X_1, X_2, ..., then E[(Z_N)^2] = 1.

A Connection to Poker. Imagine that X_j is the result of the j-th hand of poker by a break-even player; S_n is the result of his session, which is n hands long. If this player tries to estimate his standard deviation using his session results (rather than his per-hand results), then Z_n will be a term that appears in that estimate.

In order for his estimate to be unbiased, it is important that E[(Z_n)^2] be equal to his true per-hand variance, which is 1. This will be true if n is not random, or if n is random but independent of his results X_j. But if his session lengths depend on the results of his play, then this will not be true and his estimate will be biased. The conjecture is that it is always biased in one direction. That is, he will always compute a standard deviation which is lower than his true standard deviation.

Both myself and another poster have observed this phenomenon in practice. The SD computed by PT (which is computed using session results) is consistently lower than the SD one gets when one estimates directly from the per-hand data (using Excel or MATLAB, for example).