Re: The envelope problem, and a possible solution
[ QUOTE ]
Thank you for your kind words, although you could have left out "stubbornly." At least you didn't say "against all rational evidence."
[/ QUOTE ]
[img]/images/graemlins/smile.gif[/img] Sorry, I think I'm a little frustrated over this apparent miscommunication.
Okay, here's the assumptions:
<ul type="square">
[1] if A and B are events, then P(A | B) = P(A and B)/P(B),
[2] if A and B are independent events, then P(A | B) = P(A),
[3] if A and B are mutually exclusive events, then P(A or B) = P(A) + P(B),
[4] if A is an event, then 0 <= P(A) <= 1.[/list]Here's the model we agreed on:
T = total amount in the envelopes
H = 1, if you picked the larger; 0, otherwise
X = (1 + H)T/3 = amount in chosen envelope
Y = (2 - H)T/3 = amount in other envelope
We never said it, but we both should agree that in this setup, T and H should be defined so that they are independent, and H should be defined so that P(H=0)=0.5. (And if someone wants to call T a parameter, that's fine too, just give it a point-mass distribution.) Let us also assume, for simplicity of exposition, that the distribution of T is supported on a countable set, i.e. that T is a discrete random variable. We will take it for granted that we can construct the underlying probability space without any difficulty. The Claim in question is
P(Y = 2X | X = k) = 0.5 for all k.
In this setup, Y=2X iff H=0. So the Claim is simply
(*) P(H = 0 | X = k) = 0.5 for all k.
[ QUOTE ]
I think you keep repeating one side of the paradox, and wondering why I can't see something so simple. I do see it. But there's the other side as well, the one that says the probability that you have the smaller amount is 50% so the expected value of switching is positive. You can't refute a paradox by strengthening one side, that just makes the paradox more puzzling. You have to show why one side is wrong.
[/ QUOTE ]
Okay, one side says that Claim (*) is true. The argument which is typically given is some variant of the following: the event {X=k} gives no meaningful information regarding the event {H=0}. These events are therefore independent (for any k). Hence, by Assumption [2],
P(H = 0 | X = k) = P(H = 0) = 0.5.
The problem with this is that the above argument uses a heuristic, not a mathematical, definition of independence. There are many examples of events whose mathematical dependence or independence does not match up with our heuristic notion of this concept. Moreover, not everyone's heuristic understanding of independence would allow them to conclude this.
In order for the argument on this side to carry any weight, we would need to mathematically verify that these events are independent (for all k). Thus, we need to verify that
(**) P(H = 0 and X = k) = P(H = 0)P(X = k) = 0.5P(X = k)
for all k. But according to Assumption [1], this is exactly equivalent to Claim (*). So this side of the argument does nothing but try to prove Claim (*) by applying heuristics to an equivalent formulation of it.
Now, one might argue that these heuristics "ought" to be true. But our mathematical definition of independence stands firm: A and B are independent if P(A and B)=P(A)P(B). This definition is intimately tied to Assumptions [1] and [2]. If you want to "force" these heuristics to be valid, then you will at some point need to deny one of these Assumptions.
[ QUOTE ]
I accept all four of your assumptions, and don't know anyone who does not. But how do they lead to the conclusion?
[/ QUOTE ]
Okay, so now we have the other side of the argument in which we try to show Claim (*) is false. We first compute
P(H = 0 and X = k) = P(H = 0 and (1 + H)T/3 = k)
= P(H = 0 and T = 3k)
= P(H = 0)P(T = 3k)
= 0.5P(T = 3k).
Similarly,
P(H = 1 and X = k) = P(H = 1 and (1 + H)T/3 = k)
= P(H = 1 and T = 3k/2)
= P(H = 1)P(T = 3k/2)
= 0.5P(T = 3k/2).
Hence, by Assumption [3],
P(X = k) = P(H = 0 and X = k) + P(H = 1 and X = k)
= 0.5P(T = 3k) + 0.5P(T = 3k/2).
Now let us assume Claim (*) is true. This is equivalent to Claim (**), so (after a little algebra) the above computations imply
P(T = 3k) = P(T = 3k/2) for all k.
Choose some a>0 such that P(T=a)>0. Let d=P(T=a). If we let k=2a/3, then the above implies that P(T=2a)=P(T=a)=d. By induction, P(T=(2^n)*a)=d for all n. By Assumption [3], for all positive integers M,
P(T = a or T = 2a or ... or T = (2^{M-1})*a) = Md.
If we choose M such that Md>1, then this contradicts Assumption [4], so Claim (*) cannot be true. Or, equivalently, it is not the case that {H=0} and {X=k} are independent for all k.
Again, if you want to deny Assumptions [1] and/or [2] and redefine independence to more closely match the heuristic ideas in the Paradox, then fine. Another possibility is to deny Assumption [4] and try to permit the use of "unnormalizable priors" (e.g. a uniform distribution on the naturals). I really don't know if any attempts have been made to develop such theories. I certainly have never heard of any. But I would consider such theories ad hoc, and I definitely think that any theory which does not assume [1]-[4] is well outside the realm of what is commomly used by practicing statisticians.
|