Breaking sticks, not throwing them - Page 2

jason1990 · #11 02-21-2007, 10:09 AM

[ QUOTE ]
Two equally likely things can happen with the first break, we can get a piece longer than 0.5 (the first 1 in the denominator) or we can continue (the rest of the denominator). If we continue, three equally likely things can happen. 2 of them end the game (the 2) and one of them continues (the rest of the fraction). And so on.

[/ QUOTE ]
As pzhon pointed out, these events are not equally likely. There are several ways to construct the break points X_1, X_2, .... I will describe 1 valid method and 1 invalid method. Under the invalid method, these events are equally likely. Under the valid method, they are not. For simplicity, I will only construct the first 2 break points. Note that in order to verify that a method is valid, we must check that (1) X_1 ~ Uniform(0,1), and (2) the distribution of X_2 - X_1, given X_1, is Uniform(0, 1 - X_1). To check condition (2), we need to verify that, for x < y, P(X_2 < y | X_1 = x) = (y - x)/(1 - x).

First, the invalid method. This I will call the conditioning method. In this, we choose two points independently and uniformly from [0,1]. If the second point is smaller than the first, we throw away both points and try again. This amounts to simply choosing the point (X_1, X_2) uniformly from the triangle {0 < X_1 < X_2 < 1}. With this method, it is easy to verify condition (2). However, condition (1) fails, showing that this method is invalid. With this method, it is also easy to compute

P(X_2 in [X_1, 0.5] | X_1 < 0.5) = 1/3
P(X_2 in [0.5, X_1 + 0.5] | X_1 < 0.5) = 1/3
P(X_2 in [X_1 + 0.5, 1] | X_1 < 0.5) = 1/3.

So these events are equally likely. Unfortunately, since this is an invalid construction, this fact is irrelevant for the original problem.

The valid method, I will call the resampling method. We again choose two points independently and uniformly from [0,1]. This time, though, if the second point is smaller than the first, we throw away only the second point. To write this down formally, construct an iid sequence Y_1, Y_2, ... of Uniform(0,1)'s. Let N be the first index such that Y_N > Y_1. Define X_1 = Y_1 and X_2 = Y_N. Clearly, X_1 = Y_1 is Uniform(0,1), so condition (1) is satisfied. To verify condition (2), take x < y and calculate

P(Y_N < y | Y_1 = x)
= \sum_{n=2}^\infty P(N = n, Y_n < y | Y_1 = x).

If we are given that Y_1 = x, then in order to have N = n and Y_n < y, we must have Y_i < x for all i strictly between 1 and n, and x < Y_n < y. The probability of this is x^{n-2}(y - x). Summing this from 2 to infinity gives (y - x)/(1 - x). Hence, condition (2) is satisfied and this is a valid construction.

Now, it is true that

P(Y_2 in [0, Y_1] | Y_1 < 0.5) = 1/4
P(Y_2 in [Y_1, 0.5] | Y_1 < 0.5) = 1/4
P(Y_2 in [0.5, Y_1 + 0.5] | Y_1 < 0.5) = 1/4
P(Y_2 in [Y_1 + 0.5, 1] | Y_1 < 0.5) = 1/4.

If the first case happens, we are going to throw away Y_2 and choose a new point Y_3. But when we do that, the analogous events for Y_3 are no longer equally likely. The fact that we have thrown away Y_2 will tell us that the interval [0, Y_1] is longer than we would expect it to be without that information. For example, one can calculate that

P(Y_3 in [0, Y_1] | Y_1 < 0.5 and Y_2 in [0, Y_1]) = 1/3.

All in all, this means that Y_N is not equally likely to be in the three intervals [Y_1, 0.5], [0.5, Y_1 + 0.5], [Y_1 + 0.5, 1]. To verify this rigorously, it suffices to check that P(Y_N > 0.5 | Y_1 < 0.5) is not equal to 2/3. Using our earlier computations, we have

P(Y_N > 0.5 | Y_1 < 0.5) = P(Y_N > 0.5, Y_1 < 0.5)/P(Y_1 < 0.5)
= 2P(Y_N > 0.5, Y_1 < 0.5)
= 2\int_0^{0.5} P(Y_N > 0.5 | Y_1 = x) dx
= 2\int_0^{0.5} 0.5/(1 - x) dx
= ln(2),

which is not 2/3.

So the heuristic argument that led to the above quote is false. For future reference, I will call this the false heuristic.

[ QUOTE ]
What I actually solved for was the expected number of breaks less than 0.5

[/ QUOTE ]
Let us pretend for now that the false heuristic is actually true. We will use it to compute the expected number of breaks less than 0.5. Under the false heuristic, there will be exactly 1 break less than 0.5 with probability (1/2)(2/3). There will be exactly 2 breaks less than 0.5 with probability (1/2)(1/3)(3/4). In general, there will be exactly n breaks less than 0.5 with probability (n + 1)/(n + 2)! So the expected number of breaks less than 0.5 is

S = \sum_{n=1}^\infty n(n + 1)/(n + 2)!

Define

f(x) = (e^x - 1)/x
= \sum_{n=1}^\infty x^{n - 1}/n!

By differentiating the series term-by-term, it is easy to verify that f''(1) = S. We then compute

f'(x) = (xe^x - e^x + 1)/x^2
f''(x) = (x^3 e^x - 2x^2 e^x + 2xe^x - 2x)/x^4.

Thus S = e - 2 = 0.7182818... Your first continued fraction evaluates to 0.69777... This is shown here and was originally linked by pzhon. Your second continued fraction evaluates to ln(2) = 0.69314... Neither of them are equal to e - 2. So if you used the false heuristic to calculate the expected number of breaks less than 0.5, you would not be led to either of the continued fractions which you mentioned.

Edit: So as not to confuse anyone, I want to emphasize that e - 2 is not the expected number of breaks less than 0.5. It is what the expected number would be if the false heuristic were correct.

djames · #12 02-21-2007, 11:08 AM

Jason, nice post!

AaronBrown · #13 02-21-2007, 02:27 PM

Thanks for the useful post, but I disagree. You have relied upon the fallacy of the excluded middle. It's true that if I throw the second point away I get the wrong answer, and also if I treat it as a break. However, I keep the point, but do not treat it as a break. That allows me to keep all three events equally likely, but also to conform to the calculation of the problem. Putting this another way, I draw more uniform random variates than I make breaks in the stick.

Unfortunately, this is what causes things to get complicated after the first three terms. Things are simple before the first break (1/1), after the first break (1/(1+1/2)), and after the second break (1/(1+ 1/(2+1/3)). But there may not be a second break, which is what requires a more complex numerator for the fourth term. It is, unfortunately, an infinite sum with no obvious (to me) closed form representation. I wish it were 4, in which case I would have a very nice result, but it's not.

What I mean by there may not be a second break is that after the first break, we may draw one or more uniform variates that do not create breaks. Therefore, instead of 4 equally likely events we may have 6, 8 or more equally likely events for the second break. In my solution, I index the breaks by the number of uniform random variates drawn before each break occurs. This keeps everything uniform, but it means there may be no break with index 2 (or any other number greater than 1), not because the game is over but because the variate corresponding to that index was less than an earlier variate.

This still qualifies as an analytic solution, because the expression can be evaluated exactly by a computer program. This is not simulation, there are no random numbers. It's deterministic code that sums and divides to any desired degree of accuracy (unfortunately, it has to run backward from the chosen last point, it's not like an infinite sum that can be computed with increasing precision at each step). I can't prove that it converges to ln(2), unless you accept the argument that since we know the answer is ln(2) from solutions 1 and 2, it must the the convergent value of the process.

This was my first approach to this problem. I thought it was a clever solution, until I discovered the series got complicated after the third term. I slogged through it asking the computer to evaluate the number, but by the time I did that, I already had solutions 1 and 2; and knew the answer would be complicated because the continued fraction for ln(2) is complicated.

I still think there's something interesting here, and someone might build on this solution for an important insight.

PairTheBoard · #14 02-21-2007, 03:06 PM

For the puposes of the Forum Archives there should be a link to the thread that spawned this one.

Link to Original Thread

I didn't think much of jason's original problem since I couldn't see a nice way to approach it. Suprisingly to me, it has turned into a very interesting topic on several levels. All we need now is for Brandi to jump into the fray. There's a rumor going around that she's got a crush on Aaron's avatar.

PairTheBoard

jason1990 · #15 02-21-2007, 03:11 PM

I do not know whether or not you are genuinely trying to communicate something meaningful to me with this post. Based on recent events, I must unfortunately conclude that if your post does not contain precise and unambiguous claims, together with formal proofs, then it is most likely in my best interest to ignore it. I can only repeat the main conclusion of my previous post:

Let X_1, X_2, X_3, ... be the break points in the original problem. Construct them any way you please, so long as the sequence has the distribution specified in the original problem. Then, conditioned on the event {X_1 < 0.5}, the three events

X_2 in [X_1, 0.5]
X_2 in [0.5, X_1 + 0.5]
X_2 in [X_1 + 0.5, 1]

are not equally likely.

This fact has been proven in complete and rigorous detail. It therefore does not matter whether you agree with it or not. If you post some formal claims and rigorous proofs, then I will be happy to continue this conversation. Otherwise, I will most likely have to ignore any subsequent posts of yours.

AaronBrown · #16 02-21-2007, 03:31 PM

[ QUOTE ]
Let X_1, X_2, X_3, ... be the break points in the original problem. Construct them any way you please, so long as the sequence has the distribution specified in the original problem. Then, conditioned on the event {X_1 < 0.5}, the three events

X_2 in [X_1, 0.5]
X_2 in [0.5, X_1 + 0.5]
X_2 in [X_1 + 0.5, 1]

are not equally likely.

[/ QUOTE ]
Agreed.

Draw U_1, U_2, . . . independently from a uniform 0,1 distribution. If U_i > Max(U_1, U_2, . . . , U_i-1), call it a break. In my solution, I call it the ith break, because i is the useful index for the rest of the calculation. But you insist on indexing by the number of breaks. So you are going to call it X_j where j <= i. I also need to use the ordered values of U, call them O_1 (minimum of the U_i's), O_2 (second smallest) and so on. If U_i = O_i then and only then, U_i is a break.

Let X_2 = U_k where k >=2. There are k+1 equally likely events:

X_2 < 0.5
0.5 < X_2 < 0.5 + O_1
0.5 + O_1 < X_2 < 0.5 + O_2
. . .
0.5 + O_k-1 < X_2 < 1

I would sincerely appreciate it if you would thaw your tone a bit. I might be wrong on my math, and you might be right. But we're discussing an approach to solving the problem. I hope you'll agree that there's no question of me lying or posting nonsense or trying to mislead anyone. You are welcome to be as outspoken as you like on your thread, but this thread is for getting the math right, not proving any person good or bad.

PairTheBoard · #17 02-21-2007, 04:12 PM

[ QUOTE ]
Thanks for the useful post, but I disagree. You have relied upon the fallacy of the excluded middle. It's true that if I throw the second point away I get the wrong answer, and also if I treat it as a break. However, I keep the point, but do not treat it as a break. That allows me to keep all three events equally likely, but also to conform to the calculation of the problem. Putting this another way, I draw more uniform random variates than I make breaks in the stick.

[/ QUOTE ]

Could you explain what you mean by the "fallacy of the excluded middle" and how it relates to jason's argument/proof? I understand the "exluded middle" to be the assumption that an admissable statement is either true or false. Bringing in the proposition that this is a fallacy felatiizing the argument seems pretty left field for the simple probabilty problem we're looking at.

You say, "It's true that if I throw the second point away I get the wrong answer, and also if I treat it as a break. However, I keep the point, but do not treat it as a break."

I assume this is in relation to your statement in the solution, "If there is a second break, the first break was between 0 and 0.5. Draw a number from a uniform distribution from 0 to 1. Four equally likely things can happen. It can be lower than the first number, greater than the first number but less than 0.5, greater than 0.5 but less than the first number plus 0.5 or greater than the first number plus 0.5. In the first case, it is not a break, so there are three remaining cases."

This looks very confused to me. If the second number is less than the first it's not a break but you don't throw it away. You keep it so you can use it as a break in the other three cases where it's not less than the first number? I don't think this makes sense. You don't throw it away, you just sort of ignore that case. The thing is, if it is less than the first number you have to draw a third number and as jason points out, that is more likely to happen if the first number is a large one. If it's more likely that the first number is large when the third number is required, then the four cases for the third number won't be equally likely.

For example, if the first number is on the large side, like say 0.4, making it more likely a third number is required, then clearly the intervals [0,.4], [.4,.5], [.5,.9], [.9,1] are not equally likely for the third number. I don't think the appeal to some kind of "fallacy of the exluded middle" is an adequate answer to this objection.

PairTheBoard

jason1990 · #18 02-21-2007, 04:43 PM

[ QUOTE ]
Draw U_1, U_2, . . . independently from a uniform 0,1 distribution. If U_i > Max(U_1, U_2, . . . , U_i-1), call it a break. In my solution, I call it the ith break, because i is the useful index for the rest of the calculation. But you insist on indexing by the number of breaks. So you are going to call it X_j where j <= i.

[/ QUOTE ]
We are discussing the problem which is posted here . In order to be more consistent with notation for random variables, I have begun using X_j instead of x_j. Are you claiming that there is a random increasing sequence N(j) such that the sequence

U_{N(1)}, U_{N(2)}, ...

has the same distribution as X_1, X_2, ... If so, would you please define N(j) and prove that the two sequences have the same distribution?

[ QUOTE ]
I also need to use the ordered values of U, call them O_1 (minimum of the U_i's), O_2 (second smallest) and so on. If U_i = O_i then and only then, U_i is a break.

[/ QUOTE ]
I think you mean to index O by two subscripts rather than one. After all, you cannot order the entire infinite sequence {U_i} since it will not have a smallest element.

[ QUOTE ]
Let X_2 = U_k where k >=2. There are k+1 equally likely events:

[/ QUOTE ]

Which k is this? Is this the first k such that U_k is a break? If so, then by your definition, it is the first k such that U_k > Max(U_1, U_2, . . . , U_{k-1}). In that case, k is a random variable and it is therefore extremely unclear what you mean when you say there are k+1 equally likely events.

[ QUOTE ]
X_2 < 0.5
0.5 < X_2 < 0.5 + O_1
0.5 + O_1 < X_2 < 0.5 + O_2
. . .
0.5 + O_k-1 < X_2 < 1

[/ QUOTE ]
Again, this appears to be a random number of events, so your statement is unclear. Could you please make your statement more precise and then include a proof of that statement?

[ QUOTE ]
I would sincerely appreciate it if you would thaw your tone a bit. I might be wrong on my math, and you might be right. But we're discussing an approach to solving the problem. I hope you'll agree that there's no question of me lying or posting nonsense or trying to mislead anyone. You are welcome to be as outspoken as you like on your thread, but this thread is for getting the math right, not proving any person good or bad.

[/ QUOTE ]
I do not understand why you would post this paragraph, since this thread is supposed to be about mathematics only. I will not respond to it here. My response is here.

AaronBrown · #19 02-21-2007, 04:49 PM

[ QUOTE ]
Could you explain what you mean by the "fallacy of the excluded middle" and how it relates to jason's argument/proof?

[/ QUOTE ]
Jason1990 considered two approaches when you draw a uniform variate that is smaller than a previously drawn one. He said you either have to throw it away or keep it. In fact, you can do something in between. You need to keep it to form the intervals for later draws, but you don't count it as a break for computing the probability of winning or losing.

It's true that if you are forced to one extreme or the other, my solution is not correct. But I chose the middle course.

PairTheBoard · #20 02-21-2007, 05:11 PM

[ QUOTE ]
[ QUOTE ]
Could you explain what you mean by the "fallacy of the excluded middle" and how it relates to jason's argument/proof?

[/ QUOTE ]
Jason1990 considered two approaches when you draw a uniform variate that is smaller than a previously drawn one. He said you either have to throw it away or keep it. In fact, you can do something in between. You need to keep it to form the intervals for later draws, but you don't count it as a break for computing the probability of winning or losing.

It's true that if you are forced to one extreme or the other, my solution is not correct. But I chose the middle course.

[/ QUOTE ]

If the second number is less than the first you "keep it to form the intervals for later draws"? ok, let's look at an example. Suppose your first number is 0.4 and your second number is 0.2. What number do you now consider your first break and how do you use the 0.2 to form the intervals for the third number drawn?

PairTheBoard

View Poll Results: Does this make me gay?
yes	36	38.30%
no	25	26.60%
no, but lots of other stuff you do does	33	35.11%
Voters: 94. You may not vote on this poll