Two Plus Two Newer Archives

Two Plus Two Newer Archives (http://archives1.twoplustwo.com/index.php)
-   Probability (http://archives1.twoplustwo.com/forumdisplay.php?f=27)
-   -   April 2007 IBM Ponder This Challenge (http://archives1.twoplustwo.com/showthread.php?t=374930)

pzhon 04-10-2007 05:27 AM

April 2007 IBM Ponder This Challenge
 
Many reading this forum may like this month's Ponder This challenge from IBM, which can be viewed as a probability problem:
<ul type="square">
This month's puzzle concerns a frog who is hopping on the integers from minus infinity to plus infinity. Each hop is chosen at random (with equal probability) to be either +2 or -1. So the frog will make steady but irregular progress in the positive direction. The frog will hit some integers more than once and miss others entirely. What fraction of the integers will the frog miss entirely? Please find an exact answer.[/list]

neverforgetlol 04-10-2007 03:09 PM

Re: April 2007 IBM Ponder This Challenge
 
Someone who's good at stochastic processes might figure it out

f97tosc 04-10-2007 07:09 PM

Re: April 2007 IBM Ponder This Challenge
 
We can reformulate the problem to the question "what is the probability that the frog will not eventually land on 0, assuming it starts from minus infinity?".

Given that the frog is in n, denote the probability that it will eventually reach 0 as P(n).

Then generally

P(n)=0.5P(n-1)+0.5P(n+2)

or equivalently

P(n+3)-2P(n+1)+P(n)=0

This is a difference equation with the characteristic equation

r^3-2r+1=0

This has the solutions (guess the first root, factor it out, and solve for the
others)

r1=1
r2=-1/2 + sqrt(5)/2
r3=-1/2 - sqrt(5)/2

Now P(n)=C1*r1^n + C2*r2^n + C3*r3^n

Where C1, C2, and C3 are constants.

We can determine the constants from edge conditions. However, the edge conditions are different for n&gt;0 and n&lt;0, and so we will consider them separately, starting with n&gt;0.

For n&gt;0, C3 has to be zero, since r3 is negative and has the largest absolute value; otherwise we would evetually get negative probabilities.

Similarly, since |r2|&lt;1, but r1=1 and P(inf)=0, we must have that C1=0. Finally, the edge condition P(0)=1 gives us C2=1.

Hence, for n&gt;0, P(n)= (-1/2+sqrt(5)/2)^n, and specifically, P(1)=-1/2+sqrt(5)/2


Now for n&lt;=1, we have the same characteristic equation, but different edge conditions, and different constants as follows:

P(n)=D1*r1^n + D2*r2^n + D3*r3^n

r1,r2, and r3 are the same as for the n&gt;0 case. Since |r2|&lt;1, the second term would make the probability &gt;1 for sufficiently low n,and so D2=0.

This time we have the edge conditions P(0)=1, and P(1)= -1/2+sqrt(5)/2, which we determined in the n&gt;1 case.

Putting these together we have

P(0)=D1 +D3 =1
P(10= D1 + (-1/2 -sqrt(5)/2)*D3 = -1/2 + sqrt(5)/2.

This is two linear equations for two unknowns, which can be solved for
D1 = 2 sqrt(5)/(3+sqrt(5))

As stated in the beginning, we seek

1-P(-infty)=

=1-D1

=1- 2 sqrt(5)/(3+sqrt(5))

=(7-3sqrt(5))/2 ~= 0.146

lastcardcharlie 04-10-2007 07:30 PM

Re: April 2007 IBM Ponder This Challenge
 
Just to be pedantic (it's late), "minus infinity" and "plus infinity" are not integers.

alThor 04-10-2007 11:25 PM

Re: April 2007 IBM Ponder This Challenge
 
This problem (and your solution) remind me of the risk-of-ruin problems with integer payoffs. Such problems have been solved, e.g. when you lose one unit or win k units (2 in this case).

So one can also view the problem this way: consider the first time you "pass" any integer (say zero). There is a 50% chance that you hit it, and a 50% chance you jumped over it. If you jumped over it (and landed on one), you can ask the ROR question where you start with a bankroll of one unit, and are playing a game where you lose one or win two. Then,

P(hit zero) = .5 + .5*(ROR)

Solving the ROR uses the same style of difference equations you did, so this isn't really far from your answer; just making an observation on the connection.

f97tosc 04-11-2007 09:11 AM

Re: April 2007 IBM Ponder This Challenge
 
This was actually how I started to think about the problem but after I solved for "ROR" I realized that I don't think it is true that you initially "pass" the integer with 50% chance. And indeed, when I solved for the approach side as well, the answer I got was different. Of course I could have made an error somewhere when solving those equations, but I am quite skeptical of the 50% pass claim now.

jason1990 04-11-2007 11:24 AM

Re: April 2007 IBM Ponder This Challenge
 
I get a different answer. Let S_n be the position of the frog, with S_0 = 0. Let A(k) be the event that the frog eventually visits site k, and let I(k) = 1 if A(k) occurs and 0 otherwise. Define

R_L = (2L)^{-1}\sum_{k=-L}^L I(k).

Technically, we must prove that this converges almost surely as L goes to infinity, and calculate the limit. Let us first take expectations:

ER_L = (2L)^{-1}\sum_{k=-L}^L P(A(k)).

For k &gt; 0, let t(k) be the first time S_n hits -k, and let s(k) be the first time S_n hits or passes k. Because of the upward drift, S_n goes to infinity almost surely. Hence, s(k) is finite with probability one, but there is a positive probability that t(k) is infinity. Define

q_k := P(A(-k)) = P(t(k) finite)
p_k := P(S_{s(k)} = k + 1).

Note that p_k is the probability that the frog passes k on the way up. Hence, by the Markov property,

P(A(k)) = (1 - p_k) + p_k q_1.

We therefore have

(*) ER_L = (2L)^{-1}[1 + \sum_{k=1}^L q_k
+ \sum_{k=1}^L (1 - p_k + q_1 p_k)].

Now to compute q_k and p_k. Let f = (-1 + \sqrt{5})/2 = 0.618, and define M_n = f^{S_n}. It is easy to check that M_n is a martingale. By standard methods (using optional sampling and dominated convergence), this shows that q_k = f^k.

For p_k, note that

p_1 = 1/2 + (1/2)p_2
p_2 = (1/2)p_3
p_n = (1/2)p_{n-2} + (1/2)p_{n+1}, for n &gt; 2.

If we define p_0 = 0 and p_{-1} = 1, then we have

p_{n+1} = 2p_n - p_{n-2}, for n &gt; 0.

The general solution to this recurrence is

p_n = c_1 + c_2 f^{-n} + c_3 (-f)^n.

Since each p_n is a probability (and thus between 0 and 1), we must have c_2 = 0. Using p_0 = 0 and p_{-1} = 1, we get c_1 = f^2 and c_3 = -f^2. Hence,

p_n = f^2 + (-1)^{n+1} f^{n+2}.

(Note that this implies the limiting passing probability is f^2 = 0.382.) Finally, using (*), we get

lim ER_L = (1/2)(1 - f^2 + q_1 f^2)
= (1/2)(1 - f^2 + f^3)
= (-5 + 3\sqrt{5})/4 = 0.427.

Some additional argument is now required to show that R_L converges almost surely to this limit. I suspect this can be done using straightforward estimates on the covariances of the I(k)'s.

f97tosc 04-11-2007 12:19 PM

Re: April 2007 IBM Ponder This Challenge
 
Hmmm... maybe your approach is right but I think the answer is wrong... we can bound the answer as follows.

Define
P1 = P(frog never passes 0)
P2 = P(frog never passes 1)
P3 = P(frog passes first 1, then 0)

Since these are mutually exclusive, we have P1+P2+P3&lt;=1.

Also by symmetry, P1=P2.

Furthermore, P3&gt;P1, because these two together form the event that the frog hasn't stopped at 0 when it first reaches 1, and if this happens, the frog is more likely (&gt;50%) to eventually go back to 0 as in P3, than to never go back to 0, as in P1.

Then 1 &gt;= P1+P2+P3 &gt; 3 P1

P1&lt;1/3

Hence, at the very most 1/3 of integers can be skipped.

fnord_too 04-11-2007 12:24 PM

Re: April 2007 IBM Ponder This Challenge *DELETED*
 
Post deleted by fnord_too - brain fart

fnord_too 04-11-2007 12:38 PM

Re: April 2007 IBM Ponder This Challenge
 
I think this may be solvable the following way:
At some point you will will be 1 or 2 away from every number on the low side (i.e. before you pass or hit any number you will be exactly one or two less than it). For each of these two equally likely occurences there are sequences that land you on the exact number. So you should be able to take both of these starting conditions for an arbitrary number and determine the chance of hitting that number.

For instance, consider 2. At some point you are on 0 or 1 (each equally likely, I can prove that if necessary).

From zero, there are several ways to hit 2, but they are grouped around the number of actions, so:
1 action F - 50%
4 Actions BBFF, BFFB, BFBF 18.75%
8 Actions I think is the next
and so on.

Intuitively, I think this forms a nice managable infinite series, and there is a similar one for starting 1 away. Solve both of those and average the P's and that is the answer. It would take me a long time to get into the right mindset to crunch through that, but I think it is a good approach.

jason1990 04-11-2007 12:39 PM

Re: April 2007 IBM Ponder This Challenge
 
I think I see a connection. Let Y be your answer (0.146) and let M be my answer (0.427). Then M = (1 - Y)/2.

I claim to have calculated M = the fraction of all the integers that the frog will eventually HIT. I see I misread the original problem on the IBM website. So my answer should be 1 - M = 0.573 = the fraction of all the integers that the frog will eventually MISS.

You have calculated Y = the probability, starting at 0, that the frog misses state k, for k near +infinity. Hence, Y is the fraction of states above his starting point that he will miss. However, since he will reach a bounded minimum, the fraction of states below his starting point that he will miss is 1. So the fraction of all the integers that he will miss is 0.5 + Y/2 = 1 - M.

f97tosc 04-11-2007 01:38 PM

Re: April 2007 IBM Ponder This Challenge
 
Yeah good point, I think the discrepancy boils down to exactly how the frog starts off its epic journey, and that wasn't exactly well defined in the problem statement.

The answer 1-M = 0.5+Y/2 (using your notation here) essentially assumes, as you write, that there (a.s.) exist some lowest integer below which all integers are missed.

But the problem statement says that the frog "is hopping on the integers from minus infinity". So I think that, loosely speaking, rather than thinking of the frog as starting off at some particular integer, we can say that there is no point below which the frog hasn't been when he reaches 0. That is, every point can be treated as k, again using your notation. And then Y is the answer.

jason1990 04-11-2007 02:03 PM

Re: April 2007 IBM Ponder This Challenge
 
I agree that the initial conditions in the problem are ambiguous. I interpreted the statement "hopping on the integers from minus infinity to plus infinity" as "hopping on the integers, with no a priori restriction as to how negative or how positive his location can become."

How would we rigorously define a stochastic process which models this frog's movements and which starts at -infinity? We could try to take this literally, and add a point at infinity to the integers. But then, what is the distribution of the frog's first jump? He must eventually make a first jump to a finite integer. After that initial jump, there will be a lowest integer below which all integers are missed.

Or we could try to take some limiting procedure, by considering a sequence of processes where he starts at x, and let x go to -infinity. But again, for each fixed process in the sequence, there will be a lowest integer below which all integers are missed. If his starting point is any fixed integer, or even any integer-valued random variable, then the answer will be 1 - M. So I cannot see how we could take a limit in any reasonable way and get Y.

In fact, this reminds me a bit of the martingale betting system. Any reasonable limiting procedure shows that it is a disaster, whereas naively "starting at infinity" seems to show that it is a goldmine.

f97tosc 04-11-2007 02:12 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
At some point you will will be 1 or 2 away from every number on the low side (i.e. before you pass or hit any number you will be exactly one or two less than it). For each of these two equally likely occurences

[/ QUOTE ]

I think you need to distinguish between on the one hand eventually being 1 or 2 below, and on the other hand reaching 1 below without having first reached 2 below (or vice versa).

For eventually reaching 1 or 2 below, then potentially both can happen, so I don't think that is what you mean.

On the other hand, if you mean reaching 1 below without having first reached 2 below, or vice versa, then it is not obvious, in fact I think it is false, that they are equally likely.

fnord_too 04-11-2007 03:08 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
[ QUOTE ]
At some point you will will be 1 or 2 away from every number on the low side (i.e. before you pass or hit any number you will be exactly one or two less than it). For each of these two equally likely occurences

[/ QUOTE ]

I think you need to distinguish between on the one hand eventually being 1 or 2 below, and on the other hand reaching 1 below without having first reached 2 below (or vice versa).

For eventually reaching 1 or 2 below, then potentially both can happen, so I don't think that is what you mean.

On the other hand, if you mean reaching 1 below without having first reached 2 below, or vice versa, then it is not obvious, in fact I think it is false, that they are equally likely.

[/ QUOTE ]

I do mean the one you arive at first. Let me see if I can prove it.

For any integer k, prove that ariving at k-1 before k-2 has the same probability as ariving at k-2 before k-1.

Assume one can arrive at k-1 first more often than you can arive at k-2 first. Then, for (k+1) one arrives at (k+1)-2 more often than you are ariving at (k+1)-1, which is a contradicion. Ergo P(K-1) &lt;= P(K-2).

Ok, getting rid of that '&lt;' is not trivial. I need to think about this some more.

alThor 04-11-2007 03:24 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
On the other hand, if you mean reaching 1 below without having first reached 2 below, or vice versa, then it is not obvious, in fact I think it is false, that they are equally likely.

[/ QUOTE ]

Upon reconsideration, I think you're right, which throws off my ROR solution above.

What confuses me, however, is the fact that we satisfy 1-M = 0.5+Y/2 in the two solutions posted.

The solution for "Y" makes sense to me, and I consider that to be the answer to the "right" question (i.e. the best well-defined question one can form based on the wording, starting at zero and only looking forward).

However the equation (1-M) = 0.5 + 0.5Y assumes that "half" the integers are below zero (or any arbitrary lower bound), while half are above. What confuses me is that the "percentage of integers that are below zero" isn't well defined. (E.g. One can map the odd positives into all the negatives, and thereby "prove" that two thirds of the integers are positive.)

So how did the solution for M happen to satisfy 1-M = 0.5+Y/2? The solution for M must have had a hidden assumption that half the integers are negative?

Siegmund 04-11-2007 04:00 PM

Re: April 2007 IBM Ponder This Challenge
 
I came at it yet another slightly different way... and ran into an infinite series I couldn't add up, but it happens to agree exactly with the difference equation solution posted here. As a result, I now have one of the more bizarre combinatorics results I have ever seen.

Let's begin by considering another question: what is the probability that the frog will ever return to the square on which he is currently standing?

He can get there after 3 jumps, RLL LRL or LLR, with probability 3/8.

He can get there after 6 jumps, Rx2 and Lx4, 15 ways out of 64, but 3x3 of these paths repeat the 3-jump patterns twice, and only 6 are new ways.

In general, (3k choose k) ways to return after 3k jumps, but only 2/(3k-1) of these are "new".

The sum of (3k choose k) * 2/(3k-1) *2^(-3k), from 1 to infinity, gives us the probability the frog will return to his current position. Call this P.

Then the number of times the frog will return to his current position is a geometric distribution with parameter P, and the expected number of times the frog will visit a position, given that he visits it at least once, is 1/(1-P).

We also can see that the frog's average speed is 1/2 to the right... that is, his expected number of visits to each position must be 2.

Therefore, he must visit 2*(1-P) of the squares an average of 1/(1-P) times each, and never visit the remaining 2P-1 squares at all.

As it happens, P~.572949017, which is jason's 1-M, and 2P-1 comes out to f97's Y.

What a roundabout way to show that sum[3k choose k)*2/(3k-2)*2^(-3k), {k,1,Infinity}] = (9-3sqrt(5))/4. The difference-equation route is certainly an elegant way to get the exact answer... but hopefully this shows the connection between the two correct methods posted.

jason1990 04-11-2007 04:57 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
So how did the solution for M happen to satisfy 1-M = 0.5+Y/2? The solution for M must have had a hidden assumption that half the integers are negative?

[/ QUOTE ]
This assumption is encoded in the fact that I used a kind of principal value limit. Let F(L) denote the fraction of sites between -L and L which the frog misses. Then F(L) -&gt; 1 - M as L -&gt; infinity. If you take, for example, the limit of the fraction of missed sites in the box [-L,2L], you will get a different limit. In this sense, the "fraction of all integers" is another ambiguity in the problem statement.

[ QUOTE ]
The solution for "Y" makes sense to me, and I consider that to be the answer to the "right" question (i.e. the best well-defined question one can form based on the wording, starting at zero and only looking forward).

[/ QUOTE ]
Since I seem to be the odd man out in my interpretation, perhaps you could help me understand yours. Specifically, where does the frog start in your interpretation? Does he start at the origin? Does he start randomly and, if so, with what distribution? Does he start "at -infinity" and, if so, what does that mean formally? Does he even start at time n = 0, or does time extend infinitely in both directions?

Siegmund 04-11-2007 07:58 PM

Re: April 2007 IBM Ponder This Challenge
 
Rereading jason's solution, this is making a bit more sense: jason has started his frog at 0, and asked how many points in (-L,L) get visited as L gets large. The vast majority of points left of the frog's starting point never get visited, since the frog drifts to the right.

That is, the fraction of points in (-L,0) that get visited approaches 0 as L becomes large, trivially; while the fraction of points in (0,L) approaches ~0.854 as L becomes large (and this is the one that takes all the work.)

Yes, the original post was a bit ill-defined.

jason1990 04-11-2007 08:34 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
Rereading jason's solution, this is making a bit more sense: jason has started his frog at 0, and asked how many points in (-L,L) get visited as L gets large. The vast majority of points left of the frog's starting point never get visited, since the frog drifts to the right.

That is, the fraction of points in (-L,0) that get visited approaches 0 as L becomes large, trivially; while the fraction of points in (0,L) approaches ~0.854 as L becomes large (and this is the one that takes all the work.)

Yes, the original post was a bit ill-defined.

[/ QUOTE ]
I just want to clarify something I said earlier. If the frog starts at any fixed point x, then the fraction of points in [-L,0] that get visited will also go to 0, since L will eventually be very far to the left of x. So this "principal value" limit is 1 - M, no matter where he starts. It then follows that it is also 1 - M if his starting point has any probability distribution on the integers. So starting the frog at 0 is not necessary for this result.

Moreover, the limit, as (L_1,L_2) -&gt; (-infinity, infinity), of the fraction of missed sites in [L_1,L_2] does not exist.

The only way I can see to get around this is to somehow start the frog "at -infinity." I am not claiming that it is impossible to rigorously make sense of this. I just cannot presently see a sensible way to do it.

f97tosc 04-11-2007 11:52 PM

Re: April 2007 IBM Ponder This Challenge
 
I think we can resolve this by considering a reversed time process. That is, let the frog be (note: I am not writing start) at integer 0 at time 0. We now define two frog processes, one following the frog going forward in time, and one following the frog backwards in (negative) time. Neither process has an end, and both obey the specifications of the problem. We seek the intersection of the integers missed by the two processes, in the limits of small and big t.

The forward process is defined directly by the problem.

So the question is, can we define a backward process which is unique, Markovian, and consistent with the forward process? By consistent, we mean that given any times t1 and t2 and integers n1 and n2, there is no way to (statistically) distinguish between a tape of the forward process going from (t1,n1) and (t2,n2) or a backward process going from (t2,n2) to (t1,n1). I contend that this is possible but am too tired to work out the transition probabilities etc right now.

In any case, if we do this, then the fraction of integers missed in (-infty, 0] must be the same (if the backward process was consistent) as the fraction of integers in [0,infty), and so the fraction missed is Y.

f97tosc 04-12-2007 09:20 AM

Re: April 2007 IBM Ponder This Challenge
 
Actually the backward process is quite simple: If the frog's position at time t is n, then by Bayes' theorem its position at t-1 is n+1 with probability

P=P(n+1)*0.5/(0.5P(n+1) + 0.5P(n-2))

where P(n+1) and P(n-2) are the a priori probabilities that the frog was at n+1 or n-2 at time t-1. If we have no a priori information about the frog before the time t (which is the implication of a backward process), then by symmetry P(n+1)=P(n-2), and so the transition probabilities going backwards are simply go to n+1 with probability 0.5 and go to n-2 with probability n-2.

jason1990 04-12-2007 09:23 AM

Re: April 2007 IBM Ponder This Challenge
 
If you have the time and inclination to write this up, I would interested in reading it. Just to be sure I understand you, you want to model the frog's position as a process indexed by positive and negative time, that is, by a family of random variables {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...}. You want this process to be Markov in the sense that

E[f(S_n) | S_k for all k &lt; n] = E[f(S_n) | S_{n-1}]

for all integers n. Is this right?

In order for this process to be consistent with the original problem, I would think we would need, for example, to have

P(S_n = k + 2 | S_{n-1} = k) = 1/2

for all n and k. But if the frog is at integer 0 at time 0, then this is not true for n = 0. I would think we would need the position of the frog at time 0 to be random, with a distribution which can be evolved backwards.

To be more specific, if p_n(k) = P(S_n = k), then we want

p_{n+1}(k) = (p_n(k - 2) + p_n(k + 1))/2 for all k.

We seek a family of probability distributions {..., p_{-2}, p_{-1}, p_0, p_1, p_2, ...} that satisfy this recursion. Given p_0, we know how to construct p_n for n &gt; 0. But for some choices of p_0 (e.g., p_0(0) = 1), it will not be possible to construct the distributions p_n for n &lt; 0. Are there any choices of p_0 that work?

Let me know if I am misunderstanding you. I would be interested if you have any links or references to the kind of construction you are envisioning.

f97tosc 04-12-2007 09:42 PM

Re: April 2007 IBM Ponder This Challenge
 
As you suggest, let's define a family of random variables {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...}, where S_0 is known but arbitrary (I will discuss this more below). For any t&gt;0 (btw, I am confused by your indices k and n), we can find the probability distribution of S_t by the recursion P(S_t=S_{t-1}+2)=P(S_t=S_{t-1}-1)=0.5 ("forward equation"), and for any t&lt;=0 we can identify S_{t-1} from S_{t} and the recursion P(S_{t-1)}=S_{t}-2 )= P(S_{t-1}=S_{t}+1) = 0.5 ("backward equation").

Given this stochastic process we can calculate (using the techniques in yours or mine original post) the fraction of integers such that S{t} does not equal that integer for any positive or negative t.

So we have a rigorously defined stochastic process, and we have an answer, but is it really the right probability model for the problem?

And to answer this, I would first like to emphasize an important conceptual point. Our stochastic process no longer represents the frog taking one step at a time, but rather an observer learning about the frog, one step at the time. The observer has the power to request the frog's position for any t, and the probabilities represent his information state before he makes the request, but he doesn't have to do it in order of increasing t. The limit quantity above is the observer requesting the frog's position for ever larger and smaller t's, and noting which integers were missed.

I hope some of your concerns are taken care of by embracing this perspective shift. You write

" In order for this process to be consistent with the original problem, I would think we would need, for example, to have

P(S_n = k + 2 | S_{n-1} = k) = 1/2

for all n and k.
"
The implicit implication here is that the probabilities represent the frog's steps, and that these steps can only go forward in time. But if the probabilities represent uncertainty of observations not yet made, and which can be made in any direction in time, then there is no problem if this equation (the "forward equation") does not hold for some n and k. specifically, it won't apply in the cases when I already know something about k for a greater n, for example because we are below the starting point (starting in terms of observations). The Markov property does hold if we have inspected only a sequence {...,t-2,t-1,t} and are asking about S for higher t. We can also define a "backwards Markov" property (this is what I meant originally, sorry that was very unclear), meaning that given a sequence of observations {...,t+2,t+1,t}, and asking about S_{t-1}, the probability distribution will only depend on S_t. This follows directly from how the stochastic process was defined.

Also, I am not sure I follow your argument about starting point, but in the perspective just outlined, there is nothing inconsistent or wrong with having a starting probability distribution that has nothing to do with the backward or forward equations. The starting point doesn't say anything special about the frog, it is just the first time for which the observer learns the frog's position. And since the questions we may be interested in do not depend on the point at which we start observing or the first value (additional justification for this below) we may as well just start by assuming a specific given S_0.

Finally, I still think I need to justify if this model of a wandering observer is really the relevant one for the given problem. In particular, when the observer looks at increasing t, the process is virtually identical to the jumping frog process, but what about when the observers look at decreasing t.s? I think the key question here is whether there is any test that the observer can make, which can determine whether he is looking at decreasing indices of a prerecorded forward process/jumping frog, or if he is looking at a backward process (perhaps with the new steps being generated just as he inspects them)? He can't! If I gave you a sequence of {(t1,n1),...,(t_q,n_q)} then there is no way for you to tell whether the sequence was generated from (t1,n1) by a forward process or from (t_q,n_q) by a backward process. This also provides additional support to the claim that the choice of starting observation point is arbitrary; after we generate the numbers there will be no way to see where we started.

And so in effect, we have provided a rigorous answer to the question "Assume you are an observer learning the frog's position for an arbitrary first time. From there, you can track the frog's movement arbitrary far backward and forward in time, and however far you go, the frog always appears to be following the jumping process described in the problem. In the limit of observations on all integers, what is the fraction missed completely by the frog?". I think this is a reasonable way to model the somewhat ambiguous statement that the frog is hopping from minus to plus infinity.

As for references, time-reversal of markov chains and processes is covered in standard texts on discrete stochastic processes, at least it is in Gallager (not that I particularly like that book, but it's there). However, in those cases the time reversal is for steady state problems. Obviously that doesn't apply here, although surely somebody has done something like this before. The perspective that probabilities are information states has been persuasively argued E. T. Jaynes.

alThor 04-13-2007 01:25 AM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
[ QUOTE ]
The solution for M must have had a hidden assumption that half the integers are negative?

[/ QUOTE ]
This assumption is encoded in the fact that I used a kind of principal value limit. Let F(L) denote the fraction of sites between -L and L which the frog misses. Then F(L) -&gt; 1 - M as L -&gt; infinity. If you take, for example, the limit of the fraction of missed sites in the box [-L,2L], you will get a different limit. In this sense, the "fraction of all integers" is another ambiguity in the problem statement.

[/ QUOTE ]

OK, I see it. (I'm glad my question was understandable.)

[ QUOTE ]
Since I seem to be the odd man out in my interpretation, perhaps you could help me understand yours. Specifically, where does the frog start in your interpretation? Does he start at the origin? Does he start randomly and, if so, with what distribution? Does he start "at -infinity" and, if so, what does that mean formally? Does he even start at time n = 0, or does time extend infinitely in both directions?

[/ QUOTE ]

Just to add my vote, I was thinking to start him at any finite point (arbitrarily 0) at time zero, measure time forward, ask "what percentage of points in [0,X] get hit", and let X go to infty.

jason1990 04-13-2007 11:00 AM

Re: April 2007 IBM Ponder This Challenge
 
For what I am about to write, let me finally formalize the transition function. Let p(i,j) = 1/2 if j = i + 2 or j = i - 1, and p(i,j) = 0 otherwise.

I think it is important that we distinguish between the two different processes we are talking about. On the one hand, there is the (hypothetical) Markov process {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...} which models the frog's jumps. It satisfies P(S_{t+1} = j | S_t = i) = p(i,j) for all t, i, and j. I call it "hypothetical" because I do not believe such a process exists.

On the other hand, there is the process which models the observer learning about the frog, one step at the time. The observer enters the frog's world at some time which we call t = 0, and observes the frog at some location which we call i = 0. He can then request the frog's location relative to this fixed point at any future or past time. Let me call this process {..., X_{-2}, X_{-1}, X_0, X_1, X_2, ...}.

I think the relationship between S and X is pretty clear: X is simply S conditioned to be 0 at 0. That is,

(*) P(X_t = i) = P(S_t = i | S_0 = 0).

Let us now split X into a forward and a backward process. That is, for t &gt;= 0, let F_t = X_t and B_t = X_{-t}. Let us also assume, as in the construction you described, that B_t is governed by a transition function. That is, there exists a function q (which does not depend on t) such that

P(B_{t+1} = j | B_t = i) = q(i,j).

We can calculate q by observing that

q(i,j) = P(X_{-t-1} = j | X_{-t} = i)
= P(S_{-t-1} = j | S_{-t} = i, S_0 = 0).

If we let u_t(i) = P(S_t = i) and p_s(i,j) = P(S_{t+s} = j | S_t = i), then we can use Bayes Theorem to get

q(i,j) = u_{-t-1}(j)p(j,i)p_t(i,0)/(u_{-t}(i)p_t(i,0))
= (u_{-t-1}(j)/u_{-t}(i)) p(j,i).

Similarly, if q_s(i,j) = P(B_{t+s} = j | B_t = i), then

q_s(i,j) = (u_{-t-s}(j)/u_{-t}(i)) p_s(j,i).

This is valid for any i such that u_{-t}(i) &gt; 0. Take s = 3 and j = i in the above. Since p_3(i,i) &gt; 0 for all i, we get

u_{-t-3}(i) = (q_3(i,i)/p_3(i,i)) u_{-t}(i).

This equation, in fact, is valid even when u_{-t}(i) = 0. That is, it is valid for all i. Since u_{-t} is a probability distribution for all t, we can use this to show that u_{-3t}(i) = u_0(i) for all i and t. In other words, if we restrict our attention to times which are multiples of 3, the process S_t is stationary. Moreover, we get

q_3(i,j) = (u_0(j)/u_0(i)) p_3(j,i).

Of course, the frog's process S does not have this stationarity property, so the observer's process X cannot be realized in the form of equation (*).

In general, the above shows that if S has a stationary distribution u_0, then X can be defined as follows: if t &gt;= 0, then X_t is simply S_t, starting at S_0 = 0; and if t &lt;= 0, then X_t is defined by

(**) P(X_{t-1} = j | X_t = i) = (u_0(j)/u_0(i)) p(j,i).

If S does not have a stationary distribution, then X cannot be realized by (*). However, if S has an invariant measure u_0 (which is not necessarily a probability distribution), then we can define X by (**). This, I think, is what you have essentially done, where for the invariant measure, you have taken u_0(i) = 1 for all i. (In what you have described q(i,j) = p(j,i).)

In other words, the observer's process X is not well-defined in the form (*). But if we assume (informally) that the frog's position at time t = 0 is equally likely to be any integer, then (*) reduces to the definition you gave.

I must admit that this seems to be a reasonable construction of a process which models the frog jumping over all time, both positive and negative. Technically, we cannot do it, since a uniform distribution on the integers does not exist. But this construction gives us a way to get around this technical difficulty while still defining something rigorous. I would consider it analogous to the various methods used to condition on a set of measure 0.

But one final issue concerns me. The invariant measure, u_0(i) = 1 for all i, is not unique. An invariant measure must satisfy

u(j) = \sum_i u(i)p(i,j) = (u(j - 2) + u(j + 1))/2.

Another invariant measure would be u_0(i) = c^i, where c = (1 + \sqrt{5})/2. So we could generalize the above construction in a different way, by defining q(i,j) = c^{j-i}p(j,i). That is, B_t makes a jump of +1 with probability c/2 and a jump of -2 with probability 1/(2c^2). Would this not affect the proportion of missed sites on the negative half-line?

The non-uniqueness of the invariant measure suggests that the construction you have described is not canonical, which means that ambiguity still remains. Heuristically, I imagine one might argue against the alternative invariant measures by appealing to the "tape running backwards" property you mentioned (i.e. the reversibility). Frankly, though, I am a little confused by this. I can see how S (if it existed) would be reversible. But I do not see how X can be reversible. Suppose I observe two random variable (Y,Z) and I want to determine whether I am looking at something of the form (F_t,F_{t+1}) or something of the form (B_{t+1},B_t). If I repeatedly observe that Y = 0, then I know exactly what I am looking at: (F_0,F_1). The process S (if it existed) would not necessarily present such a problem, since S_0 would not be fixed. In order to prevent me from distinguishing between (F_0,F_1) and (B_1,B_0), we must put a measure on X_0. There is no probability distribution which will do the trick. However, with X defined the way you originally defined it, the uniform measure will do it for us (at least formally). And for X defined as in the previous paragraph, the measure c^i will do it for us. Therefore, I cannot see how the reversibility property can be used to rule out the alternative invariant measures and resolve the ambiguity.

fnord_too 04-13-2007 11:19 AM

Re: April 2007 IBM Ponder This Challenge
 
Ok, I am lame, has anyone written a monte carlo to come up with an aproximation? I will if I have time after lunch if no one else has. I will start at zero, terminate at 1000, and look at the numbers between 100 and 900. I figure running that about a million times should give a pretty close aproximation of the real percentage which could help.

jason1990 04-13-2007 12:36 PM

Re: April 2007 IBM Ponder This Challenge
 
I just thought of an example which I think better illustrates what I am trying to say at the end of my previous post.

Define X_0 = 0. In the forward time direction, X jumps -1 or +2 with equal probability. In the negative time direction, X jumps +1 or -2 with probabilities c/2 and 1/(2c^2), respectively.

You get to observe a segment of the path of this process that consists of n time steps. You know that it lies either entirely on the positive time axis or entirely on the negative time axis, but you do not know which. You also know the initial location of the path is i_0, and the final location is i_n. Can you statistically determine which side of the time axis you are on?

Suppose you are on the positive side of the time axis. Conditioning only on the initial location, each path of length n going from i_0 to i_n has probability 2^{-n}. So conditioning on both initial and final locations, the probability of each path is 1/N, where N is the number of such paths.

Now suppose you are on the negative side of the time axis. Pick a path of length n going from i_0 to i_n. Let x be the number of (forward) jumps of size +2 in this path, and let y be the number of (forward) jumps of size -1. Then x + y = n and 2x - y = d := i_n - i_0. Conditioning only on the final location, this path has probability

(c/2)^y(1/(2c^2))^x = 2^{-n}c^{-d}.

This is true for all such paths. Hence, conditioning on both the initial and final locations, the probability of each path is 1/N, where N is the number of such paths.

In other words, if you look at a sequence going from (t_0,i_0) to (t_n,i_n), there is no way for you to statistically determine whether it was generated by the forward process (as described in the original puzzle) or by a time reversal of the backward process described above.

The constant c can be determined from the relations

c/2 + 1/(2c^2) = 1 and c &gt; 0.

The solutions are c = 1 and c = (1 + \sqrt{5})/2. If c = 1, then we have a process for which the fraction of missed sites is the same on both sides of the time axis. If c = (1 + \sqrt{5})/2, then we do not. This is the ambiguity I was (perhaps unsuccessfully) trying to describe.

jason1990 04-13-2007 12:42 PM

Re: April 2007 IBM Ponder This Challenge
 
I believe we are all agreed upon the real fraction of missed sites on the positive side of the integer lattice. The discussion is about how to interpret the fraction of negative integers which are missed. In particular, what does it mean when they say that the frog is "hopping on the integers from minus infinity to plus infinity?"

marv 04-13-2007 04:28 PM

Re: April 2007 IBM Ponder This Challenge
 
I don't think you can deduce S is stationary from what you've done here. If t were restricted to say [-10,10], then we could easily construct a suitable X and S process.

We can define a backwards process in this case by noting that the increments in S are IID, so we can add up their negatives in reverse order.

This isn't quite the same as conditioning S on going through 0 since it ignores the issue of a prior (but I'm sure we can show that no such S exists anyway since it'll need infinite mass for all time, so we need another way of justifying the construction).

Marv

jason1990 04-13-2007 04:48 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
I don't think you can deduce S is stationary from what you've done here. If t were restricted to say [-10,10], then we could easily construct a suitable X and S process.

[/ QUOTE ]
But time is not restricted to [-10,10]. The fact that time extends infinitely in the negative direction is essential to the conclusion. Under the hypothesized conditions, we know that

u_{-t-3}(i) = (q_3(i,i)/p_3(i,i)) u_{-t}(i)

for all i and all t &gt;= 0. Let c(i) = q_3(i,i)/p_3(i,i), so that

u_{-3n}(i) = c(i)^n u_0(i).

Take any i such that u_0(i) &gt; 0. Assume c(i) &gt; 1. Then for n sufficiently large, u_{-3n}(i) &gt; 1, a contradiction. Hence c(i) &lt;= 1. But

\sum_i u_0(i) = 1 = \sum_i u_{-3n}(i) = \sum_i c(i)^n u_0(i),

so each c(i) = 1. This shows that u_{-3n} = u_0 for all n, and so the process S, restricted to times which are multiples of 3, has the same distribution at all times. This is clearly not possible, given the transition function p.

By the way, I just came across this link, http://www.math.uah.edu/stat/markov/TimeReversal.xhtml, which may explain things better than I. One of the conclusions from the first section is that the time-reversed Markov chain is a time-homogeneous Markov chain if X_0 has an invariant distribution. In our case, there is no invariant distribution. But there is an invariant measure. However, it is not unique, so there is more than one way to construct the process in the negative time direction.

marv 04-13-2007 05:07 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
[ QUOTE ]
I don't think you can deduce S is stationary from what you've done here. If t were restricted to say [-10,10], then we could easily construct a suitable X and S process.

[/ QUOTE ]
But time is not restricted to [-10,10]. The fact that time extends infinitely in the negative direction is essential to the conclusion. Under the hypothesized conditions, we know that

u_{-t-3}(i) = (q_3(i,i)/p_3(i,i)) u_{-t}(i)

for all i and all t &gt;= 0. Let c(i) = q_3(i,i)/p_3(i,i), so that

u_{-3n}(i) = c(i)^n u_0(i).

Take any i such that u_0(i) &gt; 0. Assume c(i) &gt; 1. Then for n sufficiently large, u_{-3n}(i) &gt; 1, a contradiction. Hence c(i) &lt;= 1. But

\sum_i u_0(i) = 1 = \sum_i u_{-3n}(i) = \sum_i c(i)^n u_0(i),

so each c(i) = 1. This shows that u_{-3n} = u_0 for all n, and so the process S, restricted to times which are multiples of 3, has the same distribution at all times. This is clearly not possible, given the transition function p.

[/ QUOTE ]

OK.

jason1990 04-13-2007 05:37 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
We can define a backwards process in this case by noting that the increments in S are IID,

[/ QUOTE ]
Yes. In other words, p(i,j) depends only on i - j. Whenever this is the case, the uniform measure will be an invariant measure, in which case...

[ QUOTE ]
we can add up their negatives in reverse order.

[/ QUOTE ]
In other words, we can form the reverse chain by using the transitions q(i,j) = p(j,i). We can obviously do the same thing with a symmetric random walk, with equally likely steps +1 and -1. But in that case, things are much cleaner, since the uniform measure is the unique invariant measure (up to constant multiples). So adding their negatives in reverse order is the only way to construct the reverse chain.

But in our case, we have other invariant measures, such as u(i) = [(1 + \sqrt{5})/2]^i. So we can also form the reverse chain by using the transitions q(i,j) = p(j,i)u(j)/u(i).

pzhon 04-14-2007 01:08 AM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]

In other words, if you look at a sequence going from (t_0,i_0) to (t_n,i_n), there is no way for you to statistically determine whether it was generated by the forward process (as described in the original puzzle) or by a time reversal of the backward process described above.

[/ QUOTE ]
This is true for any probabilities p and 1-p. You always lose the information on the drift when you pin both endpoints of a Brownian motion. This works equally well when you condition on two locations within the region with positive time. (You should find a difference when t_0 is negative and t_n is positive, which may be enough for you to discard that construction.)

This is very far from saying that you can't discuss the properties of a typical walk from -infinity to +infinity. It sounds like you are raising objections on the negative time side that apply equally to the positive side (by reversing time and the locations).

I interpret the problem as saying that the steps are independent, and have probability 1/2 each of being -1 and 2. This applies equally to the next step as to the previous step. I don't specify the location at time 0. Time doesn't have to be labelled by Z, just by a Z-torsor, a set with a principal Z-action. (In fact, the same is true of the line itself.) We're not interested in determining the probability that the frog is at a location at time 0. We're interested in the properties of the path. There are many limiting constructions which do not converge to a sensible probability, but such that the properties of the paths converge.

Random tilings of the entire line or plane are often discussed by mathematical tiling theorists and mathematical physicists, who discuss the Gibbs measure. There are some technical difficulties, but are these what you are interested in? Would you object to discussing a random tiling of the line with tiles of width 1 and 2 in equal proportion?

This problem describes a virtual tiling of the line with tiles of width -1 and 2 in equal proportion. This adds more technical difficulties. Is your interest the difference between a tiling (where everything is covered once) and a virtual tiling (where locations are covered once with multiplicity, where there are negative multiplicities, but almost surely only finitely many tiles cover a point)?

jason1990 04-14-2007 09:54 AM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
[ QUOTE ]

In other words, if you look at a sequence going from (t_0,i_0) to (t_n,i_n), there is no way for you to statistically determine whether it was generated by the forward process (as described in the original puzzle) or by a time reversal of the backward process described above.

[/ QUOTE ]
This is true for any probabilities p and 1-p.

[/ QUOTE ]
You are right. In hindsight, I now think that the example in my post you are replying to does not really illustrate anything. But I still think the discussion I presented about different invariant measures has meaningful content. In this failed example, I was trying to formalize the idea of the "tape running backwards" in a way which avoided a discussion of invariant measures with infinite mass. Perhaps that is not possible.

[ QUOTE ]
This is very far from saying that you can't discuss the properties of a typical walk from -infinity to +infinity.

[/ QUOTE ]
In case it was not clear, this is not at all what I am trying to claim.

[ QUOTE ]
It sounds like you are raising objections on the negative time side that apply equally to the positive side (by reversing time and the locations).

[/ QUOTE ]
There is an inherent asymmetry between positive and negative time. The puzzle tells us the transition probabilities for a forward time step. It is up to us to deduce the transition probabilities for a negative time step.

You appear to be assuming that, if we flip the time and space axes, then we should get a process which has the same law as the original. I contend that this is not implied by the hypothesis that the frog jumps forward by +2 or -1 with equal probability. Symmetry under "flipping" is an additional assumption we might choose to impose on the process, and it amounts to a particular choice of invariant measure.

[ QUOTE ]
I interpret the problem as saying that the steps are independent, and have probability 1/2 each of being -1 and 2. This applies equally to the next step as to the previous step.

[/ QUOTE ]
I agree that this is a reasonable interpretation. I contend that there are other reasonable interpretations. As I stated before, this interpretation is equivalent to performing the standard construction of a reverse-time Markov chain, where you use the uniform measure as your invariant measure. In the case of a symmetric walk with steps +1 and -1, this invariant measure is unique and this is the only was to reverse the chain. Here is a quote from one of f97tosc's posts:

[ QUOTE ]
Actually the backward process is quite simple: If the frog's position at time t is n, then by Bayes' theorem its position at t-1 is n+1 with probability

P=P(n+1)*0.5/(0.5P(n+1) + 0.5P(n-2))

where P(n+1) and P(n-2) are the a priori probabilities that the frog was at n+1 or n-2 at time t-1. If we have no a priori information about the frog before the time t (which is the implication of a backward process), then by symmetry P(n+1)=P(n-2), and so the transition probabilities going backwards are simply go to n+1 with probability 0.5

[/ QUOTE ]
This construction implicitly assumes that the prior probabilities of the position of the frog at time t-1 are all the same. In other words, it uses the uniform measure for the prior probabilities. This is justified because the uniform measure is an invariant measure for this process.

(Incidentally, this is not justified by some kind of indifference principle. In cases where one is time-reversing a Markov chain with a unique, non-uniform stationary distribution, one must use the stationary distribution for the prior probabilities. So the prior probabilities must be taken from an invariant measure.)

But in this case, there are other invariant measures to choose from. We could, for example, take P(n) = c^n, where c = [(1 + \sqrt{5})/2]. This would also be justified by the fact that this measure is invariant for our process. If we do that, then we get P = c/2, as I have previously described. I contend that this is a perfectly reasonable interpretation of the backward time steps. On the mathematical side, it is reasonable since there is no reason to prefer one invariant measure over another. It may even be reasonable on the heuristic side. One might informally argue that, since the frog has positive drift, then when we enter the frog's world without any prior information, we should be more likely to see him at large positive numbers than at large negative numbers. In other words, the prior probabilities should be an increasing function of n.

[ QUOTE ]
Time doesn't have to be labelled by Z, just by a Z-torsor, a set with a principal Z-action.

[/ QUOTE ]
This is interesting. Are you saying that you have a Z-torsor X, and a family of random variables {S_t: t in X}, which model the frog's jumping process? If so, I would be interested in reading about it.

[ QUOTE ]
(In fact, the same is true of the line itself.)

[/ QUOTE ]
I would guess that the issue I am describing would still occur even in that case. We could take an invariant measure P(x) for x in X such that P(x)/P(y)=c^g, whenever y.g=x. Honestly, though, I know nothing about torsor-valued Markov processes or Markov processes indexed by torsors. But I would be happy to learn.

[ QUOTE ]
Random tilings of the entire line or plane are often discussed by mathematical tiling theorists and mathematical physicists, who discuss the Gibbs measure. There are some technical difficulties, but are these what you are interested in? Would you object to discussing a random tiling of the line with tiles of width 1 and 2 in equal proportion?

[/ QUOTE ]
Well, I suppose I would not object. But I do not know anything about random tilings. Really, though, my perspective on this is from the theory of Markov processes. My main interest is in exploring the different ways of making sense of a Markov process defined on an infinite time-line.

PairTheBoard 04-14-2007 11:02 AM

Re: April 2007 IBM Ponder This Challenge
 
Watching you guys work on this together is like taking a peek into Gauss' brain.

I would like to see a satisfactory solution. It seems to me this speaks to the philisophical argument sometimes made about the necessity of an Intitial Cause. I forget which famous guy first proposed it but the idea is that since every Effect has a Cause, "therefore" you must have an Initial Cause. I've always thought the "therefore" was just bs. What's the problem with a Universe existing from time -infinity to +infinity where the laws of Cause and Effect consitently work locally in time? In this case, the local Law of Cause and Effect is that if the Frog is somewhere in time (Cause) he jumps -1 or +2 with equal probability (Effect).

So if you guys can't satisfactorily model this mathematically I might have to conclude something philisophically radical.

PairTheBoard

Pokerlogist 04-14-2007 12:15 PM

Re: April 2007 IBM Ponder This Challenge
 
For fun(!), I just ran a quick and dirty simulation that counted the number of missed integers. The frog started at -100,000 and was forced to stop after +100,000. It randomly hopped -1 or +2.

The proportion of missed integers was .57.
This exactly agrees with Jason's answer. Nice [img]/images/graemlins/smile.gif[/img]

jason1990 04-14-2007 12:47 PM

Re: April 2007 IBM Ponder This Challenge
 
From my previous post:

[ QUOTE ]
In this failed example, I was trying to formalize the idea of the "tape running backwards" in a way which avoided a discussion of invariant measures with infinite mass. Perhaps that is not possible.

[/ QUOTE ]
Here is another attempt. Let us do what marv suggested and restrict time to [-10,10]. We know that

P(S_{t+1} = i-1 | S_t = i) = 1/2.

Let us try to compute the reverse transition

q(t,i) = P(S_{t-1} = i+1 | S_t = i).

Recalling that p_t(i,j) = P(S_t = j | S_0 = i), we can write this as q(t,i) = N/D, where

N = P(S_{t-1} = i+1, S_t = i)
= \sum_j P(S_{-10} = j, S_{t-1} = i+1, S_t = i)
= \sum_j P(S_{-10} = j)p_{t+9}(j,i+1)p(i+1,i)
= (1/2)\sum_j P(S_{-10} = j)p_{t+9}(j,i+1), and

D = P(S_t = i) = \sum_j P(S_{-10} = j)p_{t+10}(j,i).

We see, then, that the transition q(t,i) is not uniquely determined without assuming something about the distribution u(j) = P(S_{-10} = j). What shall we assume about it? Well, we know the frog has been jumping for a long time before reaching t = -10. So one less jump should not matter. In other words, whatever we assume for S_{-10} should hold equally well for S_{-11}. That is, we want

u(j) = P(S_{-10} = j)
= P(S_{-11} = j+1)(1/2) + P(S_{-11} = j-2)(1/2)
= (u(j+1) + u(j-2))/2.

Unfortunately, there is no solution to this recursion with the property that \sum_j u(j) = 1. So we cannot find a satisfying probability distribution to impose on S_{-10}. We can, however, do an approximation. Note that the positive solutions to the above recursion can be written in the form

u(j) = k_1 + k_2 c^j,

where c = (1 + \sqrt{5})/2. We will therefore take two approaches in what follows.

Approach 1. Let us assume that S_{-10} is uniformly distributed on some large interval I of length L. If you like, imagine a sequence of intervals whose union is the entire set of integers. It does not matter whether these intervals are symmetric about the origin or not. The origin plays no special role here. This approach amounts to taking the solution (k_1,k_2) = (1,0).

Now, for fixed i, p_{t+9}(j,i+1) will be zero for all j outside of some bounded interval. Hence, once I is large enough,

N = (1/2)\sum_j u(j)p_{t+9}(j,i+1)
= (1/2L)\sum_j p_{t+9}(j,i+1)
= (1/2L)\sum_j p_{t+9}(0,i+1-j)
= (1/2L)\sum_j p_{t+9}(0,j) = 1/2L, and

D = \sum_j u(j)p_{t+10}(j,i)
= (1/L)\sum_j p_{t+10}(0,i-j) = 1/L.

Hence, the reverse jump of +1 occurs with limiting probability q(t,i) = N/D = 1/2. [img]/images/graemlins/spade.gif[/img]

Approach 2. Let us assume that S_{-10} is distributed on some large interval I of length L, and its distribution u(j) is proportional to c^j. As in Approach 1, this distribution is also translation invariant, in the sense that only the length L matters. The actual location of the interval I does not affect this distribution, and the origin plays no special role here either. (One way to see this is to observe that u(j) is also proportional to c^{j-a}, where a is left endpoint of the interval I.) This approach amounts to taking the solution (k_1,k_2) = (0,1).

Let M = M(I) denote the sum over all j in I of c^j, so that u(j) = (c^j)/M. In general, as above, when I is sufficiently large,

\sum_j u(j)p_s(j,i) = (1/M)\sum_j c^j p_s(0,i-j)
= (c^i/M)\sum_j c^{j-i} p_s(0,i-j)
= (c^i/M)\sum_j c^{-j} p_s(0,j).

I will leave it as an exercise to prove by induction that

\sum_j c^{-j} p_s(0,j) = 1 for all s.

(Use the relations p_{s+1}(0,j) = \sum_i p(0,i)p_s(i,j) and p_s(i,j) = p_s(0,j-i).) It now follows that

N = (1/2)\sum_j u(j)p_{t+9}(j,i+1) = c^{i+1}/2M, and
D = \sum_j u(j)p_{t+10}(j,i) = c^i/M.

Hence, in this case, the reverse jump of +1 occurs with limiting probability q(t,i) = N/D = c/2. [img]/images/graemlins/spade.gif[/img]

As I mentioned previously, I think reasonable arguments, both mathematical and heuristic, exist for justifying both of these approaches.

jason1990 04-14-2007 12:55 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
For fun(!), I just ran a quick and dirty simulation that counted the number of missed integers. The frog started at -100,000 and was forced to stop after +100,000. It randomly hopped -1 or +2.

The proportion of missed integers was .57.
This exactly agrees with Jason's answer. Nice [img]/images/graemlins/smile.gif[/img]

[/ QUOTE ]
In order to "honestly" get my answer, you should have started the frog at 0, not -100,000. I suspect that you only got my answer because you stopped the frog the first time he reached +100,000. The real frog would have kept jumping and backtracked over a lot of those missed sites, giving you f97tosc's answer.

jason1990 04-14-2007 02:45 PM

Re: April 2007 IBM Ponder This Challenge
 
[ QUOTE ]
What's the problem with a Universe existing from time -infinity to +infinity where the laws of Cause and Effect consitently work locally in time? In this case, the local Law of Cause and Effect is that if the Frog is somewhere in time (Cause) he jumps -1 or +2 with equal probability (Effect).

[/ QUOTE ]
There is nothing wrong with this in general. The question I am posing is this: if you know the rule for moving forward in time, what is the rule for moving backward? If the process you are describing has a unique invariant distribution, then the answer is unique: the probability of moving backward from i to j is u(j)/u(i) times the probability of moving forward from j to i, where u is the invariant distribution.

The frog puzzle poses a technical difficulty. Since there is no invariant distribution, we must use an invariant measure. But this is not a major problem. The real issue I keep mulling over is the fact that the invariant measure is not unique. So the rule for moving backward is not uniquely specified by the rule for moving forward.

If you look at my long post here, you can see that the time interval I chose, [-10,10], is not important. The same thing works for any time interval [t_1,t_2]. So I am really describing processes that have no special time or space origin, and which follow the frog's rule for moving forward. But the two processes have different rules for moving backward.


All times are GMT -4. The time now is 06:49 PM.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.