View Single Post
  #24  
Old 04-12-2007, 09:42 PM
f97tosc f97tosc is offline
Senior Member
 
Join Date: Oct 2006
Posts: 120
Default Re: April 2007 IBM Ponder This Challenge

As you suggest, let's define a family of random variables {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...}, where S_0 is known but arbitrary (I will discuss this more below). For any t>0 (btw, I am confused by your indices k and n), we can find the probability distribution of S_t by the recursion P(S_t=S_{t-1}+2)=P(S_t=S_{t-1}-1)=0.5 ("forward equation"), and for any t<=0 we can identify S_{t-1} from S_{t} and the recursion P(S_{t-1)}=S_{t}-2 )= P(S_{t-1}=S_{t}+1) = 0.5 ("backward equation").

Given this stochastic process we can calculate (using the techniques in yours or mine original post) the fraction of integers such that S{t} does not equal that integer for any positive or negative t.

So we have a rigorously defined stochastic process, and we have an answer, but is it really the right probability model for the problem?

And to answer this, I would first like to emphasize an important conceptual point. Our stochastic process no longer represents the frog taking one step at a time, but rather an observer learning about the frog, one step at the time. The observer has the power to request the frog's position for any t, and the probabilities represent his information state before he makes the request, but he doesn't have to do it in order of increasing t. The limit quantity above is the observer requesting the frog's position for ever larger and smaller t's, and noting which integers were missed.

I hope some of your concerns are taken care of by embracing this perspective shift. You write

" In order for this process to be consistent with the original problem, I would think we would need, for example, to have

P(S_n = k + 2 | S_{n-1} = k) = 1/2

for all n and k.
"
The implicit implication here is that the probabilities represent the frog's steps, and that these steps can only go forward in time. But if the probabilities represent uncertainty of observations not yet made, and which can be made in any direction in time, then there is no problem if this equation (the "forward equation") does not hold for some n and k. specifically, it won't apply in the cases when I already know something about k for a greater n, for example because we are below the starting point (starting in terms of observations). The Markov property does hold if we have inspected only a sequence {...,t-2,t-1,t} and are asking about S for higher t. We can also define a "backwards Markov" property (this is what I meant originally, sorry that was very unclear), meaning that given a sequence of observations {...,t+2,t+1,t}, and asking about S_{t-1}, the probability distribution will only depend on S_t. This follows directly from how the stochastic process was defined.

Also, I am not sure I follow your argument about starting point, but in the perspective just outlined, there is nothing inconsistent or wrong with having a starting probability distribution that has nothing to do with the backward or forward equations. The starting point doesn't say anything special about the frog, it is just the first time for which the observer learns the frog's position. And since the questions we may be interested in do not depend on the point at which we start observing or the first value (additional justification for this below) we may as well just start by assuming a specific given S_0.

Finally, I still think I need to justify if this model of a wandering observer is really the relevant one for the given problem. In particular, when the observer looks at increasing t, the process is virtually identical to the jumping frog process, but what about when the observers look at decreasing t.s? I think the key question here is whether there is any test that the observer can make, which can determine whether he is looking at decreasing indices of a prerecorded forward process/jumping frog, or if he is looking at a backward process (perhaps with the new steps being generated just as he inspects them)? He can't! If I gave you a sequence of {(t1,n1),...,(t_q,n_q)} then there is no way for you to tell whether the sequence was generated from (t1,n1) by a forward process or from (t_q,n_q) by a backward process. This also provides additional support to the claim that the choice of starting observation point is arbitrary; after we generate the numbers there will be no way to see where we started.

And so in effect, we have provided a rigorous answer to the question "Assume you are an observer learning the frog's position for an arbitrary first time. From there, you can track the frog's movement arbitrary far backward and forward in time, and however far you go, the frog always appears to be following the jumping process described in the problem. In the limit of observations on all integers, what is the fraction missed completely by the frog?". I think this is a reasonable way to model the somewhat ambiguous statement that the frog is hopping from minus to plus infinity.

As for references, time-reversal of markov chains and processes is covered in standard texts on discrete stochastic processes, at least it is in Gallager (not that I particularly like that book, but it's there). However, in those cases the time reversal is for steady state problems. Obviously that doesn't apply here, although surely somebody has done something like this before. The perspective that probabilities are information states has been persuasively argued E. T. Jaynes.
Reply With Quote