April 2007 IBM Ponder This Challenge - Page 3

f97tosc · #21 04-11-2007, 11:52 PM

I think we can resolve this by considering a reversed time process. That is, let the frog be (note: I am not writing start) at integer 0 at time 0. We now define two frog processes, one following the frog going forward in time, and one following the frog backwards in (negative) time. Neither process has an end, and both obey the specifications of the problem. We seek the intersection of the integers missed by the two processes, in the limits of small and big t.

The forward process is defined directly by the problem.

So the question is, can we define a backward process which is unique, Markovian, and consistent with the forward process? By consistent, we mean that given any times t1 and t2 and integers n1 and n2, there is no way to (statistically) distinguish between a tape of the forward process going from (t1,n1) and (t2,n2) or a backward process going from (t2,n2) to (t1,n1). I contend that this is possible but am too tired to work out the transition probabilities etc right now.

In any case, if we do this, then the fraction of integers missed in (-infty, 0] must be the same (if the backward process was consistent) as the fraction of integers in [0,infty), and so the fraction missed is Y.

f97tosc · #22 04-12-2007, 09:20 AM

Actually the backward process is quite simple: If the frog's position at time t is n, then by Bayes' theorem its position at t-1 is n+1 with probability

P=P(n+1)*0.5/(0.5P(n+1) + 0.5P(n-2))

where P(n+1) and P(n-2) are the a priori probabilities that the frog was at n+1 or n-2 at time t-1. If we have no a priori information about the frog before the time t (which is the implication of a backward process), then by symmetry P(n+1)=P(n-2), and so the transition probabilities going backwards are simply go to n+1 with probability 0.5 and go to n-2 with probability n-2.

jason1990 · #23 04-12-2007, 09:23 AM

If you have the time and inclination to write this up, I would interested in reading it. Just to be sure I understand you, you want to model the frog's position as a process indexed by positive and negative time, that is, by a family of random variables {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...}. You want this process to be Markov in the sense that

E[f(S_n) | S_k for all k < n] = E[f(S_n) | S_{n-1}]

for all integers n. Is this right?

In order for this process to be consistent with the original problem, I would think we would need, for example, to have

P(S_n = k + 2 | S_{n-1} = k) = 1/2

for all n and k. But if the frog is at integer 0 at time 0, then this is not true for n = 0. I would think we would need the position of the frog at time 0 to be random, with a distribution which can be evolved backwards.

To be more specific, if p_n(k) = P(S_n = k), then we want

p_{n+1}(k) = (p_n(k - 2) + p_n(k + 1))/2 for all k.

We seek a family of probability distributions {..., p_{-2}, p_{-1}, p_0, p_1, p_2, ...} that satisfy this recursion. Given p_0, we know how to construct p_n for n > 0. But for some choices of p_0 (e.g., p_0(0) = 1), it will not be possible to construct the distributions p_n for n < 0. Are there any choices of p_0 that work?

Let me know if I am misunderstanding you. I would be interested if you have any links or references to the kind of construction you are envisioning.

f97tosc · #24 04-12-2007, 09:42 PM

As you suggest, let's define a family of random variables {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...}, where S_0 is known but arbitrary (I will discuss this more below). For any t>0 (btw, I am confused by your indices k and n), we can find the probability distribution of S_t by the recursion P(S_t=S_{t-1}+2)=P(S_t=S_{t-1}-1)=0.5 ("forward equation"), and for any t<=0 we can identify S_{t-1} from S_{t} and the recursion P(S_{t-1)}=S_{t}-2 )= P(S_{t-1}=S_{t}+1) = 0.5 ("backward equation").

Given this stochastic process we can calculate (using the techniques in yours or mine original post) the fraction of integers such that S{t} does not equal that integer for any positive or negative t.

So we have a rigorously defined stochastic process, and we have an answer, but is it really the right probability model for the problem?

And to answer this, I would first like to emphasize an important conceptual point. Our stochastic process no longer represents the frog taking one step at a time, but rather an observer learning about the frog, one step at the time. The observer has the power to request the frog's position for any t, and the probabilities represent his information state before he makes the request, but he doesn't have to do it in order of increasing t. The limit quantity above is the observer requesting the frog's position for ever larger and smaller t's, and noting which integers were missed.

I hope some of your concerns are taken care of by embracing this perspective shift. You write

" In order for this process to be consistent with the original problem, I would think we would need, for example, to have

P(S_n = k + 2 | S_{n-1} = k) = 1/2

for all n and k.
"
The implicit implication here is that the probabilities represent the frog's steps, and that these steps can only go forward in time. But if the probabilities represent uncertainty of observations not yet made, and which can be made in any direction in time, then there is no problem if this equation (the "forward equation") does not hold for some n and k. specifically, it won't apply in the cases when I already know something about k for a greater n, for example because we are below the starting point (starting in terms of observations). The Markov property does hold if we have inspected only a sequence {...,t-2,t-1,t} and are asking about S for higher t. We can also define a "backwards Markov" property (this is what I meant originally, sorry that was very unclear), meaning that given a sequence of observations {...,t+2,t+1,t}, and asking about S_{t-1}, the probability distribution will only depend on S_t. This follows directly from how the stochastic process was defined.

Also, I am not sure I follow your argument about starting point, but in the perspective just outlined, there is nothing inconsistent or wrong with having a starting probability distribution that has nothing to do with the backward or forward equations. The starting point doesn't say anything special about the frog, it is just the first time for which the observer learns the frog's position. And since the questions we may be interested in do not depend on the point at which we start observing or the first value (additional justification for this below) we may as well just start by assuming a specific given S_0.

Finally, I still think I need to justify if this model of a wandering observer is really the relevant one for the given problem. In particular, when the observer looks at increasing t, the process is virtually identical to the jumping frog process, but what about when the observers look at decreasing t.s? I think the key question here is whether there is any test that the observer can make, which can determine whether he is looking at decreasing indices of a prerecorded forward process/jumping frog, or if he is looking at a backward process (perhaps with the new steps being generated just as he inspects them)? He can't! If I gave you a sequence of {(t1,n1),...,(t_q,n_q)} then there is no way for you to tell whether the sequence was generated from (t1,n1) by a forward process or from (t_q,n_q) by a backward process. This also provides additional support to the claim that the choice of starting observation point is arbitrary; after we generate the numbers there will be no way to see where we started.

And so in effect, we have provided a rigorous answer to the question "Assume you are an observer learning the frog's position for an arbitrary first time. From there, you can track the frog's movement arbitrary far backward and forward in time, and however far you go, the frog always appears to be following the jumping process described in the problem. In the limit of observations on all integers, what is the fraction missed completely by the frog?". I think this is a reasonable way to model the somewhat ambiguous statement that the frog is hopping from minus to plus infinity.

As for references, time-reversal of markov chains and processes is covered in standard texts on discrete stochastic processes, at least it is in Gallager (not that I particularly like that book, but it's there). However, in those cases the time reversal is for steady state problems. Obviously that doesn't apply here, although surely somebody has done something like this before. The perspective that probabilities are information states has been persuasively argued E. T. Jaynes.

alThor · #25 04-13-2007, 01:25 AM

[ QUOTE ]
[ QUOTE ]
The solution for M must have had a hidden assumption that half the integers are negative?

[/ QUOTE ]
This assumption is encoded in the fact that I used a kind of principal value limit. Let F(L) denote the fraction of sites between -L and L which the frog misses. Then F(L) -> 1 - M as L -> infinity. If you take, for example, the limit of the fraction of missed sites in the box [-L,2L], you will get a different limit. In this sense, the "fraction of all integers" is another ambiguity in the problem statement.

[/ QUOTE ]

OK, I see it. (I'm glad my question was understandable.)

[ QUOTE ]
Since I seem to be the odd man out in my interpretation, perhaps you could help me understand yours. Specifically, where does the frog start in your interpretation? Does he start at the origin? Does he start randomly and, if so, with what distribution? Does he start "at -infinity" and, if so, what does that mean formally? Does he even start at time n = 0, or does time extend infinitely in both directions?

[/ QUOTE ]

Just to add my vote, I was thinking to start him at any finite point (arbitrarily 0) at time zero, measure time forward, ask "what percentage of points in [0,X] get hit", and let X go to infty.

jason1990 · #26 04-13-2007, 11:00 AM

For what I am about to write, let me finally formalize the transition function. Let p(i,j) = 1/2 if j = i + 2 or j = i - 1, and p(i,j) = 0 otherwise.

I think it is important that we distinguish between the two different processes we are talking about. On the one hand, there is the (hypothetical) Markov process {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...} which models the frog's jumps. It satisfies P(S_{t+1} = j | S_t = i) = p(i,j) for all t, i, and j. I call it "hypothetical" because I do not believe such a process exists.

On the other hand, there is the process which models the observer learning about the frog, one step at the time. The observer enters the frog's world at some time which we call t = 0, and observes the frog at some location which we call i = 0. He can then request the frog's location relative to this fixed point at any future or past time. Let me call this process {..., X_{-2}, X_{-1}, X_0, X_1, X_2, ...}.

I think the relationship between S and X is pretty clear: X is simply S conditioned to be 0 at 0. That is,

(*) P(X_t = i) = P(S_t = i | S_0 = 0).

Let us now split X into a forward and a backward process. That is, for t >= 0, let F_t = X_t and B_t = X_{-t}. Let us also assume, as in the construction you described, that B_t is governed by a transition function. That is, there exists a function q (which does not depend on t) such that

P(B_{t+1} = j | B_t = i) = q(i,j).

We can calculate q by observing that

q(i,j) = P(X_{-t-1} = j | X_{-t} = i)
= P(S_{-t-1} = j | S_{-t} = i, S_0 = 0).

If we let u_t(i) = P(S_t = i) and p_s(i,j) = P(S_{t+s} = j | S_t = i), then we can use Bayes Theorem to get

q(i,j) = u_{-t-1}(j)p(j,i)p_t(i,0)/(u_{-t}(i)p_t(i,0))
= (u_{-t-1}(j)/u_{-t}(i)) p(j,i).

Similarly, if q_s(i,j) = P(B_{t+s} = j | B_t = i), then

q_s(i,j) = (u_{-t-s}(j)/u_{-t}(i)) p_s(j,i).

This is valid for any i such that u_{-t}(i) > 0. Take s = 3 and j = i in the above. Since p_3(i,i) > 0 for all i, we get

u_{-t-3}(i) = (q_3(i,i)/p_3(i,i)) u_{-t}(i).

This equation, in fact, is valid even when u_{-t}(i) = 0. That is, it is valid for all i. Since u_{-t} is a probability distribution for all t, we can use this to show that u_{-3t}(i) = u_0(i) for all i and t. In other words, if we restrict our attention to times which are multiples of 3, the process S_t is stationary. Moreover, we get

q_3(i,j) = (u_0(j)/u_0(i)) p_3(j,i).

Of course, the frog's process S does not have this stationarity property, so the observer's process X cannot be realized in the form of equation (*).

In general, the above shows that if S has a stationary distribution u_0, then X can be defined as follows: if t >= 0, then X_t is simply S_t, starting at S_0 = 0; and if t <= 0, then X_t is defined by

(**) P(X_{t-1} = j | X_t = i) = (u_0(j)/u_0(i)) p(j,i).

If S does not have a stationary distribution, then X cannot be realized by (*). However, if S has an invariant measure u_0 (which is not necessarily a probability distribution), then we can define X by (**). This, I think, is what you have essentially done, where for the invariant measure, you have taken u_0(i) = 1 for all i. (In what you have described q(i,j) = p(j,i).)

In other words, the observer's process X is not well-defined in the form (*). But if we assume (informally) that the frog's position at time t = 0 is equally likely to be any integer, then (*) reduces to the definition you gave.

I must admit that this seems to be a reasonable construction of a process which models the frog jumping over all time, both positive and negative. Technically, we cannot do it, since a uniform distribution on the integers does not exist. But this construction gives us a way to get around this technical difficulty while still defining something rigorous. I would consider it analogous to the various methods used to condition on a set of measure 0.

But one final issue concerns me. The invariant measure, u_0(i) = 1 for all i, is not unique. An invariant measure must satisfy

u(j) = \sum_i u(i)p(i,j) = (u(j - 2) + u(j + 1))/2.

Another invariant measure would be u_0(i) = c^i, where c = (1 + \sqrt{5})/2. So we could generalize the above construction in a different way, by defining q(i,j) = c^{j-i}p(j,i). That is, B_t makes a jump of +1 with probability c/2 and a jump of -2 with probability 1/(2c^2). Would this not affect the proportion of missed sites on the negative half-line?

The non-uniqueness of the invariant measure suggests that the construction you have described is not canonical, which means that ambiguity still remains. Heuristically, I imagine one might argue against the alternative invariant measures by appealing to the "tape running backwards" property you mentioned (i.e. the reversibility). Frankly, though, I am a little confused by this. I can see how S (if it existed) would be reversible. But I do not see how X can be reversible. Suppose I observe two random variable (Y,Z) and I want to determine whether I am looking at something of the form (F_t,F_{t+1}) or something of the form (B_{t+1},B_t). If I repeatedly observe that Y = 0, then I know exactly what I am looking at: (F_0,F_1). The process S (if it existed) would not necessarily present such a problem, since S_0 would not be fixed. In order to prevent me from distinguishing between (F_0,F_1) and (B_1,B_0), we must put a measure on X_0. There is no probability distribution which will do the trick. However, with X defined the way you originally defined it, the uniform measure will do it for us (at least formally). And for X defined as in the previous paragraph, the measure c^i will do it for us. Therefore, I cannot see how the reversibility property can be used to rule out the alternative invariant measures and resolve the ambiguity.

fnord_too · #27 04-13-2007, 11:19 AM

Ok, I am lame, has anyone written a monte carlo to come up with an aproximation? I will if I have time after lunch if no one else has. I will start at zero, terminate at 1000, and look at the numbers between 100 and 900. I figure running that about a million times should give a pretty close aproximation of the real percentage which could help.

jason1990 · #28 04-13-2007, 12:36 PM

I just thought of an example which I think better illustrates what I am trying to say at the end of my previous post.

Define X_0 = 0. In the forward time direction, X jumps -1 or +2 with equal probability. In the negative time direction, X jumps +1 or -2 with probabilities c/2 and 1/(2c^2), respectively.

You get to observe a segment of the path of this process that consists of n time steps. You know that it lies either entirely on the positive time axis or entirely on the negative time axis, but you do not know which. You also know the initial location of the path is i_0, and the final location is i_n. Can you statistically determine which side of the time axis you are on?

Suppose you are on the positive side of the time axis. Conditioning only on the initial location, each path of length n going from i_0 to i_n has probability 2^{-n}. So conditioning on both initial and final locations, the probability of each path is 1/N, where N is the number of such paths.

Now suppose you are on the negative side of the time axis. Pick a path of length n going from i_0 to i_n. Let x be the number of (forward) jumps of size +2 in this path, and let y be the number of (forward) jumps of size -1. Then x + y = n and 2x - y = d := i_n - i_0. Conditioning only on the final location, this path has probability

(c/2)^y(1/(2c^2))^x = 2^{-n}c^{-d}.

This is true for all such paths. Hence, conditioning on both the initial and final locations, the probability of each path is 1/N, where N is the number of such paths.

In other words, if you look at a sequence going from (t_0,i_0) to (t_n,i_n), there is no way for you to statistically determine whether it was generated by the forward process (as described in the original puzzle) or by a time reversal of the backward process described above.

The constant c can be determined from the relations

c/2 + 1/(2c^2) = 1 and c > 0.

The solutions are c = 1 and c = (1 + \sqrt{5})/2. If c = 1, then we have a process for which the fraction of missed sites is the same on both sides of the time axis. If c = (1 + \sqrt{5})/2, then we do not. This is the ambiguity I was (perhaps unsuccessfully) trying to describe.

jason1990 · #29 04-13-2007, 12:42 PM

I believe we are all agreed upon the real fraction of missed sites on the positive side of the integer lattice. The discussion is about how to interpret the fraction of negative integers which are missed. In particular, what does it mean when they say that the frog is "hopping on the integers from minus infinity to plus infinity?"

marv · #30 04-13-2007, 04:28 PM

I don't think you can deduce S is stationary from what you've done here. If t were restricted to say [-10,10], then we could easily construct a suitable X and S process.

We can define a backwards process in this case by noting that the increments in S are IID, so we can add up their negatives in reverse order.

This isn't quite the same as conditioning S on going through 0 since it ignores the issue of a prior (but I'm sure we can show that no such S exists anyway since it'll need infinite mass for all time, so we need another way of justifying the construction).

Marv