Two Plus Two Newer Archives - View Single Post

jason1990 · #26 04-13-2007, 11:00 AM

For what I am about to write, let me finally formalize the transition function. Let p(i,j) = 1/2 if j = i + 2 or j = i - 1, and p(i,j) = 0 otherwise.

I think it is important that we distinguish between the two different processes we are talking about. On the one hand, there is the (hypothetical) Markov process {..., S_{-2}, S_{-1}, S_0, S_1, S_2, ...} which models the frog's jumps. It satisfies P(S_{t+1} = j | S_t = i) = p(i,j) for all t, i, and j. I call it "hypothetical" because I do not believe such a process exists.

On the other hand, there is the process which models the observer learning about the frog, one step at the time. The observer enters the frog's world at some time which we call t = 0, and observes the frog at some location which we call i = 0. He can then request the frog's location relative to this fixed point at any future or past time. Let me call this process {..., X_{-2}, X_{-1}, X_0, X_1, X_2, ...}.

I think the relationship between S and X is pretty clear: X is simply S conditioned to be 0 at 0. That is,

(*) P(X_t = i) = P(S_t = i | S_0 = 0).

Let us now split X into a forward and a backward process. That is, for t >= 0, let F_t = X_t and B_t = X_{-t}. Let us also assume, as in the construction you described, that B_t is governed by a transition function. That is, there exists a function q (which does not depend on t) such that

P(B_{t+1} = j | B_t = i) = q(i,j).

We can calculate q by observing that

q(i,j) = P(X_{-t-1} = j | X_{-t} = i)
= P(S_{-t-1} = j | S_{-t} = i, S_0 = 0).

If we let u_t(i) = P(S_t = i) and p_s(i,j) = P(S_{t+s} = j | S_t = i), then we can use Bayes Theorem to get

q(i,j) = u_{-t-1}(j)p(j,i)p_t(i,0)/(u_{-t}(i)p_t(i,0))
= (u_{-t-1}(j)/u_{-t}(i)) p(j,i).

Similarly, if q_s(i,j) = P(B_{t+s} = j | B_t = i), then

q_s(i,j) = (u_{-t-s}(j)/u_{-t}(i)) p_s(j,i).

This is valid for any i such that u_{-t}(i) > 0. Take s = 3 and j = i in the above. Since p_3(i,i) > 0 for all i, we get

u_{-t-3}(i) = (q_3(i,i)/p_3(i,i)) u_{-t}(i).

This equation, in fact, is valid even when u_{-t}(i) = 0. That is, it is valid for all i. Since u_{-t} is a probability distribution for all t, we can use this to show that u_{-3t}(i) = u_0(i) for all i and t. In other words, if we restrict our attention to times which are multiples of 3, the process S_t is stationary. Moreover, we get

q_3(i,j) = (u_0(j)/u_0(i)) p_3(j,i).

Of course, the frog's process S does not have this stationarity property, so the observer's process X cannot be realized in the form of equation (*).

In general, the above shows that if S has a stationary distribution u_0, then X can be defined as follows: if t >= 0, then X_t is simply S_t, starting at S_0 = 0; and if t <= 0, then X_t is defined by

(**) P(X_{t-1} = j | X_t = i) = (u_0(j)/u_0(i)) p(j,i).

If S does not have a stationary distribution, then X cannot be realized by (*). However, if S has an invariant measure u_0 (which is not necessarily a probability distribution), then we can define X by (**). This, I think, is what you have essentially done, where for the invariant measure, you have taken u_0(i) = 1 for all i. (In what you have described q(i,j) = p(j,i).)

In other words, the observer's process X is not well-defined in the form (*). But if we assume (informally) that the frog's position at time t = 0 is equally likely to be any integer, then (*) reduces to the definition you gave.

I must admit that this seems to be a reasonable construction of a process which models the frog jumping over all time, both positive and negative. Technically, we cannot do it, since a uniform distribution on the integers does not exist. But this construction gives us a way to get around this technical difficulty while still defining something rigorous. I would consider it analogous to the various methods used to condition on a set of measure 0.

But one final issue concerns me. The invariant measure, u_0(i) = 1 for all i, is not unique. An invariant measure must satisfy

u(j) = \sum_i u(i)p(i,j) = (u(j - 2) + u(j + 1))/2.

Another invariant measure would be u_0(i) = c^i, where c = (1 + \sqrt{5})/2. So we could generalize the above construction in a different way, by defining q(i,j) = c^{j-i}p(j,i). That is, B_t makes a jump of +1 with probability c/2 and a jump of -2 with probability 1/(2c^2). Would this not affect the proportion of missed sites on the negative half-line?

The non-uniqueness of the invariant measure suggests that the construction you have described is not canonical, which means that ambiguity still remains. Heuristically, I imagine one might argue against the alternative invariant measures by appealing to the "tape running backwards" property you mentioned (i.e. the reversibility). Frankly, though, I am a little confused by this. I can see how S (if it existed) would be reversible. But I do not see how X can be reversible. Suppose I observe two random variable (Y,Z) and I want to determine whether I am looking at something of the form (F_t,F_{t+1}) or something of the form (B_{t+1},B_t). If I repeatedly observe that Y = 0, then I know exactly what I am looking at: (F_0,F_1). The process S (if it existed) would not necessarily present such a problem, since S_0 would not be fixed. In order to prevent me from distinguishing between (F_0,F_1) and (B_1,B_0), we must put a measure on X_0. There is no probability distribution which will do the trick. However, with X defined the way you originally defined it, the uniform measure will do it for us (at least formally). And for X defined as in the previous paragraph, the measure c^i will do it for us. Therefore, I cannot see how the reversibility property can be used to rule out the alternative invariant measures and resolve the ambiguity.