Python：Advanced Guide to Artificial Intelligence

上QQ阅读APP看书，第一时间看更新

Forward phase

If we call p_ij the transition probability P(i → j), we define a recursive procedure considering the following probability:

The variable f_ti represents the probability that the HMM is in the state i (at time t) after t observations (from 1 to t). Considering the HMM assumptions, we can state that f_ti depends on all possible f_t-1j. More precisely, we have:

With this process, we are considering that the HMM can reach any of the states at time t-1 (with the first t-1 observations), and transition to the state i at time t with probability p_ji. We need also to consider the emission probability for the final state o_t conditioned to each of the possible previous states.

For definition, the initial and ending states are not emitting. It means that we can write any sequence of observations as 0, o₁, o₂, ..., o_{Sequence Length}, 0, where the first and the final values are null. The procedure starts with computing the forward message at time 1:

The non-emitting ending state must be also considered:

The expression for the last state x_Endingis interpreted here as the index of the ending state in both A and B matrices. For example, we indicate p_ij as A[i, j], meaning the transition probability at a generic time instant from the state x_t = i to the state x_t+1 = j. In the same way, p_i_Ending is represented as A[i, x_Ending], meaning the transition probability from the penultimate state x_{Sequence Length-1} = i to the ending onex_{Sequence Length} = Ending State.

The Forward algorithm can, therefore, be summarized in the following steps (we assume to have N states, hence we need to allocate N+2 positions, considering the initial and the ending states):

Initialization of a Forward vector with shape (N + 2, Sequence Length).
Initialization of A (transition probability matrix) with shape (N, N). Each element is P(x_i|x_j).
Initialization of B with shape (Sequence Length, N). Each element is P(o_i|x_j).
For i=1 to N:
1. Set Forward[i, 1] = A[0, i] · B[1, i]
For t=2 to Sequence Length-1:
1. For i=1 to N:
  1. Set S = 0
2. For j=1 to N:
  1. Set S = S + Forward[j, t-1] · A[j, i] · B[t, i]
3. Set Forward[i, t] = S

Set S = 0.
For i=1 to N:
1. Set S = S + Forward[i, Sequence Length] · A[i, x_Ending]
Set Forward[x_Ending, Sequence Length] = S.

Now it should be clear that the name forward derives from the procedure to propagate the information from the previous step to the next one, until the ending state, which is not emittied.