Python:Advanced Guide to Artificial Intelligence
上QQ阅读APP看书,第一时间看更新

Forward phase

If we call pij the transition probability P(i → j), we define a recursive procedure considering the following probability:

The variable fti represents the probability that the HMM is in the state i (at time t) after t observations (from 1 to t). Considering the HMM assumptions, we can state that fti depends on all possible ft-1j. More precisely, we have:

With this process, we are considering that the HMM can reach any of the states at time t-1 (with the first t-1 observations), and transition to the state i at time t with probability pji. We need also to consider the emission probability for the final state ot conditioned to each of the possible previous states.

For definition, the initial and ending states are not emitting. It means that we can write any sequence of observations as 0, o1, o2, ..., oSequence Length, 0, where the first and the final values are null. The procedure starts with computing the forward message at time 1:

The non-emitting ending state must be also considered:

The expression for the last state xEnding is interpreted here as the index of the ending state in both A and B matrices. For example, we indicate pij as A[i, j], meaning the transition probability at a generic time instant from the state xt = i to the state xt+1 = j. In the same way, piEnding is represented as A[i, xEnding], meaning the transition probability from the penultimate state xSequence Length-1 = i to the ending one xSequence Length = Ending State.

The Forward algorithm can, therefore, be summarized in the following steps (we assume to have N states, hence we need to allocate N+2 positions, considering the initial and the ending states):

  1. Initialization of a Forward vector with shape (N + 2, Sequence Length).
  2. Initialization of A (transition probability matrix) with shape (N, N). Each element is P(xi|xj).
  3. Initialization of B with shape (Sequence Length, N). Each element is P(oi|xj).
  4. For i=1 to N:
    1. Set Forward[i, 1]A[0, i] · B[1, i]
  5. For t=2 to Sequence Length-1:
    1. For i=1 to N:
      1. Set S = 0
    2. For j=1 to N:
      1. Set S = S + Forward[j, t-1] · A[j, i] · B[t, i]
    3. Set Forward[i, t] = S
  1. Set S = 0.
  2. For i=1 to N:
    1. Set S = S + Forward[i, Sequence Length] · A[i, xEnding]
  3. Set Forward[xEnding, Sequence Length] = S.

Now it should be clear that the name forward derives from the procedure to propagate the information from the previous step to the next one, until the ending state, which is not emittied.