Baum-Welch算法就是EM算法,所以首先给出EM算法的Q函数
\[\sum_zP(Z|Y,\theta')\log P(Y,Z|\theta)\]
换成HMM里面的记号便于理解
\[Q(\lambda,\lambda') = \sum_zP(I|O,\lambda')\log P(I,O|\lambda)\]
根据状态序列和观测序列的联合分布
\[\begin{align*}
P(O,I|\lambda) &= \sum_IP(O|I,\lambda)P(I|\lambda)\\
&= \pi_{i_1}b_{i_1}(o_1)a_{i_1i_2}b_{i_2}(o_2)\dots a_{i_{T-1}i_T}b_{i_T}(o_T)\\
\end{align*}\]
代入上式后得
\[\begin{align*}
Q(\lambda, \lambda') &= \sum_IP(I|O,\lambda')\log\pi_{i_1}\\ &+ \sum_IP(I|O,\lambda')\log\sum_{t=1}^Tb_{i_t}(o_t) \\ &+ \sum_IP(I|O,\lambda')\log\sum_{t=2}^Ta_{i_{t-1}i_T}
\end{align*}\]
这便是E步,下面看看M步.
看Q函数得第一步, 由于带有约束
\[\sum_i^N\pi_i = 1\]
这个时候就需要请出拉格朗日乘子了
\[\begin{align*}
L &= \sum_IP(I|O,\lambda')\log\pi_1 + \gamma(\sum_{i=1}^N\pi_i -1)\\
&= \sum_{i=1}^NP(O,i_1=i|\lambda')\log\pi_i + \gamma(\sum_{i=1}^N\pi_i -1)\\
\end{align*}\]
令\(\dfrac{\partial L}{\partial\pi_i} = 0\)得到
\[\begin{align*}
P(O, i_1 = i|\lambda') + \gamma \pi_i &= 0\\
P(O, i_1 = i|\lambda') &= -\gamma \pi_i\\
\sum_{i=1}^NP(O, i_1 = i|\lambda') &= -\gamma \sum_{i=1}^N\pi_i\\
\gamma &= -P(O|\lambda')
\end{align*}\]
回代,得到
\[\pi_i = \dfrac{P(O, i_1=i|\lambda')}{P(O|\lambda')}\]
其他得参数同样可以得到