A Markov chain is “a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event”.

Space or time can be either discrete(𝑋_𝑡:t=0, 1, 2,…) or continuous(𝑋_𝑡:t≥0). (we will focus on Markov chains in discrete space an time)
Example for Markov chain:

HMM is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (i.e. hidden) states.
State: x = (x1, x2, x3) ;
Observation: y = (y1, y2, y3);
Transition matrix: A = (aij);
Emission matrix: B = (bij)

Example:


Q1: how to evaluate the matching degree between the model and the observation ? ( forward algorithm)

Q2: how to infer the hidden state from the observation ? (Viterbi algorithm)
Observation 𝒚=(𝑦
1
_1
1, 𝑦
2
_2
2,…, 𝑦
𝑇
_𝑇
T), initial prob. 𝝅=(𝜋
1
_1
1,𝜋
2
_2
2,…, 𝜋
𝐾
_𝐾
K), transition matrix 𝐴, emission matrix 𝐵.

Viterbi algorithm(optimal solution) backtracking method; It retains the optimal solution of each choice in the previous step and finds the optimal selection path through the backtracking method.
Example:


Q3: how to train the model to describe the observation more accurately ? (Baum-Welch algorithm)
The Baum–Welch algorithm uses the well known EM algorithm to find the maximum likelihood estimate of the parameters of a hidden Markov model given a set of observed feature vectors.


Baum–Welch algorithm(forward-backward alg.) It approximates the optimal parameters through iteration.

use Lagrangian multiplier method and take the partial derivative of Lagrangian funcition。
CpG island:


reference:
未完待续…