Table of Contents

What :

Graphical Model

G = (V, E)

𝕐 = (𝕐_v)_v ∈ V

Markov Property :

P(\mathbb{Y}_v | \mathbb{X}, \mathbb{Y}_w, w \neq v) = p(\mathbb{Y}_v | \mathbb{X}, \mathbb{Y}_w, w\eq v)

where v and w are neighbors in the graph G

For sequences, Chains/ Tree

Fundamental Theorem of random Fields

p_{\theta}(\mathbb{y} | \mathbb{x}) \prop \exp \left( \sum_{e\in E, k} \lambda_k f_k(e,\mathbb{y}|_e, \mathbb{x}) + \sum_{v\in V, k} \mu_k g_k (v, \mathbb{y}|_v, \mathbb{x}) \right)

𝕩 data sequence 𝕪 label sequence |Y| Dictionary of possible states f_k, g_k boolean features, f associated to pair/edge (transition), g to point/vertices (state) 𝕪|_S Set of components of 𝕪 with verticies in subgraph S

f_y′,y(<u, v > ,𝕪|_<u, v>, 𝕩)=δ(𝕪_u, y′)δ(𝕪_v, y)

g − y, x(v, 𝕪|v, 𝕩)=δ(𝕪_v, y)δ′𝕩_v, x)

Calculating CRF

For a sequence, set 𝕐₀ = start and 𝕐_n + 1 = stop

M, matrix |Y|×|Y| M_i(𝕩)=[M_i(y′,y|𝕩)] where i is the position of the observation in the sequence 𝕩

M_i(y′,y|𝕩)=exp(Λ(y′,y|𝕩)) where Λ(y′,y|𝕩)=∑_kλ_kf_k(e_i, 𝕐|_{e_i} = (y′,y),𝕩)+∑_kμ_kg_k(v_i, 𝕐|_{v_i} = y, 𝕩)

Normalisation constant :

Z_θ(𝕩)=(M₁(𝕩)M₂(𝕩)...M_n + 1(𝕩))_{start, stop}

Giving :

p_{\theta}(\mathbb{y}| \mathbb{x}) = \frac{\Pi_{i=1}^{n+1} M_i(\mathbb{y}_{i-1}, \mathbb{y}_i|\mathbb{x})}{(\Pi_{i=1}^{n+1} M_i(\mathbb{x}))_{start, stop}}

with y₀ = start and y_n + 1 = stop

What :

Calculating CRF

Training CRF

IIS Improve Iterative Scaling