# Elman Net

## Dynamics of RNN

\begin{align*} \bm{x}_o^{(t)} &= \bm{a}_o( \bm{u}_o^{(t)} ) \\ \bm{x}_c^{(t)} &= \bm{a}_c( \bm{u}_c^{(t)} ) \\ \bm{u}_o^{(t)} &= W_{oc}\, \bm{u}_c^{(t)} + \bm{b}_o \\ \bm{u}_c^{(t)} &= (\bm{1}-\bm{e}_c)\bm{u}_c^{(t-1)} + \bm{e}_c (W_{cc}\, \bm{x}_c^{(t-1)} + W_{ci}\, \bm{x}_i^{(t)} + \bm{b}_c) \end{align*}

Note that $$\bm{x}\bm{u} = \sum_i x_i u_i$$ and $$\bm{1} = (1,1,\cdots)^T$$ .

Networks states and biases:

Layer Activated values Potentials Activation Func. Bias
Output $$\bm{x}_o$$ $$\bm{u}_o$$ $$\bm{a}_o$$ $$\bm{b}_o$$
Context $$\bm{x}_c$$ $$\bm{u}_c$$ $$\bm{a}_c$$ $$\bm{b}_c$$
Input $$\bm{x}_i$$

Network weights:

Connection Weights
to from
Output Context $$W_{oc}$$
Context Context $$W_{cc}$$
Context Input $$W_{ci}$$

$$\bm{e}_c$$ is the time-constant like variable. For typical RNN, set $$\bm{e}_c = \bm{1}$$ .

## BPTT

Backward propagation of delta-error $$dE/d u_{c'}^{(t)}$$ can be written as follows:

\begin{align*} \frac{dE}{d u_c^{(t-1)}} &= \sum_{o} \frac{dE}{d u_{o}^{(t)}} w_{oc} a_c'(u_c^{(t)}) + \sum_{c'} \frac{dE}{d u_{c'}^{(t)}} \frac{d u_{c'}^{(t)}}{d u_c^{(t-1)}} \\ \frac{d u_{c'}^{(t)}}{d u_c^{(t-1)}} &= (1 - e_{c'}) \delta_{c' c} + e_{c'} w_{c' c} a_c'(u_c^{(t-1)}) \end{align*}

Here I assume $$d a_c(u_c)/d u_{c'} = 0$$ if $$c \neq c'$$ .