Convergence
To analyze how to guarantee Vs(s) convergence with TD(k) update
4.1 Bellman Equations
For sequence <$S_{t-1}$, $r_{t}$, $S_{t}$>, We can reason out different TD(k) rules.
- No action Format: $k=0$, $TD(0)$ rule
$$ V(s) = R(s) + r * \sum_{s'}\Gamma(s, s')*V(s') $$
We have $V_t(S_{t-1})$ update rule as follow:
$$
\begin{array}
\mathcal{V_t(S_{t-1})} =
\begin{cases}
V_{t-1}(S_{t-1}) + \alpha_t\left[r_t + rV_{t-1}(S_t) - V_t(S_{t-1}) \right] & \\
V_{t-1}(S_{t-1}) & \text{converge case}
\end{cases}
\end{array}
$$