Gradient Descent on Squared Error

Define error as:

\begin{displaymath}
E = \frac{1}{2} (T-O)^2 = \frac{1}{2} (T-g(\sum_j W_j I_j))^2\end{displaymath}

Then:

\begin{displaymath}
\frac{\partial E}{\partial W_j}=-I_j(T-O)g'(\sum_j W_j I_j) = -I_j \times Err \times g'(in)\end{displaymath}

And weights are updated a small amount in the opposite direction of the gradient:

\begin{displaymath}
W_j \leftarrow W_j + \alpha \times I_j Err \times g'(in)\end{displaymath}


next up previous
Next: Perceptron Learning: Good Up: NEURAL NETWORKS Previous: Perceptron Training Rule