Скачать книгу
upper T Baseline normal e right-parenthesis Over partial-differential w Subscript italic i j Superscript l Baseline EndFraction equals StartFraction partial-differential left-parenthesis normal e Superscript upper T Baseline normal e right-parenthesis Over partial-differential s Subscript j Superscript l Baseline EndFraction StartFraction partial-differential s Subscript j Superscript l Baseline Over partial-differential w Subscript italic i j Superscript l Baseline EndFraction equals delta Subscript j Superscript i Baseline a Subscript i Superscript l minus 1 Baseline comma"/>
with leading to the weight update .
Parameters δ are derived recursively starting from the output layer:
(3.9)
where f ′ is the derivative of the sigmoid function of s. We have also used for the output layer . With this, at the output layer, each neuron has an explicit desired response, so we can write
(3.10)
Substituting into Eq. (3.9) yields .
To calculate the δ′ s, we note that eTe is influenced through indirectly through all node values in the next layer. Referring to the upper part of Figure 3.3, we again employ the chain rule
(3.11)
with
(3.12)
Recalling that , we get In summary, we have
(3.13)
(3.14)
For the bias weight we note that in Eq. (3.13). The above processing is illustrated in Figure 3.4, indicating the symmetry between the forward propagation of neuron activation values and the backward propagation of δ terms.
Скачать книгу