Скачать книгу

(3.72) of Chapter 3, which we repeat here with slightly different notation:

      (4.25)StartLayout 1st Row f Subscript t Baseline equals sigma left-parenthesis upper W Subscript f Baseline x Subscript t Baseline plus upper V Subscript f Baseline h Subscript t minus 1 Baseline plus b Subscript f Baseline right-parenthesis 2nd Row i Subscript t Baseline equals sigma left-parenthesis upper W Subscript i Baseline x Subscript t Baseline plus upper V Subscript i Baseline h Subscript t minus 1 Baseline plus b Subscript i Baseline right-parenthesis 3rd Row o Subscript t Baseline equals sigma left-parenthesis upper W Subscript o Baseline x Subscript t Baseline plus upper V Subscript o Baseline h Subscript t minus 1 Baseline plus b Subscript o Baseline right-parenthesis 4th Row c overTilde Subscript t Baseline equals hyperbolic tangent left-parenthesis upper W Subscript c Baseline x Subscript t Baseline plus upper V Subscript c Baseline h Subscript t minus 1 Baseline plus b Subscript c Baseline right-parenthesis 5th Row c Subscript t Baseline equals f Subscript t Baseline c Subscript t minus 1 Baseline plus i Subscript t Baseline c overTilde Subscript t Baseline 6th Row h Subscript t Baseline equals o Subscript t Baseline circled-dot hyperbolic tangent left-parenthesis c Subscript t Baseline right-parenthesis EndLayout

      where Wi is the i‐th row of the matrix W.

      (4.27)beta Subscript i comma j Baseline equals exp left-parenthesis upper W Subscript i Baseline left-parenthesis o Subscript upper T Baseline circled-dot left-parenthesis hyperbolic tangent left-parenthesis c Subscript j Baseline right-parenthesis minus hyperbolic tangent left-parenthesis c Subscript j minus 1 Baseline right-parenthesis right-parenthesis right-parenthesis comma

      so that

exp left-parenthesis upper W Subscript i Baseline h Subscript upper T Baseline right-parenthesis equals exp left-parenthesis sigma-summation Underscript j equals 1 Overscript upper T Endscripts upper W Subscript i Baseline left-parenthesis o Subscript upper T Baseline circled-dot left-parenthesis hyperbolic tangent left-parenthesis c Subscript j Baseline right-parenthesis minus hyperbolic tangent left-parenthesis c Subscript j minus 1 Baseline right-parenthesis right-parenthesis right-parenthesis equals product Underscript j equals 1 Overscript upper T Endscripts beta Subscript i comma j Baseline period

      As tanh (cj) − tanh (cj − 1) can be viewed as the update resulting from word j, so βi, j can be interpreted as the multiplicative contribution to pi by word j.

      This suggests a natural definition of an alternative score to βi, j , corresponding to augmenting the cj terms with the products of the forget gates to reflect the upstream changes made to cj after initially processing word j:

      (4.29)StartLayout 1st Row exp left-parenthesis upper W Subscript i Baseline h Subscript upper T Baseline right-parenthesis equals product Underscript j equals 1 Overscript upper T Endscripts exp left-parenthesis upper W Subscript i Baseline left-parenthesis o Subscript upper T Baseline circled-dot left-parenthesis hyperbolic tangent left-parenthesis sigma-summation Underscript k equals 1 Overscript j Endscripts e Subscript k comma upper T Baseline right-parenthesis minus hyperbolic tangent left-parenthesis sigma-summation Underscript k equals 1 Overscript j minus 1 Endscripts e Subscript k comma upper T Baseline right-parenthesis right-parenthesis right-parenthesis right-parenthesis 2nd Row equals product Underscript j equals 1 Overscript upper T Endscripts exp left-parenthesis upper W Subscript i Baseline left-parenthesis o Subscript upper T Baseline circled-dot left-parenthesis hyperbolic tangent left-parenthesis left-parenthesis product Underscript k equals j plus 1 Overscript t Endscripts f Subscript k Baseline right-parenthesis c Subscript j Baseline right-parenthesis minus hyperbolic tangent left-parenthesis left-parenthesis product Underscript k equals j Overscript t Endscripts f Subscript k Baseline right-parenthesis c Subscript j minus 1 Baseline right-parenthesis right-parenthesis right-parenthesis right-parenthesis 3rd Row equals product Underscript j equals 1 Overscript upper T Endscripts gamma Subscript i comma j EndLayout

      We now introduce a technique for using our variable importance scores to extract phrases from a trained LSTM. To do so, we search for phrases that consistently provide a large contribution to the prediction of a particular class relative to other classes. The utility of these patterns is validated by using them as input for a rules‐based classifier. For simplicity, we focus on the binary classification case.

Скачать книгу