Скачать книгу

score S and class C as

      (4.30)StartLayout 1st Row upper S 1 left-parenthesis w 1 comma ellipsis comma w Subscript k Baseline right-parenthesis equals StartFraction upper E Subscript j comma b Baseline left-brace product Underscript l equals 1 Overscript k Endscripts beta Subscript 1 comma b plus l comma j Baseline bar x Subscript b plus i comma j Baseline equals w Subscript i Baseline comma i equals 1 comma period period period comma k right-brace Over upper E Subscript j comma b Baseline left-brace product Underscript l equals 1 Overscript k Endscripts beta Subscript 2 comma b plus l comma j Baseline bar x Subscript b plus i comma j Baseline equals w Subscript i Baseline comma i equals 1 comma period period period comma k right-brace EndFraction semicolon 2nd Row upper S 2 left-parenthesis w 1 comma w Subscript k Baseline right-parenthesis equals StartFraction 1 Over upper S 1 left-parenthesis w 1 comma ellipsis comma w Subscript k Baseline right-parenthesis EndFraction 3rd Row upper S left-parenthesis w 1 comma w Subscript k Baseline right-parenthesis equals max Underscript i Endscripts left-parenthesis upper S Subscript i Baseline left-parenthesis w 1 comma w Subscript k Baseline right-parenthesis right-parenthesis semicolon upper C left-parenthesis w 1 comma w Subscript k Baseline right-parenthesis equals arg max Underscript i Endscripts left-parenthesis upper S Subscript i Baseline left-parenthesis w 1 comma w Subscript k Baseline right-parenthesis right-parenthesis EndLayout

      where βi,j,k denotes βi, j applied to document k and image stands for average.

      The numerator of S1 denotes the average contribution of the phrase to the prediction of class 1 across all occurrences of the phrase. The denominator denotes the same statistic, but for class 2. Thus, if S1 is high, then w1, …, wk is a strong signal for class 1, and likewise for S2 . It was proposed [93] to use S as a score function in order to search for high‐scoring representative phrases that provide insight into the trained LSTM, and C to denote the class corresponding to a phrase.

      In practice, the number of phrases is too large to feasibly compute the score for all of them. Thus, we approximate a brute force search through a two‐step procedure. First, we construct a list of candidate phrases by searching for strings of consecutive words j with importance scores βi, j > c for any i and some threshold c. Then, we score and rank the set of candidate phrases, which is much smaller than the set of all phrases.

      Rules‐based classifier: The extracted patterns from Section 4.1 can be used to construct a simple rules‐based classifier that approximates the output of the original LSTM. Given a document and a list of patterns sorted by descending score given by S, the classifier sequentially searches for each pattern within the document using simple string matching. Once it finds a pattern, the classifier returns the associated class given by C, ignoring the lower‐ranked patterns. The resulting classifier is interpretable, and despite its simplicity, retains much of the accuracy of the LSTM used to build it.

      This section focuses on the accuracy and interpretability trade‐off in fuzzy model‐based solutions. A fuzzy model based on an experience‐oriented learning algorithm is presented that is designed to balance the trade‐off between both of the above aspects. It combines support vector regression (SVR) to generate the initial fuzzy model and the available experience on the training data and standard fuzzy model solution.

      It is well known that support vector machine (SVM) has been shown to have the ability of generalizing well to unseen data, and giving a good balance between approximation and generalization. Thus, some researchers have been inspired to combine SVM with FM in order to take advantage of both of approaches: human interpretability and good performance. Therefore, support vector learning for FM has evolved into an active area of research. Before we discuss this hybrid algorithm‚ we discuss separately the basics of the FM and SVR approach.

      4.4.1 Fuzzy Models

      A descriptive (linguistic) fuzzy model captures qualitative knowledge in the form of if‐then rules [106]:

      Here, x overTilde is the input (antecedent) linguistic variable, and Ai are the antecedent descriptive (linguistic) terms (constants). Similarly, y overTilde is the output (consequent) linguistic variable, and Bi are the consequent linguistic terms. The values of ModifyingAbove x With tilde left-parenthesis y overTilde right-parenthesis and the linguistic terms Ai(Bi) are fuzzy sets defined in the domains of their respective base variables: x ∈ X ⊂ Rp and y ∈ Y ⊂ Rq. The membership functions of the antecedent (consequent) fuzzy sets are then the mappings: μ(x) : X → [0, 1], μ(y) : Y → [0, 1]. Fuzzy sets Ai define fuzzy regions in the antecedent space, for which the respective consequent propositions hold. The linguistic terms Ai and Bi are usually selected from sets of predefined terms, such as Small, Medium, and so on. By denoting these sets by script upper A and script upper B, respectively, we have upper A Subscript i Baseline element-of script upper A and upper B Subscript i Baseline element-of script upper B. The rule base script upper R equals StartSet script upper R Subscript i Baseline bar i equals 1 comma 2 comma ellipsis comma upper K EndSet and the sets script upper A and script upper B constitute the knowledgebase of the linguistic model.

Скачать книгу