Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Программы. . Читать онлайн. Литмир. LITMIR.BIZ

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic

Читать онлайн.

Информация о произведении:

Название Artificial Intelligence and Quantum Computing for Advanced Wireless Networks

Год выпуска 0

isbn 9781119790310

Автор произведения Savo G. Glisic

Жанр Программы

Серия

Издательство John Wiley & Sons Limited

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic

Скачать книгу

gain (ig) is an impurity‐based criterion that uses the entropy (e) measure (origin from information theory) as the impurity measure:

italic i g left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals e left-parenthesis y comma upper S right-parenthesis minus sigma-summation Underscript v Subscript i comma j Baseline element-of dom left-parenthesis a Subscript i Baseline right-parenthesis Endscripts StartFraction bar sigma Subscript a Sub Subscript ModifyingAbove normal t With ampersand c period dotab semicolon Subscript equals v Sub Subscript i comma j Subscript Baseline upper S bar Over bar upper S bar EndFraction dot e left-parenthesis y comma sigma Subscript a Sub Subscript ModifyingAbove tau With ampersand c period dotab semicolon Subscript equals v Sub Subscript i comma j Subscript Baseline upper S right-parenthesis

where

(2.14)

Gini index: This is an impurity‐based criterion that measures the divergence between the probability distributions of the target attribute’s values. The Gini (G) index is defined as

(2.15)

Consequently, the evaluation criterion for selecting the attribute a_i is defined as the Gini gain (GG):

(2.16)

Likelihood ratio chi‐squared statistics: The likelihood ratio (lr) is defined as

(2.17)

This ratio is useful for measuring the statistical significance of the information gain criteria. The zero hypothesis (H₀) is that the input attribute and the target attribute are conditionally independent. If H₀ holds, the test statistic is distributed as χ² with degrees of freedom equal to (dom(a_i) − 1) · (dom(y) − 1).

Normalized impurity‐based criterion: The impurity‐based criterion described above is biased toward attributes with larger domain values. That is, it prefers input attributes with many values over attributes with less values. For instance, an input attribute that represents the national security number will probably get the highest information gain. However, adding this attribute to a decision tree will result in a poor generalized accuracy. For that reason, it is useful to “normalize” the impurity‐based measures, as described in the subsequent paragraphs.

Gain ratio ( gr): This ratio “normalizes” the information gain (ig) as follows: gr(a_i, S) = ig(a_i, S)/e(a_i, S). Note that this ratio is not defined when the denominator is zero. Also, the ratio may tend to favor attributes for which the denominator is very small. Consequently, it is suggested in two stages. First, the information gain is calculated for all attributes. Then, taking into consideration only attributes that have performed at least as well as the average information gain, the attribute that has obtained the best ratio gain is selected. It has been shown that the gain ratio tends to outperform simple information gain criteria both from the accuracy aspect as well as from classifier complexity aspect.

Distance measure: Similar to the gain ratio, this measure also normalizes the impurity measure. However, the method used is different:

italic upper D upper M left-parenthesis a Subscript i Baseline comma upper S right-parenthesis equals StartFraction upper Delta upper Phi left-parenthesis a Subscript i Baseline comma upper S right-parenthesis Over minus sigma-summation Underscript v Subscript i comma j Baseline element-of dom left-parenthesis a Subscript i Baseline right-parenthesis Endscripts sigma-summation Underscript c Subscript k Baseline element-of dom left-parenthesis y right-parenthesis Endscripts b dot log Subscript 2 Baseline b EndFraction

where

(2.18)

Binary criteria: These are used for creating binary decision trees. These measures are based on the division of the input attribute domain into two subdomains.

Let β(a_i, d₁, d₂, S) denote the binary criterion value for attribute a_i over sample S when d1 and d2 are its corresponding subdomains. The value obtained for the optimal division of the attribute domain into two mutually exclusive and exhaustive subdomains, is used for comparing attributes, namely

(2.19)

Twoing criterion: The Gini index may encounter problems when the domain of the target attribute is relatively wide. In this case, they suggest using the binary criterion called the twoing (tw) criterion. This criterion is defined as

(2.20) Скачать книгу

Новинки

Популярные

Наши рекомендации

ТОП просматриваемых книг сайта:

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic

Информация о произведении: