Скачать книгу

      Figure 2.1 A high-level representation of Bayes optimal classifier.

      Bootstrap aggregating is popularly referred as bagging is a machine learning–based ensemble technique which improves the accuracy of the algorithm and is used mostly for classification or aggregation purposes. The main purpose of bagging is that it avoids overfitting problem by properly generalizing the existing data samples. Consider any standard input dataset from which new training datasets are generated by sampling the data samples uniformly with replacement. By considering the replacements, some of the observations are repeated in the form of the unique data samples using regression or voting mechanisms. The bagging technique is composed of artificial neural networks and regression tree, which are used to improve the unstable procedures. For any given application, the selection between bagging and boosting depends on the availability of the data. The variance incurred is reduced by combining bootstrap and bagging [15, 16].

      Bagging and boosting operations are considered as two most powerful tools in ensemble machine learning. The bagging operation is used concurrently with the decision tree which increases the stability of the model by reducing the variance and also improves the accuracy of the model by minimizing the error rate. The aggregation of set of predictions made by the ensemble models happens to produce best prediction as the output. While doing bootstrapping the prominent sample is taken out using the replacement mechanism in which the selection of new variables is dependent on the previous random selections. The practical application of this technique is dependent on the base learning algorithm which is chosen first and on top of which the bagging of pool of decision trees happen. Some of the zonotic diseases which can be identified and treated well using bootstrap aggregating are zonotic influenza, salmonellosis, West Nile virus, plague, rabies, Lyme disease, brucellosis, and so on [17]. A high-level representation of bootstrap is shown in Figure 2.2. It begins with the training dataset, which is distributed among the multiple bootstrap sampling units. Each of the bootstrap sampling unit operates on the training subset of data upon which the learning algorithm performs the learning operation and generates the classification output. The aggregated sum of each of the classifier is generated as the output.

      Figure 2.2 A high-level representation of Bootstrap aggregating.

      Some of the advantages offered by bootstrap in diagnosing the zonotic diseases are as follows: the aggregated power of several weak learners over runs the performance of strong learner, the variance incurred gets reduced as it efficiently handles overfitting problem, no loss of precision during interoperability, computationally not expensive due to proper management of resources, computation of over confidence bias becomes easier, equal weights are assigned to models which increases the performance, misclassification of samples is less, very much robust to the effect of the outliers and noise, the models can be easily paralyzed, achieves high accuracy through incremental development of the model, stabilizes the unstable methods, easier from implementation point of view, provision for unbiased estimate of test errors, easily overcomes the pitfalls of individuals machine learning models, and so on.

      Figure 2.3 A high-level representation of Bayesian model averaging (BMA).

      Some of the advantages offered by BMA in diagnosing the zonotic diseases are as follows: capable of performing multi-variable selection, generates overconfident inferences, the number of selected features are less, easily scalable to any number of classes, posterior probability efficiency is high, deployment of the model is easier, correct estimation of uncertainty, suitable to handle complex applications, proper accounting of the model, combines estimation and predictions, flexible with prior distribution, uses mean candidate placement model, performs multi-linear operation, suitable of handling the heterogeneous resources, provides transparent interpretation of the large amount of data, error reduction happens exponentially, the variance incurred in prediction is less, flexibility achieved in parameter inference is less, prediction about model prediction is less, high-speed compilation happens, generated high valued output, combines efficiency achieved by several learner and average models, very much robust against the effect caused by misspecification of input attributes, model specification is highly dynamic, and so on.

      Bayesian classifier combination (BCC) considers k different types of classifiers and produces the combined output. The motivation behind the innovation of this classifier is will capture the exhaustive possibilities about all forms of data, and ease of computation of marginal likelihood relationships. This classifier will not assume that the existing classifiers are true rather it is assumed to be probabilistic which mimics the behavior of the human experts. The BCC classifier uses different confusion matrices employed over the different data points for classification purpose. If the data points are hard, then the BCC uses their own confusion matrix; else, the posterior confusion matrix will be made use. The classifier identifies the relationship between the output of the model and the unknown data labels. The probabilistic models are

Скачать книгу