Computational Statistics in Data Science. Группа авторов. Математика. . Читать онлайн. Литмир. LITMIR.BIZ

Computational Statistics in Data Science. Группа авторов

Читать онлайн.

Информация о произведении:

Название Computational Statistics in Data Science

Год выпуска 0

isbn 9781119561088

Автор произведения Группа авторов

Жанр Математика

Серия

Издательство John Wiley & Sons Limited

Computational Statistics in Data Science - Группа авторов

Скачать книгу

of the greatest decrease, which is given by the gradient. The idea is that if we repeatedly do this, we will eventually arrive at a minimum. The algorithm guarantees a local minimum, but not necessarily a global one [4]; see Algorithm 1.

Gradient descent is often very slow in machine learning applications, as finding the true gradient of the error criterion usually involves iterating through the entire dataset. Since we need to calculate the gradient at each time step of the algorithm, this leads to having to iterate through the entire dataset a very large number of times. To speed up the process, we instead use a variation on gradient descent known as stochastic gradient descent. Stochastic gradient descent involves approximating the gradient at each time step with the gradient at a single observation, which significantly speeds up the process [5]; see Algorithm 2.

3 Feedforward Neural Networks

3.1 Introduction

A feedforward neural network, also known as a multilayer perceptron (MLP), is a popular supervised learning method that provides a parameterized form for the nonlinear map from an input to a predicted label [6]. The form of here can be depicted graphically as a directed layered network, where the directed edges go upward from nodes in one layer to nodes in the next layer. The neural network has been seen to be a very powerful model, as they are able to approximate any Borel measurable function to an arbitrary degree, provided that parameters are chosen correctly.

3.2 Model Description

We start by describing a simple MLP with three layers, as depicted in Figure 1.

The bottom layer of a three‐layer MLP is called the input layer, with each node representing the respective elements of an input vector. The top layer is known as the output layer and represents the final output of the model, a predicted vector. Again, each node in the output layer represents the respective predicted score of different classes. The middle layer is called the hidden layer and captures the unobserved latent features of the input. This is the only layer where the number of nodes is determined by the user of the model, rather than the problem itself.

The directed edges in the network represent weights from a node in one layer to another node in the next layer. We denote the weight from a node in the input layer to a node in the hidden layer as . The weight from a node in the hidden layer to a node in the output layer will be denoted . In each of the input and hidden layers, we introduce intercept nodes, denoted and , respectively. Weights from them to any other node are called biases. Each node in a given layer is connected by a weight to every node in the layer above except the intercept node.

The value of each node in the hidden and output layers is determined as a nonlinear transformation of the linear combination of the values of the nodes in the previous layers and the weights from each of those nodes to the node of interest. That is, the value of , , is given by , where , , and is a nonlinear transformation with range in the interval . Similarly, the value of , , is given by , where , , and is also a nonlinear transformation with a range in the interval .

More formally, the map provided by an MLP from a sample to can be written as follows:

StartLayout 1st Row 1st Column delta left-parenthesis bold-italic x Subscript i Baseline comma bold-italic theta Subscript script upper M Baseline right-parenthesis equals modifying above y with caret Subscript i Baseline equals tau left-parenthesis bold-italic upper V Superscript upper T Baseline gamma left-parenthesis bold-italic upper W Superscript upper T Baseline bold-italic x Subscript i Baseline right-parenthesis right-parenthesis 2nd Column Blank EndLayout

where bold-italic upper V equals left-parenthesis bold-italic v 0 comma period period period comma bold-italic
<p style= Скачать книгу

Новинки

Популярные

Наши рекомендации

ТОП просматриваемых книг сайта: