Скачать книгу

the input weights will be changed so as to get the minimum error. The error value will be investigated every time and it is helpful in changing the weights at nodes. At every hidden node, functions called activation functions are also used. Some of them are as follows. Please see Figure 1.2.

      1.4.1.1.1 Sigmoid

      The sigmoid function is used because it ranges between 0 and 1.

      1.4.1.1.2 Tanh

      Tanh is quite similar to sigmoid but better and ranges from −1 to 1.

Graph depicts a sigmoid function. Graph depicts a Tanh function.

      ReLU ranges from 0 to infinity.

      Using ReLU can rectify the vanishing grading problem. It also required very less computational power compared to the sigmoid and tanh. The main problem with the ReLU is that when the Z < 0, then the gradient tends to 0 which leads to no change in weights. So, to tackle this, ReLU is only used in hidden layers but not in input or output layers.

Graph depicts a ReLU function. Schematic illustration of the Basic Bernoulli’s restricted Boltzmann machine.

      1.4.2 Bernoulli’s Restricted Boltzmann Machines

      Bernoulli’s RBM has binary type of hidden and visible units hi and vi, respectively, and a matrix of weights w. It also has bias weights ai and bi for visible and hidden units, respectively. With these, the energy equation can be written as follows:

      (1.1)image

      (1.2)image

      Z is a normalizing constant just to make the sum of all probabilities equal to 1.

      The conditional probability of h given v is as follows:

      (1.3)image

      The conditional probability of v given h is as follows:

      (1.4)image

      The individual activation probabilities are as follows:

      (1.5)image

      (1.6)image

      For ANN, the results are as follows.

Graph depicts the accuracy plot for two hidden layer–based ANN. Graph depicts the accuracy plot for three hidden layer–based ANN. Graph depicts the accuracy plot for four hidden layer–based ANN. Скачать книгу