Скачать книгу

rule, we get another useful formula:

Image

      which is the famous Bayes formula. Note that it also holds for more than two variables:

Image

      Using product rule, we can deduce the chain rule:

Image

      where X1, X2, …, Xn are n random variables.

      The average value of some function f(x) (where x is the value of a certain random variable) under a probability distribution P(x) is called the expectation of f(x). For a discrete distribution, it can be written as

Image

      Usually, when f(x) = x, 𝔼[x] stands for the expectation of x.

      To measure the dispersion level of f(x) around its mean value E[f(x)], we introduce the variance of f(x):

Image

      The standard deviation is the square root of variance. In some level, covariance expresses the degree to which two variables vary together:

Image

      Greater covariance indicates higher relevance between f(x) and g(y).

      Probability distributions describe the probability of a random variable or several random variables on every state. Several distributions that are useful in the area of machine learning are listed as follows.

      • Gaussian distribution: it is also known as normal distribution and can be expressed as:

Image

      where μ is the mean of variable x and σ2 is the variance.

      • Bernoulli distribution: random variable X can either be 0 or 1, with a probability P(X = 1) = p. Then the distribution function is

Image

      It is quite obvious that E(X = p and Var(X) = p(1 − p).

      • Binomial distribution: repeat the Bernoulli experiment for N times and the times that X equals to 1 is denote by Y, then

Image

      is the Binomial distribution satisfying that E(Y = np) and Var(Y) = np(1 − p).

      • Laplace distribution: the Laplace distribution is described as

Image

      Graphs are the basic subjects in the study of GNNs. Therefore, to get a comprehensive understanding of GNN, the fundamental graph theory is required.

      A graph is often denoted by G = (V, E), where V is the set of vertices and E is the set of edges. An edge e = u, v has two endpoints u and v, which are said to be joined by e. In this case, u is called a neighbor of v, or in other words, these two vertices are adjacent. Note that an edge can either be directed or undirected. A graph is called a directed graph if all edges are directed or undirected graph if all edges are undirected. The degree of vertice v, denoted by d(v), is the number of edges connected with v.

      There are a few useful algebra representations for graphs, which are listed as follows.

      • Adjacency matrix: for a simple graph G = (V, E) with n-vertices, it can be described by an adjacency matrix A ∈ ℝn × n, where

Image

      It is obvious that such matrix is a symmetric matrix when G is an undirected graph.

      • Degree matrix: for a graph G = (V, E) with n-vertices, its degree matrix D ∈ ℝn × n is a diagonal matrix, where

Image

      • Laplacian matrix: for a simple graph G = (V, E) with n-vertices, if we consider all edges in G to be undirected, then its Laplacian matrix L = ℝn × n can be defined as

Image

      Thus, we have the elements:

Image

      Note that the graph is considered to be an undirected graph for the adjacency matrix.

      • Symmetric normalized Laplacian: the symmetric normalized Laplacian is defined as:

Image

      The elements are given by:

Image

      • Random walk normalized Laplacian: it is defined as:

Image

      The elements can be computed by:

Image

      • Incidence matrix: another common used matrix in representing a graph is incidence matrix. For a directed graph G = (V, E) with n-vertices and m-edges, the corresponding incidence matrix is M ∈ ℝn × m, where

Image

      For a undirected graph, the corresponding incidence matrix satisfies that

Image

      CHAPTER 3

      Basics of Neural Networks

      Neural networks are one of the most important models in machine learning. The structure of artificial neural networks, which consists of numerous neurons with connections to each other, bears great resemblance to that of biological neural networks.

Скачать книгу