Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic. Программы. . Читать онлайн. Литмир. LITMIR.BIZ

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks. Savo G. Glisic

Читать онлайн.

Информация о произведении:

Название Artificial Intelligence and Quantum Computing for Advanced Wireless Networks

Год выпуска 0

isbn 9781119790310

Автор произведения Savo G. Glisic

Жанр Программы

Серия

Издательство John Wiley & Sons Limited

Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic

Скачать книгу

incorporates the attention mechanism into the propagation step. It computes the hidden states of each node by attending to its neighbors, following a self‐attention strategy. The work defines a single graph attentional layer and constructs arbitrary GATs by stacking this layer. The layer computes the coefficients in the attention mechanism of the node pair (i, j) by

(5.24)

where α_ij is the attention coefficient of node j to represents the neighborhoods of node i in the graph. The input set of node features to the layer is h = {h₁, h₂, …, h_N}, h_i ∈ ℝ^F, where N is the number of nodes, and F is the number of features of each node; the layer produces a new set of node features (of potentially different cardinality F^′), , as its output. is the weight matrix of a shared linear transformation that is applied to every node, and is the weight vector of a single‐layer FNN. It is normalized by a softmax function, and the LeakyReLU nonlinearity (with negative input slope α = 0.2) is applied. After applying a nonlinearity, the final output features of each node can be obtained as

(5.25)

The layer utilizes multi‐head attention similarly to [33] to stabilize the learning process. It applies K independent attention mechanisms to compute the hidden states and then concatenates their features (or computes the average), resulting in the following two output representations:

(5.26)

(5.27)

where is the normalized attention coefficient computed by the k‐th attention mechanism. The attention architecture in [35] has several properties: (i) the computation of the node‐neighbor pairs is parallelizable, thus making the operation efficient; (ii) it can be applied to graph nodes with different degrees by specifying arbitrary weights to neighbors; and (iii) it can be easily applied to inductive learning problems.

Apart from different variants of GNNs, several general frameworks have been proposed that aim to integrate different models into a single framework.

Message passing neural networks (MPNNs) [36]: This framework abstracts the commonalities between several of the most popular models for graph‐structured data, such as spectral approaches and non‐spectral approaches in graph convolution, gated GNNs, interaction networks, molecular graph convolutions, and deep tensor neural networks. The model contains two phases, a message passing phase and a readout phase. The message passing phase (namely, the propagation step) runs for T time steps and is defined in terms of th message function M_t and the vertex update function U_t . Using messages , the updating functions of the hidden states are

(5.28)

where e_vw represents features of the edge from node v to w. The readout phase computes a feature vector for the whole graph using the readout function R according to

(5.29)

where T denotes the total time steps. The message function M_t , vertex update function U_t , and readout function R could have different settings. Hence, the MPNN framework could generalize several different models via different function settings. Here, we give an example of generalizing GGNN, and other models’ function settings could be found in Eq. (5.36). The function settings for GGNNs are

(5.30)

where is the adjacency matrix, one for each edge label e. The is the gated recurrent unit introduced in [25]. i and j are neural networks in function R.

Non‐local neural networks (NLNN) are proposed for capturing long‐range dependencies with deep neural networks by computing the response at a position as a weighted sum of the features at all positions (in space, time, or spacetime). The generic non‐local operation is defined as

(5.31)

where i is the index of an output position, and j is the index that enumerates all possible positions. f(h_i, h_j) computes a scalar between i and j representing the relation between them. g(h_j) denotes a transformation of the input h_j , and a factor 1/ is utilized to normalize the results.

There are several instantiations with different f and g settings. For simplicity, the linear transformation can be used as the function g. That means g(h_j) = W_gh_j , where W_g is a learned weight matrix. The Gaussian function is a natural choice for function f, giving , where is dot‐product similarity and C (h) =∑_∀j f(h_i, h_j). It is straightforward to extend the Gaussian function by computing similarity in the embedding space giving with θ (h_i) = W_θh_i , φ(h_j ) = Wφh_j , and . The function f can also be implemented as a dot‐product similarity f(h_i, h_j) = θ(h_i)^T φ(h_j ). Here, the factor , where N is the number of positions in h. Concatenation can also be used, defined as , where w_f is a weight vector projecting the vector to a scalar and

5.1.3 Graph Networks

The Graph Network (GN) framework [37] generalizes and extends various GNN, MPNN, and NLNN approaches. A graph is defined as a 3‐tuple G = (u, H, E) (H is used instead of V for notational consistency). u is a global attribute,

Скачать книгу

Новинки

Популярные

Наши рекомендации

ТОП просматриваемых книг сайта: