Скачать книгу

w affect four components of vec(An, u), that is, a3, V2, a2 , and images. By the properties of derivatives for matrix products and the chain rule

      (5.86)equation

      holds. Thus, (vec (Ru,v)) · ∂vec(An,u)/∂w is the sum of four contributions. In order to derive a method of computing those terms, let Ia denote the a × a identity matrix. Let ⊗ be the Kronecker product, and suppose that Pa is a a2 × a matrix such that vec(diag (v) = Pa v for any vector vRa. By the Kronecker product’s properties, vec(AB) = (BIa) · vec(A) holds for matrices A, B, and Ia having compatible dimensions [67]. Thus, we have

equation

      which implies

equation

      Similarly, using the properties vec(ABC) =(CA) · vec(B) and vec(AB) =(IaA) · vec(B), it follows that

equation

      where dh is the number of hidden neurons. Then, we have

      (5.89)equation

      where the aforementioned Kronecker product properties have been used.

      holds, where images . A similar reasoning can be applied also to the third contribution.

      1 Instructions b = (∂ew/∂o)(∂Gw/∂x)(x, lN) and =(∂ew/∂o)(∂Gw/∂w)(x, lN): The terms b and c can be calculated by the backpropagation of ∂ew/∂o through the network that implements gw . Since such an operation must be repeated for each node, the time complexity of instructions b = (∂ew/∂o)(∂Gw/∂x)(x, lN) and c = (∂ew/∂o)(∂Gw/∂w)(x, lN) is for all the GNN models.

      2  Instruction = z(t)(∂Fw/∂w)(x, l): By definition of Fw, fw , and BP, we have(5.92)

      where y = [ln, xu, l(n, u), lu] and BP1 indicates that we are considering only the first part of the output of BP. Similarly

      (5.93)equation

      where y = [ln, xu, l(n, u), lu]. These two equations provide a direct method to compute d in positional and nonlinear GNNs, respectively.

      For linear GNNs, let Скачать книгу