Скачать книгу

rel="nofollow" href="#ulink_dbf2a90e-1d8b-569a-8c0e-e64612c04be7">Table 1.5 shows its parameters.

Layer name Input size Filter size Window size # Filters Stride Padding Output size # Feature maps # Connections
Conv 1 224 × 224 7 × 7 - 96 2 0 110 × 110 96 14,208
Max-pooling 1 110 × 110 3 × 3 - 2 0 55 × 55 96 0
Conv 2 55 × 55 5 × 5 - 256 2 0 26 × 26 256 614,656
Max-pooling 2 26 × 26 - 3 × 3 - 2 0 13 × 13 256 0
Conv 3 13 × 13 3 × 3 - 384 1 1 13 × 13 384 885,120
Conv 4 13 × 13 3 × 3 - 384 1 1 13 × 13 384 1,327,488
Conv 5 13 × 13 3 × 3 - 256 1 1 13 × 13 256 884,992
Max-pooling 3 13 × 13 - 3 × 3 - 2 0 6 × 6 256 0
Fully connected 1 4,096 neurons 37,752,832
Fully connected 2 4,096 neurons 16,781,312
Fully connected 3 1,000 neurons 4,097,000
Softmax 1,000 classes 62,357,608 (Total)

      1.2.5 GoogLeNet

      For each image, resizing is performed so that the input to the network is 224 × 224 × 3 image, extract mean before feeding the training image to the network. The dataset contains 1,000 categories, 1.2 million images for training, 100,000 for testing, and 50,000 for validation. GoogLeNet is 22 layers deep and uses nine inception modules, and global average pooling instead of fully connected layers to go from 7 × 7 × 1,024 to 1 × 1 × 1024, which, in turn, saves a huge number of parameters. It includes several softmax output units to enforce regularization. It is trained on a high-end GPUs within a week and achieved top-5 error rate of 6.67%. GoogleNet trains faster

Скачать книгу