Transfer Learning — Part — 6.2!! Implementing Mobilenet in PyTorch

RAVI SHEKHAR TIWARI
46 min readJun 5, 2022
Figure.1 Transfer Learning

In Part 6.0 of the Transfer Learning series we have discussed about Mobilenet pre-trained model in depth so in this series we will implement the above mentioned pre-trained model in PyTorch. This part is going to be little long because we are going to implement Mobilenet in PyTorch with Python. We will be implementing the per-trained Mobilenet model in 4 ways which we will discuss further in this article. For setting- up the Colab notebook it will be advisable to go through the below mentioned article of Transfer Learning Series.

It is also advisable to go through the article of ResNet before reading this article which is mentioned below:

1.Implementing Mobilenet Pre-trained model

In this section we will see how we can implement Mobilenet model in PyTorch to have a foundation to start our real implementation .

1.1. Image to predict

We will use the image of the coffee mug to predict the labels with the Mobilenet architectures. Below i have demonstrated the code how to load and preprocess the image.

import torch                                          #Line 1
import torchvision.models as models #Line 2
from PIL import Image #Line 3
import torchvision.transforms.functional as TF #Line 4
from torchsummary import summary #Line 5
!pip install torchviz #Line 6
from torchviz import make_dot #Line 7
import numpy as np

Line 1: The above snippet is used to import the PyTorch library which we use use to implement Mobilenet work.

Line 2: The above snippet is used to import the PyTorch pre-trained models.

Line 3: The above snippet is used to import the PIL library for visualization purpose.

Line 4: The above snippet is used to import the PyTorch Transformation library which we use use to transform the dataset for training and testing.

Line 5: The above snippet is used to import library which shows the summary of models.

Line 6: The above snippet is used to install torchviz to visualise the network.

Line 7: The above snippet is used to import torchviz to visualize the network.

image = Image.open(link_of_image)   #Line 8
image=image.resize((224,224)) #Line 9
x = TF.to_tensor(image) #Line 10
x.unsqueeze_(0) #Line 11
x=x.to(device) #Line 12
print(x.shape) #Line 13

Line 8: This snippet loads the images from the path.

Line 9: This snippet converts the image in the size (224,224) required by the model.

Line 10: This snippet convert the image into array

Line 11: This snippet converts the image size into (batch_Size,height,width, channel) from (height,width, channel) i.e. (1,224,224,3) from (224,224,3).

Line 12: This snippet is used to move the image to the device on which model is registered.

Line 13: This snippet use to display the image shape as shown below:

torch.Size([1, 3, 224, 224])
Figure. 1 Image to be predicted

1.2. Mobilenet Implementation

Here we will use Mobilenet network to predict on the coffee mug image code is demonstrated below.

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')                                                      #LINE 0
mobilenet_pretrained = models.mobilenet_v2(pretrained=True).to(device)
#LINE 1
summary(resnet_pretrained, (3, 224, 224)) #LINE 2
resnet_pretrained #LINE 3

Line 0: This is used to check the availability of the device in our environment and save it so we we utilize the resources better.

Line 1: This snippets is used to create an object for the Mobilenet model by including all its layer, pre-trained is set to true which will include all the default weight of the model trained on ImageNet dataset and attached the model to the avaliable device i.e. GPU or CPU. The model accepts data in channel first format i.e. (channel,height,width) in this case (3,224,224)

Line 2: This snippets shows the summary of the network as shown below:

----------------------------------------------------------------         Layer (type)               Output Shape         Param # ================================================================             Conv2d-1              [-1, 32, 112, 112]             864        BatchNorm2d-2         [-1, 32, 112, 112]              64              ReLU6-3               [-1, 32, 112, 112]               0             Conv2d-4              [-1, 32, 112, 112]             288        BatchNorm2d-5         [-1, 32, 112, 112]              64              ReLU6-6               [-1, 32, 112, 112]               0             Conv2d-7              [-1, 16, 112, 112]             512        BatchNorm2d-8         [-1, 16, 112, 112]              32   InvertedResidual-9    [-1, 16, 112, 112]               0            Conv2d-10             [-1, 96, 112, 112]           1,536       BatchNorm2d-11        [-1, 96, 112, 112]             192             ReLU6-12              [-1, 96, 112, 112]               0            Conv2d-13               [-1, 96, 56, 56]             864       BatchNorm2d-14          [-1, 96, 56, 56]             192             ReLU6-15                [-1, 96, 56, 56]               0            Conv2d-16               [-1, 24, 56, 56]           2,304       BatchNorm2d-17          [-1, 24, 56, 56]              48  InvertedResidual-18     [-1, 24, 56, 56]               0            Conv2d-19              [-1, 144, 56, 56]           3,456       BatchNorm2d-20         [-1, 144, 56, 56]             288             ReLU6-21               [-1, 144, 56, 56]               0            Conv2d-22              [-1, 144, 56, 56]           1,296       BatchNorm2d-23         [-1, 144, 56, 56]             288             ReLU6-24               [-1, 144, 56, 56]               0            Conv2d-25               [-1, 24, 56, 56]           3,456       BatchNorm2d-26          [-1, 24, 56, 56]              48  InvertedResidual-27     [-1, 24, 56, 56]               0            Conv2d-28              [-1, 144, 56, 56]           3,456       BatchNorm2d-29         [-1, 144, 56, 56]             288             ReLU6-30               [-1, 144, 56, 56]               0            Conv2d-31              [-1, 144, 28, 28]           1,296       BatchNorm2d-32         [-1, 144, 28, 28]             288             ReLU6-33               [-1, 144, 28, 28]               0            Conv2d-34               [-1, 32, 28, 28]           4,608       BatchNorm2d-35          [-1, 32, 28, 28]              64  InvertedResidual-36     [-1, 32, 28, 28]               0            Conv2d-37              [-1, 192, 28, 28]           6,144       BatchNorm2d-38         [-1, 192, 28, 28]             384             ReLU6-39               [-1, 192, 28, 28]               0            Conv2d-40              [-1, 192, 28, 28]           1,728       BatchNorm2d-41         [-1, 192, 28, 28]             384             ReLU6-42               [-1, 192, 28, 28]               0            Conv2d-43               [-1, 32, 28, 28]           6,144       BatchNorm2d-44          [-1, 32, 28, 28]              64  InvertedResidual-45     [-1, 32, 28, 28]               0            Conv2d-46              [-1, 192, 28, 28]           6,144       BatchNorm2d-47         [-1, 192, 28, 28]             384             ReLU6-48               [-1, 192, 28, 28]               0            Conv2d-49              [-1, 192, 28, 28]           1,728       BatchNorm2d-50         [-1, 192, 28, 28]             384             ReLU6-51               [-1, 192, 28, 28]               0            Conv2d-52               [-1, 32, 28, 28]           6,144       BatchNorm2d-53          [-1, 32, 28, 28]              64  InvertedResidual-54     [-1, 32, 28, 28]               0            Conv2d-55              [-1, 192, 28, 28]           6,144       BatchNorm2d-56         [-1, 192, 28, 28]             384             ReLU6-57               [-1, 192, 28, 28]               0            Conv2d-58              [-1, 192, 14, 14]           1,728       BatchNorm2d-59         [-1, 192, 14, 14]             384             ReLU6-60               [-1, 192, 14, 14]               0            Conv2d-61               [-1, 64, 14, 14]          12,288       BatchNorm2d-62          [-1, 64, 14, 14]             128  InvertedResidual-63     [-1, 64, 14, 14]               0            Conv2d-64              [-1, 384, 14, 14]          24,576       BatchNorm2d-65         [-1, 384, 14, 14]             768             ReLU6-66               [-1, 384, 14, 14]               0            Conv2d-67              [-1, 384, 14, 14]           3,456       BatchNorm2d-68         [-1, 384, 14, 14]             768             ReLU6-69               [-1, 384, 14, 14]               0            Conv2d-70               [-1, 64, 14, 14]          24,576       BatchNorm2d-71          [-1, 64, 14, 14]             128  InvertedResidual-72     [-1, 64, 14, 14]               0            Conv2d-73              [-1, 384, 14, 14]          24,576       BatchNorm2d-74         [-1, 384, 14, 14]             768             ReLU6-75               [-1, 384, 14, 14]               0            Conv2d-76              [-1, 384, 14, 14]           3,456       BatchNorm2d-77         [-1, 384, 14, 14]             768             ReLU6-78               [-1, 384, 14, 14]               0            Conv2d-79               [-1, 64, 14, 14]          24,576       BatchNorm2d-80          [-1, 64, 14, 14]             128  InvertedResidual-81     [-1, 64, 14, 14]               0            Conv2d-82              [-1, 384, 14, 14]          24,576       BatchNorm2d-83         [-1, 384, 14, 14]             768             ReLU6-84               [-1, 384, 14, 14]               0            Conv2d-85              [-1, 384, 14, 14]             768             ReLU6-87               [-1, 384, 14, 14]               0            Conv2d-88               [-1, 64, 14, 14]          24,576       BatchNorm2d-89          [-1, 64, 14, 14]             128  InvertedResidual-90     [-1, 64, 14, 14]               0            Conv2d-91              [-1, 384, 14, 14]          24,576       BatchNorm2d-92         [-1, 384, 14, 14]             768             ReLU6-93               [-1, 384, 14, 14]               0            Conv2d-94              [-1, 384, 14, 14]           3,456       BatchNorm2d-95         [-1, 384, 14, 14]             768             ReLU6-96               [-1, 384, 14, 14]               0            Conv2d-97               [-1, 96, 14, 14]          36,864       BatchNorm2d-98          [-1, 96, 14, 14]             192  InvertedResidual-99     [-1, 96, 14, 14]               0           Conv2d-100             [-1, 576, 14, 14]          55,296      BatchNorm2d-101        [-1, 576, 14, 14]           1,152            ReLU6-102              [-1, 576, 14, 14]               0           Conv2d-103             [-1, 576, 14, 14]           5,184      BatchNorm2d-104        [-1, 576, 14, 14]           1,152            ReLU6-105              [-1, 576, 14, 14]               0           Conv2d-106              [-1, 96, 14, 14]          55,296      BatchNorm2d-107         [-1, 96, 14, 14]             192 InvertedResidual-108    [-1, 96, 14, 14]               0           Conv2d-109             [-1, 576, 14, 14]          55,296      BatchNorm2d-110        [-1, 576, 14, 14]           1,152            ReLU6-111              [-1, 576, 14, 14]               0           Conv2d-112             [-1, 576, 14, 14]           5,184      BatchNorm2d-113        [-1, 576, 14, 14]           1,152            ReLU6-114              [-1, 576, 14, 14]               0           Conv2d-115              [-1, 96, 14, 14]          55,296      BatchNorm2d-116         [-1, 96, 14, 14]             192 InvertedResidual-117    [-1, 96, 14, 14]               0           Conv2d-118             [-1, 576, 14, 14]          55,296      BatchNorm2d-119        [-1, 576, 14, 14]           1,152            ReLU6-120              [-1, 576, 14, 14]               0           Conv2d-121               [-1, 576, 7, 7]           5,184      BatchNorm2d-122          [-1, 576, 7, 7]           1,152            ReLU6-123                [-1, 576, 7, 7]               0           Conv2d-124               [-1, 160, 7, 7]          92,160      BatchNorm2d-125          [-1, 160, 7, 7]             320 InvertedResidual-126     [-1, 160, 7, 7]               0           Conv2d-127               [-1, 960, 7, 7]         153,600      BatchNorm2d-128          [-1, 960, 7, 7]           1,920            ReLU6-129                [-1, 960, 7, 7]               0           Conv2d-130               [-1, 960, 7, 7]           8,640      BatchNorm2d-131          [-1, 960, 7, 7]           1,920            ReLU6-132                [-1, 960, 7, 7]               0           Conv2d-133               [-1, 160, 7, 7]         153,600      BatchNorm2d-134          [-1, 160, 7, 7]             320 InvertedResidual-135     [-1, 160, 7, 7]               0           Conv2d-136               [-1, 960, 7, 7]         153,600      BatchNorm2d-137          [-1, 960, 7, 7]           1,920            ReLU6-138                [-1, 960, 7, 7]               0           Conv2d-139               [-1, 960, 7, 7]           8,640      BatchNorm2d-140          [-1, 960, 7, 7]           1,920            ReLU6-141                [-1, 960, 7, 7]               0           Conv2d-142               [-1, 160, 7, 7]         153,600      BatchNorm2d-143          [-1, 160, 7, 7]             320 InvertedResidual-144     [-1, 160, 7, 7]               0           Conv2d-145               [-1, 960, 7, 7]         153,600      BatchNorm2d-146          [-1, 960, 7, 7]           1,920            ReLU6-147                [-1, 960, 7, 7]               0           Conv2d-148               [-1, 960, 7, 7]           8,640      BatchNorm2d-149          [-1, 960, 7, 7]           1,920            ReLU6-150                [-1, 960, 7, 7]               0           Conv2d-151               [-1, 320, 7, 7]         307,200      BatchNorm2d-152          [-1, 320, 7, 7]             640 InvertedResidual-153     [-1, 320, 7, 7]               0           Conv2d-154              [-1, 1280, 7, 7]         409,600      BatchNorm2d-155         [-1, 1280, 7, 7]           2,560            ReLU6-156               [-1, 1280, 7, 7]               0          Dropout-157                   [-1, 1280]               0           Linear-158                    [-1, 1000]       1,281,000 ================================================================ Total params: 3,504,872 
Trainable params: 3,504,872
Non-trainable params: 0
---------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 152.87 Params size (MB): 13.37 Estimated Total Size (MB): 166.81 ----------------------------------------------------------------

Line 3: This line is used to see the full parameter of the layers which is shown below with types of layer:

MobileNetV2(   (features): Sequential(     (0): ConvBNActivation(       (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )     (1): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)           (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)         (2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (2): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (3): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (4): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (5): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (6): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (7): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (8): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (9): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (10): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (11): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (12): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (13): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (14): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (15): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (16): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (17): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (18): ConvBNActivation(       (0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)       (1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )   )   (classifier): Sequential(     (0): Dropout(p=0.2, inplace=False)     (1): Linear(in_features=1280, out_features=1000, bias=True)

Now after loading the model and setting up the parameters it is the time for predicting the image as demonstrated below.

mobilenet_prediction=mobilenet_pretrained(x)                       #Line 4
mobilenet_prediction_numpy=mobilenet_prediction.detach().numpy() #Line 5
predicted_class_max = np.argmax(mobilenet_prediction_numpy) #Line 6
predicted_class_max #Line 7

Line 4: This snippets send the pre-processed image to the Mobilenet network for getting prediction.

Line 5: This line is used to move the prediction from the model from GPU to CPU so we can manipulate it and convert the prediction from torch tensor to numpy array.

Line 6: This snippet is used to get the array index whose probability is maximum.

Line 7: This snippets is used to display the highest probability class.

504

The below snippets is used to read the label from text file and display the label name as shown below:

with open(‘/content/imagenet1000_clsidx_to_labels.txt’, ‘r’) as fp:
line_numbers = [predicted_class_max]
for i, line in enumerate(fp):
if i in line_numbers:
lines.append(line.strip()
breakprint(lines)
Output>>>>
["504: 'coffee mug',"]

2.1. As a feature Extraction model.

Since we have discussed the Mobilenet model in details in out previous article i.e. in part 5.0 of Transfer Learning Series and we know the model have been trained in huge dataset named as ImageNet which has 1000 object. So we can use the pre-trained Mobilenet to extract the features from the image and we can feed the features in another Machine model model for classification, self-supervise learning or many other application. It will give us the following benefits:

  1. No need to train very deep Deep Learning Model.
  2. Easy to implement the any algorithm if datasets is small also.
  3. No need of high computing resources.
  4. Less development time as well as the less deployment time.

2.1.1 Image to extract feature

We will use the image of the coffee mug to predict the labels with the Mobilenet architectures. Below i have demonstrated the code how to load and preprocess the image.

import torch                                          #Line 1
import torchvision.models as models #Line 2
from PIL import Image #Line 3
import torchvision.transforms.functional as TF #Line 4
from torchsummary import summary #Line 5
!pip install torchviz #Line 6
from torchviz import make_dot #Line 7
import numpy as np

Line 1: The above snippet is used to import the PyTorch library which we use use to implement Mobilenet network.

Line 2: The above snippet is used to import the PyTorch pre-trained models.

Line 3: The above snippet is used to import the PIL library for visualization purpose.

Line 4: The above snippet is used to import the PyTorch Transformation library which we use use to transform the dataset for training and testing.

Line 5: The above snippet is used to import library which shows the summary of models.

Line 6: The above snippet is used to install torchviz to visualise the network.

Line 7: The above snippet is used to import torchviz to visualize the network.

image = Image.open(link_of_image)   #Line 8
image=image.resize((224,224)) #Line 9
x = TF.to_tensor(image) #Line 10
x.unsqueeze_(0) #Line 11
x=x.to(device) #Line 12
print(x.shape) #Line 13

Line 8: This snippet loads the images from the path.

Line 9: This snippet converts the image in the size (224,224) required by the model.

Line 10: This snippet convert the image into array

Line 11: This snippet converts the image size into (batch_Size,height,width, channel) from (height,width, channel) i.e. (1,224,224,3) from (224,224,3).

Line 12: This snippet is used to move the image to the device on which model is registered.

Line 13: This snippet use to display the image shape as shown below:

torch.Size([1, 3, 224, 224])

2.1.2 Mobilenet Implementation as Feature extraction(code)

Here we will use Mobilenet network to extract features of the coffee mug image code is demonstrated below.

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')                                                      
#LINE 0
mobilenet_pretrained = models.mobilenet_v2(pretrained=True).to(device)
#LINE 1
mobilenet_pretrained.features
#LINE 2
summary(mobilenet_pretrained, (3, 224, 224))
#LINE 3

Line 0: This is used to check the availability of the device in our environment and save it so we we utilize the resources better.

Line 1: This snippets is used to create an object for the Mobilenet model by including all its layer, pre-trained is set to true which will include all the default weight of the model trained on ImageNet dataset and attached the model to the avaliable device i.e. GPU or CPU. The model accepts data in channel first format i.e. (channel,height,width) in this case (3,224,224)

Line 2: This snippets shows the summary of the network as shown below:

MobileNetV2(   (features): Sequential(     (0): ConvBNActivation(       (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )     (1): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)           (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)         (2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (2): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (3): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (4): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (5): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (6): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (7): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (8): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (9): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (10): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (11): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (12): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (13): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (14): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (15): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (16): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (17): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (18): ConvBNActivation(       (0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)       (1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )   )   (classifier): Sequential(     (0): Dropout(p=0.2, inplace=False)     (1): Linear(in_features=1280, out_features=1000, bias=True)

Line 3: This line is used to see the full parameter of the feature extractor layers which is shown below :

----------------------------------------------------------------         Layer (type)               Output Shape         Param # ================================================================             Conv2d-1              [-1, 32, 112, 112]             864        BatchNorm2d-2         [-1, 32, 112, 112]              64              ReLU6-3               [-1, 32, 112, 112]               0             Conv2d-4              [-1, 32, 112, 112]             288        BatchNorm2d-5         [-1, 32, 112, 112]              64              ReLU6-6               [-1, 32, 112, 112]               0             Conv2d-7              [-1, 16, 112, 112]             512        BatchNorm2d-8         [-1, 16, 112, 112]              32   InvertedResidual-9    [-1, 16, 112, 112]               0            Conv2d-10             [-1, 96, 112, 112]           1,536       BatchNorm2d-11        [-1, 96, 112, 112]             192             ReLU6-12              [-1, 96, 112, 112]               0            Conv2d-13               [-1, 96, 56, 56]             864       BatchNorm2d-14          [-1, 96, 56, 56]             192             ReLU6-15                [-1, 96, 56, 56]               0            Conv2d-16               [-1, 24, 56, 56]           2,304       BatchNorm2d-17          [-1, 24, 56, 56]              48  InvertedResidual-18     [-1, 24, 56, 56]               0            Conv2d-19              [-1, 144, 56, 56]           3,456       BatchNorm2d-20         [-1, 144, 56, 56]             288             ReLU6-21               [-1, 144, 56, 56]               0            Conv2d-22              [-1, 144, 56, 56]           1,296       BatchNorm2d-23         [-1, 144, 56, 56]             288             ReLU6-24               [-1, 144, 56, 56]               0            Conv2d-25               [-1, 24, 56, 56]           3,456       BatchNorm2d-26          [-1, 24, 56, 56]              48  InvertedResidual-27     [-1, 24, 56, 56]               0            Conv2d-28              [-1, 144, 56, 56]           3,456       BatchNorm2d-29         [-1, 144, 56, 56]             288             ReLU6-30               [-1, 144, 56, 56]               0            Conv2d-31              [-1, 144, 28, 28]           1,296       BatchNorm2d-32         [-1, 144, 28, 28]             288             ReLU6-33               [-1, 144, 28, 28]               0            Conv2d-34               [-1, 32, 28, 28]           4,608       BatchNorm2d-35          [-1, 32, 28, 28]              64  InvertedResidual-36     [-1, 32, 28, 28]               0            Conv2d-37              [-1, 192, 28, 28]           6,144       BatchNorm2d-38         [-1, 192, 28, 28]             384             ReLU6-39               [-1, 192, 28, 28]               0            Conv2d-40              [-1, 192, 28, 28]           1,728       BatchNorm2d-41         [-1, 192, 28, 28]             384             ReLU6-42               [-1, 192, 28, 28]               0            Conv2d-43               [-1, 32, 28, 28]           6,144       BatchNorm2d-44          [-1, 32, 28, 28]              64  InvertedResidual-45     [-1, 32, 28, 28]               0            Conv2d-46              [-1, 192, 28, 28]           6,144       BatchNorm2d-47         [-1, 192, 28, 28]             384             ReLU6-48               [-1, 192, 28, 28]               0            Conv2d-49              [-1, 192, 28, 28]           1,728       BatchNorm2d-50         [-1, 192, 28, 28]             384             ReLU6-51               [-1, 192, 28, 28]               0            Conv2d-52               [-1, 32, 28, 28]           6,144       BatchNorm2d-53          [-1, 32, 28, 28]              64  InvertedResidual-54     [-1, 32, 28, 28]               0            Conv2d-55              [-1, 192, 28, 28]           6,144       BatchNorm2d-56         [-1, 192, 28, 28]             384             ReLU6-57               [-1, 192, 28, 28]               0            Conv2d-58              [-1, 192, 14, 14]           1,728       BatchNorm2d-59         [-1, 192, 14, 14]             384             ReLU6-60               [-1, 192, 14, 14]               0            Conv2d-61               [-1, 64, 14, 14]          12,288       BatchNorm2d-62          [-1, 64, 14, 14]             128  InvertedResidual-63     [-1, 64, 14, 14]               0            Conv2d-64              [-1, 384, 14, 14]          24,576       BatchNorm2d-65         [-1, 384, 14, 14]             768             ReLU6-66               [-1, 384, 14, 14]               0            Conv2d-67              [-1, 384, 14, 14]           3,456       BatchNorm2d-68         [-1, 384, 14, 14]             768             ReLU6-69               [-1, 384, 14, 14]               0            Conv2d-70               [-1, 64, 14, 14]          24,576       BatchNorm2d-71          [-1, 64, 14, 14]             128  InvertedResidual-72     [-1, 64, 14, 14]               0            Conv2d-73              [-1, 384, 14, 14]          24,576       BatchNorm2d-74         [-1, 384, 14, 14]             768             ReLU6-75               [-1, 384, 14, 14]               0            Conv2d-76              [-1, 384, 14, 14]           3,456       BatchNorm2d-77         [-1, 384, 14, 14]             768             ReLU6-78               [-1, 384, 14, 14]               0            Conv2d-79               [-1, 64, 14, 14]          24,576       BatchNorm2d-80          [-1, 64, 14, 14]             128  InvertedResidual-81     [-1, 64, 14, 14]               0            Conv2d-82              [-1, 384, 14, 14]          24,576       BatchNorm2d-83         [-1, 384, 14, 14]             768             ReLU6-84               [-1, 384, 14, 14]               0            Conv2d-85              [-1, 384, 14, 14]             768             ReLU6-87               [-1, 384, 14, 14]               0            Conv2d-88               [-1, 64, 14, 14]          24,576       BatchNorm2d-89          [-1, 64, 14, 14]             128  InvertedResidual-90     [-1, 64, 14, 14]               0            Conv2d-91              [-1, 384, 14, 14]          24,576       BatchNorm2d-92         [-1, 384, 14, 14]             768             ReLU6-93               [-1, 384, 14, 14]               0            Conv2d-94              [-1, 384, 14, 14]           3,456       BatchNorm2d-95         [-1, 384, 14, 14]             768             ReLU6-96               [-1, 384, 14, 14]               0            Conv2d-97               [-1, 96, 14, 14]          36,864       BatchNorm2d-98          [-1, 96, 14, 14]             192  InvertedResidual-99     [-1, 96, 14, 14]               0           Conv2d-100             [-1, 576, 14, 14]          55,296      BatchNorm2d-101        [-1, 576, 14, 14]           1,152            ReLU6-102              [-1, 576, 14, 14]               0           Conv2d-103             [-1, 576, 14, 14]           5,184      BatchNorm2d-104        [-1, 576, 14, 14]           1,152            ReLU6-105              [-1, 576, 14, 14]               0           Conv2d-106              [-1, 96, 14, 14]          55,296      BatchNorm2d-107         [-1, 96, 14, 14]             192 InvertedResidual-108    [-1, 96, 14, 14]               0           Conv2d-109             [-1, 576, 14, 14]          55,296      BatchNorm2d-110        [-1, 576, 14, 14]           1,152            ReLU6-111              [-1, 576, 14, 14]               0           Conv2d-112             [-1, 576, 14, 14]           5,184      BatchNorm2d-113        [-1, 576, 14, 14]           1,152            ReLU6-114              [-1, 576, 14, 14]               0           Conv2d-115              [-1, 96, 14, 14]          55,296      BatchNorm2d-116         [-1, 96, 14, 14]             192 InvertedResidual-117    [-1, 96, 14, 14]               0           Conv2d-118             [-1, 576, 14, 14]          55,296      BatchNorm2d-119        [-1, 576, 14, 14]           1,152            ReLU6-120              [-1, 576, 14, 14]               0           Conv2d-121               [-1, 576, 7, 7]           5,184      BatchNorm2d-122          [-1, 576, 7, 7]           1,152            ReLU6-123                [-1, 576, 7, 7]               0           Conv2d-124               [-1, 160, 7, 7]          92,160      BatchNorm2d-125          [-1, 160, 7, 7]             320 InvertedResidual-126     [-1, 160, 7, 7]               0           Conv2d-127               [-1, 960, 7, 7]         153,600      BatchNorm2d-128          [-1, 960, 7, 7]           1,920            ReLU6-129                [-1, 960, 7, 7]               0           Conv2d-130               [-1, 960, 7, 7]           8,640      BatchNorm2d-131          [-1, 960, 7, 7]           1,920            ReLU6-132                [-1, 960, 7, 7]               0           Conv2d-133               [-1, 160, 7, 7]         153,600      BatchNorm2d-134          [-1, 160, 7, 7]             320 InvertedResidual-135     [-1, 160, 7, 7]               0           Conv2d-136               [-1, 960, 7, 7]         153,600      BatchNorm2d-137          [-1, 960, 7, 7]           1,920            ReLU6-138                [-1, 960, 7, 7]               0           Conv2d-139               [-1, 960, 7, 7]           8,640      BatchNorm2d-140          [-1, 960, 7, 7]           1,920            ReLU6-141                [-1, 960, 7, 7]               0           Conv2d-142               [-1, 160, 7, 7]         153,600      BatchNorm2d-143          [-1, 160, 7, 7]             320 InvertedResidual-144     [-1, 160, 7, 7]               0           Conv2d-145               [-1, 960, 7, 7]         153,600      BatchNorm2d-146          [-1, 960, 7, 7]           1,920            ReLU6-147                [-1, 960, 7, 7]               0           Conv2d-148               [-1, 960, 7, 7]           8,640      BatchNorm2d-149          [-1, 960, 7, 7]           1,920            ReLU6-150                [-1, 960, 7, 7]               0           Conv2d-151               [-1, 320, 7, 7]         307,200      BatchNorm2d-152          [-1, 320, 7, 7]             640 InvertedResidual-153     [-1, 320, 7, 7]               0           Conv2d-154              [-1, 1280, 7, 7]         409,600      BatchNorm2d-155         [-1, 1280, 7, 7]           2,560            ReLU6-156               [-1, 1280, 7, 7]               0          Dropout-157                   [-1, 1280]               0           Linear-158                    [-1, 1000]       1,281,000 ================================================================ Total params: 3,504,872 
Trainable params: 3,504,872
Non-trainable params: 0
---------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 152.87 Params size (MB): 13.37 Estimated Total Size (MB): 166.81 ----------------------------------------------------------------

Now after loading the model and setting up the parameters it is the time for predicting the image as demonstrated below.

resnet_prediction=mobilenet_pretrainedd.features(x)              #Line 4
mobilenet_prediction_numpy=mobilenet_pretrained.detach().numpy() #Line 5

Line 4: This snippet is used to feed the image to the feature extractor layer of the Mobilenet network

Line 5: This snippet is used to detacht the output from the GPU to CPU.

This will give us the output of features from the image , the Feature variable will be of shape (No_of samples,1,1000) and for the training set it will be of (50000,1,1000), for test set it will be of (10000,1,1000) size.

2.2 Using Mobilenet Architecture(without weights)

In this section we will see how we can implemen tMobilenet as a architecture in Keras. We will use state of the art Mobilenet network architecture and train it with our datasets from scratch i.e. we will not use pre-trained weights in this architecture the weights will be optimized while training from scratch. The code is explained below:

2.2.1 Datasets

For feature extraction we will use CIFAR-10 datasets composed of 60K images, 50K for training and 10K for testing/evaluation.

import os
import torch
import torchvision
import torchvision.models as models
import tarfilefrom torchvision.datasets.utils
import download_url
from torch.utils.data import random_splitfrom skimage import io, transformimport torchvision.transforms as transformsfrom torchvision.datasets
import ImageFolder
from torchvision.transforms import ToTensor,Resize
import matplotlib
import matplotlib.pyplot as pltfrom torchvision.utils
import make_gridfrom torch.utils.data.dataloader
import DataLoader%matplotlib inlinematplotlib.rcParams['figure.facecolor'] = '#ffffff'

The above snippet used to import the library which we will be needing to implement the PyTorch function.

dataset_url = "https://s3.amazonaws.com/fast-ai-imageclas/cifar10.tgz"download_url(dataset_url, '.')with tarfile.open('./cifar10.tgz', 'r:gz') as tar:tar.extractall(path='./data')

The above snippet used to download the datasets from the AWS server in our environment and we extract the downloaded zip fine into the folder named as data.

transform=transforms.Compose([Resize((224,224)), ToTensor()])dataset = ImageFolder(data_dir+'/train', transform=transform)print(dataset.classes)

The above snippets is used to transform the datasets into PyTorch datasets by Resizing each image into (224,224) size and displaying the class names as below:

['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

The below lines are used to split the datasets into two set i.e. test set and train set.

val_size = 5000
train_size = len(dataset) - val_sizetrain_ds,
val_ds = random_split(dataset, [train_size, val_size])len(train_ds), len(val_ds)
batch_size=32
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
val_dl = DataLoader(val_ds, batch_size*2)

The below lines is used to plot the sample from the datasets as shown below:

def show_batch(dl):for images, labels in dl:fig, ax = plt.subplots(figsize=(12, 6))ax.set_xticks([]); ax.set_yticks([])ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))breakshow_batch(train_dl)
Figure 2. Sample CIFDAR-10 dataset

If you want to have the insight of the visualization library please follow the below mention article series:

2.2.2 Mobilnet Architecture(code)

In this section we will see how we can implement Mobilnet as a architecture in Keras.

mobilenet_v2_pretrained = models.mobilenet_v2()
mobilenet_v2_pretrained.fc=torch.nn.Linear(mobilenet_v2_pretrained.fc.in_features, 10)
for param in mobilenet_v2_pretrained.fc.parameters():
param.requires_grad = True

The above snippet is used to initiate the object for the Mobilnet model.Since we are using the Mobilnet as a architecture with our custom datasets so we have to add our custom dense layer so that we can classify the objects from the datasets objects . The line has 10 neurons with Softmax activation function which allow us to predict the probabilities of each classes from the neural network. the architecture is shown below:

ResNet(   (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)   (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)   (relu): ReLU(inplace=True)   (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)   (layer1): Sequential(     (0): Bottleneck(       (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)       (downsample): Sequential(         (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)         (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (1): Bottleneck(       (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (2): Bottleneck(       (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )   )   (layer2): Sequential(     (0): Bottleneck(       (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)       (downsample): Sequential(         (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)         (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (1): Bottleneck(       (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (2): Bottleneck(       (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (3): Bottleneck(       (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )   )   (layer3): Sequential(     (0): Bottleneck(       (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)       (downsample): Sequential(         (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)         (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (1): Bottleneck(       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (2): Bottleneck(       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (3): Bottleneck(       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (4): Bottleneck(       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (5): Bottleneck(       (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )   )   (layer4): Sequential(     (0): Bottleneck(       (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)       (downsample): Sequential(         (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)         (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (1): Bottleneck(       (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )     (2): Bottleneck(       (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)       (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)       (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (relu): ReLU(inplace=True)     )   )   (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))   (fc): Linear(in_features=2048, out_features=10, bias=True) )

Now after creating model we have to test the model that it is producing the correct output which can be done with the help of below codes:

for images, labels in train_dl:
print('images.shape:', images.shape)
out = mobilenet_v2_pretrained(images)
print('out.shape:', out.shape)
print('out[0]:', out[0])break

The output of above code is :

images.shape: torch.Size([32, 3, 224, 224])
out.shape: torch.Size([32, 10])
out[0]: tensor([ 0.0603, -0.5612, 0.3957, -0.0069, -0.1256, -0.6125, 0.3528, -0.5256,
-0.0646, -0.1953], grad_fn=<SelectBackward>)

Now finally we have to train the model by the following code snippets with batch size of 32 as shown below:

NUM_EPOCHS = 3
best_accuracy = 0.0
import torch.optim as optimi
mport torch.nn.functional as F
optimizer = optim.SGD(resnet_pretrained.parameters(), lr=0.001, momentum=0.9)
for epoch in range(NUM_EPOCHS):
print(epoch)
a=0
for images, labels in iter(train_dl):
print(epoch,a)
a=a+1
optimizer.zero_grad()
outputs = mobilenet_v2_pretrained(images)
loss = F.cross_entropy(outputs, labels)
loss.backward()
optimizer.step()
a=a+1
test_error_count = 0.0
for images, labels in iter(test_dl):
outputs =mobilenet_v2_pretrained(images)
test_error_count += float(torch.sum(torch.abs(labels -
outputs.argmax(1))))
test_accuracy = 1.0 - float(test_error_count) /
float(len(test_dataset))
print('%d: %f' % (epoch, test_accuracy))

Now we have trained our model now it is time for prediction for this we will set the backward propagation to false which is shown below:

with torch.no_grad():
prediction = mobilenet_v2_pretrained(test_dl)
predicted_class = np.argmax(prediction)

Finally we have used Mobilnet architecture to train on our custom datasets.

2.3. Fine Turning Mobilnet Architecture with Custom Fully Connected layers

In this section we will see how we can implementMobilnet as a architecture in PyTorch. We will use state of the art Mobilnet network architecture and train it with our dataset from scratch i.e. we will use pre-trained weights in this architecture the weights will be optimised while training from scratch only for the fully connected layers but the code for the pre-trained layers remains as it is. The code is explained below:

2.3.1 Datasets

For feature extraction we will use CIFAR-10 dataset composed of 60K images, 50K for trainning and 10K for testing/evaluation.

import os
import torch
import torchvision
import torchvision.models as models
import tarfile from torchvision.datasets.utils
import download_urlfrom torch.utils.data
import random_splitfrom skimage
import io, transform
import torchvision.transforms as transformsfrom torchvision.datasets import ImageFolderfrom torchvision.transforms
import ToTensor,Resizeimport matplotlib
import matplotlib.pyplot as plt
from torchvision.utils import make_grid
from torch.utils.data.dataloader
import DataLoader%matplotlib inlinematplotlib.rcParams['figure.facecolor'] = '#ffffff'

The above snippet used to import the library which we will be needing to implement the PyTorch function.

dataset_url = "https://s3.amazonaws.com/fast-ai-imageclas/cifar10.tgz"download_url(dataset_url, '.')with tarfile.open('./cifar10.tgz', 'r:gz') as tar:tar.extractall(path='./data')

The above snippet used to download the dataset from the AWS server in our enviromenet and we extract the downloaded zip fine into the folder named as data.

transform=transforms.Compose([Resize((224,224)), ToTensor()])dataset = ImageFolder(data_dir+'/train', transform=transform)print(dataset.classes)

The above snippets is uded to tranform the dataset into PyTorch dataset by Resizing each image into (224,224) size and displaying the class names as below:

['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

The below lines are used to split the dataset into two set i.e. test set and train set.

val_size = 5000
train_size = len(dataset) - val_sizetrain_ds, val_ds = random_split(dataset, [train_size, val_size])len(train_ds), len(val_ds)
batch_size=32
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
val_dl = DataLoader(val_ds, batch_size*2)

The below lines is used to plot the sample from the dataset as shown below:

def show_batch(dl):for images, labels in dl:fig, ax = plt.subplots(figsize=(12, 6))ax.set_xticks([]); ax.set_yticks([])ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))breakshow_batch(train_dl)
Figure 2. Sample CIFDAR-10 dataset

If you want to have the insight of the visualisation library please follow the below mention article series:

2.3.2 Mobilnet Fully Connected Layer Optimisation(code)

In this section we will see how we can implement Mobilnet as a architecture in PyTorch.

mobilenet_v2_pretrained = models.mobilenet_v2(pretrained=True)
from collections import OrderedDict
for param in mobilenet_v2_pretrained.parameters():
param.requires_grad = True
mobilenet_v2_pretrained.fc=torch.nn.Sequential(OrderedDict([('fc1',torch.nn.Linear(mobilenet_v2_pretrained.fc.in_features, 10)),('activation1', torch.nn.Softmax())]))
for param in mobilenet_v2_pretrained.fc.parameters():
param.requires_grad = True

The above snippet is used to initiate the object for the Mobilnet model.Since we are using the Mobilnet as a architechture with our custom dastaset so we have to add our custom dense layer so that we can classify the objects from the datasets objects . The pre-trained weight weights are specified param.requires_grad = False so that the loss is not propagated back to these layers where as in fully connected layers param.requires_grad = True which allows loss to propagate back only in this layers.The line has 10 neurons with Softmax activation fuction which allow us to predict the probabolities of each classes đrom the neural network. the architechture is shown below:

MobileNetV2(   (features): Sequential(     (0): ConvBNActivation(       (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )     (1): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)           (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)         (2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (2): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (3): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (4): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (5): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (6): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (7): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (8): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (9): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (10): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (11): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (12): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (13): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (14): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (15): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (16): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (17): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (18): ConvBNActivation(       (0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)       (1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )   )   (classifier): Sequential(     (0): Dropout(p=0.2, inplace=False)     (1): Linear(in_features=1280, out_features=10, bias=True)   ) )

Now after creating model we have to test the model that it is producing the correct output which can be done with the help of below codes:

for images, labels in train_dl:
print('images.shape:', images.shape)
out = mobilenet_v2_pretrained(images)
print('out.shape:', out.shape)
print('out[0]:', out[0])
break

The output of above code is :

images.shape: torch.Size([32, 3, 224, 224])
out.shape: torch.Size([32, 10])
out[0]: tensor([ 0.0603, -0.5612, 0.3957, -0.0069, -0.1256, -0.6125, 0.3528, -0.5256,
-0.0646, -0.1953], grad_fn=<SelectBackward>)

Now finally we have to train the model by the following code snippets with batch size of 32 as shown below:

NUM_EPOCHS = 3
best_accuracy = 0.0
import torch.optim as optimi
mport torch.nn.functional as F
optimizer = optim.SGD(resnet_pretrained.parameters(), lr=0.001, momentum=0.9)
for epoch in range(NUM_EPOCHS):
print(epoch)
a=0
for images, labels in iter(train_dl):
print(epoch,a)
a=a+1
optimizer.zero_grad()
outputs = mobilenet_v2_pretrained(images)
loss = F.cross_entropy(outputs, labels)
loss.backward()
optimizer.step()
a=a+1
test_error_count = 0.0
for images, labels in iter(test_dl):
outputs = mobilenet_v2_pretrained(images)
test_error_count += float(torch.sum(torch.abs(labels -
outputs.argmax(1))))
test_accuracy = 1.0 - float(test_error_count) /
float(len(test_dataset))
print('%d: %f' % (epoch, test_accuracy))

Now we have trained our model now it is time for prediction for this we will set the backward propagation to false which is shown below:

with torch.no_grad():
prediction = mobilenet_v2_pretrained(test_dl)
predicted_class = np.argmax(prediction)

Finally we have used Mobilnet architecture to train on our custom dataset.

2.4. Mobilnet weights as a neural network weight initializer

In this section we will see how we can implement Mobilnet as a weight ibnitializer in PyTorch. We will use state of the art Mobilnet network architecture and train it with our dataset from scratch i.e. we will use pre-trained weights in this architecture the weights will be optimised while training from scratch only for the fully connected layers but the code for the pre-trained layers remains as it is. The code is explained below:

2.4.1 Datasets

For feature extraction we will use CIFAR-10 dataset composed of 60K images, 50K for trainning and 10K for testing/evaluation.

import os
import torch
import torchvision
import torchvision.models as models
import tarfile
from torchvision.datasets.utils
import download_urlfrom torch.utils.data
import random_splitfrom skimage
import io, transform
import torchvision.transforms as transformsfrom torchvision.datasets import ImageFolderfrom torchvision.transforms
import ToTensor,Resize
import matplotlib
import matplotlib.pyplot as pltfrom torchvision.utils
import make_grid
from torch.utils.data.dataloader
import DataLoader%matplotlib inlinematplotlib.rcParams['figure.facecolor'] = '#ffffff'

The above snippet used to import the library which we will be needing to implement the PyTorch function.

dataset_url = "https://s3.amazonaws.com/fast-ai-imageclas/cifar10.tgz"download_url(dataset_url, '.')with tarfile.open('./cifar10.tgz', 'r:gz') as tar:tar.extractall(path='./data')

The above snippet used to download the dataset from the AWS server in our enviromenet and we extract the downloaded zip fine into the folder named as data.

transform=transforms.Compose([Resize((224,224)), ToTensor()])dataset = ImageFolder(data_dir+'/train', transform=transform)print(dataset.classes)

The above snippets is used to tranform the dataset into PyTorch dataset by Resizing each image into (224,224) size and displaying the class names as below:

['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

The below lines are used to split the dataset into two set i.e. test set and train set.

val_size = 5000
train_size = len(dataset) - val_sizetrain_ds,
val_ds = random_split(dataset, [train_size, val_size])len(train_ds), len(val_ds)
batch_size=32
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
val_dl = DataLoader(val_ds, batch_size*2)

The below lines is used to plot the sample from the dataset as shown below:

def show_batch(dl):for images, labels in dl:fig, ax = plt.subplots(figsize=(12, 6))ax.set_xticks([]); ax.set_yticks([])ax.imshow(make_grid(images, nrow=16).permute(1, 2, 0))breakshow_batch(train_dl)
Figure 2. Sample CIFAR-10 dataset

If you want to have the insight of the visualisation library please follow the below mention article series:

2.4.2 Mobilnet weights as a initialiser (code)

In this section we will see how we can implement Mobilnet as a architecture in PyTorch.

pretrained = models.mobilenet_v2_pretrained(pretrained=True)
for param in pretrained.parameters():
param.requires_grad = True
pretrained.fc=torch.nn.Sequential(OrderedDict([('fc1',torch.nn.Linear(pretrained.fc.in_features, 10)),('activation1', torch.nn.Softmax())]))for param in pretrained.classifier[6].parameters():
param.requires_grad = True
pretrained

The above snippet is used to initiate the object for the Mobilnet model.Since we are using the Mobilnet as a architechture with our custom dastaset so we have to add our custom dense layer so that we can classify the objects from the datasets objects . The pre-trained weight weights are specified param.requires_grad = False so that the loss is not propagated back to these layers where as in fully connected layers param.requires_grad = True which allows loss to propagate back only in this layers.The line has 10 neurons with Softmax activation fuction which allow us to predict the probabolities of each classes đrom the neural network. the architechture is shown below:

MobileNetV2(   (features): Sequential(     (0): ConvBNActivation(       (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)       (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )     (1): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)           (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False)         (2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (2): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False)           (1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (3): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(24, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (4): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(144, 144, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144, bias=False)           (1): BatchNorm2d(144, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(144, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (5): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (6): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (7): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192, bias=False)           (1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (8): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (9): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (10): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (11): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False)           (1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (12): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (13): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (14): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(576, 576, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576, bias=False)           (1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(576, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (15): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (16): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (17): InvertedResidual(       (conv): Sequential(         (0): ConvBNActivation(           (0): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (1): ConvBNActivation(           (0): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False)           (1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)           (2): ReLU6(inplace=True)         )         (2): Conv2d(960, 320, kernel_size=(1, 1), stride=(1, 1), bias=False)         (3): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       )     )     (18): ConvBNActivation(       (0): Conv2d(320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False)       (1): BatchNorm2d(1280, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)       (2): ReLU6(inplace=True)     )   )   (classifier): Sequential(     (0): Dropout(p=0.2, inplace=False)     (1): Linear(in_features=1280, out_features=10, bias=True)   ) )

Now after creating model we have to test the model that it is producing the correct output which acn be donne with the help of below codes:

for images, labels in train_dl:
print('images.shape:', images.shape)
out = pretrained(images)
print('out.shape:', out.shape)
print('out[0]:', out[0])
break

The output of above code is :

images.shape: torch.Size([32, 3, 224, 224])
out.shape: torch.Size([32, 10])
out[0]: tensor([ 0.0603, -0.5612, 0.3957, -0.0069, -0.1256, -0.6125, 0.3528, -0.5256,
-0.0646, -0.1953], grad_fn=<SelectBackward>)

Now finally we have to train the model by the following code snippets with batch size of 32 as shown below:

NUM_EPOCHS = 3
best_accuracy = 0.0
import torch.optim as optimi
mport torch.nn.functional as F
optimizer = optim.SGD(resnet_pretrained.parameters(), lr=0.001, momentum=0.9)
for epoch in range(NUM_EPOCHS):
print(epoch)
a=0
for images, labels in iter(train_dl):
print(epoch,a)
a=a+1
optimizer.zero_grad()
outputs = pretrained(images)
loss = F.cross_entropy(outputs, labels)
loss.backward()
optimizer.step()
a=a+1
test_error_count = 0.0
for images, labels in iter(test_dl):
outputs = pretrained(images)
test_error_count += float(torch.sum(torch.abs(labels -
outputs.argmax(1))))
test_accuracy = 1.0 - float(test_error_count) /
float(len(test_dataset))
print('%d: %f' % (epoch, test_accuracy))

Now we have traioned our model now it is time for prediction for this we will set the backward propagation to false which is shown below:

with torch.no_grad():
prediction = pretrained(test_dl)
predicted_class = np.argmax(prediction)

Finally we have used Mobilnet architechture to train on our custom dataset.

In this article we have discussed about the pre-trained Mobilenet models with implementation in PyTorch. In next article we will discuss Xception model. Stay Tuned!!!!

Special Thanks:

As we say “Car is useless if it doesn’t have a good engine” similarly student is useless without proper guidance and motivation. I will like to thank my Guru as well as my Idol “Dr. P. Supraja” and “A. Helen Victoria”- guided me throughout the journey, from the bottom of my heart. As a Guru, she has lighted the best available path for me, motivated me whenever I encountered failure or roadblock- without her support and motivation this was an impossible task for me.

References

Pytorch: Link

Keras: Link

Tensorflow: Link

ResNet Paper:

Imagenet Dataset: Link

ILSVRC : Link

if you have any query feel free to contact me with any of the -below mentioned options:

YouTube : Link

Website: www.rstiwari.com

Medium: https://tiwari11-rst.medium.com

Github Pages: https://happyman11.github.io/

Articles: https://laptrinhx.com/author/ravi-shekhar-tiwari/

Google Form: https://forms.gle/mhDYQKQJKtAKP78V7

--

--