Pytorch CNN Explanation

Link to the Image_classification_CIFAR10_CNN

Link to the Image_classification_Animal_EfficientNetV2

Table of Content

Cheat sheet

Image_classification_CIFAR10_CNN

Brief Summary
Hardware requirement
Data statistic
Learning curve
Metrics
Finetuning technique
Input
Result(Output & Demo)

Image_classification_Animal_EfficientNetV2

Brief Summary
Hardware requirement
Data statistic
Learning curve
Metrics
Finetuning technique
Input
Result(Output & Demo)

Summary

Cheat Sheet

Example Usage	Description
torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')	device index to select. It’s a no-op if this argument is a negative integer or `None`.
transform = transforms.Compose( # transform is from torchvision (only for image) [transforms.ToTensor(), # image to tensor --> divide by 255 transforms.Resize((32, 32))])	Composes several transforms together. This transform does not support torchscript.
torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)	CIFAR10 Dataset.
torch.utils.data.random_split(trainvalset, [40000, 10000])	Randomly split a dataset into non-overlapping new datasets of given lengths. If a list of fractions that sum up to 1 is given, the lengths will be computed automatically as floor(frac * len(dataset)) for each fraction provided. After computing the lengths, if there are any remainders, 1 count will be distributed in round-robin fashion to the lengths until there are no remainders left.
torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True)	Data loader combines a dataset and a sampler, and provides an iterable over the given dataset. The DataLoader supports both map-style and iterable-style datasets with single- or multi-process loading, customizing loading order and optional automatic batching (collation) and memory pinning. See torch.utils.data documentation page for more details.
nn.Conv2d(3, 6, 5)	Create Convolution 2d layer
nn.MaxPool2d(2, 2)	Create Max pool layer
nn.Linear(400, 120)	Create Linear Layer
torch.nn.Softmax(dim=1)	Create Activation function
F.relu	Use Activation function
criterion = nn.CrossEntropyLoss()	This criterion computes the cross entropy loss between input logits and target.
optimizer = optim.SGD(net.parameters(), lr=1e-2, momentum=0.9)	Implements stochastic gradient descent (optionally with momentum)
M = confusion_matrix(y_labels, y_predict)	Create confusion matrix
transforms.RandomRotation(30,)	Rotate the image by angle. If the image is torch Tensor, it is expected to have […, H, W] shape, where … means an arbitrary number of leading dimensions.
transforms.RandomCrop(224)	Crop the given image at a random location.
transforms.RandomHorizontalFlip()	Horizontally flip the input with a given probability.
transforms.RandomVerticalFlip()	Vertically flip the input with a given probability.
net = torchvision.models.efficientnet_v2_s(weights = pretrain_weight)	Pre created Model call ‘efficientnet_v2_s’
scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.5)	Decays the learning rate of each parameter group by gamma every step_size epochs.
import torch.nn as nn import torch.nn.functional as F class CNN(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(3, 6, 5) # 3 input channels, 6 output channels, 55 kernel size self.pool = nn.MaxPool2d(2, 2) # 22 kernel size, 2 strides self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(400, 120) # dense input 400 (16*5), output 120 self.fc2 = nn.Linear(120, 84) # dense input 120, output 84 self.fc3 = nn.Linear(84, 10) # dense input 84, output 10 self.softmax = torch.nn.Softmax(dim=1) # perform softmax at dim[1] (batch,class) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = torch.flatten(x,start_dim=1) # flatten all dimensions (dim[1]) except batch (dim[0]) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) x = self.softmax(x) return x net = CNN().to(device)	Create Neural Network Model that has Input → Conv2d → Relu → MaxPool2d → Conv2d → Relu → MaxPool2d → Flatten → Linear → Relu → Linear → Relu → Linear → Softmax → Output as a layer

Image_classification_CIFAR10_CNN

Brief Summary

Create CNN model that classified CIFAR10 image into 10 categories('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck') using pytorch and custom our own layers resulting not so accurated model but better than random choices(1/10)

Hardware requirement

GPU with CUDA core or CPU

Data statistic

40000 train data, 10000 validation data, 10000 test data

Learning curve

Metrics

Loss: compute using CrossEntropyLoss

Acc: compute by

F1-score:

which range between [0,1] higher is better

Finetuning technique

SGD with learning rate = 0.01 and momentum = 0.9

Input(Example)

Result(Output & Demo)

Array of categories number according to given input

Confusion Matrix:

Image_classification_Animal_EfficientNetV2