Link to the Image_classification_CIFAR10_CNN
Link to the Image_classification_Animal_EfficientNetV2
Table of Content
- Image_classification_CIFAR10_CNN
- Brief Summary
- Hardware requirement
- Data statistic
- Learning curve
- Metrics
- Finetuning technique
- Input
- Result(Output & Demo)
- Image_classification_Animal_EfficientNetV2
- Brief Summary
- Hardware requirement
- Data statistic
- Learning curve
- Metrics
- Finetuning technique
- Input
- Result(Output & Demo)
Cheat Sheet
Example Usage | Description |
torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') | device index to select. It’s a no-op if this argument is a negative integer or None . |
transform = transforms.Compose( # transform is from torchvision (only for image)
[transforms.ToTensor(), # image to tensor --> divide by 255
transforms.Resize((32, 32))]) | Composes several transforms together. This transform does not support torchscript. |
torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) | CIFAR10 Dataset. |
torch.utils.data.random_split(trainvalset, [40000, 10000]) | Randomly split a dataset into non-overlapping new datasets of given lengths.
If a list of fractions that sum up to 1 is given, the lengths will be computed automatically as floor(frac * len(dataset)) for each fraction provided.
After computing the lengths, if there are any remainders, 1 count will be distributed in round-robin fashion to the lengths until there are no remainders left. |
torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True) | Data loader combines a dataset and a sampler, and provides an iterable over the given dataset.
The DataLoader supports both map-style and iterable-style datasets with single- or multi-process loading, customizing loading order and optional automatic batching (collation) and memory pinning.
See torch.utils.data documentation page for more details. |
nn.Conv2d(3, 6, 5) | Create Convolution 2d layer |
nn.MaxPool2d(2, 2) | Create Max pool layer |
nn.Linear(400, 120) | Create Linear Layer |
torch.nn.Softmax(dim=1) | Create Activation function |
F.relu | Use Activation function |
criterion = nn.CrossEntropyLoss() | This criterion computes the cross entropy loss between input logits and target. |
optimizer = optim.SGD(net.parameters(), lr=1e-2, momentum=0.9) | Implements stochastic gradient descent (optionally with momentum) |
M = confusion_matrix(y_labels, y_predict) | Create confusion matrix |
transforms.RandomRotation(30,) | Rotate the image by angle. If the image is torch Tensor, it is expected to have […, H, W] shape, where … means an arbitrary number of leading dimensions. |
transforms.RandomCrop(224) | Crop the given image at a random location. |
transforms.RandomHorizontalFlip() | Horizontally flip the input with a given probability. |
transforms.RandomVerticalFlip() | Vertically flip the input with a given probability. |
net = torchvision.models.efficientnet_v2_s(weights = pretrain_weight) | Pre created Model call ‘efficientnet_v2_s’ |
scheduler = lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.5) | Decays the learning rate of each parameter group by gamma every step_size epochs. |
import torch.nn as nn
import torch.nn.functional as F
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 6, 5) # 3 input channels, 6 output channels, 5*5 kernel size
self.pool = nn.MaxPool2d(2, 2) # 2*2 kernel size, 2 strides
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(400, 120) # dense input 400 (16*5), output 120
self.fc2 = nn.Linear(120, 84) # dense input 120, output 84
self.fc3 = nn.Linear(84, 10) # dense input 84, output 10
self.softmax = torch.nn.Softmax(dim=1) # perform softmax at dim[1] (batch,class)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = torch.flatten(x,start_dim=1) # flatten all dimensions (dim[1]) except batch (dim[0])
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
x = self.softmax(x)
return x
net = CNN().to(device) | Create Neural Network Model that has
Input → Conv2d → Relu → MaxPool2d → Conv2d → Relu → MaxPool2d → Flatten → Linear → Relu → Linear → Relu → Linear → Softmax → Output
as a layer |
Image_classification_CIFAR10_CNN
Brief Summary
Create CNN model that classified CIFAR10 image into 10 categories('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck') using pytorch and custom our own layers resulting not so accurated model but better than random choices(1/10)
Hardware requirement
GPU with CUDA core or CPU
Data statistic
40000 train data, 10000 validation data, 10000 test data
Learning curve
Metrics
Loss: compute using
CrossEntropyLoss
Acc: compute by
F1-score:
which range between [0,1] higher is better
Finetuning technique
SGD with learning rate = 0.01 and momentum = 0.9
Input(Example)
Result(Output & Demo)
Array of categories number according to given input
Confusion Matrix:
Image_classification_Animal_EfficientNetV2
Brief Summary
Create CNN model that classified CIFAR10 image into 10 categories['butterfly','cat','chicken','cow','dog','elephant','horse','sheep','spider','squirrel'] using pytorch and
EfficientNet_V2
model resulting accurated model since we use pre-trained weight from EfficientNet_V2_S_Weights.IMAGENET1K_V1
Hardware requirement
GPU with CUDA core or CPU
Data statistic
1400 train data, 300 validation data, 300 test data
Learning curve
Metrics
Loss: compute using
CrossEntropyLoss
Acc: compute by
F1-score:
which range between [0,1] higher is better
Finetuning technique
lr_scheduler with learning rate = 0.02 momentum = 0.9 step_size=7 and gamma=0.5 which decay learning rate
Input(Example)
Result(Output & Demo)
Array of categories number according to given input
Confusion Matrix:
Summary
- Prepare Data
- Find data
- Create data pipeline
- Create Model
- either custom our own layers or use pre-trained model
- select loss function
- select optimizer
- Train model
- train
- validate
- show all metrics(loss accuracy f1-score)
- Test model
- show result(Confusion Matrix)