Multiclass Image Classification with Pytorch

Nandan Pandey
Analytics Vidhya
Published in
8 min readJun 30, 2020

--

Intel Classification Challenge

credit

Data for this tutorial has been taken from Kaggle which was originally published on analytics-vidhya by Intel to host a Image classification Challenge.

About Dataset

This Data contains around 25k images of size 150x150 distributed under 6 categories.
{ ‘buildings’ : 0,‘forest’ : 1,‘glacier’ : 2,‘mountain’ : 3,‘sea’ : 4,‘street’ : 5 }

The Train, Test and Prediction data is separated in each zip files. There are around 14k images in Train, 3k in Test and 7k in Prediction.

Challenge

It’s a multi class image classification problem. Objective is to classify these images into correct category with higher accuracy.

Prerequisite

Basic understanding of python, pytorch and classification problem.

Approach

  1. Do some exploratory data analysis (EDA) to analyze and visualize data for better understanding.
  2. Define some utility functions to perform various tasks and so that modularity of code can be maintained.
  3. Load various pre-trained model and fine tune them according to our problem.
  4. Try various hyperparameters for each model.
  5. Save model’s weight and record the metrics.
  6. Conclusion
  7. Future Work

So let’s dive in code!

  1. Libraries

First of all , import all important libraries.

2. Image folder to Dataset

As our data is present inside folder so let’s convert them as dataset.

3. Exploratory Data Analysis (EDA)

Let’s answer some questions here as a part of EDA but here EDA is not extensively covered. If you want to do so you can clone notebook and perform there for your practice.

Let’s move on to answer some questions.

a) How many images are there in dataset ?

Answer :

It means there are 14034 images for training , 3000 images for test/validation and 7301 images are there for prediction.

b) Can you tell me the image size?

Answer:

It means that size of image is 150 * 150 having three channels and it’s label is 0.

c) Can you print a batch of training images?

Answer: This question’s answer will be given after creating a dataloader so wait and move on next heading given below.

4. Creating a DataLoader

Create a dataloader for all datasets that will load data in batch.

Next, create a dataloader that can be used to print images of one batch as asked in above question to do so .

5. Generate Class names

Although class names can be listed by hand via seeing folder name here but as a good practice we should write code for this.

6. Create accuracy function

Define a function that will calculate accuracy of our model.

7. Download Pretrained model

Download any pretrained model of your choice and you are free to choose any model as you want. Here I have chosen two models VGG and ResNet50 to do experiment on. Let’s move and download the model.

8. Freeze all layers

After downloading models it is possible to train whole architecture as you want . One possible strategy is that you can train some layers of pretrained model and some layers not. Here, I have chosen such a strategy that not any existing layer must be trained during training of model on new inputs so kept all the layers freeze by setting model’s each parameter’s requires_grad to False.

If requires_grad is True then it means update that parameters of which derivatives can be calculated.

9. Add your own classifier layer

Now to use downloaded pretrained model as your own classifier you have to make some changes in it because number of classes you want to predict on may be differ from the number of classes on which the model has been trained . One another reason is that it is possible(almost every case) that model has been trained to detect some specific type of things but you want to detect different things using that model.

These are the possible reason.

So some change in model are that there can be your own classification layer that will perform classification according to your requirement.

So it’s totally up to you that what architecture you want to add in pretrained model . Here I have chosen most common strategy what people follow is that replace last layer of model with your own classification layer.

Other strategy is that you can remove some layers from last say you have removed last three layers and added your own classification layer.

For better understanding see below

Pretrained VGG model:

In above image last two layer (avgpool and classsifer) of VGG model has been shown. You can see that this pretrained model is designed for classifying 1000 classes. But we need only 6 class classification so slightly change this model.

New model after replacing last layer:

I have replaced classifier layer with my own as you can see that there are 6 out_features that means 6 outputs but in pretrained model there was some another number because model was trained to classify those number of classes.

You might ask that why some in-features and out_features inside of classifier layer has been changed why???

So let’s answer this . You can choose any number for those but remember in_features inside first Linear layer that is 25088 must be same because it is the number of output layer that must not be changed.

Same for ResNet50:

Pretrained model(Last two layer)

My new model after replacing last layer

Notice that in_features in first Linear layer is same as 2048 and out_features in last Linear is 6.

Any in_features and out_features except above mentioned can be changed according to your choice.

10. Create base class

Create a base class that will contain all useful functions to be used in future and this has been done only to insure the concept of DRY (Don’t repeat yourself) as functions inside this class will be required for both of models and we have to define these functions for each of them separately if not implemented here that will violate DRY concept.

11. Inherit base class

Created a class for each model via inheriting base class that has all useful function that are required during training of any model.

12. Create object of inherited class

Instantiated the class

13. Check device

Create a function that will check which device is currently present . If GPU is present then choose it else pick CPU as a working device .

Here I am using GPU hence it’s showing device type as CUDA.

14. Move to Device

Create a function that can move tensors and model to specific device.

15. DeviceDataLoader

Create a DeviceDataLoader Class that wraps a DataLoader to move data to specific device and then can yield a batch of data from that device.

Here you can see that tensors and both models has been sent to appropriate device that is present currently . In my case this device is GPU.

16. evaluate and fit function

Let’s define evaluate function that evaluates the performance of our model on unseen data and fit function that can be used for training of our model.

17. Training (phase-1)

Let’s Train our model i.e. VGG for some number of epochs.

18. Training (phase-2)

Let’s train for some more number of epochs and evaluate that model.

19. Training (phase-3)

Let’s train our model2 i.e. ResNet50 for some number of epochs

20. Training (phase-4)

Let’s train for some more number of epochs and evaluate that model.

21. Predict Single Image

Define a function that can be used by model to predict over single image.

22. Do prediction

Let’s predict

As it can be seen that Currently VGG is giving wrong prediction although it has a good validation accuracy (val_acc) whereas ResNet is giving right prediction but we can not say that it will predict right on every image.

So, let’s train both model for more number of epochs so that error can be minimized i.e. val_loss can be reduced as much as possible and both model can perform more accurately.

It’s totally up to you but the whole procedure is this that has been presented here.

Now, it’s your turn to predict on whole pred folder/dataset.

Hint: Use pred_dl as a dataloader to load pred data in a batch for doing prediction. Practice it and also try to use the concept of ensembling prediction to get more correct number of predictions.

23. Save model

After training the model very well let’s save it so that we can use it as our future work that is given in next heading.

24. Future Work

Ensemble both model’s prediction, make final prediction and Convert this project into flask/stream-lit web app via using our saved model . So stay tuned for next blog till then Happy Learning!!!!!!!!!

Resources

If you want Notebook you can get it here.

Contact

For any queries, you can ping me on Twitter, Linkedin

--

--