Food Mnist Classification

Food Mnist Classification
Photo by Sophia Mii L, ILLO and Animagic Studios on Dribbble

This data set consists of 10 food categories, with 5,000 images. For each class, 125 manually reviewed test images are provided as well as 375 training images. On purpose, the training images were not cleaned, and thus still contain some amount of noise.

This comes mostly in the form of intense colors and sometimes wrong labels. All images were rescaled to have a maximum side length of 512 pixels. Our aim is to train a deep learning model which can successfully classify food images.

Food Images
Food Images
Credit: Rohit Sharma

Table of Content

  • Introduction to cAInvas
  • Importing the Dataset
  • Data Preprocessing
  • Model Training
  • Introduction to DeepC
  • Compilation with DeepC

Introduction to cAInvas

cAInvas is an integrated development platform to create intelligent edge devices. Not only we can train our deep learning model using Tensorflow, Keras, or Pytorch, we can also compile our model with its edge compiler called DeepC to deploy our working model on edge devices for production.

The Food Mnist Classification model is also a part of cAInvas gallery. All the dependencies which you will be needing for this project are also pre-installed.

cAInvas also offers various other deep learning notebooks in its gallery which one can use for reference or to gain insight about deep learning. It also has GPU support and which makes it the best in its kind.

Importing the Dataset

While working on cAInvas one of its key features is UseCases Gallary. When working on any of its UseCases you don’t have to look for data manually. We will be cloning a repository for the dataset. To load the data we just have to enter the following commands:

Running the above command will load the data in your workspace which you will use for model training.

Data Preprocessing

In this step, we will convert the input of the trainset and testset into numpy arrays from list and normalize the pixel values of the trainset and testset for better training results and model’s performance. Finally we will one hot encode our labels since there are ten classes of images. All this can be done by executing the following commands:

Model Training

After creating the trainset and testset, next step is to pass our training data into our Deep Learning model to learn to classify food images. We will creating two models:

In one model we will be applying Transfer Learning and we will be using EfficienNet and the other model will be a CNN model.

The model architecture for the models are as follows:

Model: "efficient_net"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
efficientnet-b0 (Functional) (None, 7, 7, 1280)        4049564   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                12810     
=================================================================
Total params: 4,062,374
Trainable params: 4,020,358
Non-trainable params: 42,016
_________________________________________________________________
Model: "cnn_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 217, 217, 32)      6176      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 210, 210, 64)      131136    
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 203, 203, 128)     524416    
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 101, 101, 128)     0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 94, 94, 256)       2097408   
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 47, 47, 256)       0         
_________________________________________________________________
global_average_pooling2d_1 ( (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                16448     
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
=================================================================
Total params: 2,776,234
Trainable params: 2,776,234
Non-trainable params: 0
_________________________________________________________________

The loss function used for both the models is“categorical_crossentropy” and optimizer used is “Adam”. For training the model we used Keras API with tensorflow at backend. Here is the final performance metrics for both the models.

Efficient Net:

loss: 0.0803 - acc: 0.9816 - val_loss: 0.4097 - val_acc: 0.8800

CNN Model:

loss: 0.2559 - acc: 0.9167 - val_loss: 1.7814 - val_acc: 0.6240

Introduction to DeepC

DeepC Compiler and inference framework is designed to enable and perform deep learning neural networks by focussing on features of small form-factor devices like micro-controllers, eFPGAs, cpus and other embedded devices like raspberry-pi, odroid, arduino, SparkFun Edge, risc-V, mobile phones, x86 and arm laptops among others.

DeepC also offers ahead of time compiler producing optimized executable based on LLVM compiler tool chain specialized for deep neural networks with ONNX as front end.

Compilation with DeepC

After training the model, it was saved in an H5 format using Keras as it easily stores the weights and model configuration in a single file.

After saving the file in H5 format we can easily compile our model using DeepC compiler which comes as a part of cAInvas platform so that it converts our saved model to a format which can be easily deployed to edge devices. And all this can be done very easily using a simple command.

And that’s it, our Food MNIST Classification Model is trained and ready for deployment on edge devices.

Link for the cAInvas Notebook :

Credit: Ashish Arya

Also Read: Captcha recognition — on cAInvas