Hand Sign Recognition with CNNs

Photo by Kristen Brittain on Dribbble

Using a CNN to recognize hand digits can come in handy for many use cases. One of them being smartwatches. If a smartwatch can recognise hand digits, using them to control media in mobile or the watch is no big deal. This can also be used in TV remotes too. But more than that, we can play HandCricket with a computer, LoL.

DATASET

Kaggle Dataset — https://www.kaggle.com/koryakinp/fingers

The dataset contains 12 labels:

  1. 0R
  2. 1R
  3. 2R
  4. 3R
  5. 4R
  6. 5R
  7. 0L
  8. 1L
  9. 2L
  10. 3L
  11. 4L
  12. 5L

where the numbers indicate the number of fingers and R/L indicate Right or Left hand

IMPORTS

I will be using Tensorflow to implement the CNN, Skimage to read image data, Matplotlib to plot graphs and display images, Seaborn to display the heatmap

MODEL

The model contains 2 Conv2D layers with 64 and 128 nodes respectively with an activation function ‘ELU’ and followed by 2 Dense layers

The loss function is Categorical Crossentropy and Optimizer is Adam

Model Architecture:

Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 64, 64, 64)        1088      
_________________________________________________________________
activation (Activation)      (None, 64, 64, 64)        0         
_________________________________________________________________
batch_normalization (BatchNo (None, 64, 64, 64)        256       
_________________________________________________________________
dropout (Dropout)            (None, 64, 64, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 64, 64, 128)       131200    
_________________________________________________________________
activation_1 (Activation)    (None, 64, 64, 128)       0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 64, 64, 128)       512       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 32, 32, 128)       0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 32, 32, 128)       0         
_________________________________________________________________
global_max_pooling2d (Global (None, 128)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               16512     
_________________________________________________________________
activation_2 (Activation)    (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 12)                1548      
=================================================================
Total params: 151,116
Trainable params: 150,732
Non-trainable params: 384

Train:

https://gist.github.com/IamRash-7/030f81119c6e02590dd3245655339013

RESULT

The model was able to achieve 99.75% which is Amazing!

PREDICTIONS

Let’s look at some predictions made by our model on randomly chosen images

https://gist.github.com/IamRash-7/6f98170a6591407638aa8282bd1ca1ff

Notebook : Here

Credit : Rasswanth Shankar