KEY FACIAL POINTS DETECTION AND FACE RECOGNITION

Key Facial Points Detection has many applications in the field of IOT and is widely used in many Face Applications related task such as Face Recognition, Drowsiness Detection, and many more. Face Recognition is used for security verification and has many other applications as well.

Table of Content

Introduction to cAInvas
Importing the Dataset
Data Analysis
Data Augmentation
Model Training
Introduction to DeepC
Compilation with DeepC
Face Recognition

Introduction to cAInvas

cAInvas is an integrated development platform to create intelligent edge devices. Not only we can train our deep learning model using Tensorflow, Keras, or Pytorch, we can also compile our model with its edge compiler called DeepC to deploy our working model on edge devices for production.

The Key Facial Points Detection model is also a part of cAInvas gallery. All the dependencies which you will be needing for this project are also pre-installed.

cAInvas also offers various other deep learning notebooks in its gallery which one can use for reference or to gain insight about deep learning. It also has GPU support and which makes it the best in its kind.

Importing the Dataset

While working on cAInvas one of its key features is UseCases Gallary. When working on any of its UseCases you don’t have to look for data manually. As they have the feature to import the dataset to your workspace when you work on them.

To load the data we will use pandas library and load the csv dataset which includes co-ordinates of Key Facial Points and the last column contains pixel values of the image. We just have to run the following commands:

Data Analysis

In this step, we will analyze our data. Our csv data consists of 31 columns. The first 30 columns consists the co-ordinates of the fifteen key facial features and the last column contains the pixel values of the image of people.

Since pixel values for the image is given as space separated string, we will convert this into numpy array and convert the obtained 1D array into 2D array of shape (96,96). The plotted image along with key facial points looks like this:

Plotted image along with key facial points

Data Augmentation

For data augmentation, we will flip the images along x-axis and increase or decrease the brightness of the image. Since we are flipping the images horizontally, y coordinate values would be the same.

X-coordinate values only would be changed, all we have to do is to subtract our initial x-coordinate values from width of the image(96). The original image, flipped image and increased brightness image looks like this:

Next we will pass the pixel values of the images as input to our model and we will train our deep learning model to predict the co-ordinates of the fifteen key facial points.

Model Training

After creating the trainset and testset, next step is to pass our training data into our Deep Learning model to learn to predict the co-ordinates of the key facial points. The model architecture used was:

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 96, 96, 1)]  0                                            
__________________________________________________________________________________________________
zero_padding2d (ZeroPadding2D)  (None, 102, 102, 1)  0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 48, 48, 64)   3200        zero_padding2d[0][0]             
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 48, 48, 64)   256         conv1[0][0]                      
__________________________________________________________________________________________________
activation (Activation)         (None, 48, 48, 64)   0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 23, 23, 64)   0           activation[0][0]                 
__________________________________________________________________________________________________
res_2_conv_a (Conv2D)           (None, 23, 23, 64)   4160        max_pooling2d[0][0]              
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 11, 11, 64)   0           res_2_conv_a[0][0]               
__________________________________________________________________________________________________
bn_2_conv_a (BatchNormalization (None, 11, 11, 64)   256         max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 11, 11, 64)   0           bn_2_conv_a[0][0]                
__________________________________________________________________________________________________
res_2_conv_b (Conv2D)           (None, 11, 11, 64)   36928       activation_1[0][0]               
__________________________________________________________________________________________________
bn_2_conv_b (BatchNormalization (None, 11, 11, 64)   256         res_2_conv_b[0][0]               
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 11, 11, 64)   0           bn_2_conv_b[0][0]                
__________________________________________________________________________________________________
res_2_conv_copy (Conv2D)        (None, 23, 23, 256)  16640       max_pooling2d[0][0]              
__________________________________________________________________________________________________
res_2_conv_c (Conv2D)           (None, 11, 11, 256)  16640       activation_2[0][0]               
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 11, 11, 256)  0           res_2_conv_copy[0][0]            
__________________________________________________________________________________________________
bn_2_conv_c (BatchNormalization (None, 11, 11, 256)  1024        res_2_conv_c[0][0]               
__________________________________________________________________________________________________
bn_2_conv_copy (BatchNormalizat (None, 11, 11, 256)  1024        max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
add (Add)                       (None, 11, 11, 256)  0           bn_2_conv_c[0][0]                
                                                                 bn_2_conv_copy[0][0]             
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 11, 11, 256)  0           add[0][0]                        
__________________________________________________________________________________________________
res_2_identity_1_a (Conv2D)     (None, 11, 11, 64)   16448       activation_3[0][0]               
__________________________________________________________________________________________________
bn_2_identity_1_a (BatchNormali (None, 11, 11, 64)   256         res_2_identity_1_a[0][0]         
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 11, 11, 64)   0           bn_2_identity_1_a[0][0]          
__________________________________________________________________________________________________
res_2_identity_1_b (Conv2D)     (None, 11, 11, 64)   36928       activation_4[0][0]               
__________________________________________________________________________________________________
bn_2_identity_1_b (BatchNormali (None, 11, 11, 64)   256         res_2_identity_1_b[0][0]         
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 11, 11, 64)   0           bn_2_identity_1_b[0][0]          
__________________________________________________________________________________________________
res_2_identity_1_c (Conv2D)     (None, 11, 11, 256)  16640       activation_5[0][0]               
__________________________________________________________________________________________________
bn_2_identity_1_c (BatchNormali (None, 11, 11, 256)  1024        res_2_identity_1_c[0][0]         
__________________________________________________________________________________________________
add_1 (Add)                     (None, 11, 11, 256)  0           bn_2_identity_1_c[0][0]          
                                                                 activation_3[0][0]               
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 11, 11, 256)  0           add_1[0][0]                      
__________________________________________________________________________________________________
res_2_identity_2_a (Conv2D)     (None, 11, 11, 64)   16448       activation_6[0][0]               
__________________________________________________________________________________________________
bn_2_identity_2_a (BatchNormali (None, 11, 11, 64)   256         res_2_identity_2_a[0][0]         
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 11, 11, 64)   0           bn_2_identity_2_a[0][0]          
__________________________________________________________________________________________________
res_2_identity_2_b (Conv2D)     (None, 11, 11, 64)   36928       activation_7[0][0]               
__________________________________________________________________________________________________
bn_2_identity_2_b (BatchNormali (None, 11, 11, 64)   256         res_2_identity_2_b[0][0]         
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 11, 11, 64)   0           bn_2_identity_2_b[0][0]          
__________________________________________________________________________________________________
res_2_identity_2_c (Conv2D)     (None, 11, 11, 256)  16640       activation_8[0][0]               
__________________________________________________________________________________________________
bn_2_identity_2_c (BatchNormali (None, 11, 11, 256)  1024        res_2_identity_2_c[0][0]         
__________________________________________________________________________________________________
add_2 (Add)                     (None, 11, 11, 256)  0           bn_2_identity_2_c[0][0]          
                                                                 activation_6[0][0]               
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 11, 11, 256)  0           add_2[0][0]                      
__________________________________________________________________________________________________
res_3_conv_a (Conv2D)           (None, 11, 11, 128)  32896       activation_9[0][0]               
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 5, 5, 128)    0           res_3_conv_a[0][0]               
__________________________________________________________________________________________________
bn_3_conv_a (BatchNormalization (None, 5, 5, 128)    512         max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 5, 5, 128)    0           bn_3_conv_a[0][0]                
__________________________________________________________________________________________________
res_3_conv_b (Conv2D)           (None, 5, 5, 128)    147584      activation_10[0][0]              
__________________________________________________________________________________________________
bn_3_conv_b (BatchNormalization (None, 5, 5, 128)    512         res_3_conv_b[0][0]               
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 5, 5, 128)    0           bn_3_conv_b[0][0]                
__________________________________________________________________________________________________
res_3_conv_copy (Conv2D)        (None, 11, 11, 512)  131584      activation_9[0][0]               
__________________________________________________________________________________________________
res_3_conv_c (Conv2D)           (None, 5, 5, 512)    66048       activation_11[0][0]              
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 5, 5, 512)    0           res_3_conv_copy[0][0]            
__________________________________________________________________________________________________
bn_3_conv_c (BatchNormalization (None, 5, 5, 512)    2048        res_3_conv_c[0][0]               
__________________________________________________________________________________________________
bn_3_conv_copy (BatchNormalizat (None, 5, 5, 512)    2048        max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
add_3 (Add)                     (None, 5, 5, 512)    0           bn_3_conv_c[0][0]                
                                                                 bn_3_conv_copy[0][0]             
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 5, 5, 512)    0           add_3[0][0]                      
__________________________________________________________________________________________________
res_3_identity_1_a (Conv2D)     (None, 5, 5, 128)    65664       activation_12[0][0]              
__________________________________________________________________________________________________
bn_3_identity_1_a (BatchNormali (None, 5, 5, 128)    512         res_3_identity_1_a[0][0]         
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 5, 5, 128)    0           bn_3_identity_1_a[0][0]          
__________________________________________________________________________________________________
res_3_identity_1_b (Conv2D)     (None, 5, 5, 128)    147584      activation_13[0][0]              
__________________________________________________________________________________________________
bn_3_identity_1_b (BatchNormali (None, 5, 5, 128)    512         res_3_identity_1_b[0][0]         
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 5, 5, 128)    0           bn_3_identity_1_b[0][0]          
__________________________________________________________________________________________________
res_3_identity_1_c (Conv2D)     (None, 5, 5, 512)    66048       activation_14[0][0]              
__________________________________________________________________________________________________
bn_3_identity_1_c (BatchNormali (None, 5, 5, 512)    2048        res_3_identity_1_c[0][0]         
__________________________________________________________________________________________________
add_4 (Add)                     (None, 5, 5, 512)    0           bn_3_identity_1_c[0][0]          
                                                                 activation_12[0][0]              
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 5, 5, 512)    0           add_4[0][0]                      
__________________________________________________________________________________________________
res_3_identity_2_a (Conv2D)     (None, 5, 5, 128)    65664       activation_15[0][0]              
__________________________________________________________________________________________________
bn_3_identity_2_a (BatchNormali (None, 5, 5, 128)    512         res_3_identity_2_a[0][0]         
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 5, 5, 128)    0           bn_3_identity_2_a[0][0]          
__________________________________________________________________________________________________
res_3_identity_2_b (Conv2D)     (None, 5, 5, 128)    147584      activation_16[0][0]              
__________________________________________________________________________________________________
bn_3_identity_2_b (BatchNormali (None, 5, 5, 128)    512         res_3_identity_2_b[0][0]         
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 5, 5, 128)    0           bn_3_identity_2_b[0][0]          
__________________________________________________________________________________________________
res_3_identity_2_c (Conv2D)     (None, 5, 5, 512)    66048       activation_17[0][0]              
__________________________________________________________________________________________________
bn_3_identity_2_c (BatchNormali (None, 5, 5, 512)    2048        res_3_identity_2_c[0][0]         
__________________________________________________________________________________________________
add_5 (Add)                     (None, 5, 5, 512)    0           bn_3_identity_2_c[0][0]          
                                                                 activation_15[0][0]              
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 5, 5, 512)    0           add_5[0][0]                      
__________________________________________________________________________________________________
Averagea_Pooling (AveragePoolin (None, 2, 2, 512)    0           activation_18[0][0]              
__________________________________________________________________________________________________
flatten (Flatten)               (None, 2048)         0           Averagea_Pooling[0][0]           
__________________________________________________________________________________________________
dense (Dense)                   (None, 4096)         8392704     flatten[0][0]                    
__________________________________________________________________________________________________
dropout (Dropout)               (None, 4096)         0           dense[0][0]                      
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 2048)         8390656     dropout[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 2048)         0           dense_1[0][0]                    
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 30)           61470       dropout_1[0][0]                  
==================================================================================================
Total params: 18,016,286
Trainable params: 18,007,710
Non-trainable params: 8,576
__________________________________________________________________________________________________

The loss function used was “MSE” and optimizer used was “Adam”. For training the model we used Keras API with tensorflow at backend. Here is the training plot and some of test results:

Introduction to DeepC

DeepC Compiler and inference framework is designed to enable and perform deep learning neural networks by focussing on features of small form-factor devices like micro-controllers, eFPGAs, cpus and other embedded devices like raspberry-pi, odroid, arduino, SparkFun Edge, risc-V, mobile phones, x86 and arm laptops among others.

DeepC also offers ahead of time compiler producing optimized executable based on LLVM compiler tool chain specialized for deep neural networks with ONNX as front end.

Compilation with DeepC

After training the model, it was saved in an H5 format using Keras as it easily stores the weights and model configuration in a single file.

After saving the file in H5 format we can easily compile our model using DeepC compiler which comes as a part of cAInvas platform so that it converts our saved model to a format which can be easily deployed to edge devices. And all this can be done very easily using a simple command.

And that’s it, our Key Facial Point Detection Model is trained and ready for deployment on edge devices.

Face Recognition

For the face recognition we are going to use a python library called DLIB. DLIB is a library which contains models which are trained to find KEY FACIAL POINTS using a very large dataset and with a more complex Neural Network Architecture.

It has been trained for a sufficiently long time and it can be used for various Face Applications such as Face Recognition for instance. On how to use DLIB for face recognition you can visit the cAInvas notebook link given at the bottom.

Link for the cAInvas Notebook: https://cainvas.ai-tech.systems/use-cases/face-recognition-app/

Credit: Ashish Arya

Also Read: Detect the weather in images — Image tagging on cAInvas