Classification on Organic Compounds

Photo by MaryArty on Dribbble

Built a simple Artificial Neural Network using TensorFlow and Keras which classifies the organic compounds as either Musk or Non-Musk compounds

Aim

To develop a Deep Learning model that classifies the organic compounds as either Musk or Non-Musk compounds using python programming language and Deep learning libraries

Prerequisites

Before getting started, you should have a good understanding of:

  1. Python programming language
  2. Deep Learning Libraries(Tensorflow, Keras)

Dataset

Link to download the dataset:

https://datahub.io/machine-learning/musk

get the data

https://gist.github.com/omchaithanyav/b4521240dcf9d8d518142794c1ea6176

out

--2021-07-06 11:17:20--  https://cainvas-static.s3.amazonaws.com/media/user_data/vomchaithany/musk.csv
Resolving cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)... 52.219.160.35
Connecting to cainvas-static.s3.amazonaws.com (cainvas-static.s3.amazonaws.com)|52.219.160.35|:443... connected.
HTTP request sent, awaiting response... 304 Not Modified
File ‘musk.csv’ not modified on server. Omitting download.

Import the required libraries

https://gist.github.com/omchaithanyav/3e159ea5b2106cdeade45ee2702de653

Load the data

https://gist.github.com/omchaithanyav/3ab1c5fa8774e965e0217ce6dfd66594

out

Preprocessing the Data

https://gist.github.com/omchaithanyav/122ac88559094628b2c9b65c0f2b1960

Split the data for training and test

https://gist.github.com/omchaithanyav/313ed2aa587f8fdecfb728e0cfbd6578

out

((4618, 166), (4618,))

Build, train, and save the model

https://gist.github.com/omchaithanyav/5187ccd1a2ad246610e7255d8c2077fa

out

Epoch 1/15
145/145 [==============================] - 0s 3ms/step - loss: 0.4019 - accuracy: 0.8441 - val_loss: 0.2712 - val_accuracy: 0.9157
Epoch 2/15
145/145 [==============================] - 0s 2ms/step - loss: 0.2232 - accuracy: 0.9309 - val_loss: 0.1938 - val_accuracy: 0.9444
Epoch 3/15
145/145 [==============================] - 0s 2ms/step - loss: 0.1743 - accuracy: 0.9461 - val_loss: 0.1602 - val_accuracy: 0.9480
Epoch 4/15
145/145 [==============================] - 0s 2ms/step - loss: 0.1454 - accuracy: 0.9530 - val_loss: 0.1333 - val_accuracy: 0.9601
Epoch 5/15
145/145 [==============================] - 0s 2ms/step - loss: 0.1238 - accuracy: 0.9591 - val_loss: 0.1167 - val_accuracy: 0.9641
Epoch 6/15
145/145 [==============================] - 0s 2ms/step - loss: 0.1095 - accuracy: 0.9632 - val_loss: 0.1033 - val_accuracy: 0.9682
Epoch 7/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0946 - accuracy: 0.9693 - val_loss: 0.0946 - val_accuracy: 0.9646
Epoch 8/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0848 - accuracy: 0.9725 - val_loss: 0.0859 - val_accuracy: 0.9717
Epoch 9/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0758 - accuracy: 0.9766 - val_loss: 0.0798 - val_accuracy: 0.9732
Epoch 10/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0700 - accuracy: 0.9783 - val_loss: 0.0737 - val_accuracy: 0.9737
Epoch 11/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0611 - accuracy: 0.9831 - val_loss: 0.0670 - val_accuracy: 0.9783
Epoch 12/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0548 - accuracy: 0.9842 - val_loss: 0.0622 - val_accuracy: 0.9803
Epoch 13/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0505 - accuracy: 0.9857 - val_loss: 0.0612 - val_accuracy: 0.9798
Epoch 14/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0452 - accuracy: 0.9861 - val_loss: 0.0564 - val_accuracy: 0.9823
Epoch 15/15
145/145 [==============================] - 0s 2ms/step - loss: 0.0404 - accuracy: 0.9883 - val_loss: 0.0546 - val_accuracy: 0.9828

Graphs

loss vs validation loss

https://gist.github.com/omchaithanyav/5c21f0492cf249d01e76f66f4e9aad53

out

accuracy vs validation accuracy

https://gist.github.com/omchaithanyav/e2f574809e4dfdf87303671cd8d0b642

out

Accuracy of our model

https://gist.github.com/omchaithanyav/232584f073b860a4fe7ecc81de3c854c

out

62/62 [==============================] - 0s 1ms/step - loss: 0.0546 - accuracy: 0.9828
[0.0546199269592762, 0.9828282594680786]

Predictions

https://gist.github.com/omchaithanyav/39d5a22f0fd3763693e2843b88d6b075

out

array([[1.4935225e-03],
[6.3299501e-01],
[3.2852648e-03],
[7.9143688e-04],
[2.2959751e-04]], dtype=float32)

https://gist.github.com/omchaithanyav/103d89899bae0388a3b2b7a20c25a72e

out

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]

https://gist.github.com/omchaithanyav/2e08119289ad55a430c14bc1fd26b5d4

out

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0])

here we can see that the predicted values are the same as the actual values

classification report and Heat Map

https://gist.github.com/omchaithanyav/4a8bd74fbb74bf66a3cbe14123ce73d8

out

precision    recall  f1-score   support
           0       0.99      0.99      0.99      1673
1 0.94 0.95 0.95 307
    accuracy                           0.98      1980
macro avg 0.96 0.97 0.97 1980
weighted avg 0.98 0.98 0.98 1980

Heat Map

https://gist.github.com/omchaithanyav/20f146148f432db8ec09f13e97953a79

out

Link to access the notebook:

Conclusion:

We’ve trained our simple ANN using TensorFlow and Keras for classifying Musk /Non-Musk compounds and got an accuracy of 98%.

Notebook Link : Here

Credit: Om Chaithanya V