Fake News Classifier using Bidirectional LSTM

Building a Deep Learning Model to identify unreliable news articles

Fake News Classifier using Bidirectional LSTM
Photo by Kait Cooper on Dribbble

What is Fake news?

Fake news is false or misleading information presented as news. It often aims to damage the reputation of a person or entity or make money through advertising revenue.

However, the term does not have a fixed definition and has been applied more broadly to include any type of false information, including unintentional and unconscious mechanisms, and also by high-profile individuals to apply to any news unfavorable to his/her personal perspectives.

Aim

To develop a Fake News Classifier using Bidirectional Long Short Term Memory (LSTM) using Python programming Language and Keras on Cainvas Platform.

Prerequisites

Before getting started, you should have a good understanding of:

  1. Python programming language
  2. Keras — Deep learning library

Dataset

we are going to use the train.csv dataset to train the model and then we do predictions for the test.csv dataset.

you can download these CSV files from Kaggle:

URL: https://www.kaggle.com/c/fake-news/data

Importing all the required libraries

let’s import all the required libraries:

Load and Process Data

Let’s load our data file train.csv using pandas.

Output:

Load and Process Data
Load and Process Data

drop the nan values:

load X and y with Independent and dependent features:

One-hot Representation:

Vocabulary size:

Getting a copy of Independent features:

Downloading stopwords:

we are using nltk’s stopwords method to remove stopwords from our data, NumPy for array operations, and pandas to process data.

Dataset Preprocessing:

output:

Preprocessing
Preprocessing

output:

Preprocessing
Preprocessing

Embedding Representation:

refer to: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526

output:

Embedding Representation
Embedding Representation

Building the model:

output:

Building Model
Building Model

train test split:

here we use sklearn.model_selection package to split the data into train data and test data

Training Model:

output:

Training Model
Training Model

Predicting and Heat Map:

output:

Prediction and Heatmap
Prediction and Heatmap

Accuracy of the Model:

output:

Accuracy of Model
Accuracy of Model

output:

precision, recall, support
precision, recall, support

Loading the test data:

output:

Test Data
Test Data

Making Predictions for test data:

Joining the test data and predicted labels:

output:

Test Data and Predicted Labels
Test Data and Predicted Labels

URL to access the Notebook: https://cainvas.ai-tech.systems/use-cases/fake-news-classification-app-using-lstm/

Conclusion

We’ve trained our simple Bidirectional LSTM model on a fake news dataset and got an accuracy of 90%. There are many other machine learning models which perform much better but let’s admit it Machine Learning models require a lot of feature engineering and data wrangling. We are using a deep learning model to let the model figure everything out on its own.

Credit: Om Chaithanya V

Also Read: Malaria Parasite Detection using a Convolutional Neural Network on the Cainvas Platform