Cardiovascular diseases

Cardiovascular diseases
Photo by Mat Voyce on Dribbble

Are the most common cause of deaths globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Heart failure is a common event caused by Cardiovascular diseases.

It is characterized by the heart’s inability to pump an adequate supply of blood to the body. Without sufficient blood flow, all major body functions are disrupted. Heart failure is a condition or a collection of symptoms that weaken the heart.







About the data:

Age: Age of the patient

anaemia: If the patient had the hemoglobin below the normal range creatinine phosphokinase: The level of the creatine phosphokinase in the blood in mcg/L

diabetes: If the patient was diabetic ejection fraction: Ejection fraction is a measurement of how much blood the left ventricle pumps out with each contraction

high_blood_pressure: If the patient had hypertension

platelets: Platelet count of blood in kilo platelets/mL

serum_creatinine: The level of serum creatinine in the blood in mg/dL

serum_sodium: The level of serum sodium in the blood in mEq/L

sex: The sex of the patient

smoking: If the patient smokes actively or ever did in past

time: It is the time of the patient’s follow-up visit for the disease in months

DEATH_EVENT: If the patient deceased during the follow-up period


Steps in data analysis and visualization:

We begin our analysis by plotting a count plot of the targer attribute. A correlation matrix of the various attributes to examine the feature importance.

plotting a count plot of the targer attribute

corelation matrix

Notable points:

Time of the patient’s follow-up visit for the disease is crucial in as initial diagnosis with cardiovascular issue and treatment reduces the chances of any fatality. It holds and inverse relation.

Ejection fraction is the second most important feature. It is quite expected as it is basically the efficiency of the heart.

Age of the patient is the third most correlated feature. Clearly as heart’s functioning declines with ageing

Next, we will examine the count plot of age.

examine the count plot of age


swarm plot


swarm plot


Boxen and swarm plot
Boxen and swarm plot


Boxen and swarm plot


Boxen and swarm plot
Boxen and swarm plot


I spotted outliers on our dataset. I didn’t remove them yet as it may lead to overfitting. Though we may end up with better statistics. In this case, with medical data, the outliers may be an important deciding factor.

Next, we examine the kde plot of time and age as they both are significant features.

kde plot of time and age



Steps involved in Data Preprocessing

  • Dropping the outliers based on data analysis
  • Assigning values to features as X and target as y
  • Perform the scaling of the features
  • Split test and training sets

Data Preprocessing


In this project, we build an artificial neural network.

The following steps are involved in the model building

  • Initialising the ANN
  • Defining by adding layers
  • Compiling the ANN
  • Train the ANN

Model Building


Training and validation loss

Plotting training and validation accuracy over epochs

Training and validation accuracy


Concluding the model with:

  • Testing on the test set
  • Evaluating the confusion matrix
  • Evaluating the classification report




Saving the model

deep CC

deep CC


Notebook Link: Here

Credit: Hrithikgupta

Also Read: Heart Disease Prediction using Neural Networksfake watch