Cardiovascular diseases

Photo by Mat Voyce on Dribbble

Are the most common cause of deaths globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Heart failure is a common event caused by Cardiovascular diseases.

It is characterized by the heart’s inability to pump an adequate supply of blood to the body. Without sufficient blood flow, all major body functions are disrupted. Heart failure is a condition or a collection of symptoms that weaken the heart.

TABLE OF CONTENTS

IMPORTING LIBRARIES

LOADING DATA

DATA ANALYSIS

DATA PREPROCESSING

MODEL BUILDING

CONCLUSIONS

IMPORTING LIBRARIES

https://gist.github.com/sgsg704/374aae2b24b33339538da91f5c2d943a

LOADING DATA

https://gist.github.com/sgsg704/098a78a23d6b33e532cebb247289d813

https://gist.github.com/sgsg704/5cc8078d7fd86ef680b4d5caaa0c0c6c

About the data:

Age: Age of the patient

anaemia: If the patient had the hemoglobin below the normal range creatinine phosphokinase: The level of the creatine phosphokinase in the blood in mcg/L

diabetes: If the patient was diabetic ejection fraction: Ejection fraction is a measurement of how much blood the left ventricle pumps out with each contraction

high_blood_pressure: If the patient had hypertension

platelets: Platelet count of blood in kilo platelets/mL

serum_creatinine: The level of serum creatinine in the blood in mg/dL

serum_sodium: The level of serum sodium in the blood in mEq/L

sex: The sex of the patient

smoking: If the patient smokes actively or ever did in past

time: It is the time of the patient’s follow-up visit for the disease in months

DEATH_EVENT: If the patient deceased during the follow-up period

DATA ANALYSIS

Steps in data analysis and visulisation:

We begin our analysis by plotting a count plot of the targer attribute. A corelation matrix od the various attributes to examine the feature importance.

https://gist.github.com/sgsg704/7df0a1276d67161d242ff2cf8d9b065c

https://gist.github.com/sgsg704/ae56f2e8c82e35f8350cb18856d623e1

Notable points:

Time of the patient’s follow-up visit for the disease is crucial in as initial diagnosis with cardiovascular issue and treatment reduces the chances of any fatality. It holds and inverse relation.

Ejection fraction is the second most important feature. It is quite expected as it is basically the efficiency of the heart.

Age of the patient is the third most correlated feature. Clearly as heart’s functioning declines with ageing

Next, we will examine the count plot of age.

https://gist.github.com/sgsg704/406b630a55ccc6f5fc14e5ed70982088

https://gist.github.com/sgsg704/202267011b8e315b9361737083f99eb2

I spotted outliers on our dataset. I didn’t remove them yet as it may lead to overfitting. Though we may end up with better statistics. In this case, with medical data, the outliers may be an important deciding factor.

Next, we examine the kde plot of time and age as they both are significant features.

https://gist.github.com/sgsg704/b55361572d1302d6848edb4f84ae3b58

https://gist.github.com/sgsg704/80de95ae5abc43f9f6cce1450a1e11b8

DATA PREPROCESSING

Steps involved in Data Preprocessing

Dropping the outliers based on data analysis

Assigning values to features as X and target as y

Perform the scaling of the features

Split test and training sets

https://gist.github.com/sgsg704/3b512de625797199c09d17913318a388https://gist.github.com/sgsg704/7dff0b5f11829002f7d69fad7747ea72

https://gist.github.com/sgsg704/0f03b0cc015b2b588be184361af7eba4

MODEL BUILDING

In this project, we build an artificial neural network.

Following steps are involved in the model building

Initialising the ANN

Defining by adding layers

Compiling the ANN

Train the ANN

https://gist.github.com/sgsg704/2ab656c59b9ecbfa379546809d196323

https://gist.github.com/sgsg704/6ea5615b5acf805b12111949056c1d2dhttps://gist.github.com/sgsg704/b921cb160e1e738c14d4812788bc9ae8

Plotting training and validation accuracy over epochs

https://gist.github.com/sgsg704/e21aa6fda3e101514fadb0b5483a8b39

CONCLUSIONS

Concluding the model with:

Testing on the test set

Evaluating the confusion matrix

Evaluating the classification report

https://gist.github.com/sgsg704/118e6de040ef907c3bc4bcf72b2e00bahttps://gist.github.com/sgsg704/f882b51b991f6f387bdcf355e3e9bf1f

https://gist.github.com/sgsg704/00a5c4f0639d19bcb988ab656775f367

Saving the model

https://gist.github.com/sgsg704/829964f0a09a832ffad9d9de4af1cc9b

deep CC

https://gist.github.com/sgsg704/700c3760d473ace67587cc6d6e0eca91

Notebook Link : Here

Credit : Hrithikgupta