World Health Organization has estimated that four out of five cardiovascular diseases(CVD) deaths are due to heart attacks. This whole research intends to pinpoint the ratio of patients who possess a good chance of being affected by CVD and also to predict the overall risk using Logistic Regression.
What is Logistic Regression ?
Logistic Regression is a statistical and machine-learning techniques classifying records of a dataset based on the values of the input fields . It predicts a dependent variable based on one or more set of independent variables to predict outcomes . It can be used both for binary classification and multi-class classification. To know more about it, click here .
Data Preparation :
The dataset is publically available on the Kaggle website, and it is from an ongoing cardiovascular study on residents of the town of Framingham, Massachusetts. The classification goal is to predict whether the patient has 10-years risk of future coronary heart disease (CHD). The dataset provides the patients’ information. It includes over 4,000 records and 15 attributes.
Loading the Dataset .
Sex_male age currentSmoker ... heartRate glucose TenYearCHD 0 1 39 0 ... 80.0 77.0 0 1 0 46 0 ... 95.0 76.0 0 2 1 48 1 ... 75.0 70.0 0 3 0 61 1 ... 65.0 103.0 1 4 0 46 1 ... 85.0 85.0 0 [5 rows x 15 columns] (3751, 15) 0 3179 1 572 Name: TenYearCHD, dtype: int64
Code: Ten Year’s CHD Record of all the patients available in the dataset :
Output : Graph Display :
Code: Counting number of patients affected by CHD where (0= Not Affected ; 1= Affected) :
Output: Graph Display :
Code : Training and Test Sets: Splitting Data | Normalization of the Dataset
Train Set : (2625, 6) (2625, ) Test Set : (1126, 6) (1126, )
Code: Modeling of the Dataset | Evaluation and Accuracy :
Accuracy of the model in jaccard similarity score is = 0.8490230905861457
Code: Using Confusion Matrix to find the Acuuracy of the model :
The details for confusion matrix is = precision recall f1-score support 0 0.85 0.99 0.92 951 1 0.61 0.08 0.14 175 accuracy 0.85 1126 macro avg 0.73 0.54 0.53 1126 weighted avg 0.82 0.85 0.80 1126
Confusion Matrix :
- Heart Disease Prediction using ANN
- ML | Linear Regression vs Logistic Regression
- COVID-19 Peak Prediction using Logistic Function
- Understanding Logistic Regression
- ML | Logistic Regression using Python
- ML | Why Logistic Regression in Classification ?
- ML | Logistic Regression using Tensorflow
- Logistic Regression in R Programming
- ML | Cost function in Logistic Regression
- ML | Rainfall prediction using Linear regression
- ML | Logistic Regression v/s Decision Tree Classification
- Identifying handwritten digits using Logistic Regression in PyTorch
- ML | Kaggle Breast Cancer Wisconsin Diagnosis using Logistic Regression
- Python - Logistic Distribution in Statistics
- sympy.stats.Logistic() in python
- Word Prediction using concepts of N - grams and CDF
- Scrapping Weather prediction Data using Python and BS4
- Prediction of Wine type using Deep Learning
- Python | Customer Churn Analysis Prediction
- Link Prediction - Predict edges in a network using Networkx
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.