Skip to content

HumnaAhmed/heart-disease-prediction-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

❤️ Heart Disease Prediction

Task Objective

Predict whether a patient is at risk of heart disease using medical attributes and machine learning. The goal is to analyze patient health data and build a model that can assist in early detection of heart disease risk.


Dataset

Heart disease dataset based on the UCI Heart Disease dataset.

File used:

  • heart_disease_uci.csv

Features used:

  • age – Age of the patient
  • sex – Gender of the patient
  • cp – Chest pain type
  • trestbps – Resting blood pressure
  • chol – Serum cholesterol level
  • fbs – Fasting blood sugar
  • restecg – Resting electrocardiographic results
  • thalch – Maximum heart rate achieved
  • exang – Exercise induced angina
  • oldpeak – ST depression induced by exercise
  • slope – Slope of the peak exercise ST segment
  • ca – Number of major vessels colored by fluoroscopy
  • thal – Thalassemia

Target variable:

  • target

    • 0 → No Heart Disease
    • 1 → Heart Disease Present

Tools Used

  • Python
  • Pandas
  • Matplotlib
  • Seaborn
  • Scikit-learn

Steps Performed

  1. Loaded the heart disease dataset using Pandas.
  2. Explored dataset structure and summary statistics.
  3. Handled missing values in numerical and categorical columns.
  4. Encoded categorical variables using LabelEncoder.
  5. Converted the target column into binary classification.
  6. Performed Exploratory Data Analysis (EDA) using visualizations.
  7. Split the dataset into training and testing sets.
  8. Standardized numerical features using StandardScaler.
  9. Trained a Decision Tree Classifier.
  10. Evaluated the model using accuracy, confusion matrix, classification report, and ROC curve.
  11. Identified important features affecting predictions.
  12. Visualized the trained decision tree.
  13. Allowed real-time patient input for prediction.

Model Applied

  • Decision Tree Classifier

Key parameters:

  • max_depth = 5
  • min_samples_leaf = 3
  • class_weight = balanced
  • random_state = 42

Key Results and Findings

  • The model successfully classifies patients into heart disease risk or no risk.
  • Evaluation metrics such as accuracy, confusion matrix, and ROC-AUC provide insight into model performance.
  • Feature importance analysis highlights the most influential medical attributes.
  • Visualization of the decision tree helps understand the model's decision-making process.

Insights

  • Certain health indicators like chest pain type, cholesterol level, and maximum heart rate strongly influence heart disease prediction.
  • Decision Trees provide interpretable results, which is useful in healthcare applications.
  • Machine learning can assist doctors by providing early risk prediction.

Project Files


📊 Output & Results

1️⃣ Dataset Shape

This output shows the total number of rows and columns in the dataset after loading it into the program.
It helps understand the size and structure of the dataset used for training the machine learning model.

image

2️⃣ Data Summary

This section displays statistical information about the dataset including:

  • Mean
  • Standard deviation
  • Minimum and maximum values
  • Quartiles

It helps understand the distribution and range of medical attributes in the dataset.

image

3️⃣ Heart Disease Distribution

This graph shows the distribution of patients with and without heart disease. It helps understand the balance of classes in the dataset.

image

4️⃣ Feature Correlation Heatmap

The heatmap shows correlations between numerical features in the dataset. It helps identify relationships between medical attributes.

image

5️⃣ Model Performance

This section shows the evaluation results of the trained machine learning model including:

  • Accuracy Score
  • Precision
  • Recall
  • F1-score

These metrics help measure how well the model predicts heart disease risk.

image

6️⃣ Confusion Matrix

The confusion matrix evaluates the classification results by showing:

  • True Positives
  • True Negatives
  • False Positives
  • False Negatives
image

7️⃣ ROC Curve

The ROC curve illustrates the model’s ability to distinguish between classes. A higher AUC score indicates better model performance.

image

8️⃣ Important Features

This visualization shows which features have the highest impact on heart disease prediction according to the decision tree model.

image

9️⃣ Decision Tree Visualization

This diagram represents the structure of the trained Decision Tree model and how it makes classification decisions.

image

About

Implementation of Heart Disease Prediction using Machine Learning including data preprocessing, EDA, model training, and evaluation - AI/ML Internship task

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages