Predictive Modeling of Heart Disease Using Machine Learning Models

Authors' Affiliations

Adetayo Folasole, Department of Computing, College of Business and Technology, East Tennessee State University, Johnson City, TN Onyeka Chukwudalu Ekwebene, Department of Biostatistics and Epidemiology, East Tennessee State University, Johnson City, TN.

Location

D.P. Culp Center Ballroom

Start Date

4-5-2024 9:00 AM

End Date

4-5-2024 11:30 AM

Poster Number

26

Name of Project's Faculty Sponsor

Ghaith Husari

Faculty Sponsor's Department

Computing

Classification of First Author

Graduate Student-Master’s

Competition Type

Competitive

Type

Poster Presentation

Presentation Category

Health

Abstract or Artist's Statement

Predictive Modeling of Heart Disease Using Machine Learning Models ABSTRACT Cardiovascular diseases (CVDs) represent the leading cause of mortality worldwide, claiming approximately 17.9 million lives annually. This staggering figure accounts for 31% of all global deaths, highlighting the critical public health challenge posed by CVDs. Most of these deaths, about 80%, are attributed to heart attacks and strokes. Alarmingly, a significant portion of these fatalities—nearly one-third—occur prematurely in individuals under the age of 70, underscoring the urgent need for effective prevention and treatment strategies. We use machine learning model to predict the presence or absence of heart disease in patient based on categorical and numerical features such as age, sex, resting blood pressure, cholesterol, fasting blood pressure, chest pain type, fasting blood sugar etc. Our dataset was sourced from the Cleveland database from the UC Irvine Machine Learning Repository, which had 918 rows and 12 columns. It included 13 independent variables and 1 target variable i.e., heart disease. Using exploratory data analysis, we established correlations between variables where we realized Age and Old Peak having the highest correlation. Our initial findings established patients with a presence of heart disease had a low Max Heart Rate and vice versa due to increased stiffness of large arteries which comes with age. We further identified patterns and risk factors by using algorithms such as Logistic Regression, Naive Bayes, and Random Forest. These methods allowed us to achieve significant accuracy in predicting the presence or absence of heart disease, based on various patient features. Specifically, the accuracy rates derived from these algorithms range from 75% to 90%, with Logistic Regression achieving the highest accuracy following hyperparameter tuning. Furthermore, our analysis elucidates the importance of individual features in predicting heart disease risk. Factors such as age, sex, resting blood pressure, cholesterol levels, and chest pain type emerge as crucial indicators, informing targeted interventions for high-risk individuals. By identifying these key predictors, healthcare professionals can prioritize resources and interventions, optimizing the allocation of limited healthcare resources. The integration of machine learning into cardiovascular care holds the promise of significantly reducing the global burden of CVDs by enhancing the accuracy of diagnosis, optimizing treatment strategies, and ultimately, saving lives.

This document is currently not available here.

Share

COinS
 
Apr 5th, 9:00 AM Apr 5th, 11:30 AM

Predictive Modeling of Heart Disease Using Machine Learning Models

D.P. Culp Center Ballroom

Predictive Modeling of Heart Disease Using Machine Learning Models ABSTRACT Cardiovascular diseases (CVDs) represent the leading cause of mortality worldwide, claiming approximately 17.9 million lives annually. This staggering figure accounts for 31% of all global deaths, highlighting the critical public health challenge posed by CVDs. Most of these deaths, about 80%, are attributed to heart attacks and strokes. Alarmingly, a significant portion of these fatalities—nearly one-third—occur prematurely in individuals under the age of 70, underscoring the urgent need for effective prevention and treatment strategies. We use machine learning model to predict the presence or absence of heart disease in patient based on categorical and numerical features such as age, sex, resting blood pressure, cholesterol, fasting blood pressure, chest pain type, fasting blood sugar etc. Our dataset was sourced from the Cleveland database from the UC Irvine Machine Learning Repository, which had 918 rows and 12 columns. It included 13 independent variables and 1 target variable i.e., heart disease. Using exploratory data analysis, we established correlations between variables where we realized Age and Old Peak having the highest correlation. Our initial findings established patients with a presence of heart disease had a low Max Heart Rate and vice versa due to increased stiffness of large arteries which comes with age. We further identified patterns and risk factors by using algorithms such as Logistic Regression, Naive Bayes, and Random Forest. These methods allowed us to achieve significant accuracy in predicting the presence or absence of heart disease, based on various patient features. Specifically, the accuracy rates derived from these algorithms range from 75% to 90%, with Logistic Regression achieving the highest accuracy following hyperparameter tuning. Furthermore, our analysis elucidates the importance of individual features in predicting heart disease risk. Factors such as age, sex, resting blood pressure, cholesterol levels, and chest pain type emerge as crucial indicators, informing targeted interventions for high-risk individuals. By identifying these key predictors, healthcare professionals can prioritize resources and interventions, optimizing the allocation of limited healthcare resources. The integration of machine learning into cardiovascular care holds the promise of significantly reducing the global burden of CVDs by enhancing the accuracy of diagnosis, optimizing treatment strategies, and ultimately, saving lives.