Predictive Modeling of Heart Disease Using Machine Learning Models
Location
D.P. Culp Center Ballroom
Start Date
4-5-2024 9:00 AM
End Date
4-5-2024 11:30 AM
Poster Number
26
Name of Project's Faculty Sponsor
Ghaith Husari
Faculty Sponsor's Department
Computing
Competition Type
Competitive
Type
Poster Presentation
Presentation Category
Health
Abstract or Artist's Statement
Predictive Modeling of Heart Disease Using Machine Learning Models ABSTRACT Cardiovascular diseases (CVDs) represent the leading cause of mortality worldwide, claiming approximately 17.9 million lives annually. This staggering figure accounts for 31% of all global deaths, highlighting the critical public health challenge posed by CVDs. Most of these deaths, about 80%, are attributed to heart attacks and strokes. Alarmingly, a significant portion of these fatalities—nearly one-third—occur prematurely in individuals under the age of 70, underscoring the urgent need for effective prevention and treatment strategies. We use machine learning model to predict the presence or absence of heart disease in patient based on categorical and numerical features such as age, sex, resting blood pressure, cholesterol, fasting blood pressure, chest pain type, fasting blood sugar etc. Our dataset was sourced from the Cleveland database from the UC Irvine Machine Learning Repository, which had 918 rows and 12 columns. It included 13 independent variables and 1 target variable i.e., heart disease. Using exploratory data analysis, we established correlations between variables where we realized Age and Old Peak having the highest correlation. Our initial findings established patients with a presence of heart disease had a low Max Heart Rate and vice versa due to increased stiffness of large arteries which comes with age. We further identified patterns and risk factors by using algorithms such as Logistic Regression, Naive Bayes, and Random Forest. These methods allowed us to achieve significant accuracy in predicting the presence or absence of heart disease, based on various patient features. Specifically, the accuracy rates derived from these algorithms range from 75% to 90%, with Logistic Regression achieving the highest accuracy following hyperparameter tuning. Furthermore, our analysis elucidates the importance of individual features in predicting heart disease risk. Factors such as age, sex, resting blood pressure, cholesterol levels, and chest pain type emerge as crucial indicators, informing targeted interventions for high-risk individuals. By identifying these key predictors, healthcare professionals can prioritize resources and interventions, optimizing the allocation of limited healthcare resources. The integration of machine learning into cardiovascular care holds the promise of significantly reducing the global burden of CVDs by enhancing the accuracy of diagnosis, optimizing treatment strategies, and ultimately, saving lives.
Predictive Modeling of Heart Disease Using Machine Learning Models
D.P. Culp Center Ballroom
Predictive Modeling of Heart Disease Using Machine Learning Models ABSTRACT Cardiovascular diseases (CVDs) represent the leading cause of mortality worldwide, claiming approximately 17.9 million lives annually. This staggering figure accounts for 31% of all global deaths, highlighting the critical public health challenge posed by CVDs. Most of these deaths, about 80%, are attributed to heart attacks and strokes. Alarmingly, a significant portion of these fatalities—nearly one-third—occur prematurely in individuals under the age of 70, underscoring the urgent need for effective prevention and treatment strategies. We use machine learning model to predict the presence or absence of heart disease in patient based on categorical and numerical features such as age, sex, resting blood pressure, cholesterol, fasting blood pressure, chest pain type, fasting blood sugar etc. Our dataset was sourced from the Cleveland database from the UC Irvine Machine Learning Repository, which had 918 rows and 12 columns. It included 13 independent variables and 1 target variable i.e., heart disease. Using exploratory data analysis, we established correlations between variables where we realized Age and Old Peak having the highest correlation. Our initial findings established patients with a presence of heart disease had a low Max Heart Rate and vice versa due to increased stiffness of large arteries which comes with age. We further identified patterns and risk factors by using algorithms such as Logistic Regression, Naive Bayes, and Random Forest. These methods allowed us to achieve significant accuracy in predicting the presence or absence of heart disease, based on various patient features. Specifically, the accuracy rates derived from these algorithms range from 75% to 90%, with Logistic Regression achieving the highest accuracy following hyperparameter tuning. Furthermore, our analysis elucidates the importance of individual features in predicting heart disease risk. Factors such as age, sex, resting blood pressure, cholesterol levels, and chest pain type emerge as crucial indicators, informing targeted interventions for high-risk individuals. By identifying these key predictors, healthcare professionals can prioritize resources and interventions, optimizing the allocation of limited healthcare resources. The integration of machine learning into cardiovascular care holds the promise of significantly reducing the global burden of CVDs by enhancing the accuracy of diagnosis, optimizing treatment strategies, and ultimately, saving lives.