Degree Name

MS (Master of Science)

Program

Mathematical Sciences

Date of Award

5-2017

Committee Chair or Co-Chairs

Nicole Lewis

Committee Members

Anant Godbole, Bob Price Jr., JeanMarie Hendrickson

Abstract

Missing data is one of the challenges we are facing today in modeling valid statistical models. It reduces the representativeness of the data samples. Hence, population estimates, and model parameters estimated from such data are likely to be biased.

However, the missing data problem is an area under study, and alternative better statistical procedures have been presented to mitigate its shortcomings. In this paper, we review causes of missing data, and various methods of handling missing data. Our main focus is evaluating various multiple imputation (MI) methods from the multiple imputation of chained equation (MICE) package in the statistical software R. We assess how these MI methods perform with different percentages of missing data. A multiple regression model was fit on the imputed data sets and the complete data set. Statistical comparisons of the regression coefficients are made between the models using the imputed data and the complete data.

Document Type

Thesis - Open Access

Copyright

Copyright by the authors.

Share

COinS