Degree Name

MS (Master of Science)

Program

Mathematical Sciences

Date of Award

5-2020

Committee Chair or Co-Chairs

Christina Nicole Lewis

Committee Members

Robert M. Price Jr, JeanMarie L. Hendrickson

Abstract

We compare different multiple imputation methods for categorical variables using the MICE package in R. We take a complete data set and remove different levels of missingness and evaluate the imputation methods for each level of missingness. Logistic regression imputation and linear discriminant analysis (LDA) are used for binary variables. Multinomial logit imputation and LDA are used for nominal variables while ordered logit imputation and LDA are used for ordinal variables. After imputation, the regression coefficients, percent deviation index (PDI) values, and relative frequency tables were found for each imputed data set for each level of missingness and compared to the complete corresponding data set. It was found that logistic regression outperformed LDA for binary variables, and LDA outperformed both multinomial logit imputation and ordered logit imputation for nominal and ordered variables. Simulations were ran to confirm the validity of the results.

Document Type

Dissertation - unrestricted

Copyright

Copyright by the authors.

Share

COinS