Degree Name

MS (Master of Science)


Computer and Information Science

Date of Award


Committee Chair or Co-Chairs

Debra J. Knisley, Phillip E. Pfeiffer IV

Committee Members

Cecilia A. McIntosh, Christopher D. Wallace, Yali Liu


Machine learning is applied to a challenging and biologically significant protein classification problem: the prediction of flavonoid UGT acceptor regioselectivity from primary protein sequence. Novel indices characterizing graphical models of protein residues are introduced. The indices are compared with existing amino acid indices and found to cluster residues appropriately. A variety of models employing the indices are then investigated by examining their performance when analyzed using nearest neighbor, support vector machine, and Bayesian neural network classifiers. Improvements over nearest neighbor classifications relying on standard alignment similarity scores are reported.

Document Type

Thesis - Open Access


Copyright by the authors.