Degree Name

MS (Master of Science)

Program

Computer and Information Science

Date of Award

12-2009

Committee Chair or Co-Chairs

Debra J. Knisley, Phillip E. Pfeiffer IV

Committee Members

Cecilia A. McIntosh, Christopher D. Wallace, Yali Liu

Abstract

Machine learning is applied to a challenging and biologically significant protein classification problem: the prediction of flavonoid UGT acceptor regioselectivity from primary protein sequence. Novel indices characterizing graphical models of protein residues are introduced. The indices are compared with existing amino acid indices and found to cluster residues appropriately. A variety of models employing the indices are then investigated by examining their performance when analyzed using nearest neighbor, support vector machine, and Bayesian neural network classifiers. Improvements over nearest neighbor classifications relying on standard alignment similarity scores are reported.

Document Type

Thesis - unrestricted

Copyright

Copyright by the authors.

Share

COinS