Using a Neural Network to Identify Secondary RNA Structures Quantified by Graphical Invariants

Document Type


Publication Date



Graphs have been used extensively in theoretical computer science to model various discrete structures, most notably data structures. Chemists have also utilized graphs to represent and quantify molecules and more recently, graphs have appeared in the literature as biomolecules such as RNA and protein structures. In this work, we quantify a graphical representation of secondary RNA structures using trees. By identifying a subset of the trees whose elements are known to model secondary RNA structure, we train a neural network to recognize the patterns of graphical invariants of trees that are RNA-like in structure. What is of particular interest is that these invariants are tools primarily from the field of theoretical computer science. We then identify additional trees that are potential representations of RNA secondary structure that may either occur naturally and have not been identified or may be considered a viable candidate for synthetically produced RNA.