Multiscale Modeling of RNA Structures Using NMR Chemical Shifts
Zhang, Kexin
2020
Abstract
Structure determination is an important step in understanding the mechanisms of functional non-coding ribonucleic acids (ncRNAs). Experimental observables in solution-state nuclear magnetic resonance (NMR) spectroscopy provide valuable information about the structural and dynamic properties of RNAs. In particular, NMR-derived chemical shifts are considered structural "fingerprints" of RNA conformational state(s). In my thesis, I have developed computational tools to model RNA structures (mainly secondary structures) using structural information extracted from NMR chemical shifts. Inspired by methods that incorporate chemical-mapping data into RNA secondary structure prediction, I have developed a framework, CS-Fold, for using assigned chemical shift data to conditionally guide secondary structure folding algorithms. First, I developed neural network classifiers, CS2BPS (Chemical Shift to Base Pairing Status), that take assigned chemical shifts as input and output the predicted base pairing status of individual residues in an RNA. Then I used the base pairing status predictions as folding restraints to guide RNA secondary structure prediction. Extensive testing indicates that from assigned NMR chemical shifts, we could accurately predict the secondary structures of RNAs and map distinct conformational states of a single RNA. Another way to utilize experimental data like NMR chemical shifts in structure modeling is probabilistic modeling, that is, using experimental data to recover native-like structure from a structural ensemble that contains a set of low energy structure models. I first developed a model, SS2CS (Secondary Structure to Chemical Shift), that takes secondary structure as input and predicts chemical shifts with high accuracies. Using Bayesian/maximum entropy (BME), I was able to reweight secondary structure models based on the agreement between the measured and reweighted ensemble-averaged chemical shifts. Results indicate that BME could identify the native or near-native structure from a set of low energy structure models as well as recover some of the non-canonical interactions in tertiary structures. We could also probe the conformational landscape by studying the weight pattern assigned by BME. Finally, I explored RNA structural annotation using assigned NMR chemical shifts. Using multitask learning, eleven structural properties were annotated by classifying individual residues in terms of each structural property. The results indicate that our method, CS-Annotate, could predict the structural properties with reasonable accuracy. We believe that CS-Annotate could be used for assessing the quality of a structure model by comparing the structure derived structural properties with the CS-Annotate derived structural properties. One major limitation of the tools developed is that they require assigned chemical shifts. And to assign chemical shifts, a secondary structure model is typically assumed. However, with the recent advances in singly labeled RNA synthesis, chemical shifts could be assigned without the assumption about the secondary structure. We envision that using the chemical shifts derived from singly labeled NMR experiments, CS-Fold could be used for modeling the secondary structure of RNA. We also believe that unassigned chemical shifts could be used for selecting structure models. Native-like structures could be recovered by comparing optimally assigned chemical shifts with computed chemical shifts (generated by SS2CS). Overall, the results presented in this thesis indicate we could extract crucial structural information of the residues in an RNA based on its NMR chemical shifts. Moreover, with the tools like CS-Fold, SS2CS, and CS-Annotate, we could accurately predict the secondary structure, model conformational landscape, and study structural properties of an RNA.Subjects
RNA Structure Modeling NMR Chemical Shifts Machine Learning
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.