Decoding the Non-coding Genome: Novel Technologies for the Characterization of Non-coding Elements and Variation
Nishizaki, Sierra
2020
Abstract
One of the key frontiers in genomics research is decoding the function of non-coding sequence and variation. Non-coding sequence, once thought to be junk DNA, is now known to regulate gene expression in a tissue-specific manner, and is frequently found to be mutated in cases of complex human disease. Despite their importance in human disease, non-coding regions are vastly understudied compared to protein coding regions. This is in part due to the abundance of non-coding sequences currently predicted to comprise 98.8% of the genome compared to protein coding regions, which make up only 1.2%. To complicate things further, most of this sequence is non-functional. A non-coding mutation may lead to a change in gene expression or a difference in human phenotype, yet it could show no change in gene expression at all. Therefore, there is considerable demand for novel computational and experimental tools focused on identifying functional non-coding sequences, and prioritizing variation associated with gene expression regulation and human disease. The focus of the work in this dissertation is the development of novel tools to identify functional non-coding regulatory sequences, and to prioritize the variation that falls within these sequences. I will introduce the following computational tools, the SNP Effect Matrix Pipeline (SEMpl) and the SNP Effect Matrix Pipeline with Methylation (SEMplMe). These methods integrate data from genome-wide annotations of functional elements, such as sites of transcription factor protein binding (ChIP-seq), open chromatin (DNase-seq), and DNA methylation (WGBS), to generate predictions of the consequences of nucleotide and methylation changes to binding affinity in transcription factor binding sites. As transcription factor binding sites are the building blocks of larger regulatory sequences, such as regulatory elements, functional alterations caused by the introduction of a variant or DNA methylation may lead to aberrant gene expression. SEMpl and SEMplMe are easy to use tools to help researchers prioritize the hundreds of putative regulatory variants that emerge from high-throughput studies, such as genome-wide association studies. This will greatly increase the rate at which non-coding variation can be experimentally validated. I will also introduce experimental tools focused on identifying larger blocks of regulatory non-coding sequence: cis-regulatory elements. Cis-regulatory elements are sequences that are able to alter or drive gene expression. Currently, a large body of in- formation exists for regulatory elements that are associated with an increase in gene expression, known as positive regulatory elements. However, regulatory elements associated with a decrease in gene expression, also known as negative regulatory elements, are comparatively understudied. To help fill this gap in knowledge between positive and negative regulatory elements, I helped develop two novel methodologies that are able to invert negative regulation into a positive reporter signal. The observed positive output allows negative regulatory elements to be characterized in a spatio-temporal manner in vivo in whole animals. This advancement will allow negative regulatory elements to be studied in a manner similar to what has already been achieved for positive regulatory elements for the first time. Together, the studies in this dissertation investigate non-coding regulatory sequence genome-wide through the development of novel tools which prioritize regulatory variation and identify and characterize regulatory elements.Deep Blue DOI
Subjects
non-coding regulatory sequence complex human disease and variation transcription factor binding sites noncoding variant annotation negative regulatory elements zebrafish reporter assays
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.