Show simple item record

Epigenomic and Transcriptomic Profiling for the Study of Monogenic and Polygenic Traits and Disease

dc.contributor.authorOrchard, Peter
dc.date.accessioned2020-10-04T23:26:29Z
dc.date.availableNO_RESTRICTION
dc.date.available2020-10-04T23:26:29Z
dc.date.issued2020
dc.identifier.urihttps://hdl.handle.net/2027.42/163000
dc.description.abstractMany trait-associated genomic loci are in non-coding regions of the genome. Determining which genetic variants in these regions are causally related to a trait and elucidating their downstream effects can be difficult. Layering transcriptomic and epigenomic data on top of genetic variation data can help nominate causal phenotype-associated variants and generate hypotheses about their effects in different cellular contexts. In this thesis, I first apply RNA-sequencing (RNA-seq) and the assay for transposase accessible chromatin using sequencing (ATAC-seq) to investigate gene expression and chromatin accessibility in the Danforth mouse, a model of caudal birth defects. The Danforth phenotype results from an endogenous retroviral insertion near the Ptf1a gene. I identify 49 genes differentially expressed between Danforth and WT E9.5 tailbuds, including increased expression of Ptf1a and the nearby Gm13344 lncRNA in Danforth. A gene ontology enrichment analysis indicates differentially expressed genes are enriched in the hedgehog signaling pathway, suggesting disruption of hedgehog signaling may cause the Danforth phenotype. I identify one region of increased chromatin accessibility in Danforth relative to WT mice, localizing to the Gm13344 promoter. This region is orthologous to a human PTF1A enhancer, suggesting it may mediate Ptf1a overexpression in the Danforth mouse. Next, I apply a software package for the quality control of ATAC-seq data (developed in our lab) to public datasets to measure heterogeneity, and analyze GM12878 ATAC-seq data to quantify the impact of Tn5 transposase concentration and sequencing lane cluster density. I find that increasing cluster density shifts the ATAC-seq fragment length distribution towards shorter fragments and results in greater transcription start site enrichment. I show that increasing Tn5 transposase concentration increases the enrichment of reads in enhancers and promoters, with ~80% of ATAC-seq peaks showing increased signal with increasing Tn5 concentration (5% FDR). Peaks bound by the CTCF transcription factor are less sensitive to Tn5 concentration than those bound by other transcription factors. This analysis demonstrates the difficulties in reliably quantifying chromatin accessibility and utilizing public datasets. I then apply single-nucleus ATAC-seq and RNA-seq to human and rat skeletal muscle to generate cell type specific transcriptomic and chromatin accessibility maps. I integrate these maps with UK Biobank genome-wide association study (GWAS) data to explore enrichment of GWAS signals in cell type specific ATAC-seq peaks. I demonstrate the utility of these maps by nominating causal genetic variants and cell types at several GWAS loci, including the T2D-associated ARL15 locus. At the ARL15 locus I nominate a credible set variant in a highly mesenchymal stem cell specific ATAC-seq peak. Lastly, to gain insight into the genetic regulation of chromatin architecture and its association with aerobic exercise capacity, I analyze skeletal muscle ATAC-seq (n = 129) and RNA-seq (n = 143) from a rat model for untrained running capacity. Although no genes associate with running capacity at 5% FDR, a gene ontology enrichment analysis indicates that the genes with the strongest association are enriched in fatty acid oxidation pathways, consistent with previous findings in this rat model. I identify no ATAC-seq peaks associated with running capacity (5% FDR) but find 4,477 ATAC-seq peaks associate with at least one SNP (5% FDR). Together, these projects demonstrate the value of epigenomic and transcriptomic data in the investigation of monogenic and polygenic traits, as well as the challenges and limitations of applying epigenomic and transcriptomic data in this context.
dc.language.isoen_US
dc.subjectgenome
dc.subjectepigenome
dc.subjecttranscriptome
dc.subjectATAC-seq
dc.subjectcomplex disease
dc.titleEpigenomic and Transcriptomic Profiling for the Study of Monogenic and Polygenic Traits and Disease
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBioinformatics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberParker, Stephen CJ
dc.contributor.committeememberTerhorst, Jonathan
dc.contributor.committeememberBurant, Charles
dc.contributor.committeememberKang, Hyun Min
dc.contributor.committeememberLi, Jun
dc.subject.hlbsecondlevelGenetics
dc.subject.hlbsecondlevelScience (General)
dc.subject.hlbtoplevelHealth Sciences
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/163000/1/porchard_1.pdfen_US
dc.identifier.orcid0000-0001-6097-1106
dc.identifier.name-orcidOrchard, Peter; 0000-0001-6097-1106en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.