Show simple item record

Model-Based Genomic Studies of Protein Sequence Evolution: Convergence, Epistasis, and Amino Acid Acceptance Rates

dc.contributor.authorZou, Zhengting
dc.date.accessioned2017-10-05T20:26:12Z
dc.date.availableNO_RESTRICTION
dc.date.available2017-10-05T20:26:12Z
dc.date.issued2017
dc.date.submitted2017
dc.identifier.urihttps://hdl.handle.net/2027.42/138483
dc.description.abstractProtein sequence changes are a major contributor to phenotypic evolution and biodiversity. While the genomic revolution has drastically increased the available amount of protein sequence data for comparative studies, development of analytic tools lags behind. In particular, current mathematical models of sequence evolution are over-simplified and typically ignore many heterogeneities in evolutionary processes. As a result, they often provide inadequate descriptions of evolution, leading to misleading conclusions. My thesis uncovers some of these heterogeneities and demonstrates that incorporating them into mathematical models of protein sequence evolution offers new insights into evolutionary mechanisms. For instance, convergent evolution of morphological traits has long interested biologists because it is a strong indicator of common natural selections in independent evolutionary lineages. Similarly, convergent evolution of protein sequences is commonly thought to have resulted from natural selection. In Chapter 2 of this thesis, however, I show that such interpretations are problematic, because sequence convergence can be explained by neutral evolution as long as among-site variations in amino acid composition are considered. I also find that the convergence level reduces with genetic distance. In Chapter 3, I evaluate two hypotheses that could explain the diminishing convergence with genetic distance: (i) divergent epistasis in distantly related organisms and (ii) gene tree discordance. I demonstrate that both hypotheses are at work, but their contributions vary depending on how closely related the species of interest are. In Chapter 4, I revisit a high-profile claim of genome-wide adaptive protein sequence convergence for echolocation in three lineages of mammals. I discover that the amount of convergence observed is no more than those in proper negative controls, suggesting that these sequence convergences are largely neutral and unrelated to echolocation. A widely believed but never critically tested hypothesis in phylogenetics is that morphological data contain more convergence and hence are less suitable for phylogenetic inference than molecular data. Analyzing a large dataset including thousands of morphological traits and thousands of molecular traits, I find unequivocal evidence for this hypothesis and uncover its underlying cause in Chapter 5. I subsequently design a method to identify and remove highly convergent traits, leading to higher phylogenetic accuracies. In Chapter 6, I report a new type of evolutionary heterogeneity that potentially contributes to phylogenetic error: between-species variation in the probability with which a mutation between a specific pair of amino acids is fixed. In Chapter 7, I find that this heterogeneity leads to another previously unknown heterogeneity among species: the fitness disadvantage of nonsynonymous transversions relative to that of nonsynonymous transitions, a subject that has been studied since the dawn of the field molecular evolution. These six chapters, along with the introductory and concluding chapters, provide an integrative study of previously unknown or neglected heterogeneities in protein sequence evolution. Together, they correct misconceptions in molecular evolution, help improve phylogenetic inference, and deepen our understanding of evolutionary mechanisms.
dc.language.isoen_US
dc.subjectprotein sequence evolution
dc.subjectconvergence
dc.titleModel-Based Genomic Studies of Protein Sequence Evolution: Convergence, Epistasis, and Amino Acid Acceptance Rates
dc.typeThesisen_US
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBioinformatics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberZhang, George
dc.contributor.committeememberWittkopp, Trisha
dc.contributor.committeememberBoyle, Alan P
dc.contributor.committeememberBurns Jr, Daniel M
dc.contributor.committeememberSmith, Stephen A
dc.subject.hlbsecondlevelEcology and Evolutionary Biology
dc.subject.hlbsecondlevelGenetics
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttps://deepblue.lib.umich.edu/bitstream/2027.42/138483/1/ztzou_1.pdf
dc.identifier.orcid0000-0003-1716-5090
dc.identifier.name-orcidZou, Zhengting; 0000-0003-1716-5090en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.