Bioinformatics and Machine Learning Guided Biocatalyst Development
dc.contributor.author | Chiang, Chang-Hwa | |
dc.date.accessioned | 2025-05-12T17:41:28Z | |
dc.date.available | 2025-05-12T17:41:28Z | |
dc.date.issued | 2025 | |
dc.date.submitted | 2025 | |
dc.identifier.uri | https://hdl.handle.net/2027.42/197297 | |
dc.description.abstract | Biocatalysis offers precise, efficient, and sustainable pathways for chemical transformations but is often limited by challenges in identifying and engineering enzymes for specific reactions. This thesis addresses these challenges by developing bioinformatics and machine learning strategies to enhance biocatalyst development across various stages, including enzyme identification, mechanistic understanding, and engineering. In Chapter 2, we focused on deciphering the stereocontrol mechanism of an oxidative dearomatization reaction catalyzed by flavin-dependent monooxygenases (FDMOs). To overcome common limitations encountered when performing mutagenesis on extant enzymes, such as stability issues and epistasis, we employed an ancestral sequence reconstruction (ASR)-based approach that introduces the dimension of time into the analysis. By resurrecting ancestral FDMOs along key evolutionary trajectories leading to enzymes with complementary stereoselectivity, we pinpointed a critical Phe-to-Tyr switch in the active site that controls stereoselectivity. Based on our experience working with ancestral enzymes, we envision that ASR can not only be used for investigating enzymatic mechanisms but also be adopted for enzyme engineering. In Chapter 3, targeting a biocatalytic synthesis of azaphilones involving an FDMO and an acyltransferase (AT), we leveraged bioinformatics-guided ASR to navigate ancestral sequence space, generating superior ancestral FDMO and AT to enable two key transformations in the synthesis. In Chapter 4, we optimized variational autoencoder-based latent space models, demonstrating their ability to capture both local and global phylogenetic and functional relationships within enzyme families, thus providing a superior tool for enzyme sampling and discovery. In Chapter 5, we aimed to optimize a commonly used enzyme engineering workflow by directed evolution. First, we developed an automated primer generation pipeline on Google Colab for site-saturation mutagenesis, which greatly reduces the time and effort required for manual primer design. Moreover, by gathering sequence information of variants of a bacterial cytochrome P450 via next-generation sequencing, we tested whether protein fitness models such as ESM-1v and EVcouplings can predict mutational effects on reactivity for a P450-catalyzed non-native dimerization reaction of apigenin. While the models showed only weak predictive power for individual mutations, they demonstrated potential at the positional level, offering a strategy to identify mutational hotspots for enzyme engineering. Collectively, this work advances biocatalysis by introducing novel strategies and tools for enzyme identification, mechanistic understanding, and engineering. These approaches facilitate the efficient development of biocatalysts for targeted chemical transformations, addressing key challenges and paving the way for future innovations in biocatalysis. | |
dc.language.iso | en_US | |
dc.subject | Biocatalysis | |
dc.subject | Enzymes | |
dc.subject | Ancestral sequence reconstruction | |
dc.subject | Bioinformatics | |
dc.subject | Machine learning | |
dc.title | Bioinformatics and Machine Learning Guided Biocatalyst Development | |
dc.type | Thesis | |
dc.description.thesisdegreename | PhD | |
dc.description.thesisdegreediscipline | Chemistry | |
dc.description.thesisdegreegrantor | University of Michigan, Horace H. Rackham School of Graduate Studies | |
dc.contributor.committeemember | Brooks III, Charles L | |
dc.contributor.committeemember | Narayan, Alison Rae Hardin | |
dc.contributor.committeemember | Smith, Janet L | |
dc.contributor.committeemember | Koutmos, Markos | |
dc.contributor.committeemember | Mapp, Anna K | |
dc.subject.hlbsecondlevel | Chemistry | |
dc.subject.hlbtoplevel | Science | |
dc.contributor.affiliationumcampus | Ann Arbor | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/197297/1/cdchiang_1.pdf | |
dc.description.bitstreamurl | http://deepblue.lib.umich.edu/bitstream/2027.42/197297/2/cdchiang_2.pdf | |
dc.identifier.doi | https://dx.doi.org/10.7302/25723 | |
dc.identifier.orcid | 0000-0003-1363-9148 | |
dc.identifier.name-orcid | Chiang, Chang-Hwa; 0000-0003-1363-9148 | en_US |
dc.working.doi | 10.7302/25723 | en |
dc.owningcollname | Dissertations and Theses (Ph.D. and Master's) |
Files in this item
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.