Bioinformatics and Machine Learning Guided Biocatalyst Development

Chiang, Chang-Hwa

Bioinformatics and Machine Learning Guided Biocatalyst Development

dc.contributor.author	Chiang, Chang-Hwa
dc.date.accessioned	2025-05-12T17:41:28Z
dc.date.available	2025-05-12T17:41:28Z
dc.date.issued	2025
dc.date.submitted	2025
dc.identifier.uri	https://hdl.handle.net/2027.42/197297
dc.description.abstract	Biocatalysis offers precise, efficient, and sustainable pathways for chemical transformations but is often limited by challenges in identifying and engineering enzymes for specific reactions. This thesis addresses these challenges by developing bioinformatics and machine learning strategies to enhance biocatalyst development across various stages, including enzyme identification, mechanistic understanding, and engineering. In Chapter 2, we focused on deciphering the stereocontrol mechanism of an oxidative dearomatization reaction catalyzed by flavin-dependent monooxygenases (FDMOs). To overcome common limitations encountered when performing mutagenesis on extant enzymes, such as stability issues and epistasis, we employed an ancestral sequence reconstruction (ASR)-based approach that introduces the dimension of time into the analysis. By resurrecting ancestral FDMOs along key evolutionary trajectories leading to enzymes with complementary stereoselectivity, we pinpointed a critical Phe-to-Tyr switch in the active site that controls stereoselectivity. Based on our experience working with ancestral enzymes, we envision that ASR can not only be used for investigating enzymatic mechanisms but also be adopted for enzyme engineering. In Chapter 3, targeting a biocatalytic synthesis of azaphilones involving an FDMO and an acyltransferase (AT), we leveraged bioinformatics-guided ASR to navigate ancestral sequence space, generating superior ancestral FDMO and AT to enable two key transformations in the synthesis. In Chapter 4, we optimized variational autoencoder-based latent space models, demonstrating their ability to capture both local and global phylogenetic and functional relationships within enzyme families, thus providing a superior tool for enzyme sampling and discovery. In Chapter 5, we aimed to optimize a commonly used enzyme engineering workflow by directed evolution. First, we developed an automated primer generation pipeline on Google Colab for site-saturation mutagenesis, which greatly reduces the time and effort required for manual primer design. Moreover, by gathering sequence information of variants of a bacterial cytochrome P450 via next-generation sequencing, we tested whether protein fitness models such as ESM-1v and EVcouplings can predict mutational effects on reactivity for a P450-catalyzed non-native dimerization reaction of apigenin. While the models showed only weak predictive power for individual mutations, they demonstrated potential at the positional level, offering a strategy to identify mutational hotspots for enzyme engineering. Collectively, this work advances biocatalysis by introducing novel strategies and tools for enzyme identification, mechanistic understanding, and engineering. These approaches facilitate the efficient development of biocatalysts for targeted chemical transformations, addressing key challenges and paving the way for future innovations in biocatalysis.
dc.language.iso	en_US
dc.subject	Biocatalysis
dc.subject	Enzymes
dc.subject	Ancestral sequence reconstruction
dc.subject	Bioinformatics
dc.subject	Machine learning
dc.title	Bioinformatics and Machine Learning Guided Biocatalyst Development
dc.type	Thesis
dc.description.thesisdegreename	PhD
dc.description.thesisdegreediscipline	Chemistry
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Brooks III, Charles L
dc.contributor.committeemember	Narayan, Alison Rae Hardin
dc.contributor.committeemember	Smith, Janet L
dc.contributor.committeemember	Koutmos, Markos
dc.contributor.committeemember	Mapp, Anna K
dc.subject.hlbsecondlevel	Chemistry
dc.subject.hlbtoplevel	Science
dc.contributor.affiliationumcampus	Ann Arbor
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/197297/1/cdchiang_1.pdf
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/197297/2/cdchiang_2.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/25723
dc.identifier.orcid	0000-0003-1363-9148
dc.identifier.name-orcid	Chiang, Chang-Hwa; 0000-0003-1363-9148	en_US
dc.working.doi	10.7302/25723	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: cdchiang_1.pdf
Size:: 53.94MB
Format:: PDF

View/Open

Name:: cdchiang_2.pdf
Size:: 49.35MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.