Show simple item record

Advancement of Molecular Mechanics Based Drug Discovery Through the Use of Machine Learning

dc.contributor.authorJones, Murchtricia
dc.date.accessioned2021-09-24T19:12:24Z
dc.date.available2021-09-24T19:12:24Z
dc.date.issued2021
dc.date.submitted2021
dc.identifier.urihttps://hdl.handle.net/2027.42/169793
dc.description.abstractDrug discovery is the leading motivation for the development of new chemical entities. Improving computational methodologies is an important scientific endeavor for facilitating the development and optimization of new therapeutic agents. Particularly, this dissertation focuses on increasing the accuracy of molecular dynamics simulations which employ molecular mechanics force fields (MMFFs). MMFFs provide an atomistic representation of drug-target binding which enables the elucidation of structural information necessary to evolve compounds into viable drug candidates. The accuracy and efficiency of such computational assays are highly dependent on the initial set of force field parameters required to begin the simulation. Through many years of training and refinement, the parameters developed for macromolecules are well developed; however, the generation of force field parameters for novel chemical scaffolds can be challenging due to the vastness of small molecule chemical space. The work herein addresses this obstacle by employing machine learning models for the development of a framework which facilitates small molecule parametrization across various MMFFs. The presented framework, Machine learning based Multipurpose AtomTyper for CHARMM (ML-MATCH), considers each molecule from an atom-centric viewpoint. This framework has two components, with the first being the machine learning application. Using Random Forest, two key parameters can be predicted: atom types and partial charges. With the CHARMM General Force Field (CGenFF) as the training set, we found an average accuracy score of 96% for the classification of atom types and a Pearson R-value of 0.974e and RMSE of 0.028e for the assignment of partial charges. To validate the models, we compared ML-MATCH derived parameters to that of PARAMCHEM, the current gold standard for CGenFF based parameterization, for molecules within the FreeSolve Database. This resulted in an accuracy score of 90% for atom types and RMSE of 0.049e for partial charges. The second component of this framework is the MATCHing algorithm which serves to identify the closest MATCH between the bonded parameters of the query and those which exists in the force field’s training set. ML-MATCH derived bonded parameters were validated by conducting free energy of hydration calculations for benzene derivatives within FreeSolve which were subsequently compared to both experimental free energies and calculated hydration free energies computed using PARAMCHEM derived parameters. With the GBMV2 implicit solvent model, we found an average Pearson R-value of 0.7223 and 0.4635 for ML-MATCH and ParamChem when compared to experiment, respectively. Similarly, for the FACTS model, we found an average Pearson R-values of 0.7505 and 0.5353. These findings show that ML-MATCH derived parameters are well-suited for reproducing experimental data in simulation. Application of ML-MATCH derived parameters in more complex simulations and retraining on various force fields, shows that this framework goes beyond the status quo of current atom parameterization software in its ability to identify the underlying rules and assumption for a given force field without being explicitly programmed to do so. Therefore, the novel developed ML-MATCH platform for small molecule parametrization will be particularly useful for ligands in the studies of computer-aided drug design and developing therapeutic agents.
dc.language.isoen_US
dc.subjectAdvancement of Molecular Mechanics Based Drug Discovery Through the Use of Machine Learning
dc.titleAdvancement of Molecular Mechanics Based Drug Discovery Through the Use of Machine Learning
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineBioinformatics
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberBrooks III, Charles L
dc.contributor.committeememberNikolovska-Coleska, Zaneta
dc.contributor.committeememberFrank, Aaron Terrence
dc.contributor.committeememberNajarian, Kayvan
dc.contributor.committeememberWalter, Nils G
dc.subject.hlbsecondlevelScience (General)
dc.subject.hlbtoplevelScience
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/169793/1/murchkia_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/2838
dc.identifier.orcid0000-0002-7193-6282
dc.identifier.name-orcidJones, Murchtricia ; 0000-0002-7193-6282en_US
dc.working.doi10.7302/2838en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.