Digging Deeper into the Methods of Computational Chemistry
Kammeraad, Joshua
2020
Abstract
This dissertation applies a skeptical but hopeful analytical paradigm and the tools of linear algebra, numerical methods, and machine learning to a diversity of problems in computational chemistry. When the foundation underlying a project is undermined, the primary purpose of the project becomes digging into the nature and structure of the problem. A common theme emerges in which assumptions in an area are challenged and a deeper understanding of the problem structure leads to new insights. In chapter 2, this approach is exploited to approximate derivative coupling vectors, which together with the difference gradient span the branching planes of conical intersections between electronic states. While gradients are commonly available in many electronic structure methods, the derivative coupling vectors are not always implemented and ready for use in characterizing conical intersections. An approach is introduced which computes the derivative coupling vector with high accuracy (direction and magnitude) using energy and gradient information. The new method is based on the combination of a linear-coupling two-state Hamiltonian and a finite-difference Davidson approach for computing the branching plane. Benchmark cases are provided showing these vectors can be efficiently computed near conical intersections. In chapter 3, this approach yields a countercultural explanation for what machine learning algorithms have learned in modeling a chemical reactivity dataset. Data-driven models of chemical reactions, a departure from conventional chemical approaches, have recently been shown to be statistically successful using machine learning. These models, however, are largely black box in character and have not provided the kind of chemical insights that historically advanced the field of chemistry. The chapter examines the knowledgebase of machine learning models—what does the machine learn?—by deconstructing black box machine learning models of a diverse chemical reaction dataset. Through experimentation with chemical representations and modeling techniques, the analysis provides insights into the nature of how statistical accuracy can arise, even when the model lacks informative physical principles. By peeling back the layers of these complicated models we arrive at a minimal, chemically intuitive model (and no machine learning involved). This model is based on systematic reaction type classification and Evans-Polanyi relationships within reaction types which are easily visualized and interpreted. Through exploring this simple model, we gain deeper understanding of the dataset and uncover a means for expert interactions to improve the model’s reliability. In chapter 4, human - algorithm interaction is explored as a paradigm for generating representative ensembles of conformers for organic compounds, a challenging problem in computational chemistry with implications on the ability to understand and predict reactivity. The approach utilizes the molecular editor IQmol as an interface between chemists and reinforcement learning algorithms with the cheminformatics package RDKit as a backbone. Conformer ensembles are evaluated by uniqueness and the approximation they yield of the partition function. Prototype results are presented for a standard reinforcement learning algorithm tested on linear alkanes and chemist manipulation of a fragment of the biomolecule lignin. Future aims and directions for this young project are discussed. The concluding chapter reflects on the broader lessons learned from conducting the dissertation. It discusses open questions and potential paradigms for pursuing them.Subjects
computational chemistry data science
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.