Reinforcement Learning based Sequential and Robust Bayesian Optimal Experimental Design
Shen, Wanggang
2023
Abstract
Optimal experimental design (OED) is a statistical approach aimed at designing experiments in order to extract maximum information from them. It entails carefully selecting experimental conditions to effectively achieve specific objectives, such as minimizing the uncertainty associated with the model parameters. OED is highly valuable in various fields such as engineering, physics, chemistry, and biology to optimize the performance of a system or to gain a deeper understanding of a phenomenon. While conventional OED approaches predominantly focus on batch experimental designs that maximize expected information gain on model parameters, there remain active research questions that merit further investigation: 1. How can we optimally design a sequence of experiments, and fully capture information offered by earlier experiments to adaptive update the later ones? 2. How can we expand the OED objective function to include other design metrics beyond model parameter inference, such as model discrimination and goal-oriented predictions? 3. How can we incorporate robustness into OED? To address these questions, we first present a mathematical framework and computational methods to optimally design a finite number of sequential experiments. We formulate this sequential OED (sOED) problem as a finite-horizon partially observable Markov decision process (POMDP) in a Bayesian setting and with information-theoretic utilities. sOED then seeks an optimal design policy that incorporates elements of both feedback and lookahead, generalizing the suboptimal batch and greedy designs. We solve for the sOED policy numerically via policy gradient (PG) methods from reinforcement learning, and provide a derivation of the PG expression for sOED. Adopting an actor-critic approach, we parameterize the policy and value functions using deep neural networks and improve them using gradient estimates produced from simulated episodes of designs and observations. The overall PG-sOED method is validated on a linear-Gaussian benchmark, and its advantages over batch and greedy designs are demonstrated through a contaminant source inversion problem in a convection-diffusion field. Building upon sOED, we introduce variational sequential OED (vsOED) to further accelerate the designing process. Specifically, we adopt a lower bound estimator for the expected utility through variational approximation to the Bayesian posteriors. The optimal design policy is solved numerically by simultaneously maximizing the variational lower bound and performing policy gradient updates. We demonstrate this general methodology for a range of OED problems targeting parameter inference, model discrimination, and goal-oriented prediction. These cases encompass explicit and implicit likelihoods, nuisance parameters, and physics-based partial differential equation models. Our vsOED results indicate substantially improved sample efficiency and reduced number of forward model simulations compared to previous sequential design algorithms. In order to design experiments in a robust manner, we further introduce robust OED (rOED). We employ the utility variance as a measure of design robustness and introduce a variance-penalized objective formulation that tradeoff between maximizing expected utility (optimality) and minimizing utility variance (robustness). To accurately estimate the variance-penalized objective, we propose a double-nested Monte Carlo estimator, enhanced by efficient sampling techniques for improved efficiency. The accuracy and convergence of the proposed estimator is validated on benchmark examples and a sensor placement problem for source inversion in a diffusion field with building obstacles. Lastly, we formulate robust sequential OED (rsOED) that combines the principles of sequential design with the variance-penalized robust objective. We provide a solution algorithm enabled by deriving the policy gradient expressions of rsOED, and validate its performance on a nonlinear numerical example.Deep Blue DOI
Subjects
optimal experimental design sequential optimal experimental design robust optimal experimental design Bayesian experimental design policy gradient variational estimator
Types
Thesis
Metadata
Show full item recordCollections
Showing items related by title, author, creator and subject.
-
Kshivsagar, Anant (Sage Publications, 1996)
-
Radson, Darrell (1990)
-
Bagozzi, Richard P.; Yi, Youjae; Singh, Surrendra P. (Elsevier, 1991-06)
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.