Show simple item record

Reinforcement Learning based Sequential and Robust Bayesian Optimal Experimental Design

dc.contributor.authorShen, Wanggang
dc.date.accessioned2023-09-22T15:35:49Z
dc.date.available2023-09-22T15:35:49Z
dc.date.issued2023
dc.date.submitted2023
dc.identifier.urihttps://hdl.handle.net/2027.42/177996
dc.description.abstractOptimal experimental design (OED) is a statistical approach aimed at designing experiments in order to extract maximum information from them. It entails carefully selecting experimental conditions to effectively achieve specific objectives, such as minimizing the uncertainty associated with the model parameters. OED is highly valuable in various fields such as engineering, physics, chemistry, and biology to optimize the performance of a system or to gain a deeper understanding of a phenomenon. While conventional OED approaches predominantly focus on batch experimental designs that maximize expected information gain on model parameters, there remain active research questions that merit further investigation: 1. How can we optimally design a sequence of experiments, and fully capture information offered by earlier experiments to adaptive update the later ones? 2. How can we expand the OED objective function to include other design metrics beyond model parameter inference, such as model discrimination and goal-oriented predictions? 3. How can we incorporate robustness into OED? To address these questions, we first present a mathematical framework and computational methods to optimally design a finite number of sequential experiments. We formulate this sequential OED (sOED) problem as a finite-horizon partially observable Markov decision process (POMDP) in a Bayesian setting and with information-theoretic utilities. sOED then seeks an optimal design policy that incorporates elements of both feedback and lookahead, generalizing the suboptimal batch and greedy designs. We solve for the sOED policy numerically via policy gradient (PG) methods from reinforcement learning, and provide a derivation of the PG expression for sOED. Adopting an actor-critic approach, we parameterize the policy and value functions using deep neural networks and improve them using gradient estimates produced from simulated episodes of designs and observations. The overall PG-sOED method is validated on a linear-Gaussian benchmark, and its advantages over batch and greedy designs are demonstrated through a contaminant source inversion problem in a convection-diffusion field. Building upon sOED, we introduce variational sequential OED (vsOED) to further accelerate the designing process. Specifically, we adopt a lower bound estimator for the expected utility through variational approximation to the Bayesian posteriors. The optimal design policy is solved numerically by simultaneously maximizing the variational lower bound and performing policy gradient updates. We demonstrate this general methodology for a range of OED problems targeting parameter inference, model discrimination, and goal-oriented prediction. These cases encompass explicit and implicit likelihoods, nuisance parameters, and physics-based partial differential equation models. Our vsOED results indicate substantially improved sample efficiency and reduced number of forward model simulations compared to previous sequential design algorithms. In order to design experiments in a robust manner, we further introduce robust OED (rOED). We employ the utility variance as a measure of design robustness and introduce a variance-penalized objective formulation that tradeoff between maximizing expected utility (optimality) and minimizing utility variance (robustness). To accurately estimate the variance-penalized objective, we propose a double-nested Monte Carlo estimator, enhanced by efficient sampling techniques for improved efficiency. The accuracy and convergence of the proposed estimator is validated on benchmark examples and a sensor placement problem for source inversion in a diffusion field with building obstacles. Lastly, we formulate robust sequential OED (rsOED) that combines the principles of sequential design with the variance-penalized robust objective. We provide a solution algorithm enabled by deriving the policy gradient expressions of rsOED, and validate its performance on a nonlinear numerical example.
dc.language.isoen_US
dc.subjectoptimal experimental design
dc.subjectsequential optimal experimental design
dc.subjectrobust optimal experimental design
dc.subjectBayesian experimental design
dc.subjectpolicy gradient
dc.subjectvariational estimator
dc.titleReinforcement Learning based Sequential and Robust Bayesian Optimal Experimental Design
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineMechanical Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberHuan, Xun
dc.contributor.committeememberBanovic, Nikola
dc.contributor.committeememberGarikipati, Krishna
dc.contributor.committeememberMarzouk, Youssef
dc.subject.hlbsecondlevelMechanical Engineering
dc.subject.hlbtoplevelEngineering
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/177996/1/wgshen_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/8453
dc.identifier.orcid0000-0002-6824-9393
dc.identifier.name-orcidShen, Wanggang; 0000-0002-6824-9393en_US
dc.working.doi10.7302/8453en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.