Synthesis and Evaluation of Automated Vehicles
Zhang, Songan
2021
Abstract
This dissertation focuses on the synthesis of a decision-making system for Automated Vehicles (AVs), and then evaluates the safety and robustness of the system with an eye toward improving the system design. We begin with a synthesis of an AV’s decision-making system in a specific driving environment. We model the environment as a Markov Decision Process (MDP), with the goal of determining the optimal strategy (that is, policy) for this particular MDP. We propose a novel Reinforcement Learning (RL) method using model-based exploration. This method allows the training agent to explore the MDP state space by maximizing the notion of an agent’s surprise about its experiences via intrinsic motivation. The optimal strategy will be deemed to be a global-optimal policy by which the AV can travel more efficiently. We then evaluate the decision-making system in a naturalistic driving environment. We focus on lane change maneuvers, modeling the differences between AVs and Human-controlled Vehicles (HVs) using the Safety Pilot Model Deployment Program’s naturalistic driving data. The probability of crashes serves as the primary metric for evaluating the safety of AV systems. In general, testing a system in a naturalistic driving environment is time-consuming and not cost-effective. To overcome this problem, we propose an accelerated evaluation method called Subset Simulation (SS), which can significantly reduce evaluation time and beat the baseline Importance Sampling (IS) method. This technique is not only capable of evaluating a system with a high-dimension state space, but also has the potential to conduct evaluations of more complicated systems (e.g., object detection systems). The SS method is limited, however, in that the “danger regions” are searched only as the test procedure unfolds. If the environmental statistics change, the crash rate cannot be estimated accurately. Therefore, we prefer to evaluate the decision-making system without including the environmental statistics. To this end, we propose an evaluation method based on the two-player Markov game. We introduce an attacker into the environment which keeps “attacking” the AV in a socially acceptable fashion. The attacker tries to lure the AV into AV-responsible crashes (as opposed to “crazy” crashes). Once the attacker has completed training, the AV is evaluated by introducing the attacker. The crash rate of the system then becomes 50 times greater in the environment with the attacker, which allows the system to register fatal flaws in the original training environment design. Introducing attackers capable of generating socially acceptable attacks makes the behavior of the surrounding vehicles more diverse. Our goal is to improve the original policy so as to design a safe and robust decision-making system under situations with different types of drivers in the environment, different traffic densities, and differing numbers of total surrounding vehicles. We tackle this problem by implementing the state-of-the-art Meta-Reinforcement Learning (MRL) method to train an agent to quickly adapt to different environments with limited data. The MRL-trained policy can significantly decrease the crash rate with a small amount of data across different environments. This technique has tremendous potential for helping the AV quickly adapt to varying conditions such as different locations, weather, and lighting.Deep Blue DOI
Subjects
Decision Making Reinforcement Learning Adversarial Attack Meta Reinforcement Learning
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.