Efficient and Dependable Deep Learning Systems
Latifi, Salar
2023
Abstract
Deep learning is overhauling a plethora of applications such as voice assistants, autonomous vehicles and driving assist technologies, and e-commerce. With such a huge impact on human daily life, researchers are pushing towards better deep learning models with higher quality and better performance. However, the trend of training bigger models, could lead to higher demands in the hardware resources to process and service them in an efficient and timely manner. While there are also significant efforts in the hardware domain to adapt with the future and more expensive models and applications, considering the at-scale inference performance while designing novel deep learning architectures could also have a vital impact on the efficacy and practicality of these deep learning models. On the other hand, Deep neural networks (DNNs) are also now starting to emerge in mission critical applications including autonomous vehicles and precision medicine. Therefore another important question is the dependability of DNNs and trustworthiness of their predictions. Considering the irreparable damage that can be caused by mispredictions, assessment of their potential misbehavior is necessary for safe deployment. In this research dissertation, I am aiming to tackle both of these problems. The goal is to optimize different deep learning applications with respect to the reliability of their predictions, and improve their inference performance by reducing their latency and energy requirements. In the first two parts of this dissertation, I focus on vision models which have a wide function in mission critical applications, such as self-driving cars and precision medicine, for their efficient and safe deployment. And in the final part, I will be mainly focusing on the performance and efficiency of the at-scale inference of recommendation systems, which are widely used in e-commerce and online advertisement, and are getting a significant attention due to their large financial impacts on the industry. In Chapter II, I will be focusing on the dependability of image classifiers. First, I characterize the modern image classifiers and explore the root reasons for the misclassification cases that they exhibit. Then, I analyze the traditional confidence threshold checking as a reliability metric, and show its deficiencies. And finally, I propose a heterogeneous solution based on modular redundancy to detect up to 50% of the mispredictions while preserving the original accuracy levels of the baseline model. In chapter III, I will be extending the learnings from classification space to object detection. I analyze the state of the art object detectors for different causes of unreliability. Next, I modify the modular redundancy based heterogeneous system proposed in Chapter II to adapt for the low-latency and high throughput requirements of the object detectors. In addition, a new fusion algorithm is also proposed to combine the predictions of individual modules with the goal of dependability and recovering the undetected objects of the baseline model. Finally in chapter IV, I will be focusing on designing hardware-aware and efficient Transformer architectures for the language modeling tasks. First, I will discuss the significant performance costs of the multi-head attentions. Then I will introduce the PLANER optimizer which takes an existing Transformer-based network and a user-defined latency target and automatically produces an optimized, sparsely-activated version of the original network that tries to meet the latency target while maintaining baseline accuracy.Deep Blue DOI
Subjects
Optimizing Deep Learning Applications Deep Learning Reliability Energy Efficiency and Latency Optimization
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.