High-Performance Process-in-Memory Architectures Design and Security Analysis
Wang, Ziyu
2024
Abstract
The performance of processor-centric von Neumann architectures is greatly hindered by data movement between memory and processor, especially when encountering data-intensive tasks. Memory-centric process-in-memory (PIM) architectures perform computations directly within the memory modules. Hence, the performance and energy penalty associated with data access can be mitigated by minimizing data movement and leveraging high internal bandwidth. In addition to the benefits in performance and energy efficiency, PIM architectures facilitate extensive computing parallelism and scalability, while also provide enhanced security resilience against bus-snoop attacks. PIM architectures have shown their capability in many machine learning applications. Nonetheless, effectively accommodating ultra-large deep neural network (DNN) models, like Transformer, remains an ongoing challenge, and with the continued adoption of PIM architectures, security and vulnerability issues are poised to become looming threats. This dissertation focuses on high-performance PIM architecture design for data-intensive applications. To facilitate PIM architecture design and security studies, the dissertation first proposes event-driven, cycle-accurate simulators and their implementations for PIM architectures based on dynamic random-access memory (DRAM) and resistive random-access memory (RRAM), along with how these simulators can be used for architecture design. The PIM-GPT architecture is then introduced, which offers high performance, high energy efficiency and end-to-end acceleration of GPT inference. PIM-GPT leverages DRAM-based PIM solutions to perform multiply-accumulate (MAC) operations on the DRAM chips, working together with an application-specific integrated chip (ASIC) which supports data communication and other necessary arithmetic computations. At the software level, the mapping scheme is designed to maximize data locality and computation parallelism by partitioning a matrix among DRAM channels and banks to utilize all in-bank computation resources concurrently. Overall, PIM-GPT achieves 41-137x, 631-1074x speedup and 123-383x and 320-602x energy efficiency over GPU and CPU baseline, respectively, on 8 GPT models. Two security and vulnerability investigations are then conducted on RRAM-based analog PIM architectures. These studies employ a dynamic power trace modeling approach at runtime, enabling efficient power and timing side-channel analysis. The susceptibility of PIM architectures to side-channel attacks is analysed. And the study reveals the possibility of extracting complete DNN model architectural information solely from power trace measurements, without prior DNN knowledge. Furthermore, another potential security vulnerability is identified, wherein an adversary can reconstruct a user's private input data through a power side-channel attack, given proper data acquisition and pre-processing. The study employs a machine learning-based attack approach utilizing a generative adversarial network (GAN) to enhance data reconstruction. Notably, these findings illustrate the effectiveness of specific attack methodologies in extracting DNN model structures and user inputs from analog PIM accelerator power leakage, even in the presence of substantial noise levels. Countermeasures against these side-channel attacks are also discussed. In light of these security challenges, there is a growing demand for hardware secure systems capable of providing robust solutions for identification, authentication, and protection against counterfeiting and unauthorized modifications. Physical unclonable functions (PUFs) emerge as a valuable technique for hardware root-of-trust. A PUF system built upon fingerprint-like random planar structures is developed, demonstrating compatibility with the back-end-of-line (BEOL) process and presenting promising potential as a hardware security primitive in the IoT industry. In the end, guiding principles and proposals for future work are deliberated, focusing on three key aspects: 1) hardware modeling and simulation of emerging PIM architectures; 2) hardware/software co-optimization for Transformer models; and 3) security and vulnerabilities in neuromorphic computing systems.Deep Blue DOI
Subjects
process-in-memory machine learning accelerator side-channel attack hardware security dynamic random-access memory resistive random-access memory
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.