High-Performance Process-in-Memory Architectures Design and Security Analysis

Wang, Ziyu

High-Performance Process-in-Memory Architectures Design and Security Analysis

Wang, Ziyu

2024

View/Open

ziwa_1.pdf

(19.9MB

PDF)

Abstract

The performance of processor-centric von Neumann architectures is greatly hindered by data movement between memory and processor, especially when encountering data-intensive tasks. Memory-centric process-in-memory (PIM) architectures perform computations directly within the memory modules. Hence, the performance and energy penalty associated with data access can be mitigated by minimizing data movement and leveraging high internal bandwidth. In addition to the benefits in performance and energy efficiency, PIM architectures facilitate extensive computing parallelism and scalability, while also provide enhanced security resilience against bus-snoop attacks. PIM architectures have shown their capability in many machine learning applications. Nonetheless, effectively accommodating ultra-large deep neural network (DNN) models, like Transformer, remains an ongoing challenge, and with the continued adoption of PIM architectures, security and vulnerability issues are poised to become looming threats. This dissertation focuses on high-performance PIM architecture design for data-intensive applications. To facilitate PIM architecture design and security studies, the dissertation first proposes event-driven, cycle-accurate simulators and their implementations for PIM architectures based on dynamic random-access memory (DRAM) and resistive random-access memory (RRAM), along with how these simulators can be used for architecture design. The PIM-GPT architecture is then introduced, which offers high performance, high energy efficiency and end-to-end acceleration of GPT inference. PIM-GPT leverages DRAM-based PIM solutions to perform multiply-accumulate (MAC) operations on the DRAM chips, working together with an application-specific integrated chip (ASIC) which supports data communication and other necessary arithmetic computations. At the software level, the mapping scheme is designed to maximize data locality and computation parallelism by partitioning a matrix among DRAM channels and banks to utilize all in-bank computation resources concurrently. Overall, PIM-GPT achieves 41-137x, 631-1074x speedup and 123-383x and 320-602x energy efficiency over GPU and CPU baseline, respectively, on 8 GPT models. Two security and vulnerability investigations are then conducted on RRAM-based analog PIM architectures. These studies employ a dynamic power trace modeling approach at runtime, enabling efficient power and timing side-channel analysis. The susceptibility of PIM architectures to side-channel attacks is analysed. And the study reveals the possibility of extracting complete DNN model architectural information solely from power trace measurements, without prior DNN knowledge. Furthermore, another potential security vulnerability is identified, wherein an adversary can reconstruct a user's private input data through a power side-channel attack, given proper data acquisition and pre-processing. The study employs a machine learning-based attack approach utilizing a generative adversarial network (GAN) to enhance data reconstruction. Notably, these findings illustrate the effectiveness of specific attack methodologies in extracting DNN model structures and user inputs from analog PIM accelerator power leakage, even in the presence of substantial noise levels. Countermeasures against these side-channel attacks are also discussed. In light of these security challenges, there is a growing demand for hardware secure systems capable of providing robust solutions for identification, authentication, and protection against counterfeiting and unauthorized modifications. Physical unclonable functions (PUFs) emerge as a valuable technique for hardware root-of-trust. A PUF system built upon fingerprint-like random planar structures is developed, demonstrating compatibility with the back-end-of-line (BEOL) process and presenting promising potential as a hardware security primitive in the IoT industry. In the end, guiding principles and proposals for future work are deliberated, focusing on three key aspects: 1) hardware modeling and simulation of emerging PIM architectures; 2) hardware/software co-optimization for Transformer models; and 3) security and vulnerabilities in neuromorphic computing systems.

Deep Blue DOI

https://dx.doi.org/10.7302/23046

Subjects

process-in-memory

machine learning accelerator

side-channel attack

hardware security

dynamic random-access memory

resistive random-access memory

Types

Thesis

Handle

https://hdl.handle.net/2027.42/193401

Metadata

Show full item record

Collections

Dissertations and Theses (Ph.D. and Master's)

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.