In-SRAM Computing for Neural Network Acceleration

Eckert, Charles

In-SRAM Computing for Neural Network Acceleration

dc.contributor.author	Eckert, Charles
dc.date.accessioned	2023-01-30T16:10:18Z
dc.date.available	2023-01-30T16:10:18Z
dc.date.issued	2022
dc.date.submitted	2022
dc.identifier.uri	https://hdl.handle.net/2027.42/175620
dc.description.abstract	For decades, the computing paradigm has been composed of separate memory and compute units. Processing-in-Memory(PIM) has often been proposed as a solution to break past the memory wall. With PIM, compute logic is moved near the memory, which can reduce the data movement. In-memory computing expands on PIM by morphing the memory into hybrid memory compute units, where data can be stored and computed on in-place. Recent work has modified SRAM arrays to allow logical operations to be performed directly inside the arrays. Our work extends basic logical operations and additionally adds support for arithmetic operations. Coinciding with the rise of increasing memory on-chip and more focus on near and in-memory computing is the ascendance of neural networks. Neural networks are highly data-parallel applications that are challenging to accelerate due to being data-bound, compute-bound, or both. In-memory computation reduces on-chip data movement and will increase the amount of compute available as well as the amount of storage in a custom chip. These factors can greatly alleviate compute and data bottlenecks. First, this thesis observes that SRAM memory has increasingly dominated the on-chip area for general-purpose processors. This area comes at the cost of compute potential and can be repurposed to function as a dual storage and compute unit. The benefits of such repurposing are greatly expanding the parallel compute capability of the chip while also reducing the on-chip data movement, all with minimal area increase. When SRAM is repurposed, the storage area can be reclaimed with minimal overhead. Modifications to the SRAM arrays are presented that allow the SRAM to function as hybrid compute/storage units capable of arithmetic operations. Additionally, this work presents a mapping strategy for supporting CNNs in the hybrid SRAM storage compute arrays. Second, this thesis proposes a custom ASIC called Eidetic that utilizes hybrid compute/storage SRAM arrays as both its primary storage and compute units. Repurposing a processor's SRAM is hamstrung by maintaining the cache's original functions and area footprint. Too many modifications to the cache would render the solution undesirable to chip designers. By further customizing the SRAM we can create more efficient PE units. Additionally, the increased SRAM storage allows more weights to be stored on-chip. Finally, the custom ASIC allows for a control logic that supports a graph-based programming model that further reduces off-chip data movement. These customizations allow Eidetic to target data-bound applications such as RNNs and MLPs. Third, we propose a detailed comparison between the in-cache and ASIC approaches to ML acceleration. Between repurposing the cache and creating an SRAM-based custom ASIC, in-SRAM computing offers multiple viable approaches. In-Cache computing is cheaper but comes with limitations, while an ASIC design is more expensive due to the total cost of ownership (TCO). We compare the performance and energy efficiency of our repurposed cache with a server-class GPU and the baseline CPU. We similarly evaluate our custom ASIC to other state-of-the-art ASIC DNN accelerators. For both the repurposed cache and the ASIC, we develop cycle-accurate simulators to determine the performance.
dc.language.iso	en_US
dc.subject	Computer Architecture
dc.subject	Neural Network Accelerator
dc.subject	In-memory computing
dc.title	In-SRAM Computing for Neural Network Acceleration
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Computer Science & Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Das, Reetuparna
dc.contributor.committeemember	Sylvester, Dennis Michael
dc.contributor.committeemember	Mudge, Trevor N
dc.contributor.committeemember	Tang, Lingjia
dc.subject.hlbsecondlevel	Computer Science
dc.subject.hlbtoplevel	Engineering
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/175620/1/eckertch_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/6834
dc.identifier.orcid	0000-0002-8839-9890
dc.identifier.name-orcid	Eckert, Charles; 0000-0002-8839-9890	en_US
dc.working.doi	10.7302/6834	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: eckertch_1.pdf
Size:: 5.461MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe its collections in a way that respects the people and communities who create, use, and are represented in them. We encourage you to Contact Us anonymously if you encounter harmful or problematic language in catalog records or finding aids. More information about our policies and practices is available at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.