Algorithm-Architecture Co-Design for Domain-Specific Accelerators in Communication and Artificial Intelligence

Tao, Yaoyu

Algorithm-Architecture Co-Design for Domain-Specific Accelerators in Communication and Artificial Intelligence

dc.contributor.author	Tao, Yaoyu
dc.date.accessioned	2022-05-25T15:21:21Z
dc.date.available	2022-05-25T15:21:21Z
dc.date.issued	2022
dc.date.submitted	2022
dc.identifier.uri	https://hdl.handle.net/2027.42/172593
dc.description.abstract	The past decade has witnessed an explosive growth of data and the needs for high-speed data communications and processing. The needs continue to drive the development of new hardware for transmitting more data reliably and processing more data to obtain a higher level of intelligence. This thesis work explores algorithm-architecture co-design approach to derive efficient solutions for domain-specific communication and machine learning accelerators. It focuses on advanced and most compute-intensive accelerator designs for: 1) channel coding for data transmission, including polar codes and low-density parity-check (LDPC) codes, and 2) neural networks for machine learning, including differentiable neural computer (DNC) and neural ordinary differential equations (NODE). It also covers an interdisciplinary area of AI-aided communication, exploring DNC-aided flip decoding for polar codes. This work introduces a split-tree successive-cancellation list (SCL) decoder that works by dividing a polar code's decoding tree to sub-trees following a split-tree algorithm. Through algorithm-architecture co-optimizations, a 0.64mm2 40nm test chip implements a split-4, list-2, 8-frame-interleaved decoder that supports configurable code lengths up to 1024 bit and variable code rates. This work advances LDPC codes in two aspects: 1) we take the inspiration from simulated annealing to generalize the post-processor design using three methods: quenching, extended heating, and focused heating, each of which targets a different decoding error structure. The resulting post-processor is demonstrated to lower the error rate by two orders of magnitudes. 2) We also present a fully parallel decoder for a (160, 80) regular-(2, 4) NB-LDPC code over Galois field GF(64) in 65nm CMOS. The decoder employs fine-grained dynamic clock gating and decoder early termination to achieve a throughput of 1.22Gb/s and an energy efficiency of 3.03nJ/b. This work contributes to the hardware acceleration of DNC. We present HiMA, a tiled, history-based memory access engine with distributed memories in tiles. HiMA incorporates a traffic-aware multi-mode network-on-chip (NoC), an optimal submatrix-based memory partition and a two-stage usage sort method leveraging distributed tiles. We create a distributed DNC (DNC-D) to allow almost all memory operations to be applied to local memories. In a 40nm design, HiMA running DNC and DNC-D demonstrates 39.1x higher speed, 164.3x better area efficiency, and 61.2x better power efficiency over the state-of-the-art accelerators. This work contributes to the hardware acceleration of NODE for improved modeling capability of continuous-time events. We carry out algorithm-architecture co-design: 1) we propose adaptive neural activation sparsity for up to 80% complexity reduction while maintaining excellent training accuracy. 2) we develop a multi-mode PE design for NODE compute kernels with configurable interconnects between PEs to handle a variety of numerical ODE solvers. The hardware efficiency is further enhanced by exploring hardware reuse and hierarchical memory. Lastly, this work investigates an interdisciplinary area of AI-aided communication, applying DNC for flip decoding of polar codes. We develop new state and action encoding with a two-phase decoding flow. Simulation results show that proposed DNC-aided SCL-Flip (DNC-SCLF) decoding demonstrates up to 0.34dB coding gain improvement or 54.2% reduction in average number of decoding attempts compared to prior works. The five pieces of work presented in this thesis tackle hardware acceleration challenges in domain-specific computing using algorithm-architecture co-design techniques. The results of this work contribute to the developments of hardware accelerators in channel coding and deep learning, enabling next-generation communication and artificial intelligence from theory to efficient hardware.
dc.language.iso	en_US
dc.subject	Algorithm-architecture co-design, channel coding, machine learning, polar decoder, LDPC decoder, nonbinary LDPC decoder, post-processing
dc.subject	successive-cancellation, differentiable neural computer, neural ordinary differential equation, DNC-aided flip polar decoding
dc.title	Algorithm-Architecture Co-Design for Domain-Specific Accelerators in Communication and Artificial Intelligence
dc.type	Thesis
dc.description.thesisdegreename	PhD	en_US
dc.description.thesisdegreediscipline	Electrical and Computer Engineering
dc.description.thesisdegreegrantor	University of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeemember	Zhang, Zhengya
dc.contributor.committeemember	Mahlke, Scott
dc.contributor.committeemember	Flynn, Michael
dc.contributor.committeemember	Kim, Hun Seok
dc.subject.hlbsecondlevel	Electrical Engineering
dc.subject.hlbtoplevel	Engineering
dc.description.bitstreamurl	http://deepblue.lib.umich.edu/bitstream/2027.42/172593/1/taoyaoyu_1.pdf
dc.identifier.doi	https://dx.doi.org/10.7302/4622
dc.identifier.orcid	0000-0001-7500-5250
dc.identifier.name-orcid	Tao, Yaoyu; 0000-0001-7500-5250	en_US
dc.working.doi	10.7302/4622	en
dc.owningcollname	Dissertations and Theses (Ph.D. and Master's)

Files in this item

Name:: taoyaoyu_1.pdf
Size:: 8.937MB
Format:: PDF

View/Open

Dissertations and Theses (Ph.D. and Master's)

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.