Show simple item record

Algorithm-Architecture Co-Design for Domain-Specific Accelerators in Communication and Artificial Intelligence

dc.contributor.authorTao, Yaoyu
dc.date.accessioned2022-05-25T15:21:21Z
dc.date.available2022-05-25T15:21:21Z
dc.date.issued2022
dc.date.submitted2022
dc.identifier.urihttps://hdl.handle.net/2027.42/172593
dc.description.abstractThe past decade has witnessed an explosive growth of data and the needs for high-speed data communications and processing. The needs continue to drive the development of new hardware for transmitting more data reliably and processing more data to obtain a higher level of intelligence. This thesis work explores algorithm-architecture co-design approach to derive efficient solutions for domain-specific communication and machine learning accelerators. It focuses on advanced and most compute-intensive accelerator designs for: 1) channel coding for data transmission, including polar codes and low-density parity-check (LDPC) codes, and 2) neural networks for machine learning, including differentiable neural computer (DNC) and neural ordinary differential equations (NODE). It also covers an interdisciplinary area of AI-aided communication, exploring DNC-aided flip decoding for polar codes. This work introduces a split-tree successive-cancellation list (SCL) decoder that works by dividing a polar code's decoding tree to sub-trees following a split-tree algorithm. Through algorithm-architecture co-optimizations, a 0.64mm2 40nm test chip implements a split-4, list-2, 8-frame-interleaved decoder that supports configurable code lengths up to 1024 bit and variable code rates. This work advances LDPC codes in two aspects: 1) we take the inspiration from simulated annealing to generalize the post-processor design using three methods: quenching, extended heating, and focused heating, each of which targets a different decoding error structure. The resulting post-processor is demonstrated to lower the error rate by two orders of magnitudes. 2) We also present a fully parallel decoder for a (160, 80) regular-(2, 4) NB-LDPC code over Galois field GF(64) in 65nm CMOS. The decoder employs fine-grained dynamic clock gating and decoder early termination to achieve a throughput of 1.22Gb/s and an energy efficiency of 3.03nJ/b. This work contributes to the hardware acceleration of DNC. We present HiMA, a tiled, history-based memory access engine with distributed memories in tiles. HiMA incorporates a traffic-aware multi-mode network-on-chip (NoC), an optimal submatrix-based memory partition and a two-stage usage sort method leveraging distributed tiles. We create a distributed DNC (DNC-D) to allow almost all memory operations to be applied to local memories. In a 40nm design, HiMA running DNC and DNC-D demonstrates 39.1x higher speed, 164.3x better area efficiency, and 61.2x better power efficiency over the state-of-the-art accelerators. This work contributes to the hardware acceleration of NODE for improved modeling capability of continuous-time events. We carry out algorithm-architecture co-design: 1) we propose adaptive neural activation sparsity for up to 80% complexity reduction while maintaining excellent training accuracy. 2) we develop a multi-mode PE design for NODE compute kernels with configurable interconnects between PEs to handle a variety of numerical ODE solvers. The hardware efficiency is further enhanced by exploring hardware reuse and hierarchical memory. Lastly, this work investigates an interdisciplinary area of AI-aided communication, applying DNC for flip decoding of polar codes. We develop new state and action encoding with a two-phase decoding flow. Simulation results show that proposed DNC-aided SCL-Flip (DNC-SCLF) decoding demonstrates up to 0.34dB coding gain improvement or 54.2% reduction in average number of decoding attempts compared to prior works. The five pieces of work presented in this thesis tackle hardware acceleration challenges in domain-specific computing using algorithm-architecture co-design techniques. The results of this work contribute to the developments of hardware accelerators in channel coding and deep learning, enabling next-generation communication and artificial intelligence from theory to efficient hardware.
dc.language.isoen_US
dc.subjectAlgorithm-architecture co-design, channel coding, machine learning, polar decoder, LDPC decoder, nonbinary LDPC decoder, post-processing
dc.subjectsuccessive-cancellation, differentiable neural computer, neural ordinary differential equation, DNC-aided flip polar decoding
dc.titleAlgorithm-Architecture Co-Design for Domain-Specific Accelerators in Communication and Artificial Intelligence
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineElectrical and Computer Engineering
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.contributor.committeememberZhang, Zhengya
dc.contributor.committeememberMahlke, Scott
dc.contributor.committeememberFlynn, Michael
dc.contributor.committeememberKim, Hun Seok
dc.subject.hlbsecondlevelElectrical Engineering
dc.subject.hlbtoplevelEngineering
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/172593/1/taoyaoyu_1.pdf
dc.identifier.doihttps://dx.doi.org/10.7302/4622
dc.identifier.orcid0000-0001-7500-5250
dc.identifier.name-orcidTao, Yaoyu; 0000-0001-7500-5250en_US
dc.working.doi10.7302/4622en
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.