Distributed, Intelligent Audio Sensing Enabled by Low-Power Integrated Technologies
Cho, Minchang
2020
Abstract
Distributed audio sensing is promising to bring full bloom of a variety of applications to improve human life. However, despite of the continued efforts, the state-of-the-art audio sensor node systems still remain at centimeter-scales in size, preventing true ubiquitous and unobtrusive deployment of them. Meanwhile, the silicon technology has been remarkably advanced, dictated by Moore's Law, and this enables a new opportunity to realize millimeter-scale of computing. In this dissertation, we explore a way to develop a millimeter-scale wireless audio sensor node system, by combining the integrated silicon technology, machine learning, and low-power circuit techniques. This dissertation first presents an audio processing IC that performs audio acquisition and compression, consuming 4.7uW. A new low-power compression algorithm and its accelerator consume only 1.5uW to provide 4-32x real-time audio compression. Newly designed custom 8Mb embedded NOR Flash enables seamless audio streaming by a ping-pong buffering scheme. Second, a picowatt-level standby power neural network processor is introduced for sensor applications. By combining custom instruction set architecture, compact SIMD microarchitecture, and ultra-low leakage SRAM memory, the processor consumes only 440pW of power at standby mode while achieves 400-GOPS/W of energy efficiency at active mode, which is suitable for modest neural network workloads on miniaturized sensor platforms. The proposed neural network processor is integrated in an acoustic object detection sensor system, and successfully demonstrates >90% of positive detection and <3% of false alarm for 5 acoustic targets detection. Next part of this dissertation is a voice and acoustic activity detector that uses a mixer-based architecture and ultra-low power neural network based classifier. By sequentially scanning 4 kHz of frequency bands and down-converting to below 500 Hz, feature extraction power consumption is reduced by 4x. The neural network processor employs computational sprinting, enabling 12x power reduction. The system also features inaudible acoustic signature detection for intentional remote silent wakeup of the system while re-using a subset of the same system components. The measurement results achieve 91.5%/90% speech/non-speech hit rates at 10 dB SNR with babble noise and 142 nW power consumption. Acoustic signature detection consumes 66nW, successfully detecting a signature 10 dB below the noise level. Finally, two generations of complete, fully functional energy-autonomous audio sensor nodes with millimeter-scale form factor are demonstrated. The systems use the proposed audio processing ICs and neural network processor integrated with a MEMS microphone, general-purpose microprocessor, 8Mb Flashes, RF transceiver with custom antenna, PV cells for energy harvesting and optical communication, and millimeter size batteries. The complete stand-alone systems achieve 1 hour (1st gen.) and 3.2 hours (2nd gen.) of continuous speech recording and energy-autonomous operation in room light. The research in this dissertation is believed to pave a way towards distributed, intelligent audio sensing and computing.Subjects
audio sensor node low power circuits voice activity detector neural network processor acoustic object detector audio compression
Types
Thesis
Metadata
Show full item recordCollections
Remediation of Harmful Language
The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.
Accessibility
If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.