## Ultra-Low Power Circuit Design for Miniaturized IoT Platform

by

## Wootaek Lim

A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering) in The University of Michigan 2018

Doctoral Committee:

Professor David Blaauw, Chair Professor Hun-Seok Kim Professor Jeffrey Scruggs Professor Dennis M. Sylvester Wootaek Lim

imhotep@umich.edu

ORCID iD : 0000-0002-8189-7838

© Wootaek Lim 2018

# To my everything mother, great father, only brother, and lovely Sook To my wife and life partner, Injung

# TABLE OF CONTENTS

| DEDICATION                                                                       | ii     |
|----------------------------------------------------------------------------------|--------|
| LIST OF FIGURES                                                                  | vi     |
| LIST OF TABLES                                                                   | ix     |
| ABSTRACT                                                                         | X      |
| CHAPTER 1 Introduction                                                           | 1      |
| 1.1 Miniaturized mm-scale Internet of Things (IoT)                               | 1      |
| 1.2 Outline of the Dissertation                                                  | 4      |
| CHAPTER 2 Battery-Less, sub-nW Cortex M0 <sup>+</sup> Processor with Dynamic Lea | ıkage- |
| Suppression Logic                                                                | 6      |
| 2.1 Introduction                                                                 | 6      |
| 2.2 Dynamic Leakage-suppression Logic                                            |        |
| 2.3 Battery-Less, sub-nW Cortex M0 <sup>+</sup> Processor                        | 16     |
| 2.4 Measurement Result                                                           | 19     |
| 2.5 Conclusion                                                                   |        |

| CHAPTER 3 A Sub-nW 80mlx-to-1.26Mlx Self-Referencing Light-to-Digital | Converter               |
|-----------------------------------------------------------------------|-------------------------|
| with AlGaAs Photodiode                                                |                         |
| 3.1 Introduction                                                      |                         |
| 3.2 Temperature-compensated and self-referencing LDC                  |                         |
| 3.3 Measurement Results                                               |                         |
| 3.4 Conclusion                                                        |                         |
| CHAPTER 4 A 380pW Dual Mode Optical Wake-up Receiver with Ambient     | Noise                   |
| Cancellation                                                          |                         |
| 4.1 Introduction                                                      |                         |
| 4.2 Wake-up Receiver Operation Including Noise Cancellation           |                         |
| 4.3 Analog Front End (AFE) Implementation                             |                         |
| 4.4 Digital Back End (DBE) Implementation                             |                         |
| 4.5 Measurement Results                                               |                         |
| 4.6 Conclusions                                                       |                         |
| CHAPTER 5 A 13.8pJ/b Implantable Optical Transmitter Achieving 65kbps | with < 10 <sup>-6</sup> |
| BER at 10mm Tissue Depth                                              |                         |
| 5.1 Introduction                                                      |                         |
| 5.2 Energy-efficient implantable optical transmitter                  |                         |
| 5.3 Measurement Result                                                |                         |

| CHAPTER 6 Conclusions and Future Directions | 66 |
|---------------------------------------------|----|
| 6.1 Conclusions                             | 66 |
| 6.2 Future Directions                       |    |
| BIBLOGRAPHY                                 |    |

# LIST OF FIGURES

| Figure 2.1. Issues with Battery-based WSN:                                                                       |
|------------------------------------------------------------------------------------------------------------------|
| Figure 2.2. Limited life time of the battery by the number of charge cycle                                       |
| Figure 2.3. Low leakage circuit techniques                                                                       |
| Figure 2.4. Power consumption of chain of inverters implemented with different style logics 10                   |
| Figure 2.5. Dynamic leakage-suppression (DLS) logic schematic                                                    |
| Figure 2.6. DLS Inverter Steady State                                                                            |
| Figure 2.7. Dynamic operation of DLS inverter when IN $0 \rightarrow V_{DD}$                                     |
| Figure 2.8. Static noise margin of DLS inverter and standard inverter                                            |
| Figure 2.9. Leakage current comparison across V <sub>DD</sub>                                                    |
| Figure 2.10. I <sub>ON</sub> /I <sub>OFF</sub> ratio of DLS inverter compared to stacked and standard topology   |
| Figure 2.11. Leakage current ratio of standard two-stacked inverter to DLS inverter                              |
| Figure 2.12. Worst threshold skewed condition for DLS NAND2                                                      |
| Figure 2.13. Transistor sizing methodology based on worst case $V_{TH}$ skew for DLS NAND2 16                    |
| Figure 2.14. $(\delta_{HL} \times \delta_{LH})^{0.5}$ plot seeking to co-optimize robustness in both transitions |
| Figure 2.15. Timing, leakage, and robustness results of DLS gates                                                |
| Figure 2.16. System block diagram with ARM® Cortex® M0+ implanted with DLS logic 18                              |
| Figure 2.17. Measured waveforms of the system running at 5Hz, powered by 0.09mm <sup>2</sup> solar cell.         |
|                                                                                                                  |

| Figure 2.18. Measured power consumption of the system across supply voltage                        |
|----------------------------------------------------------------------------------------------------|
| Figure 2.19. Measured power consumption across temperature at $V_{DD} = 0.55V$                     |
| Figure 2.20. Measured energy per operation and maximum operating frequency of processor 20         |
| Figure 2.21. Power consumption variation across 28 different chips                                 |
| Figure 2.22. Die photo of battery-less Cortex M0+ processor fabricated in 180nm CMOS 22            |
| Figure 3.1. Sensing modalities of wearable sensors                                                 |
| Figure 3.2. Basic operation principle of the proposed LDC                                          |
| Figure 3.3. Simulated results of DLS RO                                                            |
| Figure 3.4. Basic structure of the proposed LDC                                                    |
| Figure 3.5. Measured frequencies of two ROs                                                        |
| Figure 3.6. Different temperature dependencies of coefficients $\alpha$ and $\beta$                |
| Figure 3.7. The effect of PV cell shunt resistance and loading from the DLS RO to $V_{PV}$ drop 30 |
| Figure 3.8. Overall block diagram of the proposed LDC implementation                               |
| Figure 3.9. Measured log of C <sub>MEAS</sub> and RMS noise in code                                |
| Figure 3.10. Linearity error of LDC in two different modes                                         |
| Figure 3.11. Quantization error of C <sub>LE_CUM</sub> across the logging time                     |
| Figure 3.12. Light-to-digital conversion time and power consumption of LDC                         |
| Figure 3.13. C <sub>MEAS</sub> error across temperature at 50klx light intensity                   |
| Figure 3.14. Measured result of a 24-hr C <sub>LE_CUM</sub> with a sensor attached to a window     |
| Figure 4.1. Target application scenario of the proposed ULP optical receiver                       |
| Figure 4.2. The system architecture of the ULP wireless optical receiver                           |
| Figure 4.3. Always-on voltage mode operation of AFE 42                                             |
| Figure 4.4. Tunable unary-coded resistor bank                                                      |

| Figure 4.5. Light threshold and open circuit voltage of PV cell across SEL bit              | . 43 |
|---------------------------------------------------------------------------------------------|------|
| Figure 4.6. The detailed block diagram of the always-on voltage mode                        | . 44 |
| Figure 4.7. The timing diagram of two comparators of the always-on voltage mode             | . 45 |
| Figure 4.8. The block diagram of the fast RX current mode.                                  | . 46 |
| Figure 4.9. The schematic of trans-impedance amplifier with ambient light cancelling        | . 47 |
| Figure 4.10. Manchester Code and estimated period notation                                  | . 48 |
| Figure 4.11. The die photograph of the chip fabricated in 180nm CMOS                        | . 50 |
| Figure 4.12. Measured ambient light tracking (voltage/current mode) of the optical receiver | . 51 |
| Figure 4.13. Measured BER and energy per bit of fast RX current mode                        | . 52 |
| Figure 4.14. The light pattern to activate the fast RX current mode.                        | . 52 |
| Figure 5.1. Target design space for ultra-small implantable transmitter applications        | . 55 |
| Figure 5.2. The proposed system block diagram.                                              | . 56 |
| Figure 5.3. Optical TX system architecture.                                                 | . 57 |
| Figure 5.4. Measured average pulse rate of PMT with 10mm tissue.                            | . 58 |
| Figure 5.5. Symbol rate of PMT across symbol amplitude (LED power) with 10mm tissue         | . 58 |
| Figure 5.6. Energy per bit across data rate with 10mm tissue.                               | . 59 |
| Figure 5.7. Schematic of switched capacitor-based LED Driver                                | . 60 |
| Figure 5.8. Testing setup with chicken tissue.                                              | . 61 |
| Figure 5.9. Energy efficiency of proposed TX with different modulation scheme               | . 62 |
| Figure 5.10. Measured energy efficiency when using 8-ary PPM scheme for TX                  | . 62 |
| Figure 5.11. Measured BER and energy efficiency of 8-ary PPM.                               | . 64 |
| Figure 5.12. Energy efficiency and corresponding data rate at different tissue depth        | . 64 |

# LIST OF TABLES

| Table 2.1. Comparison with previous works in low power digital circuits.         | 22   |
|----------------------------------------------------------------------------------|------|
| Table 3.1. Summary of LDC performance and compares with prior work.              | 35   |
| Table 4.1. Performance summary & comparison with previous wake-up receiver works | . 53 |
| Table 5.1. Performance summary & comparison                                      | 65   |

## ABSTRACT

This thesis examines the ultra-low power circuit techniques for mm-scale Internet of Things (IoT) platforms. The IoT devices are known for their small form factors and limited battery capacity and lifespan. So, ultra-low power consumption of always-on blocks is required for the IoT devices that adopt aggressive duty-cycling for high power efficiency and long lifespan. Several problems need to be addressed regarding IoT device designs, such as ultra-low power circuit design techniques for sleep mode and energy-efficient and fast data rate transmission for active mode communication. Therefore, this thesis highlights the ultra-low power always-on systems, focusing on energy efficient optical transmission in order to miniaturize the IoT systems.

First, this thesis presents a battery-less sub-nW micro-controller for an always-operating system implemented with a newly proposed logic family. Second, it proposes an always-operating sub-nW light-to-digital converter to measure instant light intensity and cumulative light exposure, which employs the characteristics of this proposed logic family. Third, it presents an ultra-low standby power optical wake-up receiver with ambient light canceling using dual-mode operation. Finally, an energy-efficient low power optical transmitter for an implantable IoT device is suggested. Implications for future research are also provided.

# **CHAPTER 1**

# Introduction

#### **1.1 Miniaturized mm-scale Internet of Things (IoT)**

Researchers have paid a significant amount of attention to the IoT for the last couple of years. Some IoT devices have been standardized, such as a thermostat to save energy by controlling the temperature automatically [1] and a production-line monitoring sensor to indicate the output speed, mechanical defects, and fluctuations in temperature [2]. Continued progress in electrical engineering will lead to more advanced IoT technologies, which can be applied to smart homes, wearable devices, smart cities, and even connected cars [3]. Moreover, the connectivity of the IoT devices has been enhanced remarkably as communication technologies develop. The advent of Low-Power Wi-Fi, Low-Power Wide Area Networks (LPWAN) [4] will enable long-range communications among connected IoT devices. Accordingly, LPWAN are expected to cover the wireless connections among 700 million IoT devices by 2021 [5].

The continued scaling in semiconductor technology has decreased the cost of IoT devices and development of various IoT platforms. Among the numerous IoT platforms, a miniaturized, cubic millimeter-scale IoT device is a new type of platform that operates at extremely low power and can last for several months without replacing the battery. Moreover, both its size and cost will be significantly less compared to previous IoT devices. Because of these benefits, it can be used for a wide range of new applications, including continuously monitoring the temperature [6], humidity [7], illumination [8], motion [9], pressure [10], flow rate [11], and even gas concentrations [12]. The miniaturized IoT device can also be applied to implantable medical devices for surgeries with the benefit of minimizing incisions. If its size becomes smaller than the diameter of a needle, the miniaturized implantable device can be delivered by needle injection [13]. Furthermore, with its potential for low cost, the cubic-millimeter-scale IoT device can have broad use from health monitoring to effective maintenance of equipment in industry [14].

However, the rigorous size limit on the form factor volume restricts the size of the battery of IoT devices. Recent cubic millimeter-scale IoT devices have mostly relied on thin-film Li batteries [9], [15]. For example, the 0.76-mm<sup>3</sup> Li thin-film battery mentioned in [15] provides almost six orders of magnitude lower energy capacity than the capacity of one alkaline AA battery. Therefore, the average power consumption of a standard cubic millimeter-scale sensor node should be nW or sub-nW level to be sustainable for several months. In order to operate with the minimum level of energy and power, innovative low power design techniques have been developed (e.g., extensive standby power-oriented design, aggressive power duty cycling) [16], [17].

The power cycle of a miniaturized IoT device has two operation modes: active and sleep mode operation. The active mode operation activates for a short period at  $\mu$ W to sub-mW power level, being dominated by the operation of the processor, sensor interface, and data transmitter. Since cubic millimeter-scale IoT devices consume several orders of magnitude higher power during the active mode compared to the sleep mode, the energy dissipation during the active mode still dominates the total energy. Given the fixed amount of battery capacity, which is dictated by the minimum required battery size, the energy efficiency should be improved to maximize the operation of active mode, facilitating data collecting/converting, digital signal processing, and data transmission.

Reducing the power consumption in active mode operation is also imperative for the miniaturized IoT system. First, the internal resistance of a millimeter-scale battery increases rapidly as the battery size becomes smaller, affecting the functionality of the system and the overall energy efficiency. For instance, a sub-mm<sup>3</sup> Li thin film battery typically has an internal resistance of 7 k $\Omega$  [18], and this causes 700 mV battery voltage drop from the 4.2 V nominal output voltage with 100  $\mu$ A load current. This voltage drop leads to operation failure of the system or worse performance. The internal resistance increases as the battery is recharged repeatedly, ultimately making the battery-based operation less sustainable. Furthermore, its high internal resistance causes the dissipation of energy from the battery.

The miniaturized IoT system mostly operates in sleep mode. The sleep mode operation power is generally nW to sub-nW level and is determined by the leakage of the power-gated blocks, static data-retentive random-access memory, and the power consumption of the always-on blocks. Since a miniaturized IoT system adopts an aggressive duty-cycle operation, it is critical to reduce the static power consumption of the sleep mode operation to secure high energy efficiency of the system. A standby power-oriented design can significantly extend the lifetime of an intermittently activated sensor system. Furthermore, minimizing the leakage current contributes to reducing the amount of required harvesting power. If the power from the energy harvesting source is sufficient to run the ultra-low power always-on blocks, the operation of the system might continue permanently, and the size of the required harvesting source could be minimized. Therefore, an ultra-low power system is an essential component for battery-based miniaturized IoT devices. In addition, novel ultra-low power circuit techniques should be developed to minimize the energy per operation and the power consumption of the miniaturized IoT device.

#### **1.2 Outline of the Dissertation**

The dissertation is composed of 6 chapters presenting an ultra-low power processor and a light-to-digital converter based on the novel logic family, optical wake-up receiver, and optical transmitter for implantable application.

In chapter 2, a novel logic family for a sub-nW always on circuit is presented. A new logic implementation, referred to as dynamic leakage-suppression logic (DLSL) that consumes 10fW active power per gate, makes two orders of magnitude improvement over recently published work. Power is reduced through a super-cut-off feedback mechanism, and minimum power is achieved at 350-to-550mV supply voltage. This supply voltage range eliminates the need for ultra-low voltage operation, which increases robustness. It also allows the circuits to be directly connected to various harvesting sources without DC-DC conversion. DLSL is used to implement a Cortex M0+ processor that consumes 295pW, which is the lowest reported to date for a microcontroller. We show full functionality across –5 to 65°C and demonstrate autonomous operation when powered by a 0.09mm2 solar cell in room lighting (240lx).

In chapter 3, always-on, ultra-low power light-to-digital converter is proposed. The proposed light-to-digital converter measures 1.9x wider light intensity range than previous design and 7,200X lower power consumption. The proposed LDC uses the unique property of dynamic leakage suppression logic ring oscillator (DLS RO), i.e., that its frequency exponentially decays with increasing supply voltage and its power is extremely low (10s of pW). By using two DLS ROs directly powered by the photodiode, the LDC generates 11-bit output code, which is linear

with light intensity in the log-log domain. The proposed circuit also accumulates the light energy in 32-bit code with 0.38% resolution error, consuming 550pW power by digital control block.

In chapter 4, it presents a sub-nW standby power optical receiver with ambient light noise canceling. The proposed optical receiver uses dual-mode operation in order to achieve both ultralow standby power and energy-efficient fast data reception. At distance below 0.3m, the noise canceling circuit enables the receiver to operate with BER  $<10^{-4}$  over a wide range (3 orders of magnitude) of ambient light conditions. The best energy efficiency of the receiver, 112.25pJ/bit is achieved at the maximum bit rate of 250kps bit rate.

In chapter 5, an implantable optical transmitter (TX) is presented to transfer data from low power minimally invasive implants. The proposed optical TX extracts the clock from an 850nm optical signal sent from the external reader, and it encodes data with pulse position modulation generating LED current with a switched capacitor-based current source and current multiplier. When paired with a photo-multiplier tube receiver, the TX consumes 0.89µW at 65kbps (13.8pJ/bit).

Finally in chapter 6, the summary of contributions and an on-going work is presented.

# **CHAPTER 2**

# Battery-Less, sub-nW Cortex M0<sup>+</sup> Processor with Dynamic Leakage-Suppression Logic

## 2.1 Introduction

Recent low-voltage design techniques have enabled dramatic improvements in miniaturization and lifetime of wireless sensor nodes [19]–[21]. These systems typically use a secondary battery to provide energy when the sensor is awake and operating; the battery is then recharged from a harvesting source when the sensor is asleep. In these systems, the key requirement is to minimize energy per operation of the sensor. This extends the number of operations on one battery charge and/or reduces the time to recharge the battery between awake cycles. This requirement has driven significant advances in energy efficiency [19], [20] and standby power consumption [21].

Figure 2.1 shows the issues related with the battery as the size shrinks to sub-mm<sup>3</sup>. The capacity per volume of the battery decreases super-linearly as the size becomes smaller, necessitating frequent recharging from a harvesting source of the wireless sensor node system. The internal resistance of battery increases over  $1k\Omega$  when the battery decreases in size below  $1mm^3$ . Batteries also suffer from limited endurance (e.g., 2.5k discharge cycles limiting lifetime to 1.5



Figure 2.1. Issues with Battery-based WSN:

(a) Capacity per volume (b) internal resistance of different size batteries.

months with a 30 min wakeup period) as shown in Figure 2.2. In addition, the scalability challenges in the sub-5 mm range due to sealing requirements of the battery [22].

This work therefore focuses on a battery-less sensor system that operates directly from the energy harvesting source. In these systems, power is consumed as it is obtained, and hence the key requirement is to limit the maximum power draw, thereby reducing the size of the required harvesting source. While significant advances have been made in low power systems [23], the minimum power draw per logic gate remains in the 1 - 30pW range, resulting in 10s of nW

consumed by a microcontroller. This in turn requires a relatively large harvesting source, limiting the ability to scale a sensor system to true miniature sizes (e.g., an 4mm<sup>2</sup> solar cell @240 lux is needed to produce 30nW [24]). Note that reducing supply voltage further in these systems is ineffective since they become leakage power dominated. Robustness concerns also often limit voltage scalability.

### 2.2 Dynamic Leakage-suppression Logic

This work proposes a new logic implementation, referred to as dynamic leakagesuppression logic (DLSL) that consumes 10fW active power per gate, marking two orders of magnitude improvement over recently published work. Power is reduced through a super-cut-off feedback mechanism, and minimum power is achieved at 350 – 550mV supply voltage. This supply voltage range eliminates the need for ultra-low voltage operation, which increases robustness. It also allows the circuits to be directly connected to various harvesting sources without DC-DC conversion. DLS logic is used to implement a Cortex M0+ processor that consumes 295pW, which is the lowest reported to date for a microcontroller. We show full functionality



Figure 2.2. Limited life time of the battery by the number of charge cycle.



Figure 2.3. Low leakage circuit techniques.

(a) Transistor stacking (b) negative body-biasing (c) Schmitt trigger logic across –5 to 65°C and demonstrate autonomous operation when powered by a 0.09mm<sup>2</sup> solar cell in room lighting (240lux). While the operating speed is low (~ 15Hz), the processor never needs to sleep and computational performance is sufficient for control operation and continuous compression of data from an 8-bit sensors at 11 samples/minute, employing a 10 instructions/bit compression algorithm.

Previously various circuit techniques have been existed to lower the power consumption by reducing the leakage. Figure 2.3 shows three circuit techniques to reduce the leakage. Figure 2.3 (a) shows transistor stacking technique, which needs more transistors and special sizing low  $V_{DD}$  operation. Figure 2.3 (b) shows negative body biasing, which needs bias generation circuit for each bias and triple well for PMOS biasing. Figure 2.3 (c) shows the technique recently developed by [23], which enables very low supply voltage (62mV) operation. However, the power of this logic gate is mostly consumed by leakage quenching path shown as the two red dashed lines. Figure 2.4 shows the power consumed by the chain of inverters implemented with the techniques described in Figure 2.3, while operating at 1-kHz frequency. As shown in Figure 2.4, the power consumption of the chain of stacked inverters is  $3 \times 10$  wer than the normal inverter chain at 0.25V supply voltage. The power consumed by the chain of inverters with negative body-biasing is  $4 \times 10$  lower than the normal inverter chain at  $0.25V V_{DD}$ , while the chain of Schmitt trigger based inverter shows same power consumption as the normal inverter. At 65mV supply voltage, inverter chain implemented with the Schmitt trigger-based logic can operate and consumes less amount of power of the inverter chain designed by other two techniques at  $0.25V V_{DD}$ . Therefore, in order to develop the system operating at lowest power, new logic gates should have lowest leakage as well as the signal integrity at low operating voltage.

Figure 2.5 (a) shows a basic structure of DLS logic. Figure 2.5 (b) shows the DLS inverter schematic. The output voltage of the gate is fed back to the bottom PMOS and top NMOS, placing all leaking transistors in a super-cutoff state. In Figure 2.6, the steady state of DLS inverter of IN



Figure 2.5. Dynamic leakage-suppression (DLS) logic schematic

(a) Feedback structure to reduce leakage (b) DLS inverter



Figure 2.6. DLS Inverter Steady State.

= 0 or IN =  $V_{DD}$  case when  $V_{DD}$  = 400mV. When IN = 0, the leakage current is contributed by the pull-down logic  $M_{NB}$  and  $M_{PB}$ . Since the gate of  $M_{PB}$  is connected to a high OUT voltage, node *n2* settles to roughly  $V_{DD}/2$ , placing both  $M_{NB}$  and  $M_{PB}$  into super-cutoff. The same dual super-cutoff effect occurs with  $M_{NT}$  and MPT when IN =  $V_{DD}$ .

During dynamic operation of DLS logic, the output node transitions using the leakage currents of the top and bottom transistors, which are in initially in super-cutoff ( $M_{NT}$  for rising and  $M_{PB}$  for falling output transitions). In Figure 2.7, the simulated timing waveform of DLS inverter when input IN changes from 0 to  $V_{DD}$ . As IN transitions from 0V to  $V_{DD}$ ,  $M_{NB}$  switches from super-cutoff to weak-inversion and starts to equalize the voltage of *n2* and OUT. This has two effects: 1)  $M_{PB}$  switches from super-cutoff to a traditional cutoff bias point. 2) As  $M_{NB}$  pulls *n2* up, OUT is also being discharged, as is n1 to some degree. This causes  $M_{NT}$  and  $M_{PT}$  to become super-cutoff, sharply reducing the leakage from  $V_{DD}$  to OUT. At the same time, the leakage through  $M_{PB}$ , which is no longer super-cutoff, continues to pull OUT low, further suppressing the leakage of  $M_{NT}$  and  $M_{PT}$  and accelerating the overall discharge of OUT.

Due to this super-cutoff feedback effect, DLS logic naturally has different rising and falling switch points, resulting in hysteresis and a 1.45× increase in static noise margin over a standard CMOS inverter as shown in Figure 2.8.



Figure 2.7. Dynamic operation of DLS inverter when IN  $0 \rightarrow V_{DD}$ .



Figure 2.8. Static noise margin of DLS inverter and standard inverter.

strong super-cutoff effect that increases with higher  $V_{DD}$ . In comparison, in stacks of two NMOS or PMOS transistors the intermediate node settles to ~20mV at  $V_{DD}$  =0.4V, resulting in a much weaker super-cutoff effect that is independent of  $V_{DD}$  to first order. The strong super-cut-off effect



Figure 2.9. Leakage current comparison across V<sub>DD</sub>.



Figure 2.10. I<sub>ON</sub>/I<sub>OFF</sub> ratio of DLS inverter compared to stacked and standard topology. also increases the I<sub>ON</sub>/I<sub>OFF</sub> ratio compared to a stacked topology, improving static robustness as shown in Figure 2.10. In Figure 2.11, leakage current ratio (simulated) of standard two-stacked



Figure 2.11. Leakage current ratio of standard two-stacked inverter to DLS inverter.

inverter to DLS inverter at different corner across temperature is shown. The improvement ranges from 125× at SS to 425× at FS corner.

Since DLS logic operates using leakage currents, it is sensitive to threshold voltage shifts due to process variations. Figure 2.12 shows the worst threshold skewed condition of DLS NAND2 gate for each transistor, while transition of the output node OUT from 0 to V<sub>DD</sub>. In simulation, the transistor threshold voltages are skewed by an increasing value  $\delta$  until the gate fails to transition. As all the threshold voltages of transistors in DLS NAND2 are skewed in worst manner and the amount of skew increases from  $2\sigma_{VTH}$  to  $5\sigma_{VTH}$  ( $\sigma_{VTH}$  is the standard deviation of threshold voltage for the target CMOS process), OUT and *n1* nodes are failed to transition to supply voltage as expected.

Figure 2.13 shows how transistor sizing for optimal robustness is performed for a DLS 2input NAND gate. Figure 2.13 shows the value of  $\delta$  (normalized to  $\sigma_{VTH}$  for the target CMOS process) found for different sizing combinations of M<sub>NT</sub> and M<sub>PB</sub>. As mentioned, the critical point



Figure 2.12. Worst threshold skewed condition for DLS NAND2.



Figure 2.13. Transistor sizing methodology based on worst case  $V_{TH}$  skew for DLS NAND2. in the transition occurs when transistors  $M_{NT}$  and  $M_{PB}$  flip between cutoff and super-cutoff, while the internal transistors controlled by A and B are in weak-inversion. Hence, the robustness of the gate principally depends on the sizing of  $M_{NT}$  and  $M_{PB}$  and all internal transistors can be set to minimum size to reduce input loading. Increasing  $W_{PB}$  and decreasing  $W_{NT}$  improves the OUT falling transition while degrading its rising transition (top contour plots in Fig 3). Since the actual  $W_{NT}$  and  $W_{PB}$  design space is limited by standard cell pitch,  $M_{NT}$  is laid out in a multi-finger configuration. The  $(\delta_{HL} \times \delta_{LH})^{0.5}$  plot seeks to co-optimize robustness in both transitions and the selected design point exceeds  $\delta = 4 \delta_{VTH}$  to ensure robust switching as shown in Figure 2.14.

## 2.3 Battery-Less, sub-nW Cortex M0<sup>+</sup> Processor

The Cortex M0+ processor is synthesized with a DLS standard-cell library containing an inverter, 2-input NAND and NOR, and a D flip-flop with asynchronous reset. The standard cells are verified with 500k Monte-Carlo simulation and Figure 2.15 presents the key timing, leakage, and robustness results.



Figure 2.14.  $(\delta_{HL} \times \delta_{LH})^{0.5}$  plot seeking to co-optimize robustness in both transitions.

A 32-bit RISC ARM Cortex M0+ processor is implemented using the DLS library, including an on-chip clock generator, address decoder and 128B latch-based (for low  $V_{DD}$ 

## 500k Monte Carlo Simulation Result Leakage power and robustness of DLS gates <sup>†</sup>rise & fall transition failure

| @ V <sub>DD</sub> = 0.4V <sup>#D</sup> -Q write failure |          |       |             |                  |  |  |
|---------------------------------------------------------|----------|-------|-------------|------------------|--|--|
|                                                         | Inverter | NAND2 | D flip-flop |                  |  |  |
| μ(fW)                                                   | 2.4      | 2.7   | 6.3         | 51.4             |  |  |
| σ/μ                                                     | 0.28     | 0.2   | 0.24        | 0.34             |  |  |
| # of failure                                            | 0†       | 0†    | 0†          | 21 <sup>††</sup> |  |  |

| Timing i | nformation | of DLS | D flip | -flop |
|----------|------------|--------|--------|-------|
|----------|------------|--------|--------|-------|

| @ V <sub>DD</sub> = 0.4V *1FO4 = 0.7ms |            |           |           |           |
|----------------------------------------|------------|-----------|-----------|-----------|
|                                        | Setup time | Hold time | C-Q delay | D-Q delay |
| μ(FO4)                                 | 2.72       | -0.88     | 3.62      | 6.34      |
| σ/μ                                    | 0.42       | 0.37      | 0.39      | 0.4       |

Figure 2.15. Timing, leakage, and robustness results of DLS gates.

robustness) instruction and data memories [25]. All logic is implemented using a standard-cell approach with fully automatic place & route. The latch-based memory has DLS logic-based read-in/out path and a negative-edge write scheme to ensure timing violation-free write operation. In Figure 2.16, the system block diagram is shown. The measured waveforms in Figure 2.17 show the execution of program on the M0<sup>+</sup> processor.



Figure 2.16. System block diagram with ARM® Cortex® M0+ implanted with DLS logic.



Figure 2.17. Measured waveforms of the system running at 5Hz, powered by 0.09mm<sup>2</sup> solar cell.

## 2.4 Measurement Result

As expected, leakage power is dominant at the operating frequency (~15Hz) of this system as shown in Figure 2.18. The lowest functional operating voltage is 0.16V. As the supply voltage increases from 0.16V (with a fixed frequency), the power consumption decreases exponentially due to the stronger super-cutoff effect. The minimum power consumption is 295pW, occurring at 0.55V. At higher V<sub>DD</sub>, DIBL effects and p-n junction diode leakage cancel out the increased supercutoff effect. Dynamic power increases quadratically with V<sub>DD</sub> as expected, reaching 13.5% percent of the total power at V<sub>DD</sub> = 1.15V.

The system is fully functional across -5 to  $65^{\circ}$ C as shown in Figure 2.19. Due to the high temperature sensitivity of subthreshold current, total power increases exponentially from 50pW at  $-5^{\circ}$ C to 4.4nW at 65°C. Due to the relatively constant operating frequency, energy per operation



Figure 2.18. Measured power consumption of the system across supply voltage.



Figure 2.19. Measured power consumption across temperature at  $V_{DD} = 0.55V$ 

follows the power consumption curve with a minimum of 44.7pJ/instruction at the minimum power point,  $V_{DD} = 0.55V$  as shown in Figure 2.20. Figure 2.21 gives the power distribution across



Figure 2.20. Measured energy per operation and maximum operating frequency of processor.



Figure 2.21. Power consumption variation across 28 different chips.

28 different dies, with sigma/mean of 6.35%.

To highlight its extremely low power consumption, the core was operated when powered directly by a 0.09mm<sup>2</sup> bulk Silicon solar cell. Powered only by this solar cell, the processor operates at 12Hz at 0.32V, consuming 970pW during program execution. The minimum light intensity required for the solar cell to successfully power the processor is 240 lux, which is equivalent to dim indoor light. Table 2.1 compares the proposed low power processor to prior work in ultra-low power digital systems, showing an 80× improvement in active power per gate. Figure 2.22 includes the test chip die photo. M0<sup>+</sup> processor size is 1.19mm<sup>2</sup> (0.96mm×1.24mm) and latch-based memory is 0.39mm<sup>2</sup> (0.54mm×0.72mm). The 0.09mm<sup>2</sup> (0.3mm×0.3mm) solar cell is also fabricated in the same process.

### 2.5 Conclusion

Dynamic leakage-suppression logic is presented to enable ultra-low power battery-less operation of wireless sensing systems. A 32-bit Cortex M0<sup>+</sup> processor and 256B memory are

implemented using this logic family and consume 295pW at 0.55V. The processor operates from a 0.09mm<sup>2</sup> bulk silicon solar cell under 240 lux, similar to indoor light conditions.

| r                               |                      |                       |                        |                                    |                         |
|---------------------------------|----------------------|-----------------------|------------------------|------------------------------------|-------------------------|
|                                 | This work            | [18]                  | [19]                   | [20]                               | [23]                    |
| Technology                      | 0.18µm               | 0.18µm                | 0.13µm                 | 0.18µm                             | 0.13µm                  |
| Architecture                    | ARM Cortex M0+       | 16-bit FFT processor  | 8-bit processor        | ARM Cortex M0                      | 8-bit multiplier        |
| Operating Voltage               | 160mV ~ 1.15V        | 180mV ~ 1.8V          | 450mV ~ 900mV          | 600mV                              | 62mV ~ 1.2V             |
| Operating Frequency             | 2Hz ~ 15Hz           | 164Hz ~ 5.5MHz        | 40kHz ~ 3MHz           | 160kHz ~ 330kHz                    | 5kHz ~ 650kHz           |
| Minimum Operating<br>Power      | 295pW<br>@550mV, 2Hz | 90nW<br>@180mV,164Hz  | 100nW<br>@450mV, 40kHz | 3.3μW<br>@600mV, 1kHz <sup>†</sup> | 17.9nW<br>@62mV, 5.2kHz |
| Gate Count                      | 40,800 <sup>††</sup> | 156,750 <sup>††</sup> | 23,120                 | 35,240                             | 775                     |
| Active Power<br>Per Gate        | 7.23fW @550mV        | 574fW @180mV          | 4.32pW @450mV          | 93.6pW @600mV                      | 2.31pW @62mV            |
| Minimum Energy<br>Per Operation | 44.7pJ/inst @550mV   | 151nJ/FFT @350mV      | 2.8pJ/inst @350mV      | 17.2pJ/inst @260mV                 | 1.31pJ/inst @260mV      |
| Area                            | 2.04mm <sup>2</sup>  | 5.46mm <sup>2</sup>   | 0.90mm <sup>2</sup>    | 1.7mm <sup>2</sup>                 | 0.04mm <sup>2</sup>     |

Table 2.1. Comparison with previous works in low power digital circuits.

<sup>†</sup>Operating frequency at minimum reported power of the system.

<sup>++</sup>Gate Count is obtained by dividing total number of transistors by 4 for CMOS and 6 for DSL, based on the number of transistors of 2-input NAND gate.



Figure 2.22. Die photo of battery-less Cortex M0+ processor fabricated in 180nm CMOS.

# **CHAPTER 3**

# A Sub-nW 80mlx-to-1.26Mlx Self-Referencing Lightto-Digital Converter with AlGaAs Photodiode

## 3.1 Introduction

Wearable sensors are increasingly common and continue to grow more diverse in their sensing modalities, ranging from glucose to heart rate monitoring as shown in Figure 3.1. One compelling sensing modality for wearable sensors is cumulative light exposure. For instance,



Copyright MotionX® by Fullpower®

Figure 3.1. Sensing modalities of wearable sensors.

reduced light exposure exacerbates depression, while high levels of sunlight exposure are well known as the primary cause of skin cancer [26]–[30]. Several investigations have shown that exposure to light of specific wavelengths or intensity may induce severe damage to the retina [31]–[33]. In order for wearable sensors to determine how much light radiation the skin or retina is receiving, a light-to-digital converter (LDC) is needed to both measure instantaneous light intensity and record the total accumulated light energy.

LDCs targeting wearable devices will have tight constraints on area (cost) and power. At the same time, the wide dynamic range of light (~10<sup>6</sup>) from 0.11x to 100klx places difficult performance requirements on the LDC. Previous LDCs integrate photodiode current and use a trans-impedance amplifier followed by ADC conversion [34], [35], resulting in continuous power consumption up to 100s of  $\mu$ W, which is not sustainable for wearable devices with small batteries. Recently, a low-power LDC based on logarithmic digital-to-resistance conversion was proposed for heart rate monitoring [36], but its input dynamic range is too small for light energy monitoring. An alternate approach converts light intensity to frequency [37]–[41], which is then measured with a frequency-to-digital converter. However, this approach consumes significant power at strong light levels as oscillation frequency increases linearly with light intensity. Also, this approach requires an accurate reference timer circuit that may not be available during sleep/standby modes common in wearable devices.

## 3.2 Temperature-compensated and self-referencing LDC

This work presents a LDC based on the unique property of a dynamic leakage suppression inverter (DLS) [42], i.e., that its frequency exponentially decays with increasing supply voltage. In addition, since DLS ring oscillator (RO) power is extremely low (10s of pW), the oscillator can be directly powered by the photodiode without significantly impacting the photodiode open-circuit voltage, making the oscillator insensitive to system supply voltage. Photodiode open circuit voltage ( $V_{OC}$ ) is a log function of light intensity while DLS RO frequency is an exponential function of supply voltage. Hence, the resulting frequency is linear with light intensity in the log-log domain. By combining two of these oscillators, we present a temperature-compensated and self-referencing LDC covering a very wide light intensity range (80mlx to 1.26Mlx) with < 0.069% linearity error and < 1.52% deviation error due to temperature variation (-20 to 80°C). The LDC has 0.95s conversion time and consumes 550pW when measuring typical sunlight intensity (100klx).
Figure 3.2 explains the conversion method, which relies on the following observations: (1)  $V_{OC}$  of the photodiode logarithmically increases with incident power; (2) DLS RO frequency is an exponentially decaying function of supply voltage; and (3) RO current is small enough that the loaded photodiode voltage ( $V_{PV}$ ) can be approximated as  $V_{OC}$ . In a DLS RO, the current switching the output is subthreshold leakage from the super-cutoff header or footer. As oscillator supply voltage rises, the header/footer gate-source voltages become more negative, enhancing the super-cutoff effect and exponentially increasing inverter delay. Further, the super cut-off operation results in a total current draw from the 7-stage DLS RO of < 50pA at 25°C across the 0.3-to-0.45V



Figure 3.2. Basic operation principle of the proposed LDC.

supply voltage range. Figure 3.3 shows the current draw from the 7-stage DLS RO across possible supply voltage range at different temperatures. And Figure 3.3 (b) shows the  $V_{PV}$  drops from  $V_{OC}$  by the load current given at different temperatures. Figure 3.3 (b) clearly shows that the loaded  $V_{PV}$  can be approximated as  $V_{OC}$  if we guarantee the size of photodiode is large enough to neglect the current draw from DLS RO. Therefore,  $V_{PV}$  closely approximates  $V_{OC}$  and by connecting  $V_{PV}$  directly to the DLS RO supply voltage, the log of its frequency takes on a linear relationship with the log of light intensity, which allows measurement of light intensity over a very wide dynamic range of light.

The LDC uses a two-oscillator strategy in which one oscillator (DLS  $RO_{REF}$ ) serves as a reference and determines when the other oscillator (DLS  $RO_{MEAS}$ ) is sampled, producing output code  $C_{MEAS}$ . Both DLS ROs are connected to  $V_{PV}$  and therefore require different supply voltage to frequency characteristics. This is achieved by using different body bias connections for the two ROs. In Figure 3.4, basic structure of the proposed LDC is shown. By connecting the body of the



Figure 3.3. Simulated results of DLS RO.

(a) Current draw from the 7-stage DLS RO (b) VPV drops from VOC by the load current

PMOS footer to  $V_{PV}$  in RO<sub>REF</sub> and to ground in RO<sub>MEAS</sub> as shown in Figure 3.4, the two ROs have different decay constants,  $\beta_{REF}$  and  $\beta_{MEAS}$  ( $\beta_{REF} > \beta_{MEAS}$ ) and the final measured code is an exponential function of  $V_{PV}$  (equation shown in Figure 3.4). In Figure 3.5, measured frequencies of two ROs are shown across operating supply voltage. Since the ROs operate directly from  $V_{PV}$ and have no connection to the system supply voltage, line sensitivity is negligible.

However, the temperature sensitivity of the two ROs needs to match to achieve a temperature-compensated  $C_{MEAS}$ . Three factors impact the temperature sensitivity of  $C_{MEAS}$ : (1) the ratio of  $\alpha_{MEAS}$  to  $\alpha_{REF}$  over temperature, (2) the dependence of  $\Delta\beta = \beta_{REF}/\beta_{MEAS}$  on temperature, and (3) V<sub>PV</sub> dependence on temperature. These dependencies are simulated and plotted in Figure 3.6 and show both P<sub>TAT</sub> and C<sub>TAT</sub> behavior. Hence, sizing is used to cancel these opposite effects,



Figure 3.4. Basic structure of the proposed LDC.



Figure 3.5. Measured frequencies of two ROs.

resulting in a  $C_{MEAS}$  error within 1.3% across -20 to 80°C (simulated) as shown in Figure 3.6. In addition, sizing is also used to maximize  $\Delta\beta$  in order to minimize resolution error.

This work uses an AlGaAs photodiode, which offers higher power conversion efficiency (21%) than Si or GaAs photodiodes along with a  $V_{OC}$  of > 0.65V at lowlight (10lx) conditions.



Figure 3.6. Different temperature dependencies of coefficients  $\alpha$  and  $\beta$ .

These characteristics allow for high dynamic range in measurement, including extremely low (sublx) light levels. Figure 3.7 shows that the innate PV cell shunt resistance and loading from the DLS oscillators only result in deviation from the ideal log dependence of  $V_{PV}$  on light intensity at sub-100mlx levels.

Figure 3.8 shows the overall LDC implementation, which includes a PMOS diodeconnected divider circuit to shift V<sub>PV</sub> into the desired DLS RO operating voltage range for high light levels. A FSM changes the divider state (3/7 or 2/7) when output code under/overflow is detected. Overlapping light ranges of 80mlx to 800lx (3/7 ratio) and 5lx to 1.26Mlx (2/7 ratio) provide hysteresis and avoids frequent state toggling. Level converters are used to convert from the low voltage swing of oscillators to the full voltage counters. Every measurement period, the 11b C<sub>MEAS</sub> and RO<sub>MEAS</sub> phase information bits are converted to CLE, which is light energy (lx\*s, linear scale) using a 23b unsigned fixed-point (Q11.12) log function and a 25b unsigned fixedpoint (Q19.6) exponential function, guaranteeing 1lx precision. Constants A and B are found after



Figure 3.7. The effect of PV cell shunt resistance and loading from the DLS RO to V<sub>PV</sub> drop.



Figure 3.8. Overall block diagram of the proposed LDC implementation.

obtaining the  $\alpha$ -ratio and  $\Delta\beta$  from simple DLS ROs measurements. The measured light energy can be directly read out or accumulated over longer periods of time using a 32b output register (CLE\_CUM). All digital blocks operate at D<sub>VDD</sub> = 0.7V and are clocked from a separate 70Hz DLS RO clock source. To reduce leakage power, all logic is synthesized with 3V I/O high-V<sub>TH</sub> cells.

#### **3.3 Measurement Results**

The LDC is fabricated in 180nm CMOS and Figure 3.9 shows the LDC can measure across a very wide light intensity range from 80mlx to 1.26Mlx. Figure 3.11 shows a small linearity error of < 0.069% of two modes. Quantization error of  $C_{LE\_CUM}$  is a function of logging time where the error drops below 1% for accumulated light energy measurements over a few seconds, which is acceptable for wearable applications. Under low light conditions (3/7 division mode) the quantization error is slightly smaller due to the higher dividing ratio, covering a narrower light intensity range than the 2/7 mode as shown in Figure 3.10.

Accuracy (RMS noise) is below 3 codes as shown in Figure 3.9. The LDC has 0.95s conversion time and consumes 550pW when measuring 100klx light intensity as shown in Figure 3.12. In Figure 3.13, output code stability is shown and it is <1.52% across temperature at 50klx. Figure 3.14 also shows the result of a 24-hr  $C_{LE_CUM}$  measurement with a sensor attached to a window. Table 3.1 summarizes LDC performance and compares with prior work, showing a >10<sup>3</sup>



Figure 3.9. Measured log of C<sub>MEAS</sub> and RMS noise in code.

reduction in power consumption and lowest conversion energy with high dynamic range and resolution.



Figure 3.11. Linearity error of LDC in two different modes.



Figure 3.10. Quantization error of  $C_{LE\_CUM}$  across the logging time.



Figure 3.12. Light-to-digital conversion time and power consumption of LDC.



Figure 3.13. C<sub>MEAS</sub> error across temperature at 50klx light intensity.



Figure 3.14. Measured result of a 24-hr  $C_{LE\_CUM}$  with a sensor attached to a window.

|                   | [25]                 | [26]                   | [27]                     | [28]                 | This Work                 |
|-------------------|----------------------|------------------------|--------------------------|----------------------|---------------------------|
| Conversion Method | VGA +<br>12b ADC     | Integrating<br>16b ADC | DRC+LIQAF                | AMP + PFM            | DLS OSC.                  |
| Dynamic Range     | 10 <sup>6.92</sup>   | 10 <sup>4.82</sup>     | 10 <sup>2.94</sup>       | 10 <sup>6.30</sup>   | <b>10</b> <sup>7.20</sup> |
| Resolution        | 0.75mlx <sup>1</sup> | 15mlx                  | <b>0.5%</b> <sup>2</sup> | 0.15mlx <sup>3</sup> | 0.38% <sup>4</sup>        |
| Measure Time      | 800ms                | 90ms                   | N/R                      | N/R                  | 7.5s                      |
| Power             | 5.4μW                | <b>210</b> μW          | 4μ <b>W</b>              | 2mW                  | 550pW                     |
| Conversion Energy | 4.32μJ               | 1.638µJ                | N/R                      | N/R                  | 4.13nJ                    |

Table 3.1. Summary of LDC performance and compares with prior work.

1. Minimum resolution for range from 0.75mlux to 3.07lux (1.2nW/cm<sup>2</sup> ~ 4.914 $\mu$ W/cm<sup>2</sup>), estimated from the assumption that the luminous efficacy for 505nm LED source is 60lm/W. 2. Resolution error when I<sub>PV</sub> = 4nA.

3. Minimum measurable input light intensity  $(0.5 \text{nW/cm}^2)$  with  $550 \text{Hz/}\mu\text{W/cm}^2$  sensitivity, estimated from the assumption that the luminous efficacy for 850nm LED source is 30lm/W. 4. 32-bit C<sub>LE\_CUM</sub> (cumulative light energy) resolution error for 7.5 second logging time at 100klx light intensity.

#### 3.4 Conclusion

The proposed light-to-digital converter measures 1.9x wider light intensity range than previous design and 7,200X lower power consumption. The proposed LDC uses the unique property of dynamic leakage suppression logic ring oscillator (DLS RO), i.e., that its frequency exponentially decays with increasing supply voltage and its power is extremely low (10s of pW). By using two DLS ROs directly powered by the photodiode, the LDC generates 11-bit output code, which is linear with light intensity in the log-log domain. The proposed circuit also accumulates the light energy in 32-bit code with 0.38% resolution error, consuming 550pW power by digital control block.

## **CHAPTER 4**

# A 380pW Dual Mode Optical Wake-up Receiver with Ambient Noise Cancellation

#### 4.1 Introduction

As the size of the sensor node becomes smaller, it is hard to connect to a sensor node by wire after being deployed to communicate with the sensor nodes. Therefore, one of the key components in a wireless sensor node is a wake-up receiver. The main function of the wake-up receiver is the initial programming after sensor node assembly, rescue of the sensor node by reprogramming and resynchronizing timers for timing sensitive node-to-node RF communication. And it can also be used to restart a sensor node from a power shutdown due to poor harvesting conditions or when the CPU requires a hard reset. An always-on wake-up receiver enables near-zero standby power and asynchronous wake-up triggered by external interrupt signals. An always-on wake-up receiver has a clear advantage over a timer-based duty-cycling scheme where long latency for synchronization is inevitable.

Wake-up receivers for wireless sensor nodes have a number of unique requirements. Since the wakeup receiver is the lifeline to the sensor node, it should remain on at all times. Hence, minimizing standby power of the wake-up receiver is critical. Also for applications that require very small size, such as implantable sensors, the wake-up receiver should have scalability to very small size. Finally, reasonable data rate is required to shorten the programming time.

RF-based approaches have been widely adopted [43]–[47] to realize ultra-low power (ULP) wake-up receivers because these approaches have high throughput. But the RF frequency oscillator and/or RF amplifier limit their energy efficiency to several nJ/bit with relatively high standby power consumption typically on the order of 10s of  $\mu$ W. Also, the antenna size is difficult to scale to sub mm<sup>2</sup> size.

Recently, an ultrasound solution was proposed [48], but it also exhibits high standby power compared to the nW power budgets in small sensor nodes. This approach also requires a piezoelectric transducer to be bonded to the CMOS die, complicating system integration.

The optical wake-up receivers were recently proposed [49] that use free-space, line-ofsight optical communication. They can scale to very small size and have ultra-low, sub-nW power consumption. Also, they can be applied to implantable sensor system using a near-infrared photodiode. However, its low data rate of ~100 bps was limited by the use of bandwidth-inefficient pulse-width modulation. Furthermore, it does not address the need of optical wake-up receivers to compensate for highly variable ambient lighting conditions, which can vary from 100s of lux to 100klux (three orders of magnitude).

#### 4.2 Wake-up Receiver Operation Including Noise Cancellation

This work presents an ULP wireless optical receiver that enables sub-nW asynchronous node wake-up, high data rate data communication, and ambient background light tracking. The receiver consumes 380pW in always-on wake-up (i.e., standby) mode and 28.1µW in fast RX mode at 80kbps maximum data rate. In order to achieve both ultra-low standby power and high

data rates, the proposed receiver employs dual mode operation: 1) Voltage mode for passcode verification: the photodiode voltage is used as the input signal and is directly sensed by a clocked comparator and digital demodulation logic to avoid power hungry analog components. This voltage mode is used for wake-up passcode verification with ultra-low power consumption at low bit-rate. The purpose of the passcode verification is to prevent the system from waking up by false trigger due to the noise. 2) Current mode for fast RX: the diode current is used as the input signal and is sensed by a trans-impedance amplifier to achieve a high bit rate at higher power consumption. This current mode for fast RX is activated only after successful passcode verification.

Passcode verification (i.e., always-on voltage mode) operates at 5bps maximum bit rate for a 3.2 second long verification process with a 16-bit passcode consuming 80pJ per bit. The fast RX current mode exhibits 250kbps maximum bit rate at 112.25pJ/bit energy consumption; this mode enables fast node programming and data communication. Both passcode verification mode and fast RX mode support a flexible data rate that is dynamically tracked by the clock recovery algorithm employed in the proposed receiver.

A visible light optical receiver is known to be vulnerable to in-band noise from various ambient light sources such as sunlight, incandescent, and fluorescent lighting, resulting in inferior bit rate and sensitivity. Addressing this critical challenge, this work proposes noise canceling circuity in both the digital voltage based passcode verification mode and the current based analog fast RX mode, improving bit rate and input sensitivity while enabling operation across 0.3 - 100klux background light conditions.

Figure 4.1 shows a target application scenario of the proposed ULP optical receiver. Using the photodiode as a signal receptor, we can wake-up, (re-)program, and re-synchronize the timer with other wireless sensor nodes with LED transmitter attached at the ceiling or smartphone. We



Figure 4.1. Target application scenario of the proposed ULP optical receiver.

use the parasitic photodiode as a signal receptor, providing much smaller size compared to a typical inductor/antenna for RF receivers. This enables miniaturization of the entire optical receiver system to the millimeter scale. The reacting spectrum of the photodiode includes visible light. Therefore, it enables use of ubiquitously available LED lights (e.g., ceiling lights and smartphone flash) to wake-up, communicate, and (re-)program the wireless sensor node with minimal investment in infrastructure.

Figure 4.2 shows the system architecture. The ULP wireless optical receiver consists of two parts: analog front-end (AFE) and digital back-End (DBE). The main comparator (MAIN\_COMP) in AFE and related circuits are power-gated (turned-off) when not in use. DBE is always powered-on; it runs off BASE\_CLK by default, and DBE transitions its mode to use MAIN\_CLK when needed. Right after power-on-reset is released, the optical receiver goes into voltage mode, where



Figure 4.2. The system architecture of the ULP wireless optical receiver.

AFE only enables base comparator (BASE\_COMP) and base clock (BASE\_CLK), and DBE is clocked by BASE\_CLK. Using BASE\_CLK, the proposed optical receiver constantly samples incoming light signal. Once the optical receiver detects the passcode while in the voltage mode, it transitions to fast RX current mode, where AFE enables MAIN\_COMP and MAIN\_CLK, and DBE is clocked by MAIN\_CLK. Since MAIN\_CLK has a much faster frequency, so the receiver throughput becomes significantly enhanced. Once it becomes fast RX current mode, user has to send the passcode again, and then send actual data bits (i.e., header, address, data, etc).

The proposed ULP wireless optical receiver oversamples the incoming light signals. Hence, the incoming light pattern frequency must be much slower than BASE\_CLK (when in always-on voltage mode) or MAIN CLK (when in fast RX current mode).

#### 4.3 Analog Front End (AFE) Implementation

The passcode verification (i.e., wake-up) process consists of detecting a predefined preamble sequence (i.e. passcode) and matching the on-off keying (OOK) Manchester coded signal to the expected 16-bit passcode. The fast RX current mode block is power-gated until a



Figure 4.3. Always-on voltage mode operation of AFE.

valid passcode is successfully verified by the always-on passcode verification circuits. Upon verification, control signals disconnect the diode from the passcode verification block and turn on the fast RX current mode block to activate high data rate communication. The fast RX mode extracts both the clock and data from the Manchester coded input signal. Since the clock is embedded in the input signal itself, the clock frequency of the signal source does not have to be strictly synchronized with the ULP receiver local clock and can operate from 0.85 to 80kbps.

In always-on voltage mode, the input optical signal is detected by comparing the diode voltage to a reference voltage ( $V_{REF}$ ) as shown in Figure 4.3. To adapt to different ambient light levels and avoid saturation of the diode voltage, the diode is loaded with a tunable unary-coded resistor bank that consists of 36 off-state medium  $V_T$  ( $MV_T$ ) transistors with geometric growth as



Figure 4.4. Tunable unary-coded resistor bank.



Figure 4.5. Light threshold and open circuit voltage of PV cell across SEL bit.

shown in Figure 4.5 This guarantees a monotonic increase of the resistor value with the selection code (SEL) while covering a background light level equivalent to 1nA to 375nA short circuit current of the photodiode (equivalent to a range of 250lux to 93klux for a 0.1mm<sup>2</sup> n+/pw/nw parasitic diode in 180nm CMOS). The use of off-transistors for diode loading was previously shown to provide steep light/voltage response [49], thereby improving sensitivity. In Figure 4.5, the light intensity threshold change in 1x across the selection codes (SEL) of the tunable resistor bank (simulated). And the steepness of light/voltage response of the proposed circuit across the light intensity (simulated) is also provided in Figure 4.5.

The detailed block diagram of the always-on voltage mode is shown in Figure 4.6. The diode voltage is compared to the reference voltage using two comparators, one for ambient light tracking and one for pass-code detection. The clocked comparators are designed with 3V I/O transistors using a PMOS input pair suitable for input voltage levels < 500mV. The voltage reference circuit uses two zero-V<sub>T</sub> NMOS transistor in series with a  $SV_T$  always-on-state PMOS [50]. The 50Hz clock is generated using an always-on leakage-based differential thyristor



Figure 4.6. The detailed block diagram of the always-on voltage mode.

oscillator [51] with 50pW power consumption. Ambient light tracking and pass-code verification operate in parallel. The ambient light tracking logic searches for the SEL code that has a 50% probability of  $D_{OUT\_AMB} = 1$  as shown in Figure 4.6 and biases the diode output voltage exactly at  $V_{REF}$ . Since the underlying modulation is OOK, the light threshold for data detection must reside above the ambient light level. Hence, after the ambient light tracker has tested the light level and updated the SEL code, the code is increased by a fixed value (2 in our implementation) for subsequent data signal detection. It reverts back to the original code before the next ambient light tracking adjustment. The ambient light condition is assumed to be slowly varying, allowing the

ambient comparator to be clocked every 256 cycles (CLK\_AMB) as shown in Figure 4.7. Since the ambient comparator and data comparator operate with different SEL settings for the resistor bank, they cannot fire at the same time. Hence, Figure 4.7 shows how the ambient comparison is interspersed between data comparisons (data comparator fires on falling edge of CLK and ambient comparator fires on falling edge of CLK\_AMB, which corresponds to the rising edge of CLK). This allows the resistor bank to be reused for both comparators (avoiding mismatch issues if two banks are used) while maintaining constant data sampling rate, which is critical for clock/data recovery. Passcode verification is activated only after detecting three consecutive falling edges followed by a '101' sequence, which trains the internal clock recovery. After successfully verifying the next 16-bit passcode, three signals (power\_gate\_off, oscillator\_on, and digital\_reset



Figure 4.7. The timing diagram of two comparators of the always-on voltage mode.



Figure 4.8. The block diagram of the fast RX current mode.

as shown in Figure 4.6) are enabled sequentially with sufficient delay to power up and properly initialize the fast RX mode logic.

The proposed receiver is robust against process variation because the CLK frequency does not have to be accurate as long as it provides sufficient oversampling (e.g., 8×) with respect to the passcode data rate. All passcode verification block logic uses 3V I/O transistors to achieve ultralow leakage and standby power.

In Figure 4.8, the concept of the current mode operation is described. Instead of using photodiode voltage as the signal to be decoded, the current mode uses the photodiode current which is proportional to the light signal intensity by adding trans-impedance amplifier to the photodiode. This configuration can achieve a much higher data rate since the photodiode voltage is held constant which eliminate the impact of the intrinsic capacitance. The analog front-end circuitry of the fast RX current mode block employs a trans-impedance amplifier (TIA) with a DC noise canceling feedback circuit that amplifies the current signal and converts it to rail-to-rail voltage. In Figure 4.8 shows the block diagram of the fast RX current mode. The AFE for fast RX current mode consists of three amplifiers as shown in Figure 4.9: 1) the light diode regulation amplifier regulates  $V_{DIODE}$  to the  $V_{REF}$  from the passcode verification block, independent of the noise level, 2) the ambient cancelation amplifier enforces the DC value of the  $V_{SIG}$  node to be the same as the

 $V_{SIG_{REF}}$  and generates an adequate gate voltage to sink the DC ambient current to ground, 3) the post amplifier consists of current mirror-based differential amplifiers in cascaded fashion to compare  $V_{SIG}$  with  $V_{SIG_{REF}}$  and amplify their difference to full rail. During the fast RX current mode, CDR circuit is clocked by a current-starved ring oscillator at 700 kHz. This CDR circuit takes the voltage input data stream from the TIA block and generates the decoded Data and CLK with a flexible data rate (i.e., CLK rate) ranging from 0.85 kHz to 80 kHz. The diode regulation and ambient cancelation amplifiers have an identical two-input cascaded structure, which uses the same three biases generated by the on-chip voltage reference generator. However, these two amplifiers are sized differently to meet their gain and bandwidth requirements. Decoupling capacitors ( $C_1 \& C_2 = 3.2pF$ ) are inserted for feedback loop stability and also for controlling the bandwidth of the noise cancelation. During fast RX current mode, the feedback loop cancels out



Figure 4.9. The schematic of trans-impedance amplifier with ambient light cancelling.

ambient light noise if the noise spectrum is narrower than the loop bandwidth, designed to be 100Hz.

#### 4.4 Digital Back End (DBE) Implementation

Digital back end circuit takes Manchester encoded output stream from the AFE described in the previous section and decode the stream to recovered GOC\_CLK and GOC\_DATA as shown in Figure 4.2. In Figure 4.10, Manchester encoded stream and corresponding bit information is shown. In order to send 'bit 1', there should be "no intense light" for the first half period followed by "intense light" during the second half period. In order send 'bit 0', there should be "intense light" for the first half period, followed by "no intense light" during the second half period.

The DBE has the moving average filter to minimize the effect from noise. It stores the last 8 samples, and compares the number of "Light (i.e., cmp = 1)" samples with the value specified by GOC\_MAVG\_THRESHOLD. If the number of "Light (i.e., cmp = 1)" samples in the last 8 samples exceeds the value specified by GOC\_MAVG\_THRESHOLD, it sets the output *mavg* to 1, which is then fed into the matched filter. Otherwise, *mavg* is set to 0. The matched filter has a 16-bit shift register which is fed by the moving average output (i.e., *mavg*). At every clock edge, the pattern stored in the shift register is compared with the pre-defined pattern. Basically, this is





48

an XNOR operation, where it is regarded as "matched" when the corresponding locations in the shift register and the pre-defined pattern have the same value. Then, the number of "matched" locations is compared with the threshold value specified by GOC\_MF\_THRESHOLD. If it exceeds the value specified by GOC\_MF\_THRESHOLD, it sets its output (MF) to 1, which is fed into FSM for further processing.

The proposed DBE oversamples the incoming light pattern. There is no fixed number regarding the oversampling ratio, since it can automatically adjust to the incoming light period. However, there is an upper/lower limit on the oversampling ratio that user can use. The oversampling ratio is calculated based on the clock frequency of the proposed ULP optical receiver. The lower limit of the oversampling ratio is governed by the number of bits in the shifter register in DBE. Thus, it can be 16 in the ideal situation; however, it is recommended to be 20 in reality due to noise/jitter concern. The upper limit of the oversampling ratio is governed by the time-out feature implemented in DBE.

'Consistency' is much more important than the oversampling ratio, since the proposed receiver trains itself using the incoming light pattern to extract the period information. If the period varies significantly during a training transaction, it might result in synchronization errors. In order to avoid the synchronization error, the following condition must be met.

$$\frac{|T(n)-T(n-1)|}{T(n-1)} < \frac{1}{2^{(GOC\_TRAIN\_MAX\_ERROR+1)}}$$
(4.1)

, where n is an integer denoting each occurrence of the symbols (i.e., "bit 1" and "bit 0" in Figure 4.1). GOC\_TRAIN\_MAX\_ERROR is predefined value to guarantee the minimum required consistency by the DBE and T(n) is the current period detected by the DBE. T(n-1) is the estimated period information internally stored in DBE. T(n-1) is updated at the end of every period using the equation below.

$$T(n+1) = \frac{3}{4}T(n-1) + \frac{1}{4}T(n)$$
(4.2)

Although the condition given in (4.1) must be always met, the user put a much stricter constraint on the period variation. More variation in the period will increase the chance of detection errors.

#### 4.5 Measurement Results

The proposed design is fabricated in  $0.18\mu m$  CMOS with total area of  $0.85mm^2$ . The die photograph of the fabricated chip is shown in Figure 4.11. Figure 4.12 shows the measured waveform of SEL[35:0] and V<sub>CTRL</sub> signals as the ambient light intensity changes. As expected, SEL[35:0] toggles in steady state and V<sub>CTRL</sub> tracks the ambient light immediately to cancel the DC noise current. Measured bit error rates (BER) are shown in Figure 4.13 for the fast RX mode using a 3W LED source operating across an ambient light intensity ranging from low office light (500 lux) to full sun (100 klux). The DC noise canceling feedback circuit limits the BER increase to  $3.4\times$  across 500lux to 100klux at 0.5m TX-RX distance. Minimum required incident power on the



Figure 4.11. The die photograph of the chip fabricated in 180nm CMOS.

0.01mm<sup>2</sup> PV diode is 900nW @ 850nm wavelength. This enables transmit distance of 25m with a standard 3W LED. The proposed receiver achieves its maximum energy efficiency of 112.5pJ/bit at the highest achieved bit rate, 250kbps as shown in Figure 4.13.

The light pattern to activate the fast RX current mode from always-on voltage mode is shown in Figure 4.14. 'Header/Address/Data in fast RX mode' part means initial programming bit sequence. In Figure 4.14,  $T_{BASE}$  indicates the period used in voltage mode, and  $T_{MAIN}$  indicates the period used in current mode. Table 4.1 compares the proposed free-space optical receiver to prior



Figure 4.12. Measured ambient light tracking (voltage/current mode) of the optical receiver.

work, showing  $1.8 \times$  improvement in standby power,  $2700 \times$  faster bit rate, and  $1.25 \times$  higher energy efficiency.



Figure 4.13. Measured BER and energy per bit of fast RX current mode.



Figure 4.14. The light pattern to activate the fast RX current mode.

|                 | [34]                      | [35]              | [36]              | [37]              | This Work         |
|-----------------|---------------------------|-------------------|-------------------|-------------------|-------------------|
| Transmit Method | RF                        | RF                | RF                | Optical           | Optical           |
| Technology      | 90nm                      | 180nm             | 130nm             | 180nm             | 180nm             |
| Supply Voltage  | 0.5V                      | 1.8V              | 1.2V              | 1.2V              | 1.2V              |
| Power           | 52μW                      | 8.5µW/1078µW      | 16.4μW/22.9μW     | 695pW             | 380pW/28.1μW      |
| Max. Data Rate  | 100kpbs                   | 1k/200kbps        | 10k/200kbps       | 91bps             | 5bps/250kbps      |
| Energy/Bit      | 0.52nJ/bit                | 8.5/5.39nJ/bit    | 1.64/0.11nJ/bit   | 140pJ/b           | 80/112.25pJ/b     |
| BER             | @-72dBm <10 <sup>-3</sup> | <10 <sup>-6</sup> | <10 <sup>-5</sup> | <10 <sup>-5</sup> | <10 <sup>-5</sup> |

Table 4.1. Performance summary & comparison with previous wake-up receiver works.

#### 4.6 Conclusions

This work presents a sub-nW standby power optical receiver with ambient light noise canceling. The proposed optical receiver uses dual-mode operation in order to achieve both ultralow standby power and energy-efficient fast data reception. At distance below 0.3m, the noise canceling circuit enables the receiver to operate with BER  $<10^{-4}$  over a wide range (3 orders of magnitude) of ambient light conditions. The best energy efficiency of the receiver, 112.25pJ/bit is achieved at the maximum bit rate of 250kps bit rate. Compared with the prior ultra-low power optical receiver, it shows 1.8X improvement in standby power, 800X faster bit rate with 1.52X more energy efficiency.

### **CHAPTER 5**

# A 13.8pJ/b Implantable Optical Transmitter Achieving 65kbps with < 10<sup>-6</sup> BER at 10mm Tissue Depth

#### 5.1 Introduction

The next generation of implantable medical devices focuses on minimally invasive miniaturized solutions and ultra-low power operation. Traditional wireless implantable systems have used radio frequency (RF) communication. However, RF efficiency and link distance degrade super-linearly as antenna size shrinks with system size. For instance, > 10mm link distance requires an antenna size of at least 1mm<sup>2</sup> and power consumption of 10s of mW, significantly constraining battery lifetime of miniaturized systems [52]–[54]. Recently, ultrasound implantable communication has been proposed using piezoelectric transducers, offering relatively low propagation loss [55]–[57]. By operating at MHz frequencies the transducer can be reduced in size to mm-scale, however power consumption remains in the 100mW range and energy efficiency is relatively low at hundreds of pJ/bit.



Figure 5.1. Target design space for ultra-small implantable transmitter applications.

This work proposes an optical transmitter for ultra-small implantable applications with the target design space shown in Figure 5.1. Optical communication has the advantage that it uses an LED as the transmitter element; LEDs can be easily scaled to < 0.1mm2 while maintaining high efficiency [58], [59]. Unlike RF and ultrasound systems, which require custom-designed RF antennas and transducers, the proposed optical transmitter uses a commercially available LED. We propose the use of a photo-multiplying tube (PMT) as the receiver element which, with its capability of detecting a single photon, has extreme sensitivity thereby reducing transmit power. Furthermore, in contrast to RF and ultrasound, optical transmission does not require the generation (and precise control) of a carrier frequency, thus significantly reducing the power of the transmit electronics and greatly simplifying the signal modulation circuits. We build on these advantages of optical communication by using an energy-efficient pulse-position based modulation scheme. By optimizing the modulation point, the system achieves 3bits/pulse, increasing energy efficiency by 2.42× compared to a conventional binary pulse position modulation (PPM) with 10mm tissue. The proposed transmitter was implemented in 180nm CMOS technology, uses a  $0.22 \times 0.27$ mm<sup>2</sup>



Figure 5.2. The proposed system block diagram.

transmit LED, and achieves 13.8pJ/bit and 890nW power consumption at 65kbps with BER  $< 10^{-6}$  and a link distance of 10mm tissue / 5cm free space.

#### 5.2 Energy-efficient implantable optical transmitter

The system architecture consists of an optical reader device (ORD) and subcutaneous implanted sensor as shown in Figure 5.2. An 850nm wavelength IR-LED on the ORD transmits to a custom AlGaAs photodiode on the sensor node for downlink communication. The implanted sensor node demodulates both data and the embedded clock from the downlink [8]. Using the harvested clock, the sensor node then transmits to the ORD using a 450nm blue LED, creating channel separation between transmit and receive in a frequency division full duplex fashion. The PMT receiver on the ORD uses the photoelectric effect to amplify and detect individual photons achieving extremely high receiver sensitivity. For each detected photon the PMT generates a ~10ns pulse. The resulting pulse rate is proportional to LED light intensity and uplink information bits are then decoded by translating the observed PMT pulse rate to transmitted LED pulse position.



Figure 5.3. Optical TX system architecture.

Figure 5.3 shows the structure of the optical TX system of the sensor. Using the clock CLK<sub>EXTR</sub> extracted from the downlink signal, the optical-TX encodes data and then generates the LED current using a switched capacitor-based current source and controllable current multiplier. A key design parameter that affects the TX transmission efficiency is the LED power when transmitting a data symbol (or pulse). Figure 5.4 shows the pulse rate generated by the PMT as a function of LED power when transmitting through 10mm of tissue. Ideally, the pulse rate is linear with LED power since an increase in power should generate proportionally more photons and hence higher PMT pulse rate. However, at very low power levels, the LED efficiency degrades and the graph shows non-linear behavior. Similarly, at very high power levels, the photon rate becomes so high that the PMT pulses overlap and the pulse rate saturates. Both conditions degrade energy efficiency and therefore the aim is to operate the TX in the linear portion of the graph in Figure 5.4.



Figure 5.4. Measured average pulse rate of PMT with 10mm tissue.

Various combinations of pulse width and amplitude result in distinct photon rate distributions at the PMT. Given the PMT measured pulse rate shown in Figure 5.4 and assuming a Poisson distribution of photon arrival time interval, the bit error rate (BER) is computed band



Figure 5.5. Symbol rate of PMT across symbol amplitude (LED power) with 10mm tissue.



Figure 5.6. Energy per bit across data rate with 10mm tissue.

shown in Figure 5.5 for different symbol widths (shown on y-axis as 1/symbol rate) and symbol amplitude (shown as  $P_{LED}$ ). As expected, higher BER requires a longer symbols (lower symbol rate) or higher symbol amplitude ( $P_{LED}$ ). Then, by integrating the LED power over the symbol length, we obtain energy per information bit as shown in Figure 5.6, assuming simple binary-PPM. The energy optimal symbol shape has a clear dependency on the BER and also depends on transmit distance. For example, for BER =  $10^{-5}$ , near-energy optimal operation of 20pJ/b is achieved for data rates at 300kbps with 5.66µW LED transmit power. As expected, as we lower the target BER, optimal energy/bit increases and the energy efficient data rate decreases as well. Since the TX has different optimal energy points depending on the target BER, link distance, and/or tissue thickness, the LED power and symbol rate of the TX needs to be adapted for each specific operating scenario.

The LED driver consists of a switched-capacitor current source, current mirror and modulation circuit as shown in Figure 5.7. Operation is performed in two non-overlapping phases: During  $\phi 2$ , the capacitor array is discharged while the voltage on C<sub>1</sub> is forced to the reference



Figure 5.7. Schematic of switched capacitor-based LED Driver.

voltage,  $V_{REF}$ , by the amplifier. Then, during  $\phi 1$ , the output of the operational amplifier is constant and charges the capacitor array and C<sub>1</sub> with a constant current via the mirror M<sub>3</sub> and M<sub>4</sub>. In steadystate, the voltages on both C<sub>1</sub> and capacitance array at the end of  $\phi 1$  will equal V<sub>REF</sub> and I<sub>BIAS</sub> =  $2 \times V_{REF} \times C_{ARRAY} \times F_{CLK}$ , assuming a 50% duty cycle of non-overlapping clock  $\phi 1$  and  $\phi 2$ . A current mirror with two-stacked NMOSs then multiplies I<sub>BIAS</sub> from the switched-capacitor current source and draws N×I<sub>BIAS</sub> from the 450nm InGaN LED to transmit the symbol data.

The proposed TX circuit structure has the following advantages: 1) Since the capacitor has a small temperature dependence and the clock frequency is extracted from the external ORD, we obtain a very stable current reference [9], 2) the current I<sub>BIAS</sub> is directly proportional to the clock frequency  $F_{CLK}$ , resulting in inverse correlated pulse width and amplitude. Hence, energy per pulse is constant and is modulated by the capacitance array as given by  $N \times V_{REF} \times C_{ARRAY} \times V_{LED}$  ( $V_{LED}$  is the operating voltage of the LED). This enables effect modulation of the light signal within the energy optimal region of the symbol length / amplitude space shown in Figure 5.6.

#### 5.3 Measurement Result

The proposed optical TX is fabricated in 180nm CMOS. Figure 5.8 shows the testing setup with chicken tissue to measure TX energy efficiency and to measure BER of given M-ary PPM modulation scheme. Unlike the analysis in Figure 5.6, the shown measurements also include the energy of the circuits on the TX chip. As the number of positions in M-PPM increases, LED energy per bit decreases as a higher order M-PPM encodes more bits. However, the TX circuit energy overhead increases because of the longer symbol duration. As shown in Figure 5.9, an 8-ary PPM achieves the best energy efficiency at this particular scenario as it balances LED and TX circuit energy point of 13.8pJ/bit as shown in Figure 5.10. As tissue thickness increases, energy/bit of the TX rises while the data rate achieving the optimal energy efficiency decreases to meet the target BER as shown in Figure 5.12. Figure 5.11 illustrates the measured result of BER vs. energy efficiency



Figure 5.8. Testing setup with chicken tissue.
### 30p 30p 1C Energy/bit 25p 20p 13.8pJ/bit @ 65kbps 15p 10p 5p 0 Binary 4-ary 8-ary 16-ary32-ary64-ary

#### Energy/bit (J) with 10mm chicken tissue

Figure 5.9. Energy efficiency of proposed TX with different modulation scheme.

across the free space TX-ORD distance. Up to 16cm free space distance can be achieved with BER

 $< 10^{-3}$  at 20pJ/bit. Table 5.1 summarizes TX performance and compares with prior work.



Measured energy per bit(J) of 8-ary PPM

Figure 5.10. Measured energy efficiency when using 8-ary PPM scheme for TX.

It shows smallest TX element size and lowest power consumption among those listed while it achieves comparable link distance through tissue / free-space. Energy efficiency is  $\geq 24x$ improved compared to other optical TX designs listed [58], [60].



Figure 5.11. Measured BER and energy efficiency of 8-ary PPM.

(a) Measured BER (b) measured energy efficiency across TX-ORD distance



Figure 5.12. Energy efficiency and corresponding data rate at different tissue depth.

|                       | <b>[41]</b><br>IEEE TBCAS, 2015 | <b>[43]</b><br>Neuron, 2016 | <b>[45]</b><br>CICC, 2014 | G. Hwang,<br>JSSC, 2017 | <b>[44]</b><br>IEEE TCAS-I, 2017 | This Work                             |
|-----------------------|---------------------------------|-----------------------------|---------------------------|-------------------------|----------------------------------|---------------------------------------|
| Tx Size†              | 100mm <sup>2</sup>              | 0.5625mm <sup>2</sup>       | 7mm <sup>2</sup>          | 5.4mm <sup>2</sup>      | 0.1mm <sup>2</sup>               | 0.0594mm <sup>2</sup>                 |
| Transmit Method       | RF<br>OOK/BPSK                  | Ultrasound                  | Optical                   | Optical, TDMA<br>& OOK  | Optical OOK                      | Optical, M-ary<br>PPM                 |
| Frequency / $\lambda$ | 3.1 ~ 7GHz                      | 1.85MHz                     | 850nm                     | 591/624nm               | 770nm                            | 450nm                                 |
| Data rate             | 500Mbps                         | 0.5Mbps                     | 16kbps                    | 2.048Mbps               | 100bps                           | 65kps                                 |
| Energy (pJ/b)         | 10.8                            | 240                         | 331                       | 880                     | 1000                             | 13.8                                  |
| Total TX<br>Power     | 5.4mW                           | 0.12mW                      | 5.4μW                     | 1.8mW                   | 100nW                            | 890nW                                 |
| Medium                | skin, fat, bone,<br>and brain   | rat tissue                  | pig skin                  | free space              | free<br>space                    | chicken tissue<br>/free space         |
| Tx — Rx<br>Distance   | 6mm                             | 8.9mm                       | 4.75mm                    | 41cm                    | 10cm                             | <sup>++1</sup> 0mm/5cm<br>(35mm/45cm) |

Table 5.1. Performance summary & comparison

<sup>†</sup>RF TX : antenna size; Ultrasonic TX : piezoelectric transducer size; Optical TX : LED size <sup>††</sup>Distance at minimum energy per bit (maximum distance at 7nJ/bit)

## **CHAPTER 6**

## **Conclusions and Future Directions**

#### 6.1 Conclusions

A miniaturized cubic millimeter-scale device, which is a new type of IoT platform, has received attention due to its low power, small size, and cheap cost. These qualities make it useful for a wide range of new applications, from health monitoring to effective maintenance of industrial equipment. However, there are several challenges that need to be addressed to transform such a device into a bona fide IoT platform. This thesis proposed novel solutions for these challenges, adopting ultra-low power circuit techniques to minimize the energy per operation and power consumption of the miniaturized IoT device.

Due to the restricted size limit on the battery of cubic millimeter-scale IoT devices, the average power consumption of a standard miniaturized sensor node should be at nW or sub-nW levels. Therefore, an extensively standby power-oriented design and an aggressive power duty cycling scheme were applied to a cubic millimeter-scale IoT system in order to operate with the minimum level of energy and power. The power cycle of a miniaturized IoT device has two operation modes: long-duration sleep mode and short-duration active mode. Thus, it is critical to

reduce the standby power consumption of the always-on blocks and to design energy-efficient circuits for both modes. This thesis suggested several circuit design techniques to reduce the static power and to increase the energy efficiency of the blocks needed to make the miniaturized sensor system. In chapter 2, the dynamic leakage-suppression logic was presented to enable ultra-low power battery-less operation of wireless sensing systems. A 32-bit Cortex M0+ processor and a 256-B memory were implemented using this logic family while consuming 295 pW at 0.55 V. We demonstrated the battery-less operation of the processor, which operates directly from a 0.09-mm<sup>2</sup> bulk silicon solar cell under 240 lux, similar to indoor light conditions.

Chapters 3 discussed the ultra-low power, always-on sensor interface design techniques. We proposed an always-on light-to-digital converter (LDC) and found that it measured a 1.9x wider light intensity range and consumed 7,200x less power than the previous design. The proposed LDC uses the unique property of the Dynamic Leakage Suppression logic Ring Oscillator (DLS RO), discussed in chapter 2. We demonstrated that the proposed circuit accumulated the light energy in 32-bit code with 0.38% resolution error while consuming 550 pW.

In chapter 4 and 5, the low-power free-space optical transceiver circuit design was presented. More specifically, an always-on sub-nW standby power optical wake-up receiver was suggested in chapter 4. The proposed optical receiver adopted a dual-mode (voltage and current mode) operation for ultra-low standby power and energy-efficient fast data reception. This optical receiver could cancel in-band light noise from 200 lx to 100 klx, improving the BER under various ambient light conditions. Finally, chapter 5 described an energy-efficient optical transmitter (TX) for cubic millimeter-scale implantable sensor systems designed to transfer data from low-power minimally invasive implants. When paired with a photo-multiplier tube receiver, the TX consumed 0.89µW at 65 kbps while achieving 13.8 pJ/bit energy efficiency.

#### 6.2 Future Directions

The system that was synthesized with the proposed dynamic leakage suppression logic family consumed extremely low power while exhibiting poor performance. Since the current that switches the output of each logic gate was subthreshold leakage from the super-cutoff transistors, the FO4 delay was 0.7 ms, which dictated the performance of the entire system. For library implementation, we built with only DLS 2-input NAND, 2-input NOR, inverter, latch, and flip-flop. Moreover, we inserted unnecessary buffers to most of the paths in the processor to give a margin for the minimum time constraint in excess of the expected margin. Making a library with various types and sizes of DLSL gates and relaxing the hold time margin would enhance the performance and reduce the size compared to the present version. The implementation of DLSL with the advanced technology will be the next step of this line of research.

The low-power optical wake-up receiver proposed in chapter 4 was integrated into the current wireless sensor node system for test purposes. We measured how much the unary-coded resistor bank in the proposed wake-up receiver could track the target ambient light intensity across the operating temperature range (0-8°C). In order to track the target ambient light intensity range (100~100 klx) properly, more parallel shunt resistors to the bank should be inserted. In addition, we verified the required settling time of the analog feedback circuit in the current mode operation. From this settling time information, the preamble duration can be determined between voltage and current-mode operation. The final step of this work would be to test the revised wake-up receiver integrated with the wireless sensor node system.

Lastly, this thesis presented an implantable optical transmitter with PMT as a receiver. Influenced by the extremely sensitive PMT, the proposed optical transmitter achieved low power operation with reasonable energy per bit of 13.8 pJ/bit at 65 kbps transmit speed. Adding pulseamplitude modulation to the existing pulse-position modulation scheme will lead to higher energy efficiency. The implantable optical transmitter should control the pulse width and the pulse position to achieve high energy efficiency so that it can operate under various circumstances (e.g., different depths of implantation or tissue characteristics). The modified modulation can be implemented and tested to validate the energy efficient operation of the optical transmitter, enabling an implantable application. Additionally, we did the proposed transmitter performance experiments with chicken breast tissue, expecting that the chicken breast had a similar absorption characteristics as real human skin tissue condition for given LED light wavelengths [61]. Since the distribution of arteries and veins in chicken breast tissue is not identical to real human skin tissue distribution, it is necessary to test the transmitter performance under in-vivo implantation in rat.

# **BIBLOGRAPHY**

- [1] "What makes a Nest thermostat a Nest thermostat?," *Nest*. [Online]. Available: https://www.nest.com/thermostats/. [Accessed: 18-Feb-2018].
- [2] "Production Line Monitoring | Monnit Corp." [Online]. Available: https://www.monnit.com/solutions/production-line-monitoring. [Accessed: 22-Feb-2018].
- [3] J. Manyika *et al.*, "Unlocking the potential of the Internet of Things | McKinsey & Company." [Online]. Available: https://www.mckinsey.com/business-functions/digitalmckinsey/our-insights/the-internet-of-things-the-value-of-digitizing-the-physical-world.
- [4] S. Andreev et al., "Understanding the IoT connectivity landscape: a contemporary M2M radio technology roadmap," IEEE Communications Magazine, vol. 53, no. 9, pp. 32–40, Sep. 2015.
- [5] B. I. Intelligence, "Two tech companies have launched a network to connect low-power IoT devices," *Business Insider*. [Online]. Available: http://www.businessinsider.com/inmarsatacility-lpwan-iot-2017-2. [Accessed: 22-Feb-2018].
- [6] S. Jeong, Z. Foo, Y. Lee, J. Y. Sim, D. Blaauw, and D. Sylvester, "A Fully-Integrated 71nW CMOS Temperature Sensor for Low Power Wireless Sensor Nodes," *IEEE J. Solid-State Circuits*, vol. 49, no. 8, pp. 1682–1693, Aug. 2014.
- [7] A. M. C. Lee, C. T. Angeles, M. C. R. Talampas, L. G. Sison, and M. N. Soriano, "MotesArt: Wireless Sensor Network for Monitoring Relative Humidity and Temperature in an Art Gallery," in 2008 IEEE International Conference on Networking, Sensing and Control, 2008, pp. 1263–1268.
- [8] W. Lim, D. Sylvester, and D. Blaauw, "4.4 A sub-nW 80mlx-to-1.26Mlx self-referencing light-to-digital converter with AlGaAs photodiode," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), 2017, pp. 72–73.

- [9] G. Kim *et al.*, "A millimeter-scale wireless imaging system with continuous motion detection and energy harvesting," in 2014 Symposium on VLSI Circuits Digest of Technical Papers, 2014, pp. 1–2.
- [10] "A Dual-Slope Capacitance-to-Digital Converter Integrated in an Implantable Pressure-Sensing System - IEEE Journals & Magazine." [Online]. Available: http://ieeexplore.ieee.org.proxy.lib.umich.edu/document/7122368/. [Accessed: 15-Feb-2018].
- [11] Y. Hu, J. Yang, Z. Huang, R. Sokolovskij, and F. Wang, "Wireless sensor node with hybrid energy harvesting for air-flow rate sensing," in *2017 IEEE SENSORS*, 2017, pp. 1–3.
- [12] S. Edward Jero and A. Balaji Ganesh, "PIC18LF4620 based customizable wireless sensor node to detect hazardous gas pipeline leakage," 2011, pp. 563–566.
- [13] Y. Shi *et al.*, "A 10mm<sup>3</sup> syringe-implantable near-field radio system on glass substrate," in *2016 IEEE International Solid-State Circuits Conference (ISSCC)*, 2016, pp. 448–449.
- [14] "Crypto-anchors and Blockchain IBM Research US." [Online]. Available: https://www.research.ibm.com/5-in-5/crypto-anchors-and-blockchain/. [Accessed: 24-Apr-2018].
- [15] T. Jang, M. Choi, Y. Shi, I. Lee, D. Sylvester, and D. Blaauw, "Millimeter-scale computing platform for next generation of Internet of Things," in 2016 IEEE International Conference on RFID (RFID), 2016, pp. 1–4.
- [16] Y. Lee, D. Blaauw, and D. Sylvester, "Ultralow Power Circuit Design for Wireless Sensor Nodes for Structural Health Monitoring," *Proc. IEEE*, vol. 104, no. 8, pp. 1529– 1546, Aug. 2016.
- [17] M. Seok *et al.*, "The Phoenix Processor: A 30pW platform for sensor applications," in 2008 IEEE Symposium on VLSI Circuits, 2008, pp. 188–189.
- [18] "Thin Film Battery, Solid State Batteries | Cymbet Corporation." [Online]. Available: http://www.cymbet.com/. [Accessed: 23-Feb-2018].

- [19] A. Wang and A. Chandrakasan, "A 180mV FFT processor using subthreshold circuit techniques," in 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519), 2004, pp. 292-529 Vol.1.
- [20] S. Hanson *et al.*, "A Low-Voltage Processor for Sensing Applications With Picowatt Standby Mode," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1145–1155, Apr. 2009.
- [21] Y. Lee *et al.*, "A modular 1mm3 die-stacked sensing platform with optical communication and multi-modal energy harvesting," in 2012 IEEE International Solid-State Circuits Conference, 2012, pp. 402–404.
- [22] R. Hahn *et al.*, "Development of near hermetic silicon/glass cavities for packaging of integrated lithium micro batteries," in 2009 Symposium on Design, Test, Integration Packaging of MEMS/MOEMS, 2009, pp. 292–299.
- [23] N. Lotze and Y. Manoli, "A 62mV 0.13um CMOS standard-cell-based design technique using schmitt-trigger logic," in 2011 IEEE International Solid-State Circuits Conference, 2011, pp. 340–342.
- [24] W. Jung, S. Oh, S. Bang, Y. Lee, D. Sylvester, and D. Blaauw, "A 3nW fully integrated energy harvester based on self-oscillating switched-capacitor DC-DC converter," in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014, pp. 398–399.
- [25] D. Jeon, M. Seok, C. Chakrabarti, D. Blaauw, and D. Sylvester, "A Super-Pipelined Energy Efficient Subthreshold 240 MS/s FFT Core in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 1, pp. 23–34, Jan. 2012.
- [26] M. Terman and J. S. Terman, "Light Therapy for Seasonal and Nonseasonal Depression: Efficacy, Protocol, Safety, and Side Effects," CNS Spectr., vol. 10, no. 8, pp. 647–663, Aug. 2005.
- [27] D. E. Brash *et al.*, "A role for sunlight in skin cancer: UV-induced p53 mutations in squamous cell carcinoma," *Proc. Natl. Acad. Sci.*, vol. 88, no. 22, pp. 10124–10128, Nov. 1991.
- [28] R. B. Setlow, "The Wavelengths in Sunlight Effective in Producing Skin Cancer: A Theoretical Analysis," *Proc. Natl. Acad. Sci.*, vol. 71, no. 9, pp. 3363–3366, Sep. 1974.

- [29] K. H. Kraemer, M.-M. Lee, A. D. Andrews, and W. C. Lambert, "The Role of Sunlight and DNA Repair in Melanoma and Nonmelanoma Skin Cancer: The Xeroderma Pigmentosum Paradigm," *Arch. Dermatol.*, vol. 130, no. 8, pp. 1018–1021, Aug. 1994.
- [30] K. H. Kraemer, "Sunlight and skin cancer: Another link revealed," *Proc. Natl. Acad. Sci.*, vol. 94, no. 1, pp. 11–14, Jan. 1997.
- [31] T. Kuwabara, "Retinal Damage by Visible Light: An Electron Microscopic Study," *Arch. Ophthalmol.*, vol. 79, no. 1, p. 69, Jan. 1968.
- [32] W. K. Noell, V. S. Walker, B. S. Kang, and S. Berman, "Retinal Damage by Light in Rats," *Invest. Ophthalmol. Vis. Sci.*, vol. 5, no. 5, pp. 450–473, Oct. 1966.
- [33] R. W. Young, "Sunlight and age-related eye disease.," *J. Natl. Med. Assoc.*, vol. 84, no. 4, pp. 353–358, Apr. 1992.
- [34] "OPT3002, Light-to-Digital Sensor.".
- [35] "ISL29035, Integrated Digital Light Sensor with Interrupt.".
- [36] M. Alhawari, N. Albelooshi, and M. H. Perrott, "A 0.5V < 4uW CMOS photoplethysmographic heart-rate sensor IC based on a non-uniform quantizer," in 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers, 2013, pp. 384–385.
- [37] F. Tang et al., "A Linear 126-dB Dynamic Range Light-to-Frequency Converter With Dark Current Suppression Upto 125°C for Blood Oxygen Concentration Detection," *IEEE Trans. Electron Devices*, vol. 63, no. 10, pp. 3983–3988, Oct. 2016.
- [38] R. G. Correia, S. Pimenta, and G. Minas, "CMOS Integrated Photodetectors and Light-to-Frequency Converters for Spectrophotometric Measurements," *IEEE Sens. J.*, vol. 17, no. 11, pp. 3438–3445, Jun. 2017.
- [39] C. T. Chiang, "A CMOS auto-calibrated light-to-frequency converter," in 2012 IEEE Symposium on Industrial Electronics and Applications, 2012, pp. 54–57.
- [40] J. H. Correia, G. de Graaf, M. Bartek, and R. F. Wolffenbuttel, "A single-chip CMOS optical microspectrometer with light-to-frequency converter and bus interface," *IEEE J. Solid-State Circuits*, vol. 37, no. 10, pp. 1344–1347, Oct. 2002.

- [41] C. J. Aswell, J. Berlien, E. Dierschke, and M. Hassan, "A monolithic light-to-frequency converter with a scalable sensor array," in *Solid-State Circuits Conference*, 1994. Digest of *Technical Papers*. 41st ISSCC., 1994 IEEE International, 1994, pp. 158–159.
- [42] W. Lim, I. Lee, D. Sylvester, and D. Blaauw, "Batteryless Sub-nW Cortex-M0+processor with dynamic leakage-suppression logic," in 2015 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, 2015, pp. 1–3.
- [43] N. M. Pletcher, S. Gambini, and J. M. Rabaey, "A 2GHz 52µW Wake-Up Receiver with -72dBm Sensitivity Using Uncertain-IF Architecture," in 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers, 2008, pp. 524–633.
- [44] X. Huang, S. Rampu, X. Wang, G. Dolmans, and H. de Groot, "A 2.4GHz/915MHz 51μW wake-up receiver with offset and noise suppression," in 2010 IEEE International Solid-State Circuits Conference - (ISSCC), 2010, pp. 222–223.
- [45] D. Y. Yoon *et al.*, "A New Approach to Low-Power and Low-Latency Wake-Up Receiver System for Wireless Sensor Nodes," *IEEE J. Solid-State Circuits*, vol. 47, no. 10, pp. 2405–2419, Oct. 2012.
- [46] Y. L. Tsou, N. C. D. Cheng, and C. F. Jou, "A 32.4µW RF front end for 2.4 GHz wakeup receiver," in 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013, pp. 125–128.
- [47] S. E. Chen, C. L. Yang, and K. W. Cheng, "A 4.5µW 2.4 GHz wake-up receiver based on complementary current-reuse RF detector," in 2015 IEEE International Symposium on Circuits and Systems (ISCAS), 2015, pp. 1214–1217.
- [48] H. Fuketa, S. O'uchi, and T. Matsukawa, "A 0.3-V 1-μW Super-Regenerative Ultrasound Wake-Up Receiver With Power Scalability," *IEEE Trans. Circuits Syst. II Express Briefs*, vol. 64, no. 9, pp. 1027–1031, Sep. 2017.
- [49] G. Kim et al., "A 695 pW standby power optical wake-up receiver for wireless sensor nodes," in Proceedings of the IEEE 2012 Custom Integrated Circuits Conference, 2012, pp. 1–4.

- [50] M. Seok, G. Kim, D. Sylvester, and D. Blaauw, "A 0.5V 2.2pW 2-transistor voltage reference," in 2009 IEEE Custom Integrated Circuits Conference, 2009, pp. 577–580.
- [51] G. Chen *et al.*, "Millimeter-scale nearly perpetual sensor system with stacked battery and solar cells," in *2010 IEEE International Solid-State Circuits Conference (ISSCC)*, 2010, pp. 288–289.
- [52] W. Biederman *et al.*, "A Fully-Integrated, Miniaturized (0.125 mm<sup>2</sup>) 10.5 μW Wireless Neural Sensor," *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 960–970, Apr. 2013.
- [53] R. Muller *et al.*, "A Minimally Invasive 64-Channel Wireless uECoG Implant," *IEEE J. Solid-State Circuits*, vol. 50, no. 1, pp. 344–359, Jan. 2015.
- [54] S. A. Mirbozorgi, H. Bahrami, M. Sawan, L. A. Rusch, and B. Gosselin, "A Single-Chip Full-Duplex High Speed Transceiver for Multi-Site Stimulating and Recording Neural Implants," *IEEE Trans. Biomed. Circuits Syst.*, vol. 10, no. 3, pp. 643–653, Jun. 2016.
- [55] D. Seo *et al.*, "Wireless Recording in the Peripheral Nervous System with Ultrasonic Neural Dust," *Neuron*, vol. 91, no. 3, pp. 529–539, Aug. 2016.
- [56] T. C. Chang, M. L. Wang, J. Charthad, M. J. Weber, and A. Arbabian, "A 30.5mm<sup>3</sup> fully packaged implantable device with duplex ultrasonic data and power links achieving 95kb/s with < 10<sup>-4</sup> BER at 8.5cm depth," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), 2017, pp. 460–461.
- [57] J. Charthad, M. J. Weber, T. C. Chang, and A. Arbabian, "A mm-Sized Implantable Medical Device (IMD) With Ultrasonic Power Transfer and a Hybrid Bi-Directional Data Link," *IEEE J. Solid-State Circuits*, vol. 50, no. 8, pp. 1741–1753, Aug. 2015.
- [58] I. Haydaroglu, M. T. Ozgun, and S. Mutlu, "Optically Powered Optical Transmitter Using a Single Light-Emitting Diode," *IEEE Trans. Circuits Syst. Regul. Pap.*, vol. 64, no. 8, pp. 2003–2012, Aug. 2017.
- [59] W. Lim, T. Jang, I. Lee, H.-S. Kim, D. Sylvester, and D. Blaauw, "A 380pW dual mode optical wake-up receiver with ambient noise cancellation," in 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), 2016, pp. 1–2.
- [60] K. Sankaragomathi, L. Perez, R. Mirjalili, B. Parviz, and B. Otis, "A 27uW subcutaneous wireless biosensing platform with optical power and data transfer," in *Proceedings of the IEEE 2014 Custom Integrated Circuits Conference*, 2014, pp. 1–4.

[61] S. Nioka *et al.*, "Simulation study of breast tissue hemodynamics during pressure perturbation," *Adv. Exp. Med. Biol.*, vol. 566, pp. 17–22, 2005.