# Ultra Low Power Analog Circuits for Wireless Sensor Node System 

by

## Dongmin Yoon

A dissertation submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
(Electrical Engineering)
in The University of Michigan
2015

Doctoral Committee:
Professor David Blaauw, Chair
Professor Joanna Mirecki Millunchick
Professor Dennis Michael Sylvester
Associate Professor David D. Wentzloff

To my wife, parents, and brother

## TABLE OF CONTENTS

DEDICATION ..... ii
LIST OF FIGURES ..... v
LIST OF TABLES ..... vii
CHAPTER
1 Recent Trends in Wireless Sensor Node Systems ..... 1
1.1 Design Constraints in Wireless Sensor Node System ..... 1
1.2 Circuit Design Challenges ..... 3
1.3 Scope of the Thesis ..... 6
2 Pulse-Driven Crystal Oscillator ..... 8
2.1 Motivation ..... 8
2.2 Previous Works ..... 10
2.3 Circuit Operation of DLL-Assisted Pulse-Driven Crystal Oscillator ..... 11
2.4 Circuit Implementation Detail ..... 14
2.5 Measurement Result ..... 21
2.6 Conclusion ..... 25
3 Schmitt Trigger Based Pulse-Driven Crystal Oscillator ..... 26
3.1 Motivation ..... 26
3.2 Circuit Implementation Detail ..... 30
3.3 Layout and Simulation Result ..... 33
4 Sub-nW 8-bit SAR ADC with Transistor-Stack DAC ..... 35
4.1 Motivation ..... 35
4.2 Previous Works ..... 36
4.3 Circuit Implementation in 65 nm CMOS Technology ..... 37
4.4 Implementation in 180nm CMOS Technology ..... 44
4.5 Measurement Result ..... 46
5 Multiple Output Level Switched Capacitor Network Voltage Regulator ..... 55
5.1 Motivation ..... 55
5.2 Architecture Choice ..... 56
5.3 Circuit Implementation ..... 57
5.4 Measurement Result ..... 61
6 Ongoing and Future Work ..... 65
6.1 Future Work for Crystal Oscillator ..... 65
6.2 Future Work for ADC ..... 66
APPENDIX ..... 67
BIBLIOGRAPHY ..... 70

## LIST OF FIGURES

Figure
1.1 Relative technology improvement from 1990-2003. ..... 2
1.2 Block diagram of a wireless sensor node system. ..... 4
1.3 Traditional folded cascode operational amplifier ..... 5
2.1 Operation scenario of WSN. ..... 9
2.2 Conventional XO circuit. ..... 10
2.3 Popular low power XO circuit. ..... 11
2.4 Preamplifier concept of the proposed circuit. ..... 12
2.5 Block diagram of the proposed circuit. ..... 13
2.6 Circuit in medium voltage supply. ..... 14
2.7 Front end block circuit schematic. ..... 15
2.8 DLL circuit schematic. ..... 16
2.9 Edge detector circuit schematic. ..... 16
2.10 Charge pump circuit schematic. ..... 17
2.11 Level converter circuit schematic. ..... 18
2.12 SCN and clock dividers. ..... 19
2.13 Switch configuration of SCN1 and SCN2. ..... 20
2.14 Measured waveform. ..... 21
2.15 Power at different power supply voltage. ..... 22
2.16 Normalized frequency at different pulse location. ..... 22
2.17 Frequency at different temperature. ..... 23
2.18 Frequency measurement for eight hours. ..... 24
2.19 Allan deviation measurement and comparison to prior ultra-low power RTC. ..... 24
3.1 Electrical model of 32.768 kHz crystal used in simulation. ..... 27
3.2 Power consumption of crystal oscillating at different voltage level ..... 27
3.3 Block diagram of the proposed pulse-driven crystal oscillator without DLL. ..... 29
3.4 Schmitt trigger based pulse generator. ..... 30
3.5 Bootstrap circuit schematic diagram. ..... 31
3.6 SCN schematic for the Schmitt trigger based pulse-driven crystal oscillator system. ..... 32
3.7 Layout of the Schmitt trigger based pulse-driven crystal oscillator system. ..... 34
4.1 Conventional SAR ADC block diagram ..... 37
4.2 Block diagram of the proposed circuit. ..... 38
4.3 Bootstrap circuit for sampling. ..... 39
4.4 Transistor stack schematic and layout floorplan. ..... 39
4.5 Switching sequence for the last four clock cycles. ..... 41
4.6 Switch connection status at the end of conversion cycle for code 0000010 x . ..... 42
4.7 Control logic circuit and associated timing diagram. ..... 43
4.8 Comparator circuit ..... 43
4.9 Block diagram of 180 nm CMOS test chip. ..... 44
4.10 Die photo of 65 nm CMOS test chip. ..... 46
4.11 Die photo of 180 nm CMOS test chip. ..... 47
4.12 Performance with different sampling frequencies with low current DAC in 65 nm test chip ..... 48
4.13 Performance with different sampling frequencies with high current DAC in 65 nm test chip ..... 48
4.14 Performance at F_sample $=1 \mathrm{kHz}$ with low current DAC in 65 nm test chip ..... 49
4.15 Performance at F_sample $=1.024 \mathrm{kHz}$ with high current DAC in 65 nm test chip ..... 49
4.16 DNL with low current DAC in 65 nm test chip ..... 50
4.17 INL with low current DAC in 65 nm test chip ..... 50
4.18 DNL with high current DAC in 65 nm test chip ..... 51
4.19 INL with high current DAC in 65 nm test chip ..... 51
4.20 Power and FoM at different sampling frequency with high current DAC in 65 nm test chip ..... 52
4.21 Power breakdown with high current DAC in 65 nm test chip ..... 52
4.22 DC signal measurement result in 180nm test chip ..... 53
5.1 Voltage doubler configuration ..... 56
5.2 Circuit block diagram. ..... 57
5.3 Circuit schematic ..... 58
5.4 Step up converter switch configuration in detail ..... 59
5.5 Voltage doubler configuration ..... 60
$5.6 \mathrm{R}_{\text {out }}$ when $\mathrm{R}_{\mathrm{sw}}=0.25 \Omega, \mathrm{R}_{\mathrm{w}}=3 \Omega, \mathrm{C}_{\mathrm{f}}=1 \mu \mathrm{~F}, \mathrm{C}_{\mathrm{o}}=10 \mu \mathrm{~F}$. ..... 61
5.7 Die photograph of the circuit ..... 62
5.8 Measured output voltage while decreasing battery voltage ..... 62
5.9 Measured output voltage while increasing up battery voltage ..... 63
5.10 Measured efficiency ..... 64

## LIST OF TABLES

## Table

1.1 Battery specification comparison ..... 3
1.2 Traditional Scaling Results for Circuit Performance ..... 4
1.3 ITRS Low Power Technology Requirement ..... 4
2.1 Performance summary. ..... 25
3.1 Monte-Carlo simulation for delay variation $\mu \mathrm{s}$ ..... 28
3.2 Monte-Carlo simulation for power variation (pW) ..... 28
3.3 Simulated power consumption of the Schmitt trigger based pulse-driven crystal oscillator system. ..... 33
4.1 Performance summary. ..... 54
5.1 Inductor and capacitor comparison ..... 56
5.2 Configurations for step down converter to achieve fractional voltages ..... 58
5.3 Maximum current the circuit can support ..... 64

## CHAPTER 1

## Recent Trends in Wireless Sensor Node Systems

### 1.1 Design Constraints in Wireless Sensor Node System

Bell's law of computing classes [1] states that approximately every decade a new computer class with lower price and volume replaces the previous class. Extending this prediction to the next decade, the next computing class is expected to have a size that is orders of magnitude smaller than today's cell phones. One of the best example of the next computing class is wireless sensor nodes. With similar or improved computing power implemented within a small volume - thus the term "smart dust" [2] - at lower cost, it has potential to change how people interact with environment. For example, there has already been effort to use wireless sensor nodes to measure environmental change [3], monitor structural integrity of infrastructure [4], be implanted inside human body to record a patient's status constantly [5], along with many other discussions. However, many issues still remain to be solved which has been the focus of active research to realize the wireless sensor node and use it for constant monitoring devices [6, 7]. The most significant challenge rises from its biggest benefit: small volume.

Technology advance in electronics for past few decades show that circuit size has constantly been shrinking rapidly and is expected to do so for foreseeable future [8]. However, as seen in Figure 1.1 [9], the same has not been true for battery technology. Although it showed constant improvement, battery capacity per volume has been increasing much slower than transistor count per area. Even for today's hand-held devices such as a cell phone, battery capacity limits key metrics such as system volume and standby time. During the last leap of computing class change,


Figure 1.1: Relative technology improvement from 1990-2003.
from personal computers (PC) to handheld devices, even companies like Intel Corporation - the largest chip maker famous for their CPU in the PC - was not fast enough to react to this paradigm shift. Now it faces a challenge from other companies more adept to limited power in the computing environment, such as Qualcomm and ARM [10].

For wireless sensor nodes, the volume that can be allotted to battery - if any - is orders of magnitude smaller than that in a cell phone. In some applications, the system does not have any battery and operates from energy harvester or wirelessly coupled power source. Although energy harvesting and wireless power transmission are actively investigated topics, they still suffer from low efficiency. Therefore, it is obvious that it requires a system designer to be more aware of energy efficiency of the circuit. In addition, another aspect often presents challenge to the system designer, which has not been much of an issue for the previous class of the computing system. Not as much attention to as sheer battery capacity expressed by the energy it stores is the maximum instantaneous current a battery can support. Table 1.1 shows that as battery size becomes smaller, so does the maximum current it can support. If one circuit block in a wireless sensor node sys-

| Manufacturer | Product Name | Output <br> Voltage (V) | Capacity <br> $(\mu \mathrm{Ah})$ | Maximum <br> Current $(\mathrm{mA})$ | Size <br> $\left(\mathrm{mm}^{3}\right)$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
| PowerStream [11] | PGEB016144 | 3.7 | 200 | 400 | 2684 |
| PowerStream [12] | GM300910 | 3.7 | 15 | 120 | 270 |
| ST [13] | EFL700A39 | 3.9 | 0.7 | 5 | 129 |
| Cymbet [14] | CBC050 | 3.8 | 0.05 | 0.3 | 57.6 |
| Cymbet [15] | CBC012 | 3.8 | 0.012 | 0.1 | 18.75 |

Table 1.1: Battery specification comparison.
tem has instantaneous current that exceeds the maximum current the battery can support, it causes unexpected voltage drop. For systems with the battery as its only energy source, this affects operation of the other blocks as well, possibly leading to total system failure. The system designer can choose to integrate a large capacitor in order to handle an instant spike of high current demand without significant drop of battery voltage. However, this comes with cost of increased volume, which is not readily available in all wireless sensor nodes. Therefore each circuit block has to be not only energy efficient, but power efficient as well, with extra attention to instantaneous power.

### 1.2 Circuit Design Challenges

Figure 1.2 shows a typical block diagram of a wireless sensor node system. At the front end of the system are analog components such as sensor and data converters. It monitors changes in the system's environment and hands off the information in digital form. The digital core then processes the acquired data, and stores it in memory if necessary. Then data is delivered to the outside world so that its user can gather acquired information. The communication is usually done through a wireless channel, since small volume of the system inhibits use of large interface fixture needed for wired communication. On the background, a real time clock keeps track of the time for purposes of logging the timing information of data acquisition, synchronizing wireless sensor nodes, etc. All of these operations have to be done within energy and power budget constrained by power source like battery and energy harvester, and power management circuit.

Traditional scaling theory [16] asserts that active power for digital circuits can directly benefit from technology improvement. As shown in Table 1.2 [16], scaling enabled digital circuits to have faster speed with lower power with reduced supply voltage. This trend, however, is no longer


Figure 1.2: Block diagram of a wireless sensor node system.

| Device or Circuit Parameter | Scaling Factor |
| :---: | :---: |
| Device dimension | $\kappa$ |
| Doping concentration | $1 / \kappa$ |
| Voltage | $1 / \kappa$ |
| Current | $1 / \kappa$ |
| Capacitance | $1 / \kappa$ |
| Delay time | $1 / \kappa$ |
| Power dissipation | $1 / \kappa^{2}$ |
| Power density | 1 |

Table 1.2: Traditional Scaling Results for Circuit Performance

| Year of Production | 2013 | 2015 | 2017 | 2019 | 2021 | 2023 | 2025 | 2027 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Logic Industry Node Range <br> Labeling (nm) | $16 / 14$ | $11 / 10$ | $8 / 7$ | $6 / 5$ | $4 / 3$ | $3 / 2.5$ | $2 / 1.5$ | $1 / 0.75$ |
| Power Supply Voltage (V) | 0.86 | 0.83 | 0.80 | 0.77 | 0.74 | 0.71 | 0.68 | 0.65 |
| $V_{t, \text { sat }}(\mathrm{V})$ | 0.446 | 0.453 | 0.461 | 0.461 | 0.446 | 0.453 | 0.454 | 0.461 |
| NMOSFET Intrinsic Delay <br> $(\mathrm{ps})$ | 1.622 | 1.573 | 1.587 | 1.525 | 1.424 | 1.474 | 1.422 | 1.413 |

Table 1.3: ITRS Low Power Technology Requirement


Figure 1.3: Traditional folded cascode operational amplifier
sustainable due to several factors. In order to decrease delay with reduced power supply voltage, transistor's threshold voltage has to be reduced as well. However, this causes exponential increase in leakage power, which can potentially dominate power consumption. Therefore the supply voltage and especially threshold voltage cannot be scaled down accordingly with device dimension, as shown in Table 1.3 [8].

With the given processing environment, many different techniques have been researched to achieve high energy efficiency. More specifically, it has been known that the maximum energy efficiency for digital circuit can be achieved at sub-threshold level of supply voltage [17]. Also there are well-known techniques to achieve extremely low average power for digital processor cores [7, 18]. As a result, digital circuits have shown successful adaptation to the scaling trend even with much higher number of transistors usually present in a system. It can even be said that slow scaling of threshold voltage was beneficial for the design of ultra-low power digital circuit, given that aggressive voltage scaling and power gating are widely used.

However traditional analog circuits have other problems that digital circuit does not have to handle. A traditional analog circuit, best represented by an operational amplifier such as Figure 1.3[19], is designed with many transistors in saturation mode. To meet such condition, portion of supply voltage level has to be reserved to maintain transistors in saturation, referred to as voltage headroom. When this constraint has to be met with traditional techniques such as cascoding, circuits like 1.3 are very hard to be implemented with same operating condition as ultra-low power
system like [7]. Also, performance of many traditional analog circuits is determined by device matching of different transistors. Because matching is improved with increased device size, analog circuit cannot benefit directly with smaller minimum feature size [20]. As for power consumption of the traditional analog circuit, transistor in saturation mode with conventional design has to be supplied with constant bias current. These factors pose significant challenge for analog circuits to achieve good power and energy efficiency in wireless sensor node system.

### 1.3 Scope of the Thesis

This thesis will discuss essential analog circuit blocks required in ultra-low power wireless sensor node systems as shown in Figure 1.2. As discussed in previous section, these circuit blocks require non-traditional techniques to be compatible with system environment.

In Chapter 2, low power real time clock using pulse-driven crystal oscillator is discussed. Even with advance in CMOS-based timer, crystal oscillator still provides superior performance in accuracy by orders of magnitude. However, its power consumption needs to be kept minimal, considering that the real time clock has to be running all the time without being put under sleep mode. Moreover, previous works on low power crystal oscillator requires an operational amplifier with monitoring circuit. Because these require constant bias current, it was limiting the minimum power consumption of the circuit. Therefore, operational amplifier was replaced with pulsed drivers. The pulse was generated at precise timing by using DLL. These peripheral circuits were operated with reduced supply voltage to minimize peripheral circuit power. To maximize pulse strength to compensate for energy lost at each oscillation cycle, the drivers were operated at highest supply voltage available. All the different supply voltage levels were generated on chip by using switched capacitor network. The circuit was tested at different supply voltage and temperature. Its frequency characteristic along with power consumption were measured and compared to traditional circuit. Also, its performance as real time clock was verified by measuring Allan deviation.

In Chapter 3, simplified version of pulse-driven crystal oscillator is discussed. Recent works showed that the crystal oscillator can still maintain oscillation with inaccurate pulse position, thanks to crystal's high quality factor. DLL in Chater 2 is replaced with Schmitt trigger, greatly reducing design effort. Its implementation detail, layout, and simulation result are presented.

In Chapter 4, sub-nW 8 bit SAR ADC using transistor-stack DAC is discussed. Conventional SAR ADC is well known for its excellent energy efficiency. However, most of the works are focused on high sampling speed, which is not always necessary in environmental monitoring. Therefore a new SAR ADC architecture was designed for moderate sampling frequency of $1 \mathrm{kS} / \mathrm{s}$. To facilitate design effort and reduce layout dependent effect, conventional capacitive DAC was replaced with transistor-stack DAC with 255:1 multiplexer. The control logic was designed with both TSPC and CMOS logic to minimize transistor count. The design has been implemented in 65 nm as a stand-alone version and in 180 nm as part of WSN system. The ADC was tested at different sampling rates and input signal frequency. Then its linearity and power consumption was measured. Lastly, its performance was compared to ADCs with similar sampling rate and resolution.

In Chapter 5, voltage regulator using a switched capacitor network is discussed. The circuit was designed for production-cost-driven project. In order to generate voltage used for digital circuits at reduced cost, a switched capacitor network based voltage regulator was chosen, replacing a boost converter. The circuit can generate output voltage of $3.2 \mathrm{~V} \pm 10 \%$ at input voltage of $1.7-3.4 \mathrm{~V}$ with 15 mA load current.

## CHAPTER 2

## Pulse-Driven Crystal Oscillator

### 2.1 Motivation

In wireless sensor node (WSN) application, one goal is to build network of tiny sensor nodes to collect data. Due to its small size, each sensor node has to operate at extremely low power. Under one of typical scenario in Figure 2.1 (a), processor will be activated every 20 minutes to take measurement for 100 ms at $3 \mu \mathrm{~W}$. Then once in every hour, processor will transmit this data for 1 ms at 1 mW . For the rest of the time, the whole system will be put into sleep mode where it will consume idle power of 1 nW . If there is a base station which can listen to the transmitted signal all the time, each sensor node does not have to worry about accurate timing and consume average power of 1.5 nW . However for some applications, each sensor node has to communicate with each other. In such cases, each sensor node will have its own timing reference so that it will activate its communication module exactly after one hour. Unfortunately, two independently operating timing references are likely to suffer from mismatch due to jitter and other randomness. Here we assumed a new scenario as shown in Figure 2.1 (b). Given silicon based timing reference operating in sub-nW range (e.g., 660 pW in [21]), we can expect two nodes to have mismatch of 200 ms in an hour due to its high frequency instability. This corresponds to 56 ppm in an hour. In such case, communication module of one node will be activated earlier and has to wait until other node initiates communication. In such case average power consumption becomes 57 nW , which is 38 times larger than the previous case. To avoid this scenario, each node has to be equipped with high accuracy timing source. Such timing accuracy can be offered by quartz crystal oscillator (XO).

(a) Node-to-node communication with mismatch

Figure 2.1: Operation scenario of WSN.


Figure 2.2: Conventional XO circuit.
However it usually consumes much higher power than the whole sensor node system, typically in the range of $\mu \mathrm{W}$ to 100 s of nW .

### 2.2 Previous Works

To analyze problems with conventional XO, basic configuration is shown in Figure 2.2. The inverter in closed loop will amply noise in its input to kick-start the oscillation. At the same time, it provides 180 degrees phase shift, and generates energy lost each time crystal oscillates. For this configuration, inverter gain is uncontrolled and gives square wave output. Therefore, series resistor at output is required to control the drive level. This also forms low pass filter so that crystal does not oscillate at higher overtone frequency. However there are three noticeable sources of power loss in this circuit. The first one is the driver itself. Because its input signal is sinusoidal wave, the driver is never completely turned off and consumes static power. The second source is uncontrolled, high oscillation amplitude, which leads to more power loss per each cycle. The last one is the series resistor. As you can see from the waveform, this voltage difference is burnt off by the resistor. Overall, this configuration will typically consume power in $\mu \mathrm{W}$ range for 32.768 kHz crystal.

To cope with this problem, new circuit was proposed in 1977 [22], which is shown in Figure


Figure 2.3: Popular low power XO circuit.
2.3. This circuit monitors the oscillation amplitude and controls the bias current of the driver. By using active feedback control loop, it could reduce driver current, reduce oscillation amplitude, and eliminate the series resistor. The lowest power consumption reported using this scheme is 27 nW , but unfortunately it is still little too high for WSN application due to followings. First, although smaller than the conventional circuit, the driver still consumes static power. Also, there's lower bound on how small the amplitude can get. Smaller oscillation means that the driver input also becomes smaller. The driver is operating in sub-threshold region, and the smaller oscillation input makes driver much weaker and can eventually get to the point driver no longer can regenerate enough energy to sustain the oscillation. For the last, the closed loop control system, which made 27 nW possible, consumes good portion of power by itself.

### 2.3 Circuit Operation of DLL-Assisted Pulse-Driven Crystal Oscillator

To further reduce the power consumption suitable for wireless sensor node application, new circuit was designed using preamplifier concept, which is the first of two key components in the circuit. Starting from conventional circuit with only single inverter, preamplifier is placed before


Figure 2.4: Preamplifier concept of the proposed circuit.
the output driver as shown in Figrue 2.4. Then new VDD and ground pair smaller than nominal VDD and ground is supplied, so that oscillation amplitude is kept small. From now on, it will called low voltage supply, which is targeted to be about 150 mV . Then larger voltage is supplied to the preamplifier, so that small input signal can be amplified before being sent to the output driver. It will provide output driver with maximum gate-source voltage, increasing the transconductance. It will be called as high voltage supply, which is targeted to be about 1 V . In this configuration, smaller oscillation amplitude no longer weakens the driver. With help of higher input signal to the driver, it now has stronger transconductance and can maintain oscillation at lower voltage. This approach, however, has two remaining issues. If the preamplifier is implemented operational amplifier with enough bandwidth to keep the phase difference low, the preamplifier alone will consume bias current larger than 10 nA . Also, amplified signal makes output driver too strong, so that it again requires series resistor for safe operation. As a result, pulsing scheme is implemented, the second part of the key components. The previously discussed operational amplifier is replaced with an inverter and pulse generator. Instead of prematurely turning on the driver and perturb the sinusoidal oscillation, the driver is turned on at each oscillation peak to regenerate energy only when it is needed. Since precise timing it is required to have a DLL locked to crystal's resonant frequency. From there pulses can be generated at preferred timing and these signals are used to separately drive NMOS and PMOS driver. For an example, at oscillation maxima, PMOS will turn on briefly and clamp the oscillation's top level back to VDDL. At minima, NMOS will turn on


Figure 2.5: Block diagram of the proposed circuit.
and clamp the bottom level down to VSSL. This way, series resistor can be removed. In addition, only one transistor of the driver is on at a time when VDS is small, and output driver power can be significantly reduced. However, if DLL and other circuits are run in high voltage supply, which is targeted to be around 1 V , the power consumption in these circuits will be too high. It negates any benefit from lower driver power.

Therefore, another voltage pair in between high voltage supply and low voltage supply is introduced, and which is called medium voltage supply. By keeping medium voltage supply at near-threshold level, power consumption of DLL and other circuits can be reduced. Since output driver still wants to have high voltage supply signal as input, level converters are implemented in between. Unfortunately, it means four extra supply voltages. For practical application, these voltages should be generated efficiently on chip. In this circuit, switched capacitor network (SCN) is used so that the whole circuit can operate from just one set of VDD and ground. The block diagram for the final circuit is shown in Figure 2.5. Overall, the circuit will maintain the oscillation in the following way. Once DLL is locked to crystal's resonant frequency, DLL makes different sets of 32.768 kHz square wave with different phase. The circuit picks a pair of these square waves and it can make precise pulse at predefined timing. Then these pulses are applied at output driver to


Figure 2.6: Circuit in medium voltage supply.
inject energy at oscillation peak and thereby the circuit can sustain the oscillation.

### 2.4 Circuit Implementation Detail

In this section, the details of each block will be discussed. The blocks in medium voltage supply include front end, DLL, and pulse generator, which form the core of the system. These blocks are shown in Figure 2.6. Then the level converter and SCN will be explained.

Figure 2.7 shows front end (FE) block. Since crystal oscillates as sine wave within low voltage supply, the sine wave needs to be transformed into square wave within medium voltage supply for the DLL to work. Since the front end is operating at near-threshold level, where the circuit is more prone to PVT variation, if we use simple inverter, it may not make square wave at all when either NMOS or PMOS is much stronger than the other. Therefore, the FE output needs to trip exactly in the middle of low voltage supply, even with PVT variation. Therefore self-adaptive body biasing


Figure 2.7: Front end block circuit schematic.
technique is used for FE [23]. Body biased inverter will amplify the sine wave from the crystal while separate body bias generator dynamically adjusts the body bias so that trip voltage is set as required. However, the body bias generator itself consumes more than few nW of power if it is left running always. Therefore body bias voltage is held in on-chip 60 pF capacitor and adjusted only once in a while. The reference voltage, which sets the tripping voltage was generated on-chip using diode stack transistors, which can operate few pW range.

Figure 2.8 shows DLL block. Delay cells are made with current starved inverters, double stacked to minimize any leakage present. Then a pair of outputs from delay cells is chosen for pulse generator. Depending on which cells are chosen, the pulse generator will use appropriate logic gates so that correct pulses can be sent to the drivers. Figure 2.9 shows edge detector part of the DLL. Instead of using standard flip-flop, custom cell has been designed to reduce the number of transistors to 13 instead 36 transistor of standard cell FF. By doing so, flip-flop power was reduced by $55 \%$. Similar to body bias generator from the FE, the edge detector will only operate periodically. This may bring concern about worse frequency stability. However, as it will be explained in the next section, the long term stability does not show noticeable impact, due to


Figure 2.8: DLL circuit schematic.


Figure 2.9: Edge detector circuit schematic.


Figure 2.10: Charge pump circuit schematic.
innate frequency stability performance of the quartz crystal.
Figure 2.10 shows charge pump part of the DLL. Similar to other parts shown previously, double stacking is used when needed and the circuit operates periodically while relying on large capacitors to hold bias voltage. Although not shown in the figure for the simplicity, bias voltage for the charge pump is also internally generated and it is assisted by decoupling capacitors. Overall, when the DLL and FE need to recalibrate bias voltage, all the circuit parts previously explained will be activated. The measurement result confirmed that the circuit can sustain the oscillation while running in this mode for only 2 cycles out of every 32 cycles. Then for the rest of 30 cycles, only blocks in main signal path, which are FE, delay cells, and PG, remain running to save power.

The Figure 2.11 shows level converter. The pulse generated from previous blocks is still in medium voltage supply. In order to have the maximum transconductance, the pulse needs to be up-converted to high voltage supply level. Since each pulse is sent to different type of transistors, the up-conversion requirement is different for each signal. For an example, when pulse for NMOS driver is converted, the maximum voltage is increased to VDD of high voltage supply, whereas


Figure 2.11: Level converter circuit schematic.


Figure 2.12: SCN and clock dividers.
minimum voltage level does not matter since it's already lower than VSS of low voltage supply. Similarly, the minimum voltage needs to be decreased to supply ground level for PMOS pulse. To build robust level converter with PVT variation, new level converter is designed to reduce the contention between pull-up and pull-down path during transition. Especially for PMOS driver, delay element is inserted in the circuit to completely turn off unnecessary path [24]. Once the level conversion is done, this signal is applied to the gate terminal of output driver transistors.

As explained earlier the circuit has its own SCN to generate different voltage pairs. The voltage generation has to be done without consuming too much power by itself. Considering parasitic cap from switching components, it is better to use larger switching capacitor operating at lower frequency, if the circuit can tolerate ripple from resulting supply voltages. As a result, frequency dividers are used to get 4.096 kHz signal for operating the SCN as shown in Figure 2.12. Although not shown here, SCN requires separate clock source before the crystal starts to oscillate. This can be done by using low power ring oscillator, which was not implemented in this circuit. Since it can be turned off after crystal starts to oscillate, the power cost of ring oscillator will be minimal during normal operation.

Figure 2.13 shows details of SCN1 - which generates medium voltage supply from high voltage supply - and SCN2 - which generates low voltage supply from medium voltage supply. The switch configuration can be dynamically adjusted as needed and only few examples are shown in the


Figure 2.13: Switch configuration of SCN1 and SCN2.


Figure 2.14: Measured waveform.
figure. For the measurement result discussed in the following session, medium voltage supply was set to be half of high voltage supply, and low voltage supply was set to be one-third of medium voltage supply.

### 2.5 Measurement Result

The proposed circuit was fabricated in 180 nm CMOS technology with triple well. Figure 2.14 shows measured waveform, where pulse injection at each oscillation peak can be observed. The power consumption of all the circuit, including on-chip voltage generation by SCN, at different power supply voltage is shown in Figure 2.15. The minimum power of 5.58 nW was observed at room temperature. For the rest of the test result such as temperature sweep, supply voltage of 1.08 V was used.

The experiment result in Figure 2.16 shows the frequency change under different supply voltage and pulse location. Normal in Figure 2.16 means pulse was generated around oscillation minima and maxima. Early is case when pulse was generated before oscillation peaks, and late is case when pulse was generated after oscillation peaks. Full means pulse was generated around


Figure 2.15: Power at different power supply voltage.


Figure 2.16: Normalized frequency at different pulse location.


Figure 2.17: Frequency at different temperature.
oscillation peaks with three times wider pulse width. The graph shows frequency change normalized to frequency with normal pulse at supply voltage of 1 V . The circuit shows $30.3 \mathrm{ppm} / \mathrm{V}$ over $0.94-1.2 \mathrm{~V}$. Within $10 \%$ of the supply voltage at 1.08 V , the frequency drift is approximately 7 ppm . In comparison, conventional circuit - the one which has only one inverter and power consumption is in $\mu \mathrm{W}$ range - is $1.89 \mathrm{ppm} / \mathrm{V}$, and CMOS gate leakage based timer published in ISSCC 2011 [21], is $4200 \mathrm{ppm} / \mathrm{mV}$. Figure 2.16 shows that pulse location and width affect oscillation frequency by only few ppm. This shows that the circuit can maintain acceptable frequency performance even if DLL performance degrades.

Figure 2.17 shows the frequency dependence to the temperature. The ideal frequency characteristic is shown in blue line, as specified by crystal's datasheet. The shaded region marks the specification boundary coming from process variation of the crystal itself. The solid red line shows the result of the circuit. The graph suggests that the circuit stays within the specification boundary. Shown in black line is the conventional circuit result.


Figure 2.18: Frequency measurement for eight hours.


Figure 2.19: Allan deviation measurement and comparison to prior ultra-low power RTC.

| Minimum power consumption | 5.58 nW |
| :---: | :---: |
| Operating temperature tested | $-20^{\circ} \mathrm{C}-80^{\circ} \mathrm{C}$ |
| Frequency drift within <br> testing temperature | $-4.56 \mathrm{ppm}-133.3 \mathrm{ppm}$ |
| Supply voltage dependence | $30.3 \mathrm{ppm} / \mathrm{V}$ |
| Minimum oscillation amplitue | $100 \mathrm{mV} \mathrm{p}_{\mathrm{p}-\mathrm{p}}$ |


|  | Frequency | Area | Power consumption | Technology |
| :---: | :---: | :---: | :---: | :---: |
| $[26]$ | 32.768 kHz | N/A | 220 nW | $2 \mu \mathrm{~m}$ |
| $[27]$ | 2.1 MHz | $0.41 \mathrm{~mm}^{2}$ | 700 nW | $2 \mu \mathrm{~m}$ |
| $[28]$ | 32.768 kHz | N $/ \mathrm{A}$ | 27 nW | $2 \mu \mathrm{~m}$ |
|  |  |  | $(32 \mathrm{nW}$ with clock divider) | $2 \mu \mathrm{~m}$ |
| This work | 32.768 kHz | $0.3 \mathrm{~mm}^{2}$ | 5.58 nW | $0.18 \mu \mathrm{~m}$ |

Table 2.1: Performance summary.

To use this circuit as real time clock in wireless sensor node, long term frequency stability is the utmost importance. The circuit was continuously monitored in room temperature for eight hours, and Figure 2.18 shows that frequency is tightly controlled, within fraction of ppm. Shown in Figure 2.19 is Allan deviation of the circuit and basic circuit. The red line shows the measurement result of the circuit, and the black line is the measured result of the basic crystal oscillator operating at 32.768 kHz . For comparison, Allan deviation measurement result for ultra-low power silicon-based timers are shown [21,25]. The result shows that the circuit maintains stability performance needed for WSN, even with periodic operation of bias point recalibration.

### 2.6 Conclusion

The Table 2.1 shows the performance summary, along with comparison to previous works [26-28]. The new XO architecture was proposed which uses low voltage supply to keep small oscillation amplitude and high voltage supply for strong driver. Together with pulse injection scheme realized by DLL, the power could be reduced by 4.84x from the best previously known work. The long term frequency stability performance is confirmed and meets requirement for WSN application.

## CHAPTER 3

## Schmitt Trigger Based Pulse-Driven Crystal Oscillator

### 3.1 Motivation

The details of DLL-assisted pulse-driven crystal oscillator was published in 2012 [29]. The previous circuit uses DLL to generate pulse at precise timing relative to oscillation frequency. Inaccurate pulse timing can potentially increase power consumption because the capacitor connected to the crystal is charged by the driver, not by the resonant oscillation of the crystal. However in the work published in 2014 [30], it used inverter delay for timing. Using pulse with higher amplitude as it was proposed in DLL-assisted pulse-driven crystal oscillator, eliminating precise timing circuit, and using 28nm technology, it could reduce the total power consumption with 32.768 kHz crystal. As it is shown in the measurement result of the DLL-assisted pulse-driven crystal oscillator, location of the pulse has limited impact at operating frequency accuracy.

Utilizing this tradeoff between accurate timing and frequency accuracy, new circuit has been simulated with 180 nm CMOS technology, same as previous DLL-assisted pulse-driven crystal oscillator. Specifically, the circuit was designed to be used as a real-time clock for $\mathrm{mm}^{3}$-scale system as in [7]. The system operates from a single rechargeable battery, and internally generates 0.6 V and 1.2 V . The new design has to maintain stable oscillation from finite number of voltage supply level available in the system.

Figure 3.1 shows electrical model of 32.768 kHz crystal used in simulation, and Figure 3.2 shows power consumption of crystal oscillating at different voltage level. At oscillation amplitude of 0.6 V , the crystal alone dissipates 3.7 nW . While this level of power consumption is acceptable


Figure 3.1: Electrical model of 32.768 kHz crystal used in simulation.


Figure 3.2: Power consumption of crystal oscillating at different voltage level
for given application, peripheral circuit's power consumption is not. At given technology, standard threshold voltage device suffers from prohibitive level of leakage current, whereas high threshold voltage device has functional failure at different process variation. Therefore additional voltage level of 0.3 V was chosen. This can be easily generated from 0.6 V by SCN with $2: 1$ division ratio, and standard threshold voltage device can safely operate with help of dynamic compensation circuit such as 2.7.

With oscillation peak-to-peak amplitude at 0.6 V , the toal power consumption is going to be larger than 10 nW . Considering that real-time clock has to be running all the time without, it increases average power consumption of the whole system significantly for $\mathrm{mm}^{3}$-scale system. Therefore, the new system has been designed to operate at 0.3 V , which can be easily generated by SCN with $2: 1$ division ratio. At 0.3 V peak-to-peak oscillation amplitude across crystal with 5 pF capacitance to ground at each end, it was simulated to consume 0.9 nW without any peripheral circuit.

The testing result in the next section, as well as work of [30], showed that pulse location has

| Type | Temp. $\left({ }^{\circ} \mathrm{C}\right)$ | Min. | Max. | Mean | Sigma |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Schmitt trigger | -20 | 2.77 | 8.72 | 4.62 | 0.641 |
|  | 27 | 1.50 | 4.48 | 2.87 | 0.396 |
|  | 100 | 0.469 | 2.79 | 1.49 | 0.261 |
| Inverter chain | -20 | 3.25 | 25.6 | 8.46 | 2.57 |
|  | 27 | 0.747 | 8.11 | 2.96 | 0.788 |
|  | 100 | 0.001 | 2.28 | 0.828 | 0.308 |

Table 3.1: Monte-Carlo simulation for delay variation $\mu \mathrm{s}$

| Type | Temp. $\left({ }^{\circ} \mathrm{C}\right)$ | Min. | Max. | Mean | Sigma |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | -20 | 33.9 | 87.4 | 49.6 | 6.50 |
| Schmitt-trigger | 27 | 45.7 | 206 | 93.2 | 19.0 |
|  | 100 | 170 | 1035 | 407 | 104 |
| Inverter chain | -20 | 295 | 331 | 311 | 4.50 |
|  | 27 | 320 | 520 | 361 | 19.5 |
|  | 100 | 719 | 3079 | 1380 | 308 |

Table 3.2: Monte-Carlo simulation for power variation (pW)
limited effect at oscillation stability. However it still needs to stay within acceptable range, that is, close to oscillation maxima and minima. When the pulse is far away from the ideal location, the oscillator suffers from different problem according to driving scheme. If the crystal is driven at one terminal only, such as the work described in the previous section, the capacitors at both ends of the crystal is charged by the driver, not by the resonant current coming from the crystal. Therefore the circuit has to waste additional amount of power every time the crystal oscillates. If the crystal is driven at both terminals as in [30], oscillation amplitude grows larger as the pulse location is moved further away from the ideal location, thus leading to higher power consumption as well. Therefore tradeoff exists between accuracy of the pulse location and power consumption.

In [30] inverter chain was used to generate delay for the pulse. However inverter chain operating at such a low voltage is known to suffer from large variation. In addition it requires long chain of inverters to generate delay in $\mu$ s level, as required in 32.768 kHz crystal application.

Figure 3.3 shows the block diagram of the proposed Schmitt trigger based pulse-driven crystal oscillator, operating without DLL. For this work, Schmitt trigger replaces inverter chain. Table 3.1 and 3.2 shows Monte-Carlo simulation result for one Schmitt trigger compared to 19 inverters connected in series. The number of inverters was chosen so that it matches single Schmitt trigger's delay in nominal case. The area for one Schmitt trigger in 180 nm technology is $207 \mu \mathrm{~m}^{2}$, whereas


Figure 3.3: Block diagram of the proposed pulse-driven crystal oscillator without DLL.


Figure 3.4: Schmitt trigger based pulse generator.

19 inverters require $147 \mu^{2}$.

### 3.2 Circuit Implementation Detail

Figure 3.4 shows pulse generator using Schmitt trigger. The Schmitt trigger generates delay from sine-wave input coming from crystal's oscillation. Followed by inverter chain with selectable length, this circuit can generate pulse with different length. Unlike DLL-assisted pulse-driven crystal oscillator it cannot change the pulse location. In order to save power, part of the inverter chain can be power gated when not used.

Bootstrap circuit schematic is shown in Figure 3.5. This circuit can provide faster rise/fall time and lower power consumption than level converter operating at same voltage domain (i.e., from 0.3 V to 0.6 V ). It requires larger area than level converter due to four capacitors, but its small capacitance - only 10fF - makes its area overhead relatively small compared to the whole system's area.

Figure 3.6 shows SCN circuit for generating 0.3 V for the system. The crystal's oscillation voltage is level-converted to generate SCN clock signals. As explained previously, standard-threshold device operating at 0.6 V may lead to prohibitively large amount of leakage current at certain process and temperature variation. Therefore, high-threshold device operating at 1.2 V was chosen for SCN clock signal generation.


Figure 3.5: Bootstrap circuit schematic diagram.



Figure 3.6: SCN schematic for the Schmitt trigger based pulse-driven crystal oscillator system.


Table 3.3: Simulated power consumption of the Schmitt trigger based pulse-driven crystal oscillator system.

### 3.3 Layout and Simulation Result

Layout of the system is shown in Figure 3.7. The total area including SCN is $0.04 \mathrm{~mm}^{2}$, and area without SCN is $0.003 \mathrm{~mm}^{2}$. Simulation result in Figure 3.3 shows that the system consumes 5.61 nW at nominal process at $25^{\circ} \mathrm{C} .2 .34 \mathrm{nW}$ is consumed by core circuit blocks, including pulse generator, bootstrap circuit, driver, and crystal. Increased pulse width at $-20^{\circ} \mathrm{C}$ makes oscillation amplitude larger and increases power consumption compared to the case at $0^{\circ} \mathrm{C}$. At higher temperature, leakage power starts to increase and dominates overall power consumption.


Figure 3.7: Layout of the Schmitt trigger based pulse-driven crystal oscillator system.

## CHAPTER 4

## Sub-nW 8-bit SAR ADC with Transistor-Stack DAC

### 4.1 Motivation

Wireless sensor nodes (WSN), as the name suggests, acts as measurement device to real world. At the foremost part of this interface is a sensor, which transforms environmental change information into electrical parameter. For efficient handling of acquired information, the electrical parameter in analog form needs to be converted to digital form. Therefore analog-to-digital converter (ADC) is also a key component in WSN application. Like every other components in WSN, instantaneous power consumption and overall energy efficiency of ADC are also critical metrics in these systems due to their limited energy budgets. Although it depends on different applications, many environmental measurements - such as pressure, humidity, inertial movement, and light intensity and bio-signals - such as ECG, EMG, and EEG [31,32] - require relatively low sampling rates at $1 \mathrm{kS} / \mathrm{s}$ or lower. More interestingly, unique power scheduling behavior common in WSN calls for another requirement. When WSN is put to standby state for its minimal average energy consumption, it can only wake up from external triggering such as radio signal if ADC is also turned off. If ADC can be designed with minimum operating power, however, it can continuously monitor the environment - but not record or process data - and wake up only when it detects important event. For such purpose, ADC sampling rate can be brought down much lower than $1 \mathrm{kS} / \mathrm{s}$. Since ADC power consumption will be just little more than its standby power under such scenario, new ADC also needs to have very lower power in standby mode. Unfortunately, many previous researches on ADC are focused on relatively high power range for WSN application [33-35]. Typical ADCs
are designed to run at much higher sampling rates of $1 \mathrm{MS} / \mathrm{s}$ and higher. Though they can achieve very high energy efficiency, their high instantaneous power consumption often exceeds sensor node budgets. In order to achieve target resolution at higher sampling frequency, the standby power is sacrificed. As a result, slowing down the sampling clock frequency for previous designs does not provide sufficient power and energy efficiency for WSN. As a result, WSN requires a new ADC to satisfy the requirements above. For the ADC that will be discussed in the next sections, target sampling rate was set to $1 \mathrm{kS} / \mathrm{s}$ with moderate resolution of 8 -bit. In section 4.2 , I will give a brief overview of previous works on low power ADC. In section 4.3, I will discuss circuit operation principle and implementation detail in 65 nm CMOS technology. In section 4.4, I will discuss how the circuit was migrated to 180 nm technology for implementation in WSN application. Finally in section 4.5 , I will show the measurement result.

### 4.2 Previous Works

Among different ADC architectures, SAR ADCs are known to offer some of the highest energy efficiencies among modern data converter topologies. SAR ADC has two distinctive advantages over other architectures. First, it can make decision on 1 bit at each comparison. This reduces conversion time significantly, compared to architectures like single-slope ADC. Some architectures offer similar or faster conversion time - such as pipeline or flash ADCs - but they come with expense where SAR ADC's second advantage lies in. That is, SAR ADC can convert data with single comparator. As ADC's resolution increases, comparator's power starts to dominate the total power, constrained by conparator's noise performance. Therefore, SAR ADC architecture was chosen for ultra-low power and energy consumption.

To further improve power and energy efficiency of SAR ADC in target operating condition, its power consumption source has to be analyzed. As shown in Figure 4.1, a SAR ADC can be divided into three distinctive blocks: DAC, control logic, and comparator. For next paragraphs, each block's impact will be discussed. First, DAC in conventional SAR ADC is constructed by an array of capacitors, namely capacitive DAC (CDAC). Since CDAC power is directly proportional to size of unit capacitance, recent SAR ADC researches use capacitor built with metal wire [36]. However, this makes ADC vulnerable to any mismatch and layout dependent effect (LDE) such


Figure 4.1: Conventional SAR ADC block diagram
as parasitic capacitance [37]. Although there has been much discussion on CDAC switching techniques [38], there is not much margin for improvement in decreasing minimum size of the unit capacitance and therefore in using CDAC. Second, control logic in moderate resolution ADC also constitutes significant amount of power consumption in moderate resolution ADC. Especially, the large number of transistors - compared to DAC and comparator blocks - means it has large impact on standby power of the whole ADC system. Therefore recent works uses different style of logic families than conventional CMOS logic [39,40] Third, comparator power in SAR ADC depends on target resolution. To improve the noise performance, explicit capacitors have to be implemented at comparator's back-to-back inverter pair output. Since clocked comparator discharges one of the capacitors at each comparison, it directly affects power. However comparator power is likely to be similar to other blocks in moderate resolution ADC. Therefore new ADC was designed using SAR ADC architecture, with more focus on new DAC and control logic structure. Its detail will be explained in the next section.

### 4.3 Circuit Implementation in 65nm CMOS Technology

The new ADC design starts with replacing conventional CDAC to a transistor-stack DAC (TSDAC). The TS-DAC is constructed from 256 low-Vth transistors configured as forward-biased diodes, followed by an analog 255:1 MUX (Figure 4.2). Furthermore, since all voltages are static in the TS-DAC, its output voltage is not affected by parasitic routing capacitance. This allows it to


Figure 4.2: Block diagram of the proposed circuit.
be routed using automatic routing tools which, combined with automatic placement scripts, results in automatic layout generation that greatly eases technology migration. The 255:1 MUX is designed using a new incremental indexing method. In addition, the digital control logic is optimized using true single-phase clocked (TSPC) and CMOS logic circuits that reduce device count by $81 \%$ compared to conventional logic. Figure 4.2 shows the overall ADC operation. During the sampling period, $\mathrm{C}_{\text {sample }}$ stores SIGNAL $-\mathrm{V}_{\mathrm{DD}} / 2$. During conversion, $\mathrm{C}_{\text {sample }}$ is connected in series to DAC_OUT, which sets one end of the comparator input to (DAC_OUT-SIGNAL + V $\mathrm{V}_{\mathrm{DD}} / 2$ ). Since this value is compared to $\mathrm{V}_{\mathrm{DD}} / 2$, the comparator triggers when DAC_OUT and SIGNAL are equal. This sampling approach has the advantage that the comparison is made to a fixed value $\mathrm{V}_{\mathrm{DD}} / 2$, eliminating common-mode dependent comparator offset error. $\mathrm{C}_{\text {sample }}$ is chosen to be 100 fF based on the leakage of the switches; this small value results in a low sampling energy. Despite the extremely small current supported by the TS-DAC, the small comparator input capacitance allows for sufficient speed in the targeted application space.

As it will be explained later, speed of MUX settling and leakage current flowing into the output of MUX affect the overall performance greatly. In order to mitigate this problem, - i.e. faster settling of MUX output and sampled input, and lower leakage during conversion - a boot-strap circuit in Figure 4.3 is used to drive switches connected to C $_{\text {sample }}$. The circuit takes SMP EN and its complimentary signal from control logic, and generates SMP and the complimentary signal to drive transmission gates. With this bootstrap circuit, gate terminals of transistors in transmission gate is driven at voltage higher than $\mathrm{V}_{\mathrm{DD}}$ and lower than ground level.

For the TS-DAC shown in Figure 4.4, transistor matching is crucial to linearity performance.


Figure 4.3: Bootstrap circuit for sampling.


| A1 | A2 | $\cdots$ | A15 | A16 | C256 | C255 | $\cdots$ | C242 | C241 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| A17 | A18 | $\cdots$ | A31 | A32 | C240 | C239 | $\cdots$ | C226 | C225 |
| $\vdots$ | $\vdots$ | $\because$ | $\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ | $\ddots$ | $\vdots$ | $\vdots$ |
| A225 | A226 | $\cdots$ | A239 | A240 | C32 | C31 | $\cdots$ | C18 | C17 |
| A241 | A242 | $\cdots$ | A255 | A256 | C16 | C15 | $\cdots$ | C2 | C1 |
| B16 | B15 | $\cdots$ | B2 | B1 | D241 | D242 | $\cdots$ | D255 | D256 |
| B32 | B31 | $\cdots$ | B18 | B17 | D225 | D226 | $\cdots$ | D239 | D240 |
| $\vdots$ | $\vdots$ | $\because$ | $\vdots$ | $\vdots$ | $\vdots$ | $\vdots$ | $\ddots$ | $\vdots$ | $\vdots$ |
| B240 | B239 | $\cdots$ | B226 | B225 | D17 | D18 | $\cdots$ | D31 | D32 |
| B256 | B255 | $\cdots$ | B242 | B241 | D1 | D2 | $\cdots$ | D15 | D16 |

Figure 4.4: Transistor stack schematic and layout floorplan.

To achieve good matching, four sets of series-connected transistors are placed in common-centroid fashion by an automatic script and routing is performed by automatic routing (APR) tool. In addition, the nodes in the stack are stabilized using decoupling MIM and metal finger capacitors. This allows the stack to have lower driving current as the decoupling capacitors provide the instantaneous charge required when the DAC changes its output. By placing capacitors above the transistors, no area overhead is incurred. A conventional 256:1 MUX is not suitable for two reasons. First, in a conventional MUX every input passes through an equal number of switches to the output. Since the decoupling capacitors in the stack provide sufficient charge during MUX transitions, the settling speed is limited by the MUX switch on-resistances. Earlier stages of the conversion require a larger output voltage shift than later stages, and hence, they are the limiting factor in ADC performance. Second, in a conventional MUX many irrelevant switches are turned on/off at every transition (e.g., when the LSB changes value, 128 switches are impacted). To address these two issues, we propose a new MUX structure in which the number of switches from input to output increases linearly from MSB to LSB, resulting in balanced settling time while minimizing unnecessary switch transitions.

Figure 4.5 shows a 4-bit sub-section of the proposed MUX structure. During the first conversion step, CK[3] is enabled only for that conversion step, selecting ST[8] (assume BP0 [4] was enabled in a prior conversion step). At the next cycle, either B0[3] or BPO[3] is selected and remains on for the remainder of the conversion. In the shown example $\mathrm{BP} 0[3]$ is selected based on the comparator result. At the same time, CK[2] is enabled, selecting ST[4]. In the next conversion step CK[1] and B00[2] are enabled to select ST[6], and so on. Note that the number of switches the signal passes through increases by exactly one in each conversion step, leading to a balanced settling time; this provides more than one order of magnitude performance improvement (simulation). In addition, the number of active switches is reduced with gating logic based on the state of the switch that is two stages upstream. For example, in the last conversion step of Figure 4.5 the two switches BP000[1] are enabled since BP00[3] is enabled, however the two switches BP001[1] are off because B00[3] is off. Hence, only 1 unnecessary switch is enabled, compared to 3 without switch gating. Note that looking one level upstream, rather than two, can further reduce the number of switches unnecessarily turned on. However this also increases the amount of gating logic, resulting in higher total power consumption. The chosen two-level approach was found to mini-


Figure 4.5: Switching sequence for the last four clock cycles.


Figure 4.6: Switch connection status at the end of conversion cycle for code 0000010x.
mize overall power consumption, disabling $89 \%$ of all unused switches (Figure 4.6) and yielding $51 \%$ lower power consumption (simulation).

Figure 4.7 shows the control logic circuit. Although asynchronous logic has shown good performance with low power [39, 40], it requires either a detection or delay circuit to ensure DAC output voltage settling. Due to the relatively long settling period in the moderate sampling rate applications targeted, these circuits consume much more power than synchronous control logic. However, normal CMOS logic using standard flip-flops requires a large transistor count, which increases static power consumption due to leakage. We therefore implemented a hybrid control logic using TSPC logic to hold state information and regular CMOS logic to generate outputs. The resulting controller requires 221 transistors, which is $81 \%$ lower than conventional CMOS logic.

Figure 4.8 shows the comparator circuit. For an ADC to have good accuracy, comparator should induce low kick-back noise to the input side. One common solution to this problem is to use a preamplifier to isolate input from fast switching nodes. For this system, however, preamplifier itself will consume considerable amount of power, easily dominating overall power budget when used at target sampling frequency. In order to achieve low kick-back noise without preamplifier,


Figure 4.7: Control logic circuit and associated timing diagram.


Figure 4.8: Comparator circuit.


Figure 4.9: Block diagram of 180 nm CMOS test chip.
source terminal of input transistor pair is connected to ground, and reset transistor is connected to the drain terminal [41]. This reduces gate-drain voltage change, leading to reduced kickback noise and thus eliminating the need for preamplifier.

### 4.4 Implementation in 180nm CMOS Technology

The design discussed in the previous section was implemented in 65 nm CMOS technology. In order to implement the same design into WSN application such as [7], the circuit was migrated to 180 nm CMOS technology. The main purpose of the application was to measure relatively static signal such as battery voltage. In addition, it was requested that the ADC can be put into sleep mode, while reacting fast after it was called back on to active mode. Since ADC was requested to operate independently of other circuit blocks, a dedicated LDO would provide supply voltage. Additionally, ADC power consumption was not a dominant power dissipation source for the given application.

To satisfy the requirements, TS-DAC was sized differently, so that it can support approximately

10nA of current when it is on. It allows TS-DAC to settle faster when ADC is called back on to active mode, and makes it less susceptible to leakage current of the MUX. Additionally, control logic was designed with standard CMOS logic so that the design flow is more compatible with standard design procedure used for the rest of the system, at expense of increased power consumption. Unlike the previously discussed design, the whole ADC circuit will operate from single voltage supply provided by the dedicated LDO. All of these factors allowed easier implementation with APR tool. With only comparator and TS-DAC unit cell - that is, single transistor with appropriate spacing for adjacent n-wells - requiring manual layout, the whole design could be implemented with simpler APR process. In 65 nm design, the TSPC logic required manual layout of seven basic cells, and use of dual voltage supply increased layout effort for the cell placement. Figure 4.9 shows block diagram of the implemented design.


Figure 4.10: Die photo of 65 nm CMOS test chip.

### 4.5 Measurement Result

Figure 4.10 shows the die photo of the design in 65 nm CMOS with active area of $0.106 \mathrm{~mm}^{2}$ The TS-DAC requires $79 \%$ of the area. The test structure includes scan chain and level converters for off-chip data interface. Two different designs with different transistor-stack current were implemented. One design has approximately twice the transistor-stack current of the other. Because both width and length of the transistor were adjusted to achieve higher current, active area remains identical to each other.

Figure 4.11 shows the die photo of the design in 180 nm CMOS with active area of $0.152 \mathrm{~mm}^{2}$. The area includes LDO, battery voltage divider to ADC input, and interface circuit to the system register such as level converters and additional logic for generating interrupt signal for the processor core when the conversion is done.

For 65 nm test chip, the design with smaller transistor-stack current was tested with 0.7 V supplied to the DAC, MUX, and comparator, while control logic uses 0.5 V . This test condition represents minimum power consumption with maximum efficiency. The other design with higher transistor-stack current was supplied with 0.7 V for all circuit. This design provides higher accuracy with minimum loss of efficiency. For higher current DAC, five different chips were measured.


Figure 4.11: Die photo of 180 nm CMOS test chip.

As shown in Figure 4.12 and Figure 4.13, SFDR starts to decrease at sampling rate of $1 \mathrm{kS} / \mathrm{s}$ and reduce ENOB below 7 -bit at higher sampling rate. At $1 \mathrm{kS} / \mathrm{s}-1.024 \mathrm{kS} / \mathrm{s}$ for high current DAC - the ADC provides relatively constant SNDR performance regardless of the input frequency, as shown in Figure 4.14 and Figure 4.15.


Figure 4.12: Performance with different sampling frequencies with low current DAC in 65 nm test chip


Figure 4.13: Performance with different sampling frequencies with high current DAC in 65 nm test chip


Figure 4.14: Performance at F_sample $=1 \mathrm{kHz}$ with low current DAC in 65 nm test chip


Figure 4.15: Performance at F _sample $=1.024 \mathrm{kHz}$ with high current DAC in 65 nm test chip


Figure 4.16: DNL with low current DAC in 65 nm test chip


Figure 4.17: INL with low current DAC in 65 nm test chip


Figure 4.18: DNL with high current DAC in 65 nm test chip


Figure 4.19: INL with high current DAC in 65 nm test chip


Figure 4.20: Power and FoM at different sampling frequency with high current DAC in 65 nm test chip


Figure 4.21: Power breakdown with high current DAC in 65 nm test chip


Figure 4.22: DC signal measurement result in 180nm test chip

At $1 \mathrm{kS} / \mathrm{s}$ with low current DAC the DNL is $+0.76 /-0.74$ LSB and INL is $+1.09 /-1.28$ LSB as shown in Figure 4.16 and Figure 4.17. At $1.024 \mathrm{kS} / \mathrm{s}$ with high current DAC the DNL is $+0.46 /-0.46$ LSB and INL is $+0.58 /-0.70$ LSB as shown in Figure 4.18 and Figure 4.19. The linearity performance is limited by leakage current from MUX into DAC, as well as slower speed of switches in mid-rail voltage. With low current DAC, the maximum ENOB of 7.1 bit is observed at a power consumption of 466 pW , resulting in an FoM of $3.4 \mathrm{fJ} /$ conversion-step. With high current DAC, the maximum ENOB of 7.3 bit is observed at a power consumption of 673 pW , resulting in an FoM of 4.1fJ/conversion-step. When low current DAC version of ADC operates from a single supply of 0.7 V at $1 \mathrm{kS} / \mathrm{s}$, ENOB becomes 7.2 bit with FoM of $4.4 \mathrm{fJ} /$ conversion-step.

Figure 4.20 shows power consumption and FoM at different sampling frequency with high current DAC. With 7.3-bit ENOB and 246 pW power consumption, FoM increases to $12.5 \mathrm{fJ} /$ conversionstep. The power breakdown with high current DAC in Figure 4.21 shows a balanced distribution among design components with standby power of 194 pW. For low current DAC standby power decreases to 113 pW . Table 4.1 summarizes the measured ADC performance and compares it to previous work.

Since the test chip in 180 nm process was designed to measure static signal with internally generated clock, only DC measurement was taken to evaluate the performance. Figure 4.22 shows the result with externally provided input signal. With 4.0 V supplied to mimic battery input voltage,

| Parameter | This work <br> (low current) | This work <br> (high current) | $[42]$ | $[43]$ | $[44]$ | $[45]$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Process (nm) | 65 |  | 180 | 130 | 65 | 65 |
| Area ( $\mathrm{mm}^{2}$ ) | 0.106 |  | 0.06 | 0.04 | 0.046 | 0.037 |
| Resolution (bit) | 8 |  | 8 | 10 | 10 | 10 |
| Sample Frequency (KHz) | 1 | 1.024 | 2 | 1 | 1 | 1 |
| DNL (LSB) | $0.76 /-0.74$ | $0.46 /-0.46$ | $0.28 /-0.17$ | $0.54 /-0.61$ | $0.59 /-0.62$ | $0.48 /-0.55$ |
| INL (LSB) | $1.09 /-1.28$ | $0.58 /-0.70$ | $0.81 /-0.27$ | $0.45 /-0.46$ | $0.89 /-0.41$ | $0.52 /-0.61$ |
| ENOB (bit) | 7.1 | 7.3 | 7.4 | 9.1 | 9.1 | 9.1 |
| Power (nW) | 0.466 | 0.673 | 27 | 53 | 5.8 | 3 |
| Standby Power (nW) | 0.113 | 0.194 | 7.7 | 13 | 2.34 | 0.67 |
| FOM (fJ/conv.-step) | 3.4 | 4.1 | 79.9 | 94.5 | 10.9 | 5.5 |

Table 4.1: Performance summary.
it shows coefficient of determination of $\mathrm{R}^{2}=0.9998$ to linear fit equation.

## CHAPTER 5

# Multiple Output Level Switched Capacitor Network Voltage Regulator 

### 5.1 Motivation

As mobile computing device becomes more common, power management circuit need to adapt to the new trend. Unlike a plugged-in device, its input source - battery output voltage gradually decreases over its usage. In addition, its output loading condition is vastly different. First, it will have lower loading than that of plugged-in device. Because a mobile device has to operate from limited energy budget preset by battery capacity, its instantaneous power loading is designed to be lower with more energy-efficient design. Although this is helpful for designing a new power management circuit, it means the circuit itself has to be more energy efficient. Therefore, previous designs for plugged-in devices cannot be used directly. Second, it is likely to have wider range of loading. Power gating is very commonly used techniques to extend duration between battery recharging (battery life). It allows gated circuit block to reduce the power consumption by several orders of magnitude when the block is not needed. If the power management circuit operates inefficiently at such low loading, the battery life may be shortened by performance of the power management circuit, not by system's functional blocks.

The circuit proposed in this section is designed for a battery operated system in "Talking book" project [46]. The project aimed to build a low-cost audio computer for information dissemination among illiterate people group, with main focus on lower production cost (the initial goal for the

| Component | Price (U.S. cent) | Area $\left(\mathrm{mm}^{2}\right)$ |
| :---: | :---: | :---: |
| Inductor $(10 \mathrm{uH})$ | 28.9 | 5.12 |
| Capacitor $(100 \mathrm{nF})$ | 0.275 | 0.43 |

Table 5.1: Inductor and capacitor comparison


Figure 5.1: Voltage doubler configuration
cost was $\$ 10$ ). Although the most of functional blocks can be integrated into a single chip, a flash memory had to be purchased as separate chip. Because the flash chip required $3.2 \mathrm{~V} \pm 10 \%$ as its supply voltage, a voltage regulator circuit had to generate this from battery voltage of $1.7 \mathrm{~V}-3.2 \mathrm{~V}$. I will explain why specific architecture was chosen, and how it was implemented in following sections.

### 5.2 Architecture Choice

A boost converter is an obvious candidate for voltage conversion requirement mentioned above. However, it requires a discrete inductor, which increases the cost of the system. As shown in Table 5.1 [47], an inductor for boost boost converter is much more expensive than a discrete capacitor.

On the other hand, a charge pump based circuit can be built only with several capacitors, and thus at much lower cost. Figure 5.1 shows the basic configuration of charge pump. In phase 1, a capacitor is connected in parallel with input voltage source and charged to input voltage. In phase 2, the capacitor is connected in series to input voltage, and generates twice the voltage of input voltage source. This basic configuration is often called a voltage doubler. However, this architecture can only generate integer multiples of its input voltage, as commonly shown in Dickson charge pump


Figure 5.2: Circuit block diagram.
[48].
With relatively relaxed requirement on output voltage for given application, the charge pump was chosen for its low cost and reasonable efficiency. To generate output voltage in our target range, the circuit needs to be able to support as many conversion ratios as possible, with keen focus on efficiency.

### 5.3 Circuit Implementation

Figure 5.2 shows the block diagram of the proposed circuit. Unlike conventional charge pump, bottom plate of $\mathrm{C}_{\mathrm{f}}$ is now connected to the output of an independent charge pump. In this configuration, fraction of the battery voltage is generated by down-converter block. Then the output of the down-converter is used as one input of the voltage doubler (up-converter), generating final output voltage. However, this scheme can only generate discrete number of output levels. To overcome the limited conversion ratio, the circuit can be deliberately made inefficient to generate normally unavailable voltage. In [49], on-resistance of the switches are dynamically adjusted to induce more voltage drop. Similarly, LDO can be placed at the output of charge pump [50], which also comes at the expense of efficiency. These approaches can be useful when accurate output voltage is more important than efficiency. However, efficiency is prioritized with relaxed output accuracy tolerance. Thus, down-converter is made to generate six discrete levels only.

Figure 5.3 shows the schematic of the converter. The down converter can generate $25 \%, 33 \%$, $50 \%, 66 \%, 75 \%$, or $100 \%$ (bypassing) of the input. In order to change conversion ratio on demand,


Figure 5.3: Circuit schematic.

|  | $25 \%$ | $33 \%$ | $50 \%$ | $66 \%$ | $75 \%$ | $100 \%$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| CK1 | $\phi 1$ | $\phi 1$ | $\phi 1$ | $\phi 1$ | $\phi 1$ | On |
| CK2 | $\phi 2$ | $\phi 2$ | $\phi 2$ | $\phi 2$ | $\phi 2$ | On |
| CK3 | $\phi 2$ | On | On | On | $\phi 1$ | On |
| CK4 | $\phi 2$ | $\phi 2$ | On | $\phi 1$ | $\phi 1$ | On |
| CK5 | $\phi 1$ | $\phi 1$ | Off | $\phi 2$ | $\phi 2$ | Off |
| CK6 | $\phi 1$ | $\phi 1$ | Off | $\phi 2$ | $\phi 2$ | Off |
| CK7 | $\phi 2$ | $\phi 2$ | On | $\phi 1$ | $\phi 1$ | On |
| CK8 | $\phi 2$ | $\phi 2$ | On | $\phi 1$ | $\phi 1$ | On |
| CK9 | $\phi 2$ | $\phi 2$ | $\phi 2$ | $\phi 2$ | $\phi 2$ | On |
| CK10 | $\phi 1$ | $\phi 1$ | $\phi 1$ | $\phi 1$ | $\phi 1$ | Off |
| CK11 | On | Off | On | Off | On | On |

Table 5.2: Configurations for step down converter to achieve fractional voltages


Figure 5.4: Step up converter switch configuration in detail
multiple switches are placed with different gating logic. As shown in the Table 5.2, clock signals will be distributed differently for each conversion ratio setting.

In order to get maximum transconductance with smaller transistor, the up-converter uses its internal voltage to drive the switches. In addition, PMOS switches can be used by using boosted voltage level, to eliminate threshold voltage drop commonly seen in diode-connected transistors like [48]. Because clock signals are generated in 1.8 V logic level, appropriate drivers and level converters are required. Furthermore, the up-converter's internal nodes (as well as its output) need to be charged to certain voltage before it can operate as charge pump. Therefore switches' body voltage is appropriately biased at different condition as shown in Figure 5.4.

The non-ideal characteristics of the charge pump can be distinguished into two parts. First is the parasitic capacitance in transistors and metal wire. Parasitic capacitance in are periodically charged and discharged, which leads to energy loss. Second is the non-zero resistance of the charge transfer path. It includes both the on-resistance of the transistors and resistance of the routing metal. These two characteristics will be analyzed in detail for following paragraphs.

First to be analyzed is parasitic capacitance. For voltage doubler example shown in Figure 5.1, both plates of $\mathrm{C}_{\mathrm{f}}$ are swinging as large as VBAT. In addition, the transistor switches are driven by pulse signals. Parasitic capacitance related with gate terminal of the transistor will consume unwanted power. Since its power is proportional to $C_{\text {par }} \cdot V_{\text {swing }}^{2} \cdot f_{c l k}$, its effect can be minimized by


Figure 5.5: Voltage doubler configuration
using small switches and narrower wire (lowering $\mathrm{C}_{\mathrm{par}}$ ), reducing transistor gate voltage swing, or running the charge pump at slower frequency.

Next, Figure 5.5 shows the voltage doubler configuration where Rw and Ron represent wire resistance and transistor's on-resistance, respectively. When the output voltage is stabilized after enough number of clock cycle $\mathrm{T}=1 / \mathrm{f}_{\text {clk }}$, it can be represented as its Thevnin equivalent circuit. Due to non-zero $\mathrm{R}_{\mathrm{w}}$ and $\mathrm{R}_{\mathrm{on}}$, charge may not be transferred fully to the capacitors when switches are on, which effectively increases Rout. In this configuration, $\mathrm{V}_{\text {oc }}$ and $\mathrm{R}_{\text {out }}$ can be expressed as,

$$
\begin{equation*}
R_{\text {out }}=\frac{T}{C_{f}}\left(\frac{1}{1-e^{-\frac{T / 2}{-\left(2 R_{o n}+4 R_{W}\right) C_{f}}}}-\frac{1}{2}\right)+\frac{T}{C_{f} \| C_{o}}\left(\frac{1}{1-e^{\frac{T / 2}{-\left(2 R_{o n}+4 R_{W}\right)\left(C_{f} \| C_{o}\right)}}}-\frac{1}{2}\right) \tag{5.1}
\end{equation*}
$$

In (5.1), $R_{\text {sum }}=2 R_{w}+4 R_{\text {on }}$. If we assume $C_{o}$ was chosen to be much larger than $\mathrm{C}_{\mathrm{f}}$ for lower output ripple and $\mathrm{R}_{\text {sum }} \ll \mathrm{T} / \mathrm{C}_{\mathrm{f}}$ (i.e., $\mathrm{R}_{\text {sum }} \mathrm{C}_{\mathrm{f}} \ll \mathrm{T}$ ), then (5.1) can be simplified as,

$$
\begin{equation*}
R_{\text {out }}=\frac{T}{C_{f}} \tag{5.2}
\end{equation*}
$$

On contrary, if $\mathrm{R}_{\text {sum }} \gg \mathrm{T} / \mathrm{C}_{\mathrm{f}}$, (5.1) can be simplified as,

$$
\begin{equation*}
R_{\text {out }}=4 R_{\text {sum }} \tag{5.3}
\end{equation*}
$$

Figure 5.6 shows plot of $\mathrm{R}_{\text {out }}$ as expressed in (5.1), (5.2), and (5.3). (5.2) tells that in order to support more output power at same level of voltage drop, the user can either use larger capacitor or increase the clock frequency. However, this is effective only until $\mathrm{R}_{\text {sum }} \ll \mathrm{T} / \mathrm{C}_{\mathrm{f}}$. Considering


Figure 5.6: $\mathrm{R}_{\text {out }}$ when $\mathrm{R}_{\mathrm{sw}}=0.25 \Omega, \mathrm{R}_{\mathrm{w}}=3 \Omega, \mathrm{C}_{\mathrm{f}}=1 \mu \mathrm{~F}, \mathrm{C}_{\mathrm{o}}=10 \mu \mathrm{~F}$.
that $\mathrm{R}_{\text {sum }}$ is mostly dependent on parameters fixed at design time, the whole circuit's performance can be limited by this parameter if it is not designed appropriately. In order to reduce $\mathrm{R}_{\text {sum }}$, the circuit has to be run at higher frequency (lowering $T$ ), use wider wire (lowering $R_{w}$ ), or use larger transistors with higher gate voltage (lowering $\mathrm{R}_{\text {on }}$ ).

These contradictory results suggest that the circuit has to find optimal design point. Considering that output loading is not fixed, the circuit was designed so that its parameters can be dynamically adjusted in run-time. Therefore, the circuit needs to be highly reconfigurable. As for the transistor sizing, each switch is composed of 5 independent transistors connected in parallel. Number of transistors switching can be increased when the circuit experienced large output voltage drop, and decreased when output loading is small so that efficiency can be improved. In addition, the clock speed can be chosen from eight stages of clock divider, so that it can manage dynamic power loss.

### 5.4 Measurement Result

Figure 5.7 shows die photograph of the circuit. The layout area is $1.16 \mathrm{~mm}^{2}$.
Figure 5.8 and 5.9 shows output voltage at different input voltage and load current. Figure 5.8 is measured while decreasing battery input voltage, thereby changing to higher conversion ratio when output voltage drops below $3.2 \mathrm{~V}-10 \%$. The output voltage stays within specified range,


Figure 5.7: Die photograph of the circuit


Figure 5.8: Measured output voltage while decreasing battery voltage


Figure 5.9: Measured output voltage while increasing up battery voltage
except when there's no load current. The circuit needs to be put into sleep mode to save power at zero load, so it does not violate the target requirement. Figure 5.9 is measured while increasing battery input voltage, so that conversion ratio is decreased when output voltage exceeds $3.2 \mathrm{~V}+10 \%$. Again, output voltage stays within specified range except zero load case.

The Figure 5.10 shows efficiency with different input voltage and load current. Since the nominal load current for this application was at 1020 mA , the circuit was designed to provide higher efficiency at that range. The circuit provides $85.2-90.7 \%$ efficiency over battery voltage range for 15 mA load current. Although circuit can support higher current, it comes at the cost of decreased efficiency and reduced battery voltage range.

Table 5.3 shows maximum current the circuit can support at different switching frequency. Since external capacitor was used, size of the capacitor was kept large enough so that maximum performance can be supported until limited by $\mathrm{R}_{\text {sum }}$. For conclusion, the switched capacitor network voltage regulator supporting battery voltage range of 1.73 .4 V was designed. It shows maximum efficiency of $91 \%$ at load current of 15 mA . The actual cost for discrete component parts could be decreased to $\$ 0.03$ from $\$ 0.38$ for boost converter case [46].


Figure 5.10: Measured efficiency

| Switching Frequency | Max Current (mA) |  |  |
| :---: | :---: | :---: | :---: |
|  | 3.2 V | 2.5 V | 1.7 V |
| 65.28 kHz | 110.4 | 67.38 | 14.11 |
| 130.56 kHz | 128.9 | 79.18 | 18.27 |
| 261.12 kHz | 135.1 | 83.5 | 20 |
| 522.2 kHz | 136.7 | 83.5 | 20 |
| 1.044 MHz | 134.7 | 83 | 19.4 |
| 2.089 MHz | 131.8 | 74.37 | 17.85 |
| 8.356 MHz | 111.6 | 62.88 | 15 |

Table 5.3: Maximum current the circuit can support

## CHAPTER 6

## Ongoing and Future Work

### 6.1 Future Work for Crystal Oscillator

Although 32.768 kHz quartz crystal is well known for its superior frequency characteristic over silicon based timer, it shows quadratic dependence on temperature variation as shown in 2.17. It has been widely known that the size of the capacitors in the both end of the crystal largely affects crystal oscillation frequency. Fortunately, the capacitance is in the range of few pF, which can be easily implemented on chip. Also, the simulation result suggests that phase of the pulse can also affect the oscillation frequency. Therefore, it can potentially be modified to work as a temperature compensated crystal oscillator (TCXO) if it is implemented together with other temperature sensor circuit.

In order to make the circuit frequency controllable, the circuit has to be modified so that it has a pair of digitally variable capacitor array. Also, its frequency dependence on pulse phase has to be thoroughly investigated. With careful characterization and collaboration with low power temperature sensor designer, the wireless sensor node can be equipped with more precise real time clock.

With reduced power consumption provided by Schmitt trigger based pulse-driven crystal oscillator discussed in Chapter 3 and low power temperature sensor such as [51] and [51], low power TCXO can be implemented into WSN.

### 6.2 Future Work for ADC

The design has been implemented in 180 nm process inside of WSN system. Its supply voltage is generated by LDO operating from battery voltage, and ADC is controlled through system's processor core. Because of these facts, it may have issues while interfacing with other blocks of the system. First, LDO and its reference voltage generator have to be carefully characterized so that it can provide stable output voltage change across different battery voltage and temperature. Second, ADC's input MUX should have minimal effect to ADC itself and any analog voltage nodes it's connected to. Third, ADC internal clock generator should provide adequate clock frequency so that ADC can minimize energy per conversion but not lose accuracy. Fourth, ADC and LDO have to consume low leakage power when put to sleep, and respond fast enough when conversion is requested while it is still in sleep state. Overall, more careful characterization is necessary to confirm that the design is operating properly.

## APPENDIX

## APPENDIX A

## Related Papers

1. Z. Foo, D. Devescery, M. Ghaed, I. Lee, A. Madhavan, Y. Park, A. Rao, Z. Renner, N. Roberts, A. Schulman, V. Vinay, M. Wieckowski, Dongmin Yoon, C. Schmidt, T. Schmid, P. Dutta, P. Chen, D. Blaauw, "A Low-Cost Audio Computer for Information Dissemination Among Illiterate People Groups," Circuits and Systems I: Regular Papers, IEEE Transactions on , vol.60, no.8, pp.2039,2050, Aug. 2013
2. Yoonmyung Lee, Dongmin Yoon, Yejoong Kim, David Blaauw, Dennis Sylvester, "Circuit and System Design Guidelines for Ultra-Low Power Sensor Nodes" IPSJ Transactions on System LSI Design Methodology (TSLDM), February 2013, invited paper
3. Z. Foo, D. Devescery, M. Ghaed, I. Lee, A. Madhavan, Y. Park, A. Rao, Z. Renner, N. Roberts, A. Schulman, V. Vinay, M. Wieckowski, Dongmin Yoon, C. Schmidt, T. Schmid, P. Dutta, P. Chen, D. Blaauw, "A Low-cost Audio Computer for Information Dissemination among Illiterate People Groups," Custom Integrated Circuits Conference (CICC), September 2012
4. Yoonmyung Lee, Yejoong Kim, Dongmin Yoon, Dennis Sylvester, David Blaauw, "Circuit and System Design Guidelines for Ultra-Low Power Sensor Nodes," ACM/IEEE Design Automation Conference (DAC), June 2012, invited paper
5. Dongmin Yoon, Dennis Sylvester, David Blaauw, "A 5.58nW 32.768kHz DLL-Assisted XO for Real Time Clocks in Wireless Sensing Applications," IEEE International Solid-State Circuits Conference (ISSCC), February 2012
6. Z. Foo, D. Devecsery, T. Schmid, N. Clark, M. Ghaed, Y. Kuo, I. Lee, Y. Park, N. Slottow, V. Vinay, M. Wieckowski, Dongmin Yoon, C. Schmidt, D. Blaauw, P. Chen, P. Dutta, "A Case for

Custom Silicon in Enabling Low-Cost Information Technology for Developing Regions," ACM Symposium on Computing for Development, December 12010

## BIBLIOGRAPHY

[1] G. Bell, "Bell's law for the birth and death of computer classes," Commun. ACM, vol. 51, no. 1, pp. 86-94, Jan. 2008. [Online]. Available: http://doi.acm.org/10.1145/1327452.1327453
[2] J. M. Kahn, R. H. Katz, and K. S. J. Pister, "Next century challenges: Mobile networking for \“smart dust\”", in Proceedings of the 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking, ser. MobiCom '99. New York, NY, USA: ACM, 1999, pp. 271-278. [Online]. Available: http://doi.acm.org/10.1145/313451.313558
[3] S. Bhattacharya, S. Sridevi, and R. Pitchiah, "Indoor air quality monitoring using wireless sensor network," in Sensing Technology (ICST), 2012 Sixth International Conference on, Dec 2012, pp. 422-427.
[4] R. Morello, C. De Capua, and A. Meduri, "Remote monitoring of building structural integrity by a smart wireless sensor network," in Instrumentation and Measurement Technology Conference (I2MTC), 2010 IEEE, May 2010, pp. 1150-1154.
[5] G. Chen, H. Ghaed, R. Haque, M. Wieckowski, Y. Kim, G. Kim, D. Fick, D. Kim, M. Seok, K. Wise, D. Blaauw, and D. Sylvester, "A cubic-millimeter energy-autonomous wireless intraocular pressure monitor," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, Feb 2011, pp. 310-312.
[6] E. Chow, S. Chakraborty, W. Chappell, and P. Irazoqui, "Mixed-signal integrated circuits for self-contained sub-cubic millimeter biomedical implants," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2010 IEEE International, Feb 2010, pp. 236-237.
[7] Y. Lee, G. Kim, S. Bang, Y. Kim, I. Lee, P. Dutta, D. Sylvester, and D. Blaauw, "A modular 1 mm 3 die-stacked sensing platform with optical communication and multi-modal energy harvesting," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 402-404.
[8] ITRS, "International technology roadmap for semiconductors, 2013 edition," International Technology Roadmap for Semiconductors, Tech. Rep., 2013.
[9] J. Paradiso and T. Starner, "Energy scavenging for mobile and wireless electronics," Pervasive Computing, IEEE, vol. 4, no. 1, pp. 18-27, Jan 2005.
[10] D. Travlos. Arm holdings and qualcomm: The winners in mobile. [Online]. Available: http://www.forbes.com/sites/darcytravlos/2013/02/28/arm-holdings-and-qualcomm-the-winners-in-mobile/
[11] [Online]. Available: http://www.powerstream.com/p/PGEB014461.pdf
[12] [Online]. Available: http://www.powerstream.com/p/GMB300910.pdf
[13] [Online]. Available: http://www.st.com/st-webui/static/active/en/resource/technical/document/datasheet/CD00270103.pdf
[14] [Online]. Available: http://www.cymbet.com/pdfs/DS-72-01.pdf
[15] [Online]. Available: http://www.cymbet.com/pdfs/DS-72-02.pdf
[16] R. Dennard, F. Gaensslen, V. Rideout, E. Bassous, and A. LeBlanc, "Design of ion-implanted mosfet's with very small physical dimensions," Solid-State Circuits, IEEE Journal of, vol. 9, no. 5, pp. 256-268, Oct 1974.
[17] L. Nazhandali, B. Zhai, J. Olson, A. Reeves, M. Minuth, R. Helfand, S. Pant, T. Austin, and D. Blaauw, "Energy optimization of subthreshold-voltage sensor network processors," in Computer Architecture, 2005. ISCA '05. Proceedings. 32nd International Symposium on, June 2005, pp. 197-207.
[18] M. Seok, S. Hanson, Y.-S. Lin, Z. Foo, D. Kim, Y. Lee, N. Liu, D. Sylvester, and D. Blaauw, "The phoenix processor: A 30pw platform for sensor applications," in VLSI Circuits, 2008 IEEE Symposium on, June 2008, pp. 188-189.
[19] B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw-Hill, 2001.
[20] I. Young, "Analog mixed-signal circuits in advanced nano-scale cmos technology for microprocessors and socs," in ESSCIRC, 2010 Proceedings of the, Sept 2010, pp. 61-70.
[21] Y. Lee, B. Giridhar, Z. Foo, D. Sylvester, and D. Blaauw, "A 660pw multi-stage temperaturecompensated timer for ultra-low-power wireless sensor node synchronization," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2011 IEEE International, Feb 2011, pp. 46-48.
[22] E. Vittoz and J. Fellrath, "Cmos analog integrated circuits based on weak inversion operations," Solid-State Circuits, IEEE Journal of, vol. 12, no. 3, pp. 224-231, Jun 1977.
[23] S. Chatterjee, Y. Tsividis, and P. Kinget, " $0.5-\mathrm{v}$ analog circuit techniques and their application in ota and filter design," Solid-State Circuits, IEEE Journal of, vol. 40, no. 12, pp. 2373-2387, Dec 2005.
[24] Y. Kim, D. Sylvester, and D. Blaauw, "Lc2: Limited contention level converter for robust wide-range voltage conversion," in VLSI Circuits (VLSIC), 2011 Symposium on, June 2011, pp. 188-189.
[25] Y.-S. Lin, D. Sylvester, and D. Blaauw, "A 150pw program-and-hold timer for ultra-lowpower sensor platforms," in Solid-State Circuits Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International, Feb 2009, pp. 326-327,327a.
[26] D. Lanfranchi, E. Dijkstra, and D. Aebischer, "A microprocessor-based analog wristwatch chip with 3 seconds/year accuracy," in Solid-State Circuits Conference, 1994. Digest of Technical Papers. 41 st ISSCC., 1994 IEEE International, Feb 1994, pp. 92-93.
[27] D. Aebischer, H. Oguey, and V. Von Kaenel, "A 2.1 mhz crystal oscillator time base with a current consumption under 500 na," in Solid-State Circuits Conference, 1996. ESSCIRC '96. Proceedings of the 22nd European, Sept 1996, pp. 60-63.
[28] W. Thommen, "An improved low power crystal oscillator," in Solid-State Circuits Conference, 1999. ESSCIRC '99. Proceedings of the 25th European, Sept 1999, pp. 146-149.
[29] D. Yoon, D. Sylvester, and D. Blaauw, "A 5.58nw 32.768khz dll-assisted xo for real-time clocks in wireless sensing applications," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 366-368.
[30] K.-J. Hsiao, "17.7 a $1.89 \mathrm{nw} / 0.15 \mathrm{v}$ self-charged xo for real-time clock generation," in SolidState Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International, Feb 2014, pp. 298-299.
[31] X. Zou, X. Xu, L. Yao, and Y. Lian, "A 1-v 450-nw fully integrated programmable biomedical sensor interface chip," Solid-State Circuits, IEEE Journal of, vol. 44, no. 4, pp. 1067-1077, April 2009.
[32] D. Jeon, Y.-P. Chen, Y. Lee, Y. Kim, Z. Foo, G. Kruger, H. Oral, O. Berenfeld, Z. Zhang, D. Blaauw, and D. Sylvester, "24.3 an implantable 64nw ecg-monitoring mixed-signal soc for arrhythmia diagnosis," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International, Feb 2014, pp. 416-417.
[33] B. Hershberg, S. Weaver, K. Sobue, S. Takeuchi, K. Hamashita, and U.-K. Moon, "Ring amplifiers for switched capacitor circuits," Solid-State Circuits, IEEE Journal of, vol. 47, no. 12, pp. 2928-2942, Dec 2012.
[34] Y. Chai and J.-T. Wu, "A $5.37 \mathrm{mw} 10 \mathrm{~b} 200 \mathrm{~ms} / \mathrm{s}$ dual-path pipelined adc," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 462-464.
[35] B. Verbruggen, M. Iriguchi, and J. Craninckx, "A $1.7 \mathrm{mw} 11 \mathrm{~b} 250 \mathrm{~ms} / \mathrm{s} 2 \times$ interleaved fully dynamic pipelined sar adc in 40 nm digital cmos," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 466-468.
[36] P. Harpe, E. Cantatore, and A. van Roermund, "11.1 an oversampled $12 / 14 \mathrm{~b}$ sar adc with noise reduction and linearity enhancements achieving up to 79.1 db sndr," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International, Feb 2014, pp. 194-195.
[37] P. Nuzzo, C. Nani, C. Armiento, A. Sangiovanni-Vincentelli, J. Craninckx, and G. Van der Plas, "A 6-bit $50-\mathrm{ms} / \mathrm{s}$ threshold configuring sar adc in $90-\mathrm{nm}$ digital cmos," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 59, no. 1, pp. 80-92, Jan 2012.
[38] H.-Y. Tai, Y.-S. Hu, H.-W. Chen, and H.-S. Chen, "11.2 a 0.85fj/conversion-step 10b 200ks/s subranging sar adc in 40 nm cmos," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International, Feb 2014, pp. 196-197.
[39] P. Harpe, C. Zhou, X. Wang, G. Dolmans, and H. de Groot, "A 30fj/conversion-step 8b 0-to-10ms/s asynchronous sar adc in 90 nm cmos," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2010 IEEE International, Feb 2010, pp. 388-389.
[40] J.-H. Tsai, Y.-J. Chen, M.-H. Shen, and P.-C. Huang, "A $1-\mathrm{v}$, $8 \mathrm{~b}, 40 \mathrm{~ms} / \mathrm{s}, 113 \mu \mathrm{w}$ chargerecycling sar adc with a $14 \mu \mathrm{w}$ asynchronous controller," in VLSI Circuits (VLSIC), 2011 Symposium on, June 2011, pp. 264-265.
[41] H. Zhang, Y. Qin, S. Yang, and Z. Hong, "Design of an ultra-low power sar adc for biomedical applications," in Solid-State and Integrated Circuit Technology (ICSICT), 2010 10th IEEE International Conference on, Nov 2010, pp. 460-462.
[42] W. Hu, Y.-T. Liu, T. Nguyen, D. Lie, and B. Ginsburg, "An 8-bit single-ended ultra-low-power sar adc with a novel dac switching method and a counter-based digital control circuitry," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 60, no. 7, pp. 17261739, July 2013.
[43] D. Zhang, A. Bhide, and A. Alvandpour, "A 53-nw 9.1-enob 1-ks/s sar adc in 0.13- $\mu \mathrm{m}$ cmos for medical implant devices," Solid-State Circuits, IEEE Journal of, vol. 47, no. 7, pp. 1585-1593, July 2012.
[44] H. Tang, Z. Sun, K. Chew, and L. Siek, "A 5.8 nw 9.1-enob 1-ks/s local asynchronous successive approximation register adc for implantable medical device," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. PP, no. 99, pp. 1-1, 2013.
[45] D. Zhang and A. Alvandpour, "A 3-nw 9.1-enob sar adc at 0.7 v and $1 \mathrm{ks} / \mathrm{s}$," in ESSCIRC (ESSCIRC), 2012 Proceedings of the, Sept 2012, pp. 369-372.
[46] Z. Foo, D. Devescery, M. Ghaed, I. Lee, A. Madhavan, Y. S. Park, A. Rao, Z. Renner, N. Roberts, A. Schulman, V. Vinay, M. Wieckowski, D. Yoon, C. Schmidt, T. Schmid, P. Dutta, P. Chen, and D. Blaauw, "A low-cost audio computer for information dissemination among illiterate people groups," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 60, no. 8, pp. 2039-2050, Aug 2013.
[47] [Online]. Available: http://http://www.digikey.com/
[48] J. Dickson, "On-chip high-voltage generation in mnos integrated circuits using an improved voltage multiplier technique," Solid-State Circuits, IEEE Journal of, vol. 11, no. 3, pp. 374378, Jun 1976.
[49] V. Ng and S. Sanders, "A 92\%-efficiency wide-input-voltage-range switched-capacitor dcdc converter," in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, Feb 2012, pp. 282-284.
[50] M. Wieckowski, G. Chen, M. Seok, D. Blaauw, and D. Sylvester, "A hybrid dc-dc converter for sub-microwatt sub-1v implantable applications," in VLSI Circuits, 2009 Symposium on, June 2009, pp. 166-167.
[51] S. Jeong, Z. Foo, Y. Lee, J.-Y. Sim, D. Blaauw, and D. Sylvester, "A fully-integrated 71 nw cmos temperature sensor for low power wireless sensor nodes," Solid-State Circuits, IEEE Journal of, vol. 49, no. 8, pp. 1682-1693, Aug 2014.

