# An Ultra Low Power DLL with Operating Range from 500 kHz,117 nW to 166 MHz, 20 uW

Yanqing Zhang University of Virginia yanqing@virginia.edu

Abstract—In this paper, we describe an ultra low power DLL suitable for ultra low power applications such as multiple clock phase generation for ultra low power SoCs and pulse generation in low power timing schemes. The ADDLL features a current starved VCDL and aside from the VCDL, all other components are synthesizable. Designed in a 45nm PTM, simulation results show its ultra low power operation from 166 MHz at 0.5V supply consuming 20 uW, down to 500 kHz at 0.3V supply consuming 117 nW. Jitter is controlled to <5% of the clock period at our target main frequency of 100 MHz.

Keywords-DLL, ultra low power, pulse generation

## I. INTRODUCTION

For decades DLLs have been identified as a reliable circuit used in widespread applications such as clock synchronization and de-skewing[1][2], and aiding in memory timing optimization[3]. However, with the advent of ultra low power circuits[4], the design space for an ultra low power DLL has not been explored in depth. DLLs are important in the ultra low power regime as they produce multiple phase clocks for SoCs and can provide the pulse generation needed for various near-threshold and below timing techniques such as latch-based timing and time borrowing[5]. To the extent of the author's knowledge, utilizing a DLL for these purposes would ruin the purpose of ultra low power, as can be seen by Table I. Therefore, a low power and reliable DLL is much needed.

TABLE I. A SAMPLE OF DLL POWER CONSUMPTION IN LITERATURE

| Ref. | Frequency | Power |
|------|-----------|-------|
| [6]  | 300 MHz   | 70 mW |
| [7]  | 150MHz    | 36mW  |
| [2]  | 133MHz    | 30mW  |
| [3]  | 100MHz    | 0.3mW |

# II. PROPOSED DLL DESIGN

# A. Design Specifications

The ADDLL presented in [2] consumes 300uW operating at 100MHz, and because it is the lowest power at frequencies closer to the ultra low power regime, served as a design guideline. Thus, the main frequency was chosen at 100MHz. It is anticipated that the full range of low power frequencies(10kHz-100MHz as exemplified in [8][9]) can be accomplished through voltage scaling. Given the application space of SoCs and pulse generation for timing optimization, the jitter constraint was set to <5% of the clock period, since drawing from experience clock uncertainties above this level quickly give rise to unreasonable amounts of increased power commonly in the form of buffer insertion or logic resizing. The

power constraint was set to <50uW, to ensure at least an order of magnitude in power savings from the guideline[3]. Further scrutiny of the application space shows false lock prevention and portability as two additional constraints. Table IV summarizes these constraints.

# B. Architecture of DLL

Digital circuits provide great opportunities for power scaling since quadratic savings in dynamic power are achieved by simply reducing the supply voltage[9]. Common all-digital DLL(ADDLL) architectures do not reap the benefits of voltage scaling by operating the circuit at the nominal supply voltage[2][7]. Thus, we chose an ADDLL architecture where the supply voltage is scaled down. To make the entire loop fully digital, a digital control word to the VCDL generated from a digital counter replaces an analog control voltage generated from a charge pump and loop filter. A bang-bang phase detector (PD) was chosen as is common in DLL circuits, and is inherently digital.

Careful consideration was given to the design of the VCDL, which for ADDLLs are mostly inverter chain based. The VCDLs in ADDLLs such as [2][7] tend to have more than 50 inverter stages while the loop locks by choosing the correct stage that supplies the desired delay. However, this method will consume much power. The observation was made that the number of inverter stages need only be sufficient to achieve the desired phase resolution, and that the amount of delay through the VCDL can be controlled by the current supplied to the inverters, instead of the number of stages. This is easily accomplished for digital circuits through header/footer insertion, much like in a DVFS configuration. In turn, the digital control word controls the on/off switching of the headers/footers. Thus, power can be saved by supplying just enough current in the delay line for the desired amount of delay. Fig. 1 shows a block diagram of the proposed architecture.



Figure 1. Architecture of proposed DLL.

A reset input to the digital counter was included, thus forcing the delay line to start at its smallest delay value and

increment to the desired value, which addresses the false lock issue. Aside from the delay line, all other blocks are easily synthesizable for portability.

# C. Block Design

Since we will be scaling to ultra low voltages, the logic style of all gates for synthesis was static CMOS for the PD and counter. A further design consideration is that the setup and hold time of the bang-bang PD register should be optimized to increase the resolution and decrease the dithering jitter. This is done by logic effort sizing of the two gates in the register as depicted in Fig. 2.



Figure 2. Schematic of PD with gates that should be optimized circled.

A diagram of the VCDL is shown in Fig. 3, whose replica delay line configuration is inspired by [2]. The top delay line generates the locking signal to the reference and accounts for 0° -180° of delay. The bottom line is complementary of the top and provides 180°-360° of delay. The weaker latches improve jitter induced by process variations, as they will pull the delays between the top and bottom delay line closer to equal. In this design the phase resolution is 60°, so 2 latches were inserted.



Figure 3. Diagram of VCDL.

Sizing of the inverters and headers were key to ensuring correct locking and having an acquisition range around our main frequency. Simulations showed a  $V_{\rm DD}$  of 0.5V to be the minimum for 100MHz locking and it was anticipated for the 6 bit digital control word to be around 100000 when in lock. From there the inverters were sized to sink the maximum current supplied by the headers when the control word equals 000000. Header lengths were sized up to decrease leakage, and footers were sized to match the slew rate between transitions of different polarity. Weak latches were sized so that the nodes connected to the latches passed a 500 point Monte-Carlo simulation. The sizing of the transistors is summarized in Table II.

TABLE II. TRANSISTOR SIZES IN VCDL

| Inv PMOS   | W/L=670/50nm  | Inv NMOS   | W/L=540/50nm |
|------------|---------------|------------|--------------|
| Header     | W/L=120/800nm | Footer     | W/L=90/800nm |
| Latch PMOS | W/L=270/50nm  | Latch NMOS | W/L=90/50nm  |

## III. SIMULATION RESULTS



Figure 4. Transient simulation showing the DLL starting out of lock, then attaining lock state.  $T_{\text{ref\_clk}}$ =10ns.

Fig. 4 shows a transient simulation of the DLL locking to 100MHz at  $V_{\rm DD}$ =0.5V. The DLL consumes 15 uW, has 230ps of deterministic dithering jitter, takes 30 clock cycles to acquire lock and has an average control word of 011110 in lock. Fig. 5(a) and (b) show the frequency vs. power scalability, which is an important feature, of the DLL. It shows that the DLL is capable of operating across a wide range of ultra low power



Figure 5. Frequency vs. power trends showing continued scalability for the full frequency range of 166MHz to 500 kHz from (a) 0.5V supply voltage down to (b) 0.3V supply voltage.

frequencies, from 500kHz( $V_{DD}$ =0.3V) to 166MHz( $V_{DD}$ =0.5V). Three main contributors to jitter were identified for this DLL design: dithering jitter, caused by the resolution of the bang-bang PD, supply noise sensitivity, caused by having a fully digital architecture, and process variations, exacerbated by scaling to ultra low voltages.

Figures 6(a) and 6(b) show the characteristics of dithering jitter, and suggest that for a locking frequency, we should use the lowest  $V_{DD}$  available. For the same frequency, dithering jitter decreases with  $V_{DD}$  decreasing because the resolution of the bang-bang PD was not affected by voltage scaling down, while the resolution of current injected into the delay line increases. The percent jitter in a cycle period increases at lower frequencies for a set  $V_{DD}$  because of leakage current. For the lower frequencies at a certain  $V_{DD}$ , most of the current injected is leakage current, meaning the dithering effects, which change the amount of active current, is more prominent. For these reasons, the lowest  $V_{DD}$  for a known locking frequency is most appropriate for minimizing dithering jitter.



Figure 6. Dithering jitter analysis for (a) the same locking frequency across different  $V_{\rm DD}$ , and (b) different frequencies at set  $V_{\rm DD}$ .

Table III shows that the DLL is sensitive to supply noise. In the worst case that the supply is generated from a DC-DC converter( $10\%\ V_{DD}$ , 10%f noise), jitter quickly increases to unacceptable amounts. In fact, at  $10\ MHz$  and  $V_{DD}=0.4V$ , the loop fails to lock. This is because the duty cycle distortion caused by the supply noise is so high that the bang-bang PD always outputs a 'down' signal. Fortunately, in the case that the supply comes from a regulator( $0.67\%\ V_{DD}$ , 10%f), the jitter caused by supply noise is controlled to the same order of dithering jitter.

TABLE III. DLL SUPPLY NOISE JITTER

| $f@V_{ m DD}$ | Dither Jitter | V <sub>DD</sub> Noise                      | Jitter w/V <sub>DD</sub> Noise |  |
|---------------|---------------|--------------------------------------------|--------------------------------|--|
| 100MHz@0.5V   | 230ps         | $10\% V_{DD}, 10\% f$                      | 1.99ns                         |  |
| 10MHz@0.4V    | 6.5ns         | $10\% V_{DD}, 10\% f$                      | Fails to lock                  |  |
| 3MHz@0.3V     | 8ns           | $10\% V_{DD}, 10\% f$                      | 223ns                          |  |
| 100MHz@0.5V   | 230ps         | $0.67\%  \mathrm{V}_{\mathrm{DD}}, 10\% f$ | 373ps                          |  |
| 10MHz@0.4V    | 6.5ns         | $0.67\%  \mathrm{V}_{\mathrm{DD}}, 10\% f$ | 10.4ns                         |  |
| 3MHz@0.3V     | 8ns           | $0.67\%  \mathrm{V}_{\mathrm{DD}}, 10\% f$ | 20ns                           |  |

3000 point Monte-Carlo simulations were performed to analyze the effects of process variations(though it was later found that the statistical models are problematic). Though the DLL has higher variation induced jitter( $\sigma_{DLL}\!\!=\!\!365ps$ ) when compared to a conventional static CMOS inverter chain( $\sigma_{inv}\!\!=\!\!214ps$ ), it has less delay outliers(1 vs 8), defined as a pulse width  $<\!20\%$  or  $>\!80\%$  of the clock period, than the inverter chain. This suggests that the DLL is still the superior circuit for pulse generation as it is the outliers that are detrimental to pulse generation.

A cross examination of the jitter in the DLL leads us to the conclusion that the three main contributors of jitter play about the same weight in the total jitter. Therefore, future optimizations of jitter concentrating on any one source would make significant improvements.

### IV. CONCLUSIONS

Table IV shows the comparison of our DLL with the reference guideline, as well as the comparison with our design specifications. The DLL meets the design specifications(process variation not accounted for due to problematic statistical models), and exhibits the advantages of easy digital integration, and ultra low power with acceptable jitter.

TABLE IV. COMPARISON OF DLL WITH REFERENCE AND DESIGN SPECIFICATIONS

|           | P2p<br>jitter  | Main f | Power | False lock prevention | Portability |
|-----------|----------------|--------|-------|-----------------------|-------------|
| Spec      | $<5\% T_{clk}$ | 100MHz | <50uW | Yes                   | Yes         |
| [3]       | 30 ps          | 100MHz | 300uW | N/A                   | Yes         |
| This work | 373ps          | 100MHz | 15uW  | Yes                   | Yes         |

# V. REFERENCES

- E. Song, S.-W. Lee, J.-W. Lee, J. Park, and S.-I. Chae, "A reset-free anti-harmonic delay-locked loop using a cycle period detector," IEEE Journal of Solid-State Circuits, vol. 39, pp. 2055-2061, Nov. 2004.
- [2] J.-S. Wang, Y-M. Wang, C.-H. Chen, and Y.-C. Liu, "An ultra-low-power fast-lock-in small-jitter all-digital DLL," 2005 IEEE International Solid-State Circuits Conference, pp. 422-424, Feb. 2005.
- [3] B.W. Garlepp, K. S. Donnelly, J. Kim, P.S. Chau, J.L. Zerbe, C. Huang, C.V. Tran, C.L. Portmann, Y.-F. Chan, T.H. Lee, and M. A. Horowitz, "A portable digital DLL for high-speed CMOS interface circuits," IEEE Journal of Solid-State Circuits, vol. 34, pp. 632-644, May 1999.
- [4] B.H. Calhoun, A. Wang, and A.P. Chandrakasan, "Modeling and sizing for minimum energy operation in subthreshold circuits," IEEE Journal of Solid-State Circuits, vol. 40, no. 9, pp. 1778-1786, Sept. 2005.
- [5] M. Wieckowski, Y.M. Park, C. Tokunaga, D.W. Kim, Z. Food, D. Sylvester, and D. Blaauw, "Timing yield enhacement through soft edge flip-flop based design," IEEE Custom Integrated Circuits Conference, pp. 543-546, 2008.
- [6] Y. Koo, J.-Y. Park, J. Park, and W. Kim, "A 4-400 MHz jitter-suppressed delay-locked loop with frequency division method," 1999 International Conference on VLSI and CAD," pp. 339-341, Oct 1999.
- [7] H.-H. Chang, C.-H. Sun, and S.-I. Liu, "A low-jitter and precise multiphase delay-locked loop using shifted averaging VCDL," 2003 IEEE International Solid-State Circuits Conference, pp. 434-435, Feb. 2003.
- [8] IEEE 802.15.3 Standard Protocol Working Group
- [9] G.Z. Yang, "Body Sensor Networks," Springer 2006