# Short Papers

# An Analytical Delay Model for RLC Interconnects

Andrew B. Kahng and Sudhakar Muddu

Abstract-Elmore delay has been widely used to estimate interconnect delays in the performance-driven synthesis and layout of very-large-scaleintegration (VLSI) routing topologies. For typical *RLC* interconnections, however, Elmore delay can deviate significantly from SPICE-computed delay, since it is independent of inductance of the interconnect and rise time of the input signal. Here, we develop an analytical delay model based on first and second moments to incorporate inductance effects into the delay estimate for interconnection lines under step input. Delay estimates using our analytical model are within 15% of SPICE-computed delay across a wide range of interconnect parameter values. We also extend our delay model for estimation of source-sink delays in arbitrary interconnect trees. We observe significant improvement in the accuracy of delay estimates for interconnect trees when compared to the Elmore model, yet our estimates are as easy to compute as Elmore delay. Evaluation of our analytical models is several orders of magnitude faster than simulation using SPICE. We also illustrate the application of our model in controlling response undershoot/overshoot and reducing interconnect delay through constraints on the moments.

#### I. INTRODUCTION

Accurate calculation of propagation delay in very-large-scaleintegration (VLSI) interconnects is critical to the design of high-speed systems. With the evolution of VLSI technology, transmission-line effects now play an important role in determining interconnect delays and system performance. Various techniques have been proposed for the delay analysis of interconnects. These techniques are based on either simulation techniques or (closed-form) analytical formulas. Simulation tools such as SPICE give the most accurate insight into arbitrary interconnect structures but are computationally expensive. Transient simulation of lossy interconnects based on convolution techniques is presented in [10] and [15]. Faster techniques based on moment computations are proposed in [13], [14], [19], and [22]. Since these methods are too expensive to be used during iterative layout optimization, the Elmore delay [2] approximation (which represents the first moment of the transfer function) is the most widely used delay model in the performance-driven design of clock distribution and Steiner global routing topologies. However, Elmore delay cannot accurately estimate the delay for RLC interconnect lines, i.e., the representation for interconnects whose inductive impedance cannot be neglected [4]; this inaccuracy is harmful to current performancedriven routing methods, which try to optimize interconnect segment lengths and widths (as well as driver and buffer sizes). Previous moment-based approaches (e.g., [10], [11], and [13]) compute the delay estimates only from a simulated response but not from an analytical formula.

To see the effect of inductance impedance on the response, consider a two-port model for an interconnect driven by a step input with finite

Manuscript received September 20, 1994; revised December 28, 1995. This work was supported by the National Science Foundation under Grant MIP-9257982. This manuscript was recommended by Associate Editor C.-K. Cheng.

A. B. Kahng is with the Computer Science Department, University of California, Los Angeles, CA 90095-1596 USA (e-mail: abk@cs.ucla.edu). S. Muddu is with Silicon Graphics, Inc., Mountain View, CA 94039 USA (e-mail: muddu@mti.sgi.com).

Publisher Item Identifier S 0278-0070(97)09370-6.

source impedance. Fig. 1 compares the RC and RLC line responses computed by SPICE3e: 90% threshold delay is 288 ps for the RLC model but is 358 ps for the RC model. Elmore delay, which does not depend on line inductance, will yield the same delay estimate of 386 ps for both the RC and the RLC cases. More generally, the Elmore delay formula gives good estimates if the interconnect lines are RC or overdamped but gives overestimates for RLC or underdamped interconnects.

This paper gives new analytical delay models for distributed RLC interconnects under step input to incorporate inductance effects into the delay estimate. Though we consider step input in deriving the delay models, a similar approach can be applied to develop delay models under ramp input. The proposed delay model is based on both the first and second moments of the interconnect transfer function. To experimentally validate our analysis and delay formula, we model VLSI interconnect lines having various combinations of source and load parameters and obtain delay estimates from SPICE, Elmore delay, and the proposed analytical delay model. The delay estimate using SPICE is extracted from a computed response at the desired node, whereas the other two models are analytical (closed-form) expressions. Over our range of test cases, Elmore delay estimates can be as much as 50% from the SPICE-computed delays, while our analytical delay model estimates are within 15% of SPICE delays. We also extend our delay model to estimate source-sink delays in arbitrary interconnect trees. For the small tree topology considered, delay estimates using our analytical model are within 15% of SPICEcomputed delays. Elmore delay estimates vary by as much as 100% from the SPICE-computed delays. Since our analytical model has the same time complexity as the Elmore model, we believe that it can be useful in present-day performance-driven routing methodologies.

Our paper is organized as follows. In Section II, we discuss previous analytical delay models for distributed interconnect lines. Section III presents a new analytical delay model for a distributed RLC line. In Section IV, we extend our delay model for interconnection trees. Last, Section V explains minimization of delay by allowing small ringing.

#### II. PREVIOUS ANALYTICAL DELAY MODELS

The transfer function of an RLC interconnect line with source and load impedance (Fig. 2) can be obtained using the ABCD parameters [1] (as shown in (1) found at the bottom of the next page) where  $\theta = \sqrt{(r+sl)sc}$  is the propagation constant and  $Z_0 = \sqrt{(R+sL)/sC}$ is the characteristic impedance. r = R/h, l = L/h, and c = C/h are resistance, inductance, and capacitance per unit length, respectively, and h is the length of the line. To compute the RLC line response from the transfer function, the method of Padé approximation has been used by, e.g., [11] and [12]. The output transfer function is expanded into a Maclaurin series of s around s = 0, and the series is truncated to desired order.<sup>1</sup> In general, analytical computation of the exact voltage response is very tedious and is usually in the form of an infinite series.

Efficient delay estimates for RC lines are typically derived by considering a single interconnect line with resistive source and

<sup>1</sup>The work of [10] used a recursive convolution-based approach and expanded the admittance and the propagation coefficient term around  $s = \infty$ .



Fig. 1. Comparison of SPICE3e responses at the end of an interconnect line driven by a step input and terminated with a capacitive load using both RC and RLC two-port models. The 90% threshold delay for the RLC model is 288 ps, and for the RC model the delay is 358 ps. The driver resistance is 10.0  $\Omega$  and the load capacitance at the end of the line is 2.0 pF. The interconnect line parameters are  $R = 0.075 \ \Omega/\mu m$ ,  $L = 0.123 \ pH/\mu m$ ,  $C = 8.8 \ fF/\mu m$ ; the length of the line is 400  $\mu m$ .



Fig. 2. Two-port model of a distributed RLC line with source impedance  $Z_S$  and load impedance  $Z_T$ .

capacitive load impedances; delay formulas for an interconnect tree entail recursive application of the formula for a single line. The analytical Elmore delay [2] estimate, Sakurai's heuristic delay formula [17], [18], and single-pole delay estimates of [3] have been widely used.

• Elmore delay is defined to be the first moment of the system impulse response, i.e., the coefficient of s or the first moment in the system transfer function H(s). Applying this definition to H(s) in (1) and considering a source resistance  $R_S$  and a capacitive load  $C_T$ , the Elmore delay for a distributed RC or RLC line model is

$$T_{\rm ED} = R_S(C + C_T) + R\left(\frac{C}{2} + C_T\right).$$
 (2)

By considering only one pole in the transfer function, i.e., approximating the denominator polynomial to only first moment, the single-pole response can be obtained as in [3]. The single pole of the transfer function is equal to the inverse of the Elmore delay  $T_{\rm ED}$ . Hence, the delay at arbitrary thresholds of the single-pole response can be directly related to Elmore delay (Elmore delay actually corresponds to the 63.2% threshold voltage of the single-pole response). For example, delay at 90% threshold

voltage is

$$T_{0.9} = 2.3 \cdot T_{\rm ED} = 1.15 RC + 2.3 (R_S(C + C_T) + RC_T).$$
(3)

 Sakurai [17] also gave response and delay calculations for the distributed RC line. He calculates the time-domain response from the transfer function using the Heaviside expansion over poles of the transfer function. He then approximates the response using a single pole and observes the variation of delay with respect to source and load parameters; a 90% threshold delay estimate is *heuristically* obtained as

$$T_{0.9} = 1.02RC + 2.3(R_S(C + C_T) + RC_T)$$

Note that Sakurai's heuristic delay formula is almost identical to the Elmore delay equation (3). In this paper, to compute the 90% threshold delay according to the Elmore model, we apply (3). Since these single-pole delay estimates cannot accurately estimate delay for RLC interconnects, Zhou *et al.* [22] proposed a twopole approximation for the transfer function to compute the *response* at the load for RLC interconnection trees. However, the response computation does not provide any analytical expression for delay; it is also too time consuming to be used in iterative optimization of layout. Recently, [9] proposed to improve the Elmore delay model by using higher order moments; this work gave a heuristic net delay model equal to the sum of the first moment  $(M_1)$  and its standard deviation.<sup>2</sup>

<sup>2</sup>Standard deviation is equal to  $\sigma = \sqrt{|M_1^2 - M_2|}$ . In the early drafts of our paper [6], we also considered exactly the same model; however, it turns out that this model is not accurate for various source and load parameters, as discussed in detail by [6]. The full version of our paper [6] studies various combinations of first and second moments, of which the analytical model described here performs best.

(1)

$$H(s) = \frac{V_2(s)}{V_0(s)} = \frac{1}{\left[\cosh(\theta h) + \frac{Z_S}{Z_0}\sinh(\theta h)\right] + \frac{1}{Z_T}[Z_0\sinh(\theta h) + Z_S\cosh(\theta h)]}$$



Fig. 3. Two-port model of a distributed RLC line with resistive and inductive source impedance and capacitive load impedance.

## III. A NEW ANALYTICAL DELAY MODEL

We now develop a simple closed-form delay estimate, based on first and second moments, which considers the effect of inductance. To our knowledge, this is the first analytical delay model that handles arbitrary threshold voltages and inductance effects for a distributed line. We give experimental confirmation via 90% threshold delay estimates,<sup>3</sup> which we compare against SPICE output.<sup>4</sup>

We model an arbitrary interconnect line as follows: 1) the source is modeled as a resistive and inductive impedance  $(Z_S = R_S + sL_S)$ and 2) the load at the end of the interconnect line is modeled using capacitive impedance. Thus, the transfer function for the interconnect line of Fig. 3 is

$$H(s) = \frac{1}{\cosh(\theta h) \left(1 + \frac{Z_S}{Z_T}\right) + \sinh(\theta h) \left(\frac{Z_S}{Z_0} + \frac{Z_0}{Z_T}\right)}$$

where  $Z_T = 1/(sC_T)$ ,  $Z_S = R_S + sL_S$ ,  $Z_0 = \sqrt{(R + sL)/(sC)}$ , and  $\theta h = \sqrt{(R + sL)sC}$ . We truncate this transfer function by expanding the hyperbolic functions around s = 0; expansion around  $s = \infty$  is not necessary since we consider only the first few coefficients of the transfer function. Expanding cosh and sinh as infinite series and collecting terms up to the coefficient of  $s^2$  in the denominator, we obtain the truncated transfer function

$$H(s)\approx \frac{1}{1+sb_1+s^2b_2}$$

with coefficients

$$b_{1} = R_{S}C + R_{S}C_{T} + \frac{RC}{2} + RC_{T}$$

$$b_{2} = \frac{R_{S}RC^{2}}{6} + \frac{R_{S}RCC_{T}}{2} + \frac{(RC)^{2}}{24} + \frac{R^{2}CC_{T}}{6} + L_{S}C + L_{S}C_{T} + \frac{LC}{2} + LC_{T}.$$
(4)

Note that the first and second moments of the transfer function can be obtained from the coefficients  $b_1$  and  $b_2$ , i.e.,  $M_1 = b_1$  and  $M_2 = b_1^2 - b_2$ . We use the coefficient notation  $b_1, b_2$  and the moment notation  $M_1, M_2$  interchangeably according to the simplicity of the expression. Depending on the sign of  $b_1^2 - 4b_2$ , the poles of the transfer function can be either real or complex. We separately derive our delay model from the two-pole response for each of these cases.

#### A. Real Poles

The two-pole methodology [6], [22] yields the following response for the case of real poles:

$$v(t) = V_0 \left( 1 - \frac{s_2}{s_2 - s_1} e^{s_1 t} + \frac{s_1}{s_2 - s_1} e^{s_2 t} \right)$$

 $^{3}$ Our analytical model extends to any threshold delays; we simply give the derivation for 90% delay threshold.

<sup>4</sup>SPICE simulation results are obtained using SPICE3 and the built-in lossy transmission line (LTRA) model, which is based on convolution techniques [15].

where

$$s_{1,2} = \frac{2}{-M_1 \pm \sqrt{4M_2 - 3M_1^2}} = \frac{-b_1 \pm \sqrt{b_1^2 - 4b_2}}{2b_2}.$$

The condition for the poles to be real is  $(4M_2 - 3M_1^2) = (b_1^2 - 4b_2) \ge 0$ . Since

$$s_2 - s_1 = -\frac{\sqrt{b_1^2 - 4b_2}}{b_2}$$

is negative, the coefficients  $s_2/(s_2 - s_1)$  and  $s_1/(s_2 - s_1)$  are positive. Also, since the magnitude  $|s_2|$  is greater than  $|s_1|$ , the second term in the time-domain response decreases rapidly compared to the first term. Hence, the two-pole response can be approximated (lower-bounded) as

$$\psi(t) \approx V_0 \left( 1 - \frac{s_2}{s_2 - s_1} e^{s_1 t} \right).$$

Since the voltage is lower-bounded, the delay obtained is an upper bound on the actual delay. The delay  $\tau_r$  (the subscript indicates the case of real poles) at threshold voltage  $v_{\rm th}$  can be obtained via

$$s_{1}\tau_{r} = \ln\left(\frac{(s_{2} - s_{1})(1 - v_{\text{th}})}{s_{2}}\right)$$
$$= -\ln\left(\frac{1}{2(1 - v_{\text{th}})}\left[1 + \frac{b_{1}}{\sqrt{b_{1}^{2} - 4b_{2}}}\right]\right)$$

Letting

$$K_r = \ln\left(\frac{1}{2(1-v_{\rm th})} \left[1 + \frac{b_1}{\sqrt{b_1^2 - 4b_2}}\right]\right)$$

we have

$$\tau_r = \frac{K_r}{|s_1|} = K_r \frac{M_1 + \sqrt{4M_2 - 3M_1^2}}{2} = K_r \frac{2b_2}{b_1 - \sqrt{b_1^2 - 4b_2}}$$

i.e.,  $K_r$  is a function of the coefficients  $b_1$  and  $b_2$ . For the wide range of source, load, and interconnect parameter values considered in our simulations (see Tables I and II), we find that  $K_r$  is actually almost a constant: the plot on the left side of Fig. 4 shows the linear regression used to find the value  $K_r = 2.36$ , which gives a very strong fit between SPICE delay values and  $1/(|s_1|)$ . The variation of  $K_r$  with the quantity  $X = (b_2)/(b_1^2)$  is further discussed in [6]. Thus, we use

$$\tau_r = 2.36 \cdot \frac{2b_2}{b_1 - \sqrt{b_1^2 - 4b_2}} = 2.36 \cdot \frac{\left(M_1 + \sqrt{4M_2 - 3M_1^2}\right)}{2}.$$
(5)

The resulting delay estimates are compared against those of various other methods in Tables I and II. We see that our analytical delay model gives estimates close to those obtained from SPICE, but as expected, Elmore delay also gives good estimates for this case where the interconnect response is overdamped.



Fig. 4. The plot on the left shows the strong linear fit between SPICE delay and  $1/(|s_1|)$  for real poles with  $K_r = 2.36$ . The plot on the right shows the strong linear fit between SPICE delay and  $1/\beta$  for complex poles with  $K_c = 1.66$ .

## B. Complex Poles

The condition for complex poles is  $(4M_2-3M_1^2) = (b_1^2-4b_2) \le 0$ . The time-domain response for complex poles is given by

$$v(t) = V_0 \left( 1 - \frac{\sqrt{\alpha^2 + \beta^2}}{\beta} e^{-\alpha t} \cdot \sin(\beta t + \rho) \right)$$

where  $\alpha = \frac{M_1}{2(M_1^2 - M_2)} \beta$ ,  $= \frac{\sqrt{3M_1^2 - 4M_2}}{2(M_1^2 - M_2)}$  and  $\rho = \tan^{-1}(\frac{\beta}{\alpha})$ . Using the above equation and threshold voltage  $v_{\rm th}$ , we get

$$e^{-\alpha t} \cdot \sin(\beta \cdot t + \rho) = \frac{1 - v_{\rm th}}{\sqrt{1 + \left(\frac{\alpha}{\beta}\right)^2}}.$$
 (6)

The delay at a given threshold voltage can be computed by solving for time in (6) recursively. One way to solve the recursive (6) is to approximate the time variable in the exponential term by Elmore delay, i.e., substitute  $T_{\rm ED}$  for time t. Expanding *sine* as a Taylor

#### TABLE I

90% Threshold Voltage Delay Estimates for Combinations of Source and Load Parameters for Which the Poles of the Response Are Real (i.e., Overdamped Response). The Interconnect Line Parameters Are  $R = 0.015 \ \Omega/\mu$ m,  $L = 0.246 \text{ pH}/\mu$ m, and  $C = 0.176 \text{ fF}/\mu$ m, and the Length of the Interconnect Is 100  $\mu$ m

| Source |       | Load  | Delay from | Analytical Delay |           |
|--------|-------|-------|------------|------------------|-----------|
|        |       |       | Response   | Models           |           |
| $R_S$  | $L_S$ | $C_T$ | SPICE      | Elmore           | New Model |
| Ω      | pH    | pF    | ps         | ps               | ps        |
| 50     | 2.46  | 0.176 | 22.33      | 22.93            | 22.21     |
| 100    | 2.46  | 0.176 | 45.30      | 45.20            | 45.70     |
| 500    | 2.46  | 0.176 | 224.50     | 223.50           | 228.95    |
| 1000   | 2.46  | 0.176 | 446.20     | 446.4            | 457.46    |
| 25     | 2.46  | 1.76  | 107.10     | 108.40           | 108.65    |
| 50     | 2.46  | 1.76  | 210.10     | 210.80           | 214.74    |
| 100    | 2.46  | 1.76  | 415.20     | 415.40           | 425.10    |
| 500    | 2.46  | 1.76  | 2052.60    | 2053.0           | 2103.68   |
| 1000   | 2.46  | 1.76  | 4099.50    | 4100.0           | 4101.30   |

TABLE II THE LENGTH OF THE INTERCONNECT LINE IN THESE EXPERIMENTS IS ALWAYS  $h = 2000 \ \mu$ m. THE DELAY ESTIMATES REFER TO 50% THRESHOLD VOLTAGE

| Interc.                        | Driver | Load | SPICE | Elmore     | New Model |
|--------------------------------|--------|------|-------|------------|-----------|
| para.                          | Res.   | Cap. |       | Delay      | Delay     |
| <b>r</b> , <b>l</b> , <b>c</b> | RS     | CT   |       | $0.693b_1$ |           |
| $/\mu m$                       | Ω      | pf   | ps    | ps         | ps        |
| $0.0015 \ \Omega$              |        | Γ    | 1     | 1          |           |
| 0.176  ff                      | 100    | 0.01 | 83    | 25         | 83        |
| $0.246 \ ph$                   |        |      |       |            |           |
| "                              | 500    | 0.01 | 178   | 126        | 178       |
| 'n                             | 1000   | 0.01 | 302   | 251        | 302       |
| "                              | 100    | 0.1  | 90    | 32         | 90        |
| "                              | 500    | 0.1  | 209   | 157        | 209       |
| "                              | 1000   | 0.1  | 364   | 314        | 365       |
| "                              | 100    | 1    | 150   | 96         | 149       |
| "                              | 500    | 1    | 522   | 471        | 522       |
| "                              | 1000   | 1    | 989   | 939        | 990       |

series and considering only the first term yields

$$e^{-\alpha \cdot T_{\rm ED}} \cdot \sin(\beta \cdot \tau_c + \rho) \approx e^{-\alpha \cdot T_{\rm ED}} \cdot (\beta \cdot \tau_c + \rho)$$
$$= \frac{1 - v_{\rm th}}{\sqrt{1 + \left(\frac{\alpha}{\beta}\right)^2}}.$$

Therefore

where

$$K_c = \left(\frac{(1 - v_{\rm th})e^{\alpha \cdot T_{\rm ED}}}{\sqrt{1 + (\frac{\alpha}{\beta})^2}} - \rho\right).$$

 $\tau_c = \frac{K_c}{\beta}$ 

Substituting for  $\beta$  and using  $M_1 = b_1$  and  $M_2 = b_1^2 - b_2$ , our delay estimate is given by

$$\tau_c = \frac{K_c}{\beta} = K_c \cdot \frac{2b_2}{\sqrt{4b_2 - b_1^2}}.$$

Even though  $K_c$  is a function of  $b_1$  and  $b_2$ , for a wide range of interconnect, source, and load parameters, it too is almost a constant. We determined the constant value  $K_c = 1.66$  again by finding a good fit between SPICE delay values and  $1/(\beta)$ , as shown on the

90% Threshold Voltage Delay Estimates of the Combinations of Source and Load Parameters for Which the Poles of the Response Are Complex (i.e., Underdamped Configurations). The Interconnect Line Parameters Are  $R = 0.015 \ \Omega/\mu$ m,  $L = 0.246 \ \text{pH}/\mu$ m, and  $C = 0.176 \ \text{fF}/\mu$ m, and the Length of the Interconnect Is 100  $\mu$ m. The Percentage Error of Each Delay Model with Respect to SPICE Is Also Given

| Source |        | Load   | Delay from | Analytical Delay |            |
|--------|--------|--------|------------|------------------|------------|
|        |        |        | Response   | Models (% error) |            |
| $R_S$  | $L_S$  | $C_T$  | SPICE      | Elmore           | New Model  |
| Ω      | pН     | pF     | ps         | ps               | ps         |
| 10     | 0.0246 | 0.0176 | 1.22       | 0.90(26%)        | 1.30~(6%)  |
| 15     | 0.0246 | 0.0176 | 1.33       | 1.31 (2%)        | 1.38 (4%)  |
| 20     | 0.0246 | 0.0176 | 1.47       | 1.71 (16%)       | 1.51 ( 3%) |
| 25     | 0.0246 | 0.0176 | 1.60       | 2.12 ( 33%)      | 1.64 ( 3%) |
| 10     | 0.0246 | 0.176  | 4.50       | 5.12(14%)        | 4.25(6%)   |
| 15     | 0.0246 | 0.176  | 5.85       | 7.32 (25%)       | 5.31(9%)   |
| 20*    | 0.0246 | 0.176  | 7.90       | 9.55 (21 %)      | 8.60 (7%)  |
| 10     | 2.46   | 0.0176 | 1.31       | 0.90 (31%)       | 1.40 (%)   |
| 15     | 2.46   | 0.0176 | 1.40       | 1.31 (7%)        | 1.49 (7%)  |
| 20     | 2.46   | 0.0176 | 1.55       | 1.71 (10%)       | 1.59(2%)   |
| 25     | 2.46   | 0.0176 | 1.63       | 2.12 ( 30%)      | 1.69 (4%)  |
| 10     | 2.46   | 0.176  | 4.65       | 5.10 (10%)       | 4.30 (8%)  |
| 15     | 2.46   | 0.176  | 5.85       | 7.33 (25%)       | 5.30 ( 9%) |
| 20     | 2.46   | 0.176  | 7.98       | 9.55 (19%)       | 8.70 (9%)  |
| 10     | 24.6   | 0.0176 | 1.80       | 0.90 ( 50%)      | 1.96 (9%)  |
| 15     | 24.6   | 0.0176 | 1.89       | 1.31 ( 31%)      | 2.06 (9%)  |
| 20     | 24.6   | 0.0176 | 2.00       | 1.71 (15%)       | 2.15 ( 7%) |
| 25     | 24.6   | 0.0176 | 2.19       | 2.11 (4%)        | 2.21 (1%)  |
| 10     | 24.6   | 0.176  | 5.65       | 5.10 (10%)       | 5.44(4%)   |
| 15     | 24.6   | 0.176  | 6.50       | 7.33 (13%)       | 5.95(8%)   |
| 20     | 24.6   | 0.176  | 7.66       | 9.55 (25%)       | 6.97 (9%)  |
| 25     | 24.6   | 0.176  | 9.47       | 11.78 (24%)      | 9.26 (2%)  |

right side of Fig. 4. Therefore, the 90% threshold delay estimate for complex poles is

$$\tau_c = 1.66 \cdot \frac{2b_2}{\sqrt{4b_2 - b_1^2}} = 1.66 \cdot \frac{2\left(M_1^2 - M_2\right)}{\sqrt{3M_1^2 - 4M_2}}.$$
 (7)

Table III shows delay values for various combinations of source, load, and interconnect parameters assuming the value of  $K_c$  obtained by this regression analysis. The delay estimates using our analytical model are within 10% of SPICE-computed delay estimates, while Elmore delay estimates vary by as much as 33% from SPICE-computed delays. Hence, for the case of complex poles (i.e., underdamped response), the Elmore model is no longer acceptably accurate. Last, we consider the special case in which poles are equal, i.e., a doublepole configuration.

### C. Double Poles

The condition for a double pole is  $(b_1^2 - 4b_2) = 0$ . The double-pole response is

$$\begin{split} V(s) &= \frac{V_0}{s} \frac{1}{1 + b_1 s + b_2 s^2} = \frac{V_0}{s} \frac{1}{b_2 (s - s_1)^2} \\ &= V_0 \bigg( \frac{1}{s} - \frac{1}{s - s_1} - \frac{2}{b_1} \frac{1}{(s - s_1)^2} \bigg) \end{split}$$

where  $s_1 = -\frac{b_1}{2b_2}$ , and the time-domain response is given by  $v(t) = V_0(1 - e^{ts_1} - \frac{2t}{b_1}e^{ts_1})$ . The delay at 90% threshold is

$$\tau_{0.9} = \frac{2b_2}{b_1} \ln\left(10\left(1 + \frac{2T_{0.9}}{b_1}\right)\right) = K_d \frac{2b_2}{b_1} = K_d \frac{b_1}{2}$$

 $\begin{array}{c} \mbox{TABLE IV} \\ \mbox{The Lengths of Various Interconnects in the Tree of Fig. 5} \end{array}$ 

| Interconnect | Length  |  |  |
|--------------|---------|--|--|
|              | $\mu m$ |  |  |
| I1           | 50      |  |  |
| 12           | 100     |  |  |
| I3           | 50      |  |  |
| I4           | 200     |  |  |
| 15           | 100     |  |  |
| 16           | 50      |  |  |
| 17           | 100     |  |  |
| 18           | 200     |  |  |

which gives a recursive equation for  $K_d$ , i.e.,

$$K_d = \ln\left(10\left(1 + \frac{2T_{0.9}}{b_1}\right)\right) = \ln(10(1+K_d))$$

from which  $K_d \approx 3.9$ . Thus, in the case of a double pole, the 90% threshold delay is estimated as

$$\tau_{0.9} = K_d \cdot \frac{b_1}{2} = 1.95b_1 \tag{8}$$

which is independent of the inductance value and different from the Elmore delay expression.

In practice, the double-pole case should be applied when the magnitude of  $b_1^2 - 4b_2$  is within some threshold. We have experimentally studied a range of interconnect topologies and different driver/load parameters. We observed that for both the real and complex pole cases [(5) and (7)], the value of  $b_1^2 - 4b_2$  should be of the same order as the value of  $b_1$ . Thus, we have added a threshold criterion, such that if  $b_1^2 - 4b_2$  is more than an order of magnitude smaller than  $b_1$ , we apply (8). [However, we find that the value of  $b_1^2 - 4b_2$  is almost never close to zero (i.e., is clearly greater or less than zero), and (8) is almost never invoked.]

#### IV. INTERCONNECTION TREES

Last, we describe the extension of our analytical model to estimate delays in arbitrary interconnect trees. An RLC network is called an RLC tree if it does not contain a closed path of resistors and inductors, i.e., all resistors and inductors are floating with respect to ground and all capacitors are connected to ground. Consider an RLC interconnect tree with root (or source) S and set of sinks (or leafs)  $L = \{L_1, L_2, \dots, L_n\}$ . The unique path from root S to the sink node *i* is denoted by p(i) and is referred to as the *main path*. The edges/nodes not on the main path are referred to as the off-path edges/nodes. We model each edge on the main path of the tree using a lumped RLC segment, e.g., an L, T, or  $\Pi$  model. We replace the off-path subtree rooted at node v with the total subtree capacitance at node v. (Fig. 6 shows an example of a main path where each branch in the tree is replaced by RLC segments and the off-path subtrees are replaced by their respective subtree capacitances.) Hence, at any node v, the total capacitance is given by

$$C'_{v} = C_{v}$$
 if no off-path subtree at node  $v$   
=  $C_{v} + C_{T(v)}$  if node  $v$  has off-path subtree  $T(v)$ 

where  $C_v$  is the capacitance at the node and  $C_{T(v)}$  is the off-path subtree capacitance at node v. The *k*th coefficient  $b_k$  of the transfer function for the general *RLC* circuit of Fig. 6 can be expressed using the following recursive equation [5]:

$$b_k^{N+1} = R_N \sum_{j=1}^N C'_j \cdot b_{k-1}^j + L_N \sum_{j=1}^N C'_j \cdot b_{k-2}^j + b_k^N$$
(9)



Fig. 5. A simple interconnection tree consisting of distributed RLC lines. The lengths of the various interconnects are given in Table IV.



Fig. 6. Representation of the main path in the tree, where each distributed line is modeled using RLC segments.

where  $b_k^N$  refers to the coefficient of  $s^k$  in the transfer function between the given node and node 1. Note that  $b_0^j = 1$ ,  $b_{-1}^j = 0$ for all j and  $b_k^1 = 0$  for all k. Using the above recursive equation, the expressions for the first and second coefficients of the transfer function can be derived as

$$b_{1}^{N+1} = R_{N} \sum_{j=1}^{N} C_{j}' + b_{1}^{N} = \sum_{i=1}^{N} R_{i} \sum_{j=1}^{i} C_{j}'$$

$$b_{2}^{N+1} = R_{N} \sum_{j=1}^{N} C_{j}' b_{1}^{j} + L_{N} \sum_{j=1}^{N} C_{j}' + b_{2}^{N}$$

$$= \sum_{j=2}^{N} C_{j}' \sum_{l=j}^{N} R_{l} \sum_{i=1}^{j-1} C_{j}' \sum_{d=i}^{j-1} R_{d} + \sum_{j=1}^{N} C_{j}' \sum_{l=j}^{N} L_{l}.$$
(10)

For any given source and sink pair, the coefficients  $b_1$  and  $b_2$  can be computed in linear time by traversing the main path and using the above recursive equation. Using the analytical delay model developed in the previous section, we can obtain an analytical delay estimate for *RLC* interconnect trees using the first and second coefficients. Thus, the 90% threshold delay at a given sink *i*, depending on the value of  $(4M_2 - 3M_1^2)$ , is

$$= K_r \cdot \frac{(M_1 + \sqrt{4M_2 - 3M_1^2})}{\sqrt{2}} \quad \text{for Real poles}$$
  

$$T_{ND}(i) = K_c \cdot \frac{2(M_1^2 - M_2)}{\sqrt{3M_1^2 - 4M_2}} \quad \text{for Complex poles} \quad (11)$$
  

$$= K_d \cdot \frac{M_1}{2} \quad \text{for Double poles}$$

where the first and second moments are expressed as  $M_1 = b_1$  and  $M_2 = b_1^2 - b_2$ . The coefficients of the transfer function are obtained from (10). By contrast, the Elmore delay at the sink is equal to the

first moment, or the first coefficient  $b_1$  of the transfer function of the source-sink main path [16]. The 90% threshold delay using the first moment is simply

$$T_{\rm ED}(i) = 2.3 \cdot M_1 \tag{12}$$

which we emphasize can be inaccurate despite its wide use, since it ignores inductance of the interconnect line.

We evaluate the effect of our analytical model by considering a simple interconnection tree shown in Fig. 5. We consider the sink node N4 for delay estimation. Each edge on the main path between the root and node N4 is replaced by a two-L segment model.<sup>5</sup> We then apply the above-described recursive coefficient (or moment) computation for the resultant RLC circuit of the main path. The 90% threshold delays according to both the Elmore model and our new analytical model are computed using (11) and (12). We also compute the delay at the given sink node using SPICE3e, where each edge of the tree is modeled using the LTRA model (with SPICE, we first compute the response at the sink node and then obtain the delay for 90% threshold voltage). Table V presents delay estimates for a range of interconnect parameters, driver resistance values, and sink load capacitance values. The Elmore delay varies by as much as 35% from the SPICE-computed delay. However, our new model is within 15% of the SPICE delay for all examples. Another advantage of our model is due to simulation complexity. Our delay estimates require

<sup>&</sup>lt;sup>5</sup>Our model is not limited to traditional segment models, and indeed we believe the accuracy of our results would improve if we were to use nonuniform segment models [5], [21] designed to perfectly match the low-order moments of the distributed RLC line.

TABLE V 90% Threshold Delay Values for a Wide Range of Interconnect Parameters at Node 4 of the Tree in Fig. 5. We Compare SPICE LTRA and the Elmore Model Against Our Analytical Delay Model

| Interconnect                        | Driver | Load | SPICE | Elmore | New Model |
|-------------------------------------|--------|------|-------|--------|-----------|
| parameters                          | res.   | cap. | Delay | Delay  | Delay     |
| /μm                                 | Ω      | pF   | ps    | ps     | ps        |
| $R = 0.015 \ \Omega$                |        | l    |       |        |           |
| $C = 0.176 \; fF$                   | 10     | 0.02 | 5.7   | 6.6    | 5.0       |
| $L = 0.246 \ pH$                    |        |      |       |        |           |
| $R = 0.0015 \ \Omega$               |        |      |       |        |           |
| $C = 0.176 \; fF$                   | 10     | 0.2  | 37    | 26     | 31        |
| $L = 2.46 \ pH$                     |        |      |       |        |           |
| $R = 0.015 \ \Omega$                |        |      |       |        |           |
| $C = 0.176 \ fF$                    | 10     | 0.2  | 39    | 29     | 32        |
| $L = 2.46 \ pH$                     |        |      |       |        |           |
| $R = 0.0015 \Omega$                 | 10     |      | 150   | 000    | 0.05      |
| C = 0.176  fF                       | 10     | 2.0  | 179   | 238    | 205       |
| $L = 2.46 \ pH$                     |        |      |       |        |           |
| $R = 0.0015 \Omega$                 | 10     | 0.0  | 091   | 000    | 090       |
| C = 0.176  JF                       | 10     | 2.0  | 231   | 238    | 232       |
| $\frac{L = 0.246 \ pH}{R}$          |        |      |       |        |           |
| R = 0.015 M<br>C = 0.176 fE         | 10     | 9.0  | 100   | 970    | 990       |
| U = 0.170  Jr                       | 10     | 2.0  | 199   | 270    | 230       |
| $L = 2.40 \ pH$                     |        |      |       |        |           |
| n = 0.015 M<br>C = 0.176 fF         | 100    | 2.0  | 9410  | 9261   | 9267      |
| $U = 0.170 \ JT$<br>$I = 2.46 \ mH$ | 100    | 2.0  | 2.119 | 2001   | 2001      |
| $D = 2.40 \ p \Pi$                  |        | l    |       |        |           |

for undershoots. The first undershoot occurs at time  $T_1 = 2\pi/\beta$ , and the value of the undershoot is

$$\delta v = V_0 e^{-\alpha T_1} \sqrt{1 + \left(\frac{\alpha}{\beta}\right)^2 \sin(\beta T_1 + \rho)} = V_0 e^{-\alpha T_1}.$$

The constraint for a given percentage undershoot  $v_{\rm us}$  can be obtained as

$$\frac{\alpha}{\beta} = \frac{1}{2\pi} |\ln(v_{\rm us})|.$$

For example, with 5% undershoot, we have  $v_{\rm us} = 0.05V_0$  and  $\alpha/\beta = 0.48$ . We can express  $\alpha$  and  $\beta$  in terms of coefficients of the transfer function, i.e.,  $\frac{\alpha}{\beta} = \frac{b_1}{\sqrt{4b_2 - b_1^2}}$ . Therefore

$$b_1^2 = \left[\frac{4\left(\frac{\alpha}{\beta}\right)^2}{\left(\frac{\alpha}{\beta}\right)^2 - 1}\right]b_2.$$

With 5% undershoot, the above equation reduces to  $b_1^2 = 0.74b_2$ , and a 90% threshold delay estimate for this case can be obtained (see [6]) as

$$T_{0.9} = 1.66 \frac{2b_2}{\sqrt{4b_2 - b_1^2}} = 2.13b_1.$$

Similarly, for 5% overshoot, the relation between the coefficients is  $b_1^2 = 1.91b_2$ , and a corresponding delay estimate is  $T_{0.9} = 1.20b_1$ . As expected, the delay increases for a strong undershoot requirement, and in general, the delay increases if ringing in the response is suppressed [20]. The above constraint between  $\alpha$  and  $\beta$  to reduce the undershoot in the response could be applied with the delay model in (7) to perform delay-driven routing tree synthesis.

## VI. CONCLUSION

Fast delay estimation methods, as opposed to simulation techniques, are needed for incremental performance-driven layout synthesis. Elmore delay-based estimation methods, although efficient, cannot accurately estimate the delay for RLC interconnect lines. We have obtained an analytical delay model, based on first and second moments of *RLC* interconnection lines, that considers the effect of inductance. Resulting delay estimates are significantly more accurate than Elmore delay. We also extend our delay model to estimate source-sink delays in arbitrary interconnect trees. Even for the small tree topology considered, we observe significant improvement in the accuracy of our delay estimates, compared to the Elmore model. Since our model has the same time complexity as the Elmore model, we believe it can be valuable in modern iterative layout synthesis methodologies. Even though we consider step input for deriving the delay models, a similar approach can be applied to develop delay models under ramp input. We have also discussed a delay minimization approach that uses controlled small ringing in the response wave form.

#### REFERENCES

- L. N. Dworsky, Modern Transmission Line Theory and Applications. New York: Wiley, 1979.
- [2] W. C. Elmore, "The transient response of damped linear networks with particular regard to wideband amplifiers," J. Appl. Phys., vol. 19, pp. 55–63, Jan. 1948.
- [3] M. A. Horowitz, "Timing models for MOS circuits," Ph.D. dissertation, Stanford University, Stanford, CA, Jan. 1984.
- [4] C. C. Huang and L. L. Wu, "Signal degradation through module pins in VLSI packaging," *IBM J. Res. Develop.*, vol. 31, no. 4, pp. 489–498, July 1987.
- [5] A. B. Kahng and S. Muddu, "Two-pole analysis of interconnection trees," in Proc. IEEE MCMC Conf., Jan. 1995, pp. 105–110.

three orders of magnitude less computation than SPICE, since they have the same time complexity as the Elmore delay estimate.

## V. CONSTRAINT ON MOMENTS FOR CONTROL OF UNDERSHOOT/OVERSHOOT

In this section, we illustrate how our simple threshold delay model can yield simple analytical constraints for interconnect synthesis. Specifically, we address the question of finding interconnect and driver parameters for optimum delay with controlled ringing. Consider a simple *RLC* line driven by a gate, with  $Z_S$  being the driver impedance and  $C_L$  being the load impedance at the end of the line. The characteristic impedance of the line is given by

$$Z_0 = \sqrt{\frac{R+sL}{sC}}.$$

Ideally, the driver and line parameters are adjusted such that  $Z_s$  matches  $Z_0$  and the voltage response at the end of the line is critically damped. However, if the driver impedance  $Z_s$  is just smaller than the characteristic impedance of the line, the voltage response will have a small amount of ringing: this can be advantageous in that the threshold delay will decrease [20]. The problem with ringing is that it can cause false switching if the voltage response drops back below the threshold; hence, the advantages of ringing can be exploited only if the maximum oscillation (overshoot or undershoot) is bounded such that false switching does not occur. We now develop an analytical equation that achieves this control in terms of coefficients of the transfer function. Additional context for our discussion may be found in [6].

The voltage response for ringing is given by

$$v_{\text{out}}(t) = V_0 \left[ 1 - \frac{\sqrt{\alpha^2 + \beta^2}}{\beta} e^{-\alpha t} \sin(\beta t + \rho) \right]$$

where  $\rho = \tan^{-1}(\frac{\beta}{\alpha})$ . To find the peaks of overshoot and undershoot in the response, we set the derivative  $v'_{out}(t)$  to zero, yielding  $\beta t = n\pi$ , with  $n = 1, 3, 5, \ldots$  for overshoots and  $n = 2, 4, 6, \ldots$ 

- [6] \_\_\_\_\_, "An analytical delay model for *RLC* interconnects," in *Proc. IEEE International Symp. Circuits and Systems*, May 1996, vol. IV, pp. 237–240; see also A. B. Kahng and S. Muddu, "Accurate analytical delay models for VLSI interconnections," University of California, Los Angeles, UCLA CS Dept. TR-950034, Sept. 1995.
- [7] A. B. Kahng, K. Masuko, and S. Muddi, "Delay models for interconnects under nonmonotone and monotone response," University of California, Los Angeles, UCLA CS Dept. TR-960040, Nov. 1996.
- [8] A. B. Kahng, K. Masuko, and S. Muddu, "Analytical delay models for VLSI interconnects under ramp input," in *Proc. IEEE/ACM Int. Conf. Computer-Aided Design*, Nov. 1996, pp. 30–36.
- [9] B. Krauter, R. Gupta, J. Willis, and L. T. Pileggi, "Transmission line synthesis," in *Proc. 32nd ACM/IEEE Design Automation Conf.*, June 1995, pp. 358–363.
- [10] S. Lin and E. S. Kuh, "Transient simulation of lossy interconnect," in Proc. 29th ACM/IEEE Design Automation Conf., June 1992, pp. 81–86.
- [11] S. P. McCormick and J. Allen, "Waveform moment methods for improved interconnection analysis," in *Proc. 27th ACM/IEEE Design Automation Conf.*, June 1990, pp. 406–412.
- [12] L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis," *IEEE Trans. Computer-Aided Design*, vol. 9, pp. 352–366, Apr. 1990.
- [13] V. Raghavan, J. E. Bracken, and R. A. Rohrer, "AWESpice: A general tool for the accurate and efficient simulation of interconnect problems," in *Proc. 29th ACM/IEEE Design Automation Conf.*, June 1992, pp. 87–92.
- [14] C. L. Ratzlaff, N. Gopal, and L. T. Pillage, "RICE: Rapid interconnect circuit evaluator," in *Proc. 28th ACM/IEEE Design Automation Conf.*, June 1991, pp. 555–560.
- [15] J. S. Roychowdhury and D. O. Pederson, "Efficient transient simulation of lossy interconnect," in *Proc. 28th ACM/IEEE Design Automation Conf.*, June 1991, pp. 740–745.
- [16] J. Rubinstein, P. Penfield, and M. A. Horowitz, "Signal delay in RC tree networks," *IEEE Trans. Computer-Aided Design*, vol. 2, pp. 202–211, July 1983.
- [17] T. Sakurai, "Approximation of wiring delay in MOSFET LSI," *IEEE J. Solid-State Circuits*, vol. 18, pp. 418–426, Aug. 1983.
- [18] \_\_\_\_\_, "Closed-form expressions for interconnection delay, coupling, and crosstalk in VLSI's," *IEEE Trans. Electron Devices*, vol. 40, pp. 118–124, Jan. 1993.
- [19] M. Sriram and S. M. Kang, "Fast approximation of the transient response of lossy transmission line trees," in *Proc. ACM/IEEE Design Automation Conf.*, June 1993, pp. 691–696.
- [20] Y. Yang and R. Brews, "Overshoot control for two coupled *RLC* interconnect," *IEEE Trans. Comp., Packag., Manufact. Technol.*, vol. 17, no. 3, pp. 418–425, Aug. 1994.
- [21] Q. Yu and E. S. Kuh, "Exact moment matching model of transmission lines and application to interconnect delay estimation," *IEEE Trans. VLSI Syst.*, vol. 3, pp. 311–322, June 1995.
- [22] D. Zhou, S. Su, F. Tsui, D. S. Gao, and J. S. Cong, "A simplified synthesis of transmission lines with a tree structure," *Int. J. Analog Integrated Circuits Signal Process.*, vol. 5, pp. 19–30, Jan. 1994.

# Synthesis of Asynchronous Circuits for Stuck-At and Robust Path Delay Fault Testability

# Steven M. Nowick, Niraj K. Jha, and Fu-Chiung Cheng

Abstract—In this paper, we present methods for synthesizing multilevel asynchronous circuits to be both hazard free and completely testable. Making an asynchronous two-level circuit hazard free usually requires the introduction of either redundant or nonprime cubes or both. This adversely affects the circuit's testability. However, using extra inputs, which is seldom necessary, and a synthesis-for-testability method, we convert the two-level circuit into a multilevel circuit that is completely testable. To avoid the addition of extra inputs as much as possible, we intro

duce new exact minimization algorithms for hazard-free two-level logic where we first minimize the number of redundant cubes and then minimize the number of nonprime cubes. We target both the stuck-at and robust path delay fault models using similar methods. However, the area overhead for the latter may be slightly higher than for the former.

#### I. INTRODUCTION

Achieving complete testability of asynchronous circuits has long been recognized to be a difficult problem since these circuits must be hazard free [15]. Hazard-free synthesis methods frequently introduce redundant or nonprime product terms, resulting in circuits that are not fully testable. Thus, ensuring hazard-free behavior and at the same time achieving complete testability seem to be contradictory requirements. Our aim in this paper, however, is to show that hazardfree, completely testable, asynchronous multilevel circuits can be easily synthesized, in some rare cases requiring some extra control inputs.

To ensure high reliability of a circuit, one must test both its logical and temporal behavior for correctness. Physical defects may increase the propagation delays along different paths, giving rise to *delay faults* [19]. Delay faults can be categorized according to two models: *gate delay faults* and *path delay faults*. The former models excessive delay limited to just one gate, whereas the latter models excessive delays along a whole path from an input to an output. Therefore, the path delay fault model is more comprehensive; however, it may require more time for test generation because the number of paths is usually much larger than the number of gates.

Delay faults are generally tested by two-pattern tests. For path delay faults, these tests launch a  $0 \rightarrow 1$  or a  $1 \rightarrow 0$  transition at the input of the path to see if the desired transition reaches the output of the path within the specified time. A two-pattern test is called *robust* if arbitrary delays elsewhere in the circuit cannot invalidate it [19]. A robust test can be further categorized into a *hazard-free* or *nonhazard-free* test. For a hazard-free robust test, no hazards can occur on the tested path irrespective of the delay values elsewhere in the circuit. This is the most stringent fault model. Hazard-free robust path delay fault testability of a circuit also implies testability under other fault models, such as stuck open [17]. Since it is known that robust testability of general circuits is usually quite

Manuscript received February 8, 1996. This work was supported in part by AT&T under a Special Purpose Grant Award, by the National Science Foundation under Grant MIP-9308810, and by an Alfred P. Sloan Research Fellowship. This paper was recommended by Associate Editor S. Reddy.

Publisher Item Identifier S 0278-0070(97)09369-X.

S. M. Nowick and F.-C. Cheng are with the Department of Computer Science, Columbia University, New York, NY 10027 USA.

N. K. Jha is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA.