In a perfect world, fabrication of silicon ICs would be a
perfectly predictable process. Not only would every chip be
absolutely identical, but there would be no variations from wafer to
wafer, or lot to lot. In such a paradise, all chips would meet their
predicted design parameters. They would all run at the designers'
intended speed, no faster and no slower. All would meet their timing
specifications. There would be no clock skew, no IR-drop surprises,
and happiest of all, no need whatsoever for pessimistic design
approaches.
But we don't live in that perfect world. Trains and planes don't
run on time. New cars almost never get the mileage claimed by their
makers. And silicon fabrication processes vary, sometimes wildly,
and in ways that are maddeningly unpredictable. Circuits can vary
from predicted physical values in a number of ways, ultimately
affecting the transistors themselves, the wires that interconnect
them, or both.
Designers have faced the variability of fabrication processes
since day one, and by various means, manage to get around it.
Primarily, it's through static timing analysis. But a new generation
of static timing analysis is upon us, one that uses statistical
techniques to overcome the issues inherent in traditional static
techniques. In this report, we'll look at where static analysis has
been and where it must go to cope with the complexities of nanometer
silicon technologies.
CORNERING COMPLEXITY
Traditional
static timing analysis (STA) is, and has been, the method that
virtually every digital design team uses to achieve timing signoff.
In STA, you must have a timing model for every cell in your design
(or at least the signal paths you care about). The analyzer uses
those timing models to calculate delays for all cells in a given
path. Those delays, in turn, are summed to determine total path
delays.
Process variability comes into play here. With the move downward
in process geometries, the variability in silicon or, more
precisely, the ability to account for it becomes the priority in
maintaining the designers' intended performance.
"If you look at SiO2; (silicon-dioxide) thicknesses,
for example, we're talking about 14 atoms or so in today's high-end
processes," says Leon Stok, director of the electronic design
business for IBM's Systems and Technology group. "If you're off by
one or two atoms, you're suddenly off by 10% or 20%. Before, this
wasn't an issue. We think we're seeing the limits of some of the
physical phenomena we tend to deal with."
You may intend for your design to run at, say, 500 MHz. But with
the various process variability factors involved, even if we assumed
that all of the chips were functional, not all of them will run at
your target speed. Some may run at 400 MHz, some at 450 MHz, and
even some at 550 MHz.
This is why "corner-based" analysis has been the mainstay for
many years. The essence of corner-based analysis is to determine the
best and worst cases for various parameters, such as ambient
temperatures, supply voltage, and many others. Each of these
parameters is referred to as a "corner."
While corner-based analysis continues to be indispensable now and
into the foreseeable future, it does have several disadvantages. For
one thing, it's slow. At nanometer geometries, the number of corners
is exploding. At larger geometries, designers could get away with
analyzing worst-case parameters for just a handful of corners.
Today, designers find themselves analyzing 64 or more corners over a
full range of process variation. That translates into a huge runtime
burden.
And that's just the inter-die, or die-to-die, variation. There's
also on-chip variation (OCV) to consider. "OCV effectively adds some
pessimism to the design through the analysis to cover a variation
that could happen between, say, the clock routes between devices
that are spread around the chip," says Robert Jones, senior director
of Magma Design Automation's Silicon Signoff Unit.
Consequently, between interdie and intradie variations,
corner-based analysis is quickly becoming a millstone around
designers' necks. Yes, it's slow and cumbersome. But perhaps even
worse, it compels design from a pessimistic standpoint.
When designers are forced to consider all of the worst-case
corners they've analyzed, they suddenly find out their analysis
predicts that some of their 500-MHz chips may only run at 350 MHz.
Thus, to optimize the yield that will run at 500 MHz, they'll
compensate by overdesigning.
"Corner-based design is perceived as leaving a lot of quality on
the table," says Andrew Kahng, co-founder and CTO of Blaze DFM.
"People are very worried about the return on investment (ROI) of the
next generation of silicon technology. As guardbanding increases,
obviously you're harvesting less of the potential ROI of that
process improvement. At some point, if this isn't better managed,
the ROI will just not be there."
LEVERAGING STATISTICS
Enter statistical static timing analysis (SSTA). While traditional
static timing analysis can supply a worst-case number for delays,
it can't provide a sense of the distribution of performance versus
yield (Fig.
1). Rather than simply determining best-and worst-case
corners and attempting to arrive at a single value for delays,
statistical timing analyzers propagate probability distributions.
Among the inputs to SSTA tools are distributions of parameters,
such as transistor channel lengths. The distribution of values
represents how channel lengths can actually vary based on silicon
data.
Because they consider probability distributions, SSTA tools
accept information about variation and then simultaneously
consider the different probabilities of single transistors being
at different points in that variation space. "It can do an
analysis. It gives you more information about the likelihood of
meeting timing, essentially your parametric yield," says Bill
Mullen, group director of R&D at Synopsys.
In addition to device variation, there's also interconnect
variation. "SSTA tools can take information about the individual
wires and relate that to the variation in the parameters for the
metal at the different metal layers," says Mullen.
The goal of SSTA is to reduce the sensitivity to variability in
global attributes, such as temperature and voltage. Analysis is
performed on each type of variability to arrive at a probability
density function, or PDF (Fig.
1, again) . This function represents a statistical look at
how the device will operate across variations in the underlying
parameter.
Ultimately, an SSTA tool combines these individual PDFs with
those for all of the underlying parameters to achieve an overall
distribution for a given node in the circuit (Fig.
2).
"Statistical timing is nothing but adding probability
distributions and taking the maximum of them to find the new
arrival point of a signal at a gate," says Mustafa Celik, CEO of
Extreme Design Automation. "This is one way of doing SSTA. Another
is instead of propagating distributions, you can propagate
parametrized representations of the arrivals and delays."
SSTA: WHO AND WHY?
Now that we've defined SSTA, the next questions are who uses it
today and why. There's no doubt that SSTA is a leading-edge
technology. There are certainly designs at 90 nm that can benefit
from the application of SSTA. But many industry experts feel that
SSTA won't see widespread adoption until the 65-nm node is
prevalent, or even until 45 nm gets out of R&D and into
circulation.
"You need SSTA less at different silicon geometries," says Eric
Filseth, VP of product marketing for digital IC design at Cadence.
"At 130 nm, most designs don't vary enough to get huge value out
of statistical methods. Our belief is that you'll probably need it
at 45 nm. It's clear that people can do 65-nm chips without SSTA."
Regardless of the node at which SSTA sees broad adoption, usage
models for it are beginning to take shape. One of the gating
factors toward adoption is availability of process parameters. As
a result, statistical methods have seen their earliest usage from
integrated device manufacturers (IDMs) like IBM and Intel. In such
cases, a single part might dominate an entire fab line. An Intel
or IBM knows that it can sell any microprocessor it makes at some
price. Therefore, it uses bin sorting of parts by speed, and SSTA
is applied in an attempt to slide the distribution of speeds as
much toward the high side as possible.
A fabless semiconductor house might also make use of SSTA, but
it would do so for different reasons. Intel can bin-sort its
Pentium chips, but a fabless house doesn't necessarily have that
luxury. For many fabless houses, either the chip runs at rated
speed with the proper amount of power consumption or it doesn't.
In the latter case, it's deemed a failure and can't meet the
application's needs. But the fabless house still wants to maximize
the number of sellable chips per wafer.
"That's not necessarily the same as pushing the target
frequency as fast as possible," says Blaze DFM's Andrew Kahng,
"because you might have leakage power-limited yield loss. So
statistical design still applies even to those who do not bin or
bin crudely. For example, if a graphics company has a chip that
can't be sold into the mobile space, maybe it can still be sold
into the desktop space. So graphics-chip companies, as well as
processor companies, have that flexibility.
Clearly, the IDMs have a distinct advantage in applying
statistical methods to timing closure. "For example, an IDM has
control of the process and private access to the foundry," says
Kahng. "So the path that the statistics, statistical device
models, model-to-silicon correlation studies, etc., must go
through is at least internal."
For the fabless world, SSTA's adoption will depend on the
availability of process data and tools with the ability to consume
it.
"Most major foundries have long begun forming strategic
partnerships that will have statistics traveling back and forth
before too long," says Kahng. "One thing is that the tools need to
be able to consume the statistics before there's any point to
releasing them."
The transition to SSTA has begun, but it will most likely take
the form of an evolution. Most see traditional STA and SSTA as
complementary.
"People will continue using, wherever possible, deterministic
techniques," says Ravij Maheshwary, senior director of marketing
for signoff and power products in Synopsys' Implementation Group.
While it doesn't currently offer statistical capabilities,
Synopsys intends to evolve its existing timing closure tools—PrimeTime,
PrimeTime SI, and Star-RCXT—in a statistical direction.
"There will be a transitional period where we'll use the
deterministic STA and we'll use statistical analysis to handle the
sensitivity checking," says Magma's Robert Jones. "So we can begin
to eliminate some of that sensitivity and get to designs that are
more reliable, passing silicon yield on every wafer run."
TOOLS BEGIN APPEARING
Suppose you were interested in exploring adoption of statistical
static timing analysis in your flow. Where can you get it? As of
this writing, only one vendor, Extreme Design Automation, markets
a commercially available standalone SSTA tool. One RTL-to-GDSII
tool vendor, Magma Design Automation, offers it in the context of
its implementation flow. And one IDM, IBM Corp., provides access
to SSTA technology through its design-services operations.
Extreme Design Automation refers to its technology as
"variation-aware IC design." According to Extreme's Celik, the
company wants to fill the "design-to-manufacturing gap" with its
XT statistical timing signoff tool (Fig.
3).
Initially, Extreme sees XT as overlapping or coexisting with
Synopsys' PrimeTime.
"In a way, statistical timing will check whether PrimeTime's
analysis is correct or not," says Celik. "It will check whether
corners are valid or find others that PrimeTime may miss. Or, it
can check whether the margins and derating factors used in
PrimeTime are safe or pessimistic."
Extreme's XT is a block-based tool. It also can handle
path-based analysis. The tool can account for correlation in
variations due to reconvergent paths. Perhaps most importantly, XT
includes a patented sensitivity-analysis technology that
calculates the sensitivities of delays, arrivals, slacks, design
slack, and parametric yield with respect to design parameters
(cell sizes, wire sizes, and wire spacing), system parameters (VDD
and temperature), and process parameters.
Using the results of sensitivity analysis, XT performs
optimizations and engineering change orders (ECOs), such as
resizing of cells. The physical information embodied in the ECOs
is fed back into an incremental place-and-route tool. There, the
ECOs are implemented as post-layout optimizations that improve
parametric yield (Fig.
4).
XT also includes a library-characterization module. "Because we
need delay information for delay tables, we characterize and put
that information in a modified .lib format," says Celik.
Once the library is characterized, it can be reused for a
given-process node or technology. The libraries are additionally
parametrized, so if a given process matures or drifts in terms of
its characteristics, the library never requires recharacterization.
Information on the process change can be fed to the timer to
compensate.
Celik claims XT's extractor as the industry's first
statistical, or variation-aware, extractor. The pattern-based
extractor parametrizes R and C values with respect to process
parameters. Parametrized extraction makes it straightforward for
the tool to handle manufacturing effects like density-based
thickness variations or spacing-based variations. Mean error for
the extractor is less than 1% with a sigma of less than 2%.
Finally, Extreme's XT is built for capacity and runtime. It can
analyze 5 million instances overnight on a single 32-bit CPU. It
also can perform full-chip Monte Carlo simulation to calibrate the
statistical timer. That simulation can be distributed to a farm of
Linux machines.
Extreme's XT tool is the only current option for those
interested in a standalone SSTA tool. However, Magma's customers
have another option—the company's Quartz SSTA, which adds
variation analysis and optimization to the Magma IC-implementation
flow. Quartz SSTA is a path-based tool that offers high accuracy
for complex circuit topologies. The tool performs block-based
optimization, though, bringing an incremental-analysis capability
to the table.
For path-based analysis, Quartz SSTA takes advantage of
sophisticated filtering algorithms that allow it to locate the
most sensitive devices. This boils down to a kind of criticality
analysis, which can be very useful in determining the paths that
will most seriously impact overall circuit delays.
"We've defined a criticality factor that helps us comprehend
both the magnitude—the height of the distribution— where it is
relative to the slack value, as well as the sigma, or the standard
deviation or width of those distribution curves," says Magma's
Robert Jones.
The tool also brings flexibility in terms of library
characterization. Users can employ their own pre-characterized
libraries and develop derating factors that allow them to begin
analysis without slogging through heavy-duty characterization.