Posts Tagged ‘FinFet’

Is 2D Scaling Dead? - Other Considerations

Sunday, July 11th, 2010

othercons.PNG(Part 4 in the series Which Direction for EDA? 2D, 3D, or 360?)

In the last 2 posts in this series, I examined the lithography and transistor design issues that will need to be solved in order to save 2D scaling as we know it. In this post I will look at several other considerations.

For the moment, let’s assume that we are able to address the lithography and transistor design issues that I’ve identified in the previous posts. TSMC recently announced it will take delivery of an EUV lithography machine, so let’s assume they are successful in making the move to the 13.5 nm wavelength. IBM, TSMC, and Intel are already using multi-gate FETs in their most advanced process development and ITRS predicts it will be standard for the 32nm node, so let’s assume that will work out as well. If so, are we home free?

 

Not so fast!

 

There are still numerous technical challenges and one big economic one. First the technical:

 

Process variability refers to the fact that circuit performance can vary based upon the variability in the wafer processing. For instance, let’s say we are printing 2 overlapping rectangles on a die. Due to normal process variability, those rectangles can vary from the ideal in size (smaller or larger), can be shifted (north, south, east, west), or can be offset from each other. Thicknesses of processing layers have variability as well. The amount of doping can vary. Physical steps such as CMP (Chemical Mechanical Polishing) can introduce variability. These variabilities tend to be fixed amounts, so at large process nodes they don’t make much difference. But as we get smaller, these variabilities become significant. If we just take the old approach of choosing a 3-sigma range to define best case and worst case processing corners, the performance at lower more variable nodes may not be much greater than at the larger less variable nodes.

 

This process variability introduces performance variability, and not always in predictable ways.  For instance, if two related parameters vary equally based on oxide thickness, and all we care about is the ratio of these parameters, then the variation may cancel out. But if they vary in opposite directions, the effect may be worsened. Careful design and layout of circuits can make it so that process variations can cancel out with little net effect, but this takes enormous effort and planning and still you cannot account for all variation. Rather, we just have to live with the fact that process variation could cause +- 20, 30, or even 50% performance variation.

 

ssta_graph.JPGThere are some methods to account for this variation for digital designs, the most mainstream being statistical static timing analysis (SSTA). SSTA realizes that process variation results in a circuit performance distribution curve. Instead of drawing hard 3-sigma limits on the curve to define processing “corners”, as is done with traditional STA, SSTA allows designers to understand how yield varies with variability. For instance, if the designer wants to stick with 3-sigma bounds to achieve 90% yield then he may need to accept 500 MHz performance. However, if he wants to be more aggressive on timing he may be able to achieve 600 MHz by accepting a lower 75% yield for parts that fall within a smaller 2-sigma range. SSTA helps designers make these choices.

 

But SSTA is not a silver bullet. Process variability can affect hold times to the extent where they are very difficult to fix. Analog and mixed-signal circuits are much more susceptible to process variability since there are many more performance parameters designers care about. Companies like Solido are trying to attack this specific process variability issue, but the cost in time and analysis (e.g. Monte Carlo simulation) is large. And process variability can just plain break a chip altogether. This will only get worse as the dimensions shrink.

 

Yield is the first cousin to process variability. As discussed in the preceding section, there is a direct tradeoff between performance and yield due to process variability. And as process complexity increases and design margins shrink, yield surely will suffer. There’s a real question whether we’ll be able to yield the larger chips that we’ll be able to design.

 

Crosstalk and signal integrity issues are exaggerated at smaller nodes and are more difficult to address. According to a physical design manager I spoke with recently, the problem is that edge rates are faster and wires are closer together, so crosstalk induced delay is greater. Fixing these issues involves spreading wires or using a lower routing utilization, which defeats some of the benefit of the smaller node. And that is if you can even identify the aggressor nets, which may be multiple. It’s not uncommon for days to weeks to be spent fixing these issues at 45nm, so how long will is take at 22nm or lower?

 

Process variability and signal integrity are just 2 of the more prominent technical issues we’re hitting. In fact, pretty much everything gets more difficult. Consider clock tree synthesis for a large chip needing low skew and complex power gating. Or verifying such a large design (which merits it’s own series of posts). What about EDA tool capacity? And how are we going to manage the hundreds of people and hundreds of thousands of files associated with an effort like this? And let’s not forget the embedded software that runs on the embedded processors on these chips. A chip at these lower nodes will be a full system and will require a new approach. Are we ready?

 

And believe it or not, we’re even limited by the speed of light! A 10 Gbps SerDes lane runs at 100ps per bit, or the time it takes light to travel 3cm, a little over an inch. Even if we can process at faster and faster speeds on chip, can we communicate this data between chips at this rate, or does Einstein say “slow down”?

 

Enough of the technical issues, let’s talk economics.

 

Cost is, and always has been, the biggest non-technical threat to 2D scaling. Gordon Moore considered his observation to be primarily economic, not technological. In the end, it’s not about how much we can build, but how much we can afford to build. There are several aspects of cost, so let’s look at each.

 

Cost of fabrication is the most often quoted and well understood.  Although precise predictions will vary, it’s clear that all the breakthroughs required in lithography, transistor design, and other areas will not come cheaply. Nor will the facilities and manufacturing equipments necessary to implement these breakthroughs. $5B is not an unreasonable estimate to construct and equip a 22nm fab. When it costs $5B to ante up to even get into the game, we’re going to see some semiconductor companies fold their hands. We’re already seeing consolidation and collaboration in semiconductor fabrication (e.g. Common Platform, Global Foundries) and this will increase. Bernard Meyerson even spoke of a concept he called radical collaboration, in which competitors collaborate on and share the cost of the expensive basic science and R&D required to develop these new foundries and processes. We’re going to need to be creative.

 

Cost of design is also becoming a challenge. Larger chips mean larger chip design projects. Although I’ve not seen any hard data to back this up, I’ve seen $100M mentioned as the cost to develop a current state-of-the-art SoC. Assuming most of the cost is labor, that’s equivalent to over 200 engineer-years of labor! What will this be in 5 years? Obviously, a small startup cannot raise this much money to place a single bet on the roulette wheel, and larger companies will only be willing to place the safest bets with this type of investment. They will have to be high-margin high-volume applications, and how many of those applications will exist?

 

In the end, this all boils down to financial risk. Will semiconductors manufacturers be willing to take the risk of generating enough revenue to cover the cost of a $5B+ fab? Will semiconductor companies be willing to take the risk of generating enough revenue to cover the cost of a $100M+ SoC? For that matter, will there be many applications that draw $100M in revenue altogether? For more and more players, the answer will be “no”.

 

Despite all these increasing chip costs, it is important to take a step up and consider the costs at the system-level. Although it may be true that a 32nm 100M gate chip is more expensive than a 90nm 10M gate chip, the total system costs are certainly reduced due to the higher level of integration. Maybe 5 chips become 1 chip with higher performance and lower power. That reduces the packaging and product design cost. Perhaps other peripherals can now be incorporated that were previously separate. This will of course depend on each individual application, however, the point is that we should not stay myopically focused on the chip when we are ultimately designing systems. System performance is the new metric, not chip performance.

In the next blog post in this series, I’ll finish up the discussion on 2D scaling by looking at the alternatives and by making some predictions.

harry the ASIC guy

Is 2D Scaling Dead? Looking at Transistor Design

Wednesday, June 23rd, 2010

 (Part 3 in the series Which Direction For EDA: 2D,3D, or 360?)

Replica of the First TransistorIn the last blog post, I started to examine the question “is 2D scaling really dead or just mostly dead?” I looked at the most challenging issue for 2D scaling, lithography. But even if we can draw the device patterns somehow on the wafer at smaller and smaller geometries, does not necessarily mean that the circuits will deliver the performance (speed, area, power) improvements that Moore’s Law has delivered in the past. Indeed, as transistors get smaller (gate length and width) they also get shorter (oxide thickness). There are limits to the improvements we can gain in power and speed. We’ll talk about those next.

Transistor Design

First, consider what has made 2D scaling effective to date. The move to smaller geometries has allowed us to produce transistors that have shorter channels, operate at lower supply voltages, and switch less current. The shorter channel results in lower gate capacitance and higher drive which means faster devices. And the lower supply voltage and lower current result in lower dynamic power. All good.

At the same time, these shorter channels have higher sub-threshold and source-drain leakage currents and the thinner gate oxide results in greater gate leakage. At the start of Moore’s Law, leakage was small, so exponential increases were not a big deal. But at current and future geometries, leakage power is on par and soon exceeding dynamic power. And we care more today about static power, due to the proliferation of portable devices that spend most of their time in standby mode.

leakage-power.jpeg

The reduction in dynamic power is also reaching a limit. Most of the dynamic power reduction of the last decade was due to voltage scaling. For instance, scaling from 3.3V to 1.0V reduces power by 10x alone. But reductions beyond 08.V are problematic due to the inherent drop across a transistor and device threshold voltages. Noise margins are fast eroding and that will cause new problems.

Still, as with lithography, we haven’t thrown in the towel yet.

Strained Silicon is a technique that has been in use since the 90nm and 65nm nodes. It involves stretching apart the silicon atoms to allow better electron mobility and hence faster devices at lower power consumpti0on, up to 35% faster.

Hi-k dielectrics (k being the dielectric constant of the gate oxide) can reduce leakage current. The silicon dioxide is replaced with a material such as hafnium dioxide with a larger dielectric constant, thereby reducing leakage for an equivalent capacitance. This technique is often implemented with another modification which is replacing the polysilicon gate with a metal gate with lower resistance, hence increasing speed. Together, the use of hi-k dielectrics with metal gates is often referred to by the acronym HKMG and is common at 45nm and beyond.

A set of techniques commonly referred to as FinFET or Multi-gate FET (MuGFET) break the gate of a single transistor into several gates in a single device. How? Basically by flipping the transistor on it’s side. The net effect is a reduction in effective channel width and device threshold with the same leakage current; i.e. faster devices with lower dynamic power with the same leakage power.  But this technique is not a simple “tweak”; it’s a fundamental change in the way we build devices. To quote Bernard Meyerson of IBM, “to go away from a planar device and deal with a non-planar one introduces unimaginable complexities.” Don’t expect this to be easy or cheap.

Multigate FET - Trigate

A more mainstream technology that has been around a while, Silicon-on-Insulator (SOI), is also an attractive option for very high performance ICs such as those found in game consoles. In SOI ICs, a thick layer of an insulator (usually silicon dioxide) lies below the devices instead of silicon as in normal bulk CMOS. This reduces device capacitance and results in a speed-power improvement of 2x-4x, although with more expensive processing and a slightly more complex design process. You can find a ton of good information at the SOI Consortium website.

In summary, we are running into a brick wall for transistor design. Although there are new design techniques that can get us over the wall, none of these are easy and all of them are expensive, And the new materials used in this process create new kinds of defects, hence reducing yield. With some work, the techniques above may get us to 16nm or maybe a little bit further. Beyond that, they’re talking about Graphene transistors (i.e. carbon nanotubes), pretty far out stuff.

In my next post, I’ll look at some of the other considerations regarding 2D scaling, not the least of which is the extraordinary cost.

harry the ASIC guy