vol 3 issue 2

Farewell to flatland

3 June 2008
By Chris Edwards
Share |
Farewell to flatland

3D chips

For 40 years, chipmakers have done what they can to reduce the area of their electronic circuits. But they are getting to the limits of what they can do. It seems that the only way is up.

Moore's law, as we know it, is living on borrowed time. Technologies that have made it possible to pack more than a billion transistors onto one centimetre-square piece of silicon are reaching the point where they cannot scale much further.

The memory that makes up the cache in today's microprocessors is one example. Some believe that the static random-access memory (SRAM) cannot shrink any further. The problem is that the margins are now so slim that small changes in transistor behaviour can stop a proportion of the SRAM cells from working. Most solutions to the variability incur overhead: they make the cell bigger. One option is to go to a more reliable eight-transistor design, but that will be a third bigger than today's cells. That is not good news when you are trying to reduce the cell's size with each process generation.

To deal with the problem at the 45nm node, companies such as Intel and TSMC made the cell wider. Historically, SRAM cells were more or less square. Now they are very fat and squat. It turns out that, if you can make the transistors wide, albeit very short, they work more reliably. But reshaping the cell only gets you so far and the size of the cell is important.

SRAM makes up about half the area of a high-end microprocessor. And, in some cases, it accounts for a lot more. Stop shrinking the SRAM and you pretty much stop shrinking the processor. Does it mean Moore's Law is getting close to its end?

There is a direction the industry could take.

"I think Moores Law was a little vague in its original formulation," says Ian Phillips, principal staff engineer at ARM. The alternative to making everything take up less area is to start building up - move the electronic circuit into the third dimension. "It is likely that we will see 3D integration and other types of integration in the future," he adds.

"It is an evolution that is going to happen," agrees Mike Shapiro, chief engineer for 3D technologies at IBM and publicity chair for the International Interconnect Technology Conference (IITC).

The chips are... up?

In one area, 3D is already here and shipping in high volume. If you have a flash-based MP3 player with more than a couple of gigabytes of storage in it, the chances are that it uses memory chips arranged in a stack inside.

"Today, there is essentially 3D on the market - with wire-bonded stacks," says Shapiro.

The move to stacked complete chips on top of each other started in the cellphone market. Pressure to reduce the amount of printed-circuit board (PCB) space led memory makers to stop providing each memory device individually but put them on top of each other. They could combine different types of memory in the same plastic package and sell it as one unit. One common offering was to put non-volatile flash together with dynamic random access memory (DRAM). The flash would hold the software code and user data while the phone was switched off. When it booted, much of the software would be copied into the DRAM and run from there because the volatile memory offered faster accesses.

Since then, memory makers have stacked more and more flash devices on top of each other to cope with the demand for media storage. The Apple iPhone, for example, uses four chips stacked on top of each other to be able to provide space for 8GB of music and video. The technique has proved so successful that the iPhone uses quite a few different stacked packages. Even the baseband processor that handles the phone calls sits in a stack.

One factor that has led to the rapid take-up of stacked chips in phones and media players is that it is relatively simple to do. The equipment used to stack the chips and link them together is based on the same kit that has been putting silicon dice into plastic packages for the last
40 years.

Another advantage is that there is no need to redesign the chips themselves to cope with stacking. Manufacturers use the same parts on their own or in stacks. This has provided the flash makers with much more flexibility than they had in the past to supply a broad range of memory densities for different products. The only clue you get as to whether a flash part is a stack, unless you check, is the price. If an 8GB part is about twice the price of a 4GB device, the chances are that it is a stack. If that price drops suddenly, it marks a shift to a single-die product due, in most cases, to
the manufacturer's move to the next process node.

"I think the trend of using wirebond stacks will continue for devices with lower I/O counts," Shapiro predicts.

The reason for wirebond stacks being restricted to devices that do not need many I/O connections between chips is that you only have the perimeter of the chip available for making the connections. There is no way to get inside the sandwich using conventional techniques. Luckily, memory devices do not need large I/O counts. Processors and complex system-on-chip (SoC) devices, on the other hand, do. They need to be able to hook up to a variety of different types of memory, other processors and all the analogue I/O chips. Pretty soon, you run out of space on the perimeter for all those connections.

What you can do is put the processor at the bottom of the stack so that you can cover its entire surface with I/O pads: this is the essence of the flip-chip package. These pads connect to the PCB. You can then stack other devices on top, such as memories, using wirebonds, just as long as they do not need wide buses. Memory technologies such as Rambus' XDR allow you to obtain high datarates with narrow buses. But, ultimately, for stacked chips to proceed much further, you need another way of making connections through the stack.

Through-silicon via

The through-silicon via effectively turns the die into one layer of a PCB. By forming contacts through the silicon die on which the transistors and other circuits sit, you can provide direct connections between chips sitting underneath. Through-silicon vias are not new: power transistors have been making use of the technique for years to provide much better isolation than is possible when you put all of the contacts on the same surface. However, the holes you need to create are much bigger than those planned for interconnecting complex digital devices. But it is a technique that companies are beginning to employ.

One example is a family of image sensors that Toshiba launched in the autumn. As with stacked memories, a big target market is the cellphone. For the new modules, Toshiba stacked the image sensor on top of a processing chip to produce a smaller camera module. To form the contacts between the two chips, the company drilled vias through the image sensor.

Because you can distribute through-silicon vias across the entire surface of the die, you can have very wide buses running between chips in a stack. "You are looking at higher bandwidth connections between chips," says Shapiro.

The big problem with making through-silicon vias is not creating the holes, but filling them with conducting material (see box, ‘Digging holes'). Issues with the the filling processes place a minimum size on the holes themselves. The holes will interfere with the layout and reduce chip density. In deep-submicron processes, the hole for a through-silicon via is potentially as big as a bond pad for a conventionally wire-bonded chip. But it's in the middle of the die.

According to Scott Pozder, a researcher with Freescale Semiconductor, the minimum practical diameter for a through-silicon via for bonded wafers is around 1µm, and likely to be somewhat larger, with a pitch of around 3µm. These vias are way larger than those that link transistors together on the surface of a regular planar die. On a 65nm process, they are no bigger than 100nm on the densest layers, although they do get bigger as you move up the metal stack.

And manufacturers have the issue of deciding whether to stack at the wafer or the chip level. The wafer option looks simpler and cheaper at first glance. But you run the risk of putting good chips on top of duds. Every wafer will contain, with any luck, only a small proportion of failed chips. Stacking wafers on top of each other means you multiply the chances of building stacks that do not work. Even at 90 per cent yield per wafer, a four-deep stack will mean you end up with a third of your stacks being duds.

Hidden costs

Making the stacks at the chip level means you can first test all of your chips and then stack them. But it is potentially more time-consuming and, therefore, expensive because you have many more manufacturing operations to perform.

One option is to have the vias shaped so that they assist in a self-assembly process. Potentially, you can modify the surface of each chip so that they only lock together in the right orientation and in the right place.

To improve density, if you can make the layers ultrathin, you can bring the width of the via down to the size you see in the metal stack of a regular planar chip. This was the approach taken by Anna Topol's group at IBM a couple of years ago. The company has developed a number of sophisticated techniques that make it possible to build layers on separate wafers, then take the micron-thick films and place them on top of each other to form a complete 3D chip (see boxout, ‘Captured on film').

Topol claimed at the International Electron Device Meeting in the winter of 2005 that they had made 3D circuits with vias as small as 140nm across. With these structures, it was possible to place sub-circuits on top of each other.

Cost is a significant factor in this type process. As layer transfer is in its infancy, it is hard to gauge how expensive it could be. But, as it calls for two wafers to begin with, it seems likely that the materials cost would be at least double that of a conventional planar structure. That cost would be partially outweighed by the density improvement of stacking sub-circuits on top of each other.

There is an alternative, Samsung has found. Instead of taking a ready-made silicon substrate in the form of a wafer and going through the process of thinning it down, placing it on a separate ‘handle' wafer, attaching it to another wafer and then forming the connections,  Samsung engineers decided to grow the silicon surface in place.

The first steps are just like any other silicon process. In the S3 process, the engineers form a regular array of transistors and an initial layer of interconnect to wire them together. They polish the surface flat, as with any other modern semiconductor process. But then, instead of putting on another layer of metal interconnect, they grow a layer of silicon several micrometres thick. They then form a second layer of transistors, cut holes in the surface to wire up the lower layer and then continue putting on the metal layers that will form the complete array of circuits.

Samsung has built both experimental SRAM and flash memory devices using S3. Although it is potentially cheaper than any of the other techniques, growing layers of transistors in situ is not entirely straightforward. You have the problem that the processes used to form transistors involve high temperatures - interconnect layers suffer badly at those temperatures.

The high temperatures are used to anneal the silicon surface after it has been damaged by the processes needed to bury dopant atoms deep beneath the surface of the silicon. One option is to use lasers to cook only parts of the second silicon substrate so that the layers underneath are not effective. But this is more expensive than just putting the wafer in an oven and cooking it - today's method.

The other option, favoured by Samsung, is to use lower-temperature processes to form the transistors. There is a potential downside in that lower temperatures may not lead to optimally performing transistors. However, bulk memories do not generally need high-performance transistors.

In the short-term, anyone planning to go into the third dimension for anything other than memories will have to contend with a design gap. "Eventually, we will go to 3D LSI [large-scale integration]. But there are no sufficient tools to simulate for 3D LSI," says Oh-Hyun Kwon, president of the Samsung's system large-scale integration (LSI) division.

Although 3D integration still has its problems, the time is coming when the only way the semiconductor industry can go is up.

Share |

Expectations

A new direction

Samsung is keen on 3D integration. As the world's biggest memory maker, it has good reason to invest in the area. Memory offers a fertile ground for experimentation and 3D chips are less likely to suffer from the side-effects of stacking active silicon layers on top of each other and there is already a market for simple 3D memory structures.

Oh-Hyun Kwon, president of the Samsung's system large-scale integration (LSI) division, says: "If 3D is possible, we can put heterogeneous materials on a single chip. With 3D fusion, we can have lab-on-chip and sensor-on-chip products. If we can do that there will be another wave of the semiconductor industry."

At ISSCC in 2002, Hwang Chang-gyu, president of the electronics division, claimed that flash memories would surpass Moore's Law: doubling density every 12 months. Every year since, the president has turned up at a press conference brandishing chips and wafers to illustrate his eponymous law.

The issue is that, because flash production lines are following, more or less, the same evolutionary path as other memories and logic processes, growth in core circuit density is falling behind. Transistor scaling can only provide a density boost of around 25 per cent per year. Samsung's engineers need to find another 75 per cent.

Some of the increase in memory density has come from the ability to store more bits in each memory cell. Commercial NAND flash devices can now store 2bit per cell and companies are working on 16-level cells that can hold 4bit of data.

Going into the third dimension offers further boost. At a recent ISSCC, the company took the wraps off of an experimental 4Gb flash memory made using its S3 technology on a 56nm process that has double the memory density of the planar devices now in mass production.

Slow process

Digging holes

How do you punch a hole through a silicon wafer? One option is a laser drill. It's fast and it's clean. Unfortunately, if you have a lot of holes to drill, the process slows down dramatically. Some companies have tried to speed up the operation by splitting the laser light into many beams that can cut through the wafer in parallel. This is a technique that Matsushita used to punch holes in the sapphire wafers used for an experimental high-voltage power transistor (E&T, Vol 3 #1). Sapphire is a tough, practically inert substance that resists chemical attack. Silicon is another matter, which is why many teams favour chemical etching to produce through-silicon vias.

Reactive-ion etching, or dry etching, has proved startingly successful. It can bore deep holes through a silicon wafer with very little spreading at the top of the hole. Unfortunately, it is not so easy to fill the hole up with a conductor. When the 130nm process was introduced, via failure caused by a lack of metal in the contact hole was one of the most likely sources of chip failure. The response was to double-up on vias wherever possible - using statistics to fix a manufacturing problem on the basis that two neighbouring vias were unlikely to fail.

Although dry etching can produce holes with extremely high aspect ratios, problems with filling those holes with conducting materials means that the wafer has to be thinned.

Pozder's team concluded that a silicon wafer could be thinned to around 20µm. A 20:1 aspect ratio is well within the reach of today's etching and filling technologies. Potentially, silicon-on-insulator (SOI) technology could go further, perhaps thinning the wafer down to a delicate 5µm or so. But SOI is, in itself, an expensive technology that relies on layer transfer and wafer bonding techniques. A 3D chip made from SOI layers would not be a cheap option.

Speed and capacity - do you have the best of both worlds

Captured on film - IBM's approach to building 3D chips through layer involved putting the two types of transistor needed for complementary metal-on-semiconductor (CMOS) on top of each other. The approach, naturally, more or less halves the area taken up by the circuit. It could also help with the different processing techniques needed for the two types of transistor.

Anna Topol's group at IBM did not attempt to optimise the individual processes for the n-channel and p-channel layers. But, in other work, IBM worked on a technique where the silicon substrate was modified for the n-channel and p-channel areas individually. It turns out that n-channel transistors benefit from being aligned along one face of a silicon crystal. P-channel transistors get better electron mobility from a different orientation.

Similarly, the metal gates that Intel reintroduced on its 45nm process call for different materials for the n- and p-channel transistors (E&T, Vol 3 #1).

These different crystal orientations and metals could be more easily accommodated by building the transistors separately and then sliding one layer onto the other at a much later stage in the process.

Hot stuff - There is a big problem with heat that affects all 3D chips - dissipating the heat from fast-switching transistors. There is a danger that a high-speed 3D chip will simply cook itself to death.

This is a problem that concerns Shekhar Borkar, director of microprocessor research at Intel. The coming generations of semiconductor manufacturing processes will give Intel the ability to deploy thousands of processors on one die - without having to resort to 3D structures. The problem is feeding that beast with data.

Borkar said he reckons multicore processors of the future will have complex cache hierarchies that will see large quantities of memory pulled into the same package as the main processor chip. "Buses will have to be wider and wider to deliver the necessary bandwidth. Systems will demand hundreds of gigabytes of data per second of memory bandwidth," he claimed.

Even with future low-power transceivers, such bandwidths would demand 25W of power from the I/O pins alone, he says. That is not power the chip can afford as simply running half the cores at full pelt would probably push the chip beyond its practical maximum of around 150W.

One option, says Borkar, is to bring the processor and memory chips closer together by stacking them and running vias through them or putting them side-by-side and running direct connections from edge to edge. The reduction in wire inductance would bring a substantial power saving. Through-silicon vias would provide the necessary bandwidth.

But any design has to take account of the heat generated by the processor. The memory chips could not sit on top of the processor underneath the heat sink because they would have to carry almost all of the heat generated by the processor on its way to the heat sink. And DRAMs do not cope well with heat.

The only option that Borkar says he could find so far was to have the memory sit underneath the processor and drill vias through the DRAM to have connections run from the system to the processor. This would reduce the amount of space the DRAM had for the memory array, but there is obvious alternative to deal with the heat.

Related forum discussions
forum comment To start a discussion topic about this article, please log in or register.    

Latest Issue

E&T cover image 0513

"Africa is abundant with engineering opportunity. We look at some of the projects and the problems."

E&T jobs

Subscribe

Choose the way you would like to access the latest news and developments in your field.

Subscribe to E&T

E&T podcast

Tune into our latest podcast

iTunes logo