Engineer at work on a Tektronix system used to validate DDR4 designs for Cadence

DDR4 memory bus standard: layout at speed

A new memory bus standard allows faster data rates, but designers will need to make trade-offs.

When it comes to raw data throughput the memory bus standard DDR4 represents a major jump in performance compared to its predecessor. From DDR3 to DDR4 the datarate increases from 2133Mbit/s to 3200Mbit/s, with an extension allowing a doubling to 4266Mbit/s. At the same time, the voltage reduces from 1.5V to 1.2V and is complicated by a change in termination that effectively trades power consumption against signal quality. If the designer chooses high-resistance termination, which will reduce power consumption, there is a greater chance that the receiver will encounter problems.

Hemant Shah, senior group director of the custom IC and PCB group at Cadence Design Systems, says: “With standards such as DDR3 and DDR4, designing the memory interface is a challenge because of the set of constraints that need to be followed. Signals need to arrive within a very small window of time, which makes it hard to meet all those constraints.”

Nikola Kontic, business development manager at Zuken UK, adds: “Timing needs to be matched but you also have to match impedances. You have to think about your stack-up and impedance control to take account of the layers you are routeing on.”

However, the Jedec committee responsible for the standard did not simply turn up the clock speed and turn down the voltage to reach DDR4 and expect the PCB designer to do the rest – it has made changes over time that should allow boards to be put together cost-effectively, says Ben Jordan, senior product marketing manager at Altium.

Earlier versions of DDR used a tree topology from the controller to the memory chips that turned out to be difficult to implement in practice. Recent versions of DDR moved to the ‘fly by’ topology where memories are placed along a bidirectional bus and chips write data to that bus at specific intervals – as their time slot ‘flies by’ – set by the memory controller. Although this deals with the signal-integrity issues of the tree topology, consistent timing is essential and allows for comparatively little skew across the signals across the bus.

The controller can, to some degree, compensate for some of the skew between signals by delaying its clock signals based on a calibration phase performed at startup. Similar calibration helps determine the voltage levels needed to identify zeroes from ones. A more effective controller can more reliably decode data from a scheme with relatively weak termination, which will tend to save power.

“The effectiveness is really controller dependent. A cheap one will mean tight layout tolerances. A good controller with very fine-grain tuning will probably result in an easier PCB design,” notes Nitin Bhagwath, technical marketing engineer at Mentor Graphics.

DDR4 signals can be affected by transitions across layers of a PCB through vias, particularly by the short stubs formed by the extension of the via plating through to the other side of the PCB when the signal is carried on through an inner layer. Such stubs can be difficult to avoid when routeing into the dense pattern of solder bumps on the bottom of a ball-grid array (BGA).

Kontic says: “Some prefer or err on the side of routeing on the outer layers and avoiding layer changes because they can cause issues with signal integrity, through capacitance and losses in the vias.”

However, routeing only on the outer layers might make the board impossible to route. Drilling out the stub or using microvias removes the stub effect but tends to make the PCB more expensive.

“People are predicting that by 2016, DDR4 will be cheapest memory of all of them. We can’t save money on the chips and then have an expensive circuit board: that’s not going to fly,” says Jordan.

In server, industrial and telecom designs, programmable logic is likely to play a large role, and this provides another way in which the silicon can be tuned for the PCB design. Devices such as FPGA make it possible to alter the I/O connections that on-chip memory transceivers use.

Kontic says: “FPGA engineers get more feedback from PCB design these days. They have some flexibility in being able to swap pins to another bank so they can assist the PCB designer to achieve better length control and get shorter lengths.”

A number of today’s advanced PCB tools provide constraint-management and DDR layout assistance ‘wizard’ tools that automate many of the tedious tasks needed to match lengths in traces, adding serpentine routes and jogs to balance the flight times of signals within the DDR bytelanes. Jordan points out that it’s important to take account of the full interconnect length, including passive components such as the resistor termination, but that designers should avoid over-constraining the design.

“Many people overkill on the first design. They use too many, perhaps exotic layers and spend too long because they are trying to get a delay mismatch down to picoseconds when they may need only be within a few nanoseconds,” Jordan adds.

Bhagwath says: “Experience is usually great when you are doing next is very similar to what you did before. When you don’t know what’s coming up, that’s where simulation comes in.”
Simulation for performance and signal integrity allows for more accurate trade-offs in design and to support optimisation of the silicon with the board design, potentially cutting system cost, says Heiko Dudek, field engineering group director at Cadence: “You can explore ideas with form factor and different PCB stack-ups and play with different trade-offs.”

With access to the silicon and chip package design, it is possible to go further, by playing with different configurations of the DDR4 physical-layer controller layout and shape, says Dudek, as that can help minimise parasitic effects and compensate for the use of cheaper packages, such as those based on lead frames rather than flip-chip technology, which may be needed to compete on price.

Bhagwath adds: “A key force that’s acting is time. You are trying to balance time with whether it’s going to work or not. You don’t want the hardware to come back with 50 bugs. That’s where simulation comes in. It mitigates that problem. But one of the key things about simulation is that it’s got to be done quickly, so it’s important for it to be integrated. If it takes you a month to get it set up, it’s not going to help with time.”

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles