Monday, June 04, 2007

AMD's Griffin Flies to the Fore

In 2003, AMD bet the company on a server microprocessor, the K8. This turned out to be a wise move, as servers were a key weak point in Intel's product line-up. The K8 even worked fairly well as a desktop MPU, where the performance could justify the relatively large thermal envelope. Unfortunately, the K8 did nothing to challenge Intel's dominance of the notebook market. In fact, certain technical decisions which benefited the K8 for servers and desktops in fact hindered adoption into the notebook market. AMD notebooks have met with modest success in the retail channels, through stores such as Best Buy or Circuit City. The higher end corporate notebooks are mostly Intel Inside due to the power and platform advantages, although Toshiba recently announced they would use AMD processors in some mid-range business user notebooks.

At the Microprocessor Forum, AMD Fellow Maurice Steinman presented Griffin, a MPU specifically adapted for the notebook market, due out in 2008. Griffin is in many ways a compromise between the improvements required to address the notebook market and AMD's available resources (which are mostly being invested in Barcelona). One of the first decisions was to use the older K8 core, largely unchanged, with 2x1MB L2 caches. The microcode has been updated (which adds virtualization support and other enhancements), but the microarchitecture is relatively untouched, with none of the innovations from Barcelona. Not all of the improvements in Barcelona are suitable for a low power MPU anyway; while improving branch prediction is always a win, it's not clear that a shared L3 cache is particularly useful for notebooks (especially from a die size perspective). Instead, the improvements for Griffin are largely focused on the lower level circuit optimizations and the northbridge, which are both very high leverage points.

From the circuit side, the biggest change is in the power and clock distribution. The K8 had a single PLL and a single voltage plane for the cores and northbridge. As a result, if the cores and memory controller had to be placed into a sleep state in tandem – if either one was active, the other one was as well. Many integrated graphics solutions rely on an external frame buffer that is kept in system memory and accessed at least every 1/60th of a second (and likely much more often depending on the screen refresh algorithm). Hence, the older K8 cores could not transition into sleep states, because they would be periodically woken up by the frame buffer and memory controller activity.

Like the K8, Griffin has one PLL for the cores and the northbridge, which runs at the maximum frequency. However, each core has a digital frequency synthesizer, which runs off the PLL and produces a local clock using digital dividers and pulse dropping. This local clock can change without relocking the PLL – saving time in frequency transitions. Each core will have 8 operating frequencies, and the two cores can run independently of one another. To complement this improvement, Griffin also has 3 major voltage planes (in addition to three for analog and I/O): one for each core and one for the northbridge, so that the voltages can be modified in conjunction with the frequency for the best power savings. The downside of the additional voltage planes is that the system will require additional voltage regulation modules on each board; while this increases expenses for manufacturers, it is an improvement that is certainly worth the cost.

The key point of these enhancements is that the power states for the cores are now independent of the northbridge for all integrated graphics configurations. Unlike the K8, notebook chipsets for Griffin with integrated graphics will not need external frame buffers to achieve power savings in the processor cores. This will reduce the overall cost and power consumption of the platform and addresses a problem with the previous generation.

A Northbridge too Far

The northbridge was another area that AMD's architects chose to focus on, to save power. The memory controller now runs at memory frequencies, rather than core clock rates, which generally lowers power, without a large impact on performance. Like Barcelona, Griffin features two independent 64 bit DDR2 memory controllers, complete with a page access predictor, which has been slightly modified to err on the side of power efficiency. The memory controller also includes a prefetcher and write buffer, which has again, been tuned specifically for mobile workloads. One substantial change is that the drive strength for the DDR2 I/O pins has been reduced, which saves power, although certain configurations, such as 4 SO-DIMM slots are no longer supported . This is a very reasonable trade-off, as most notebooks simply don’t need the same configurations that a desktop or server would require.

Griffin uses HyperTranpsort 3.0 to communicate with the chipset and the rest of the world. The additional bandwidth that HyperTransport 3.0 offers will improve performance for integrated graphics, especially on Windows Vista/DirectX 10 and other stressful workloads. HT3 also introduces some new power saving features that are controlled by the hardware. The HT link itself is 16 bits wide, in each direction; however, the link can change width depending on the bandwidth that is actually needed. For example, when a user is just working on a low I/O (disk and graphics) bandwidth application, like a spreadsheet or word processor, each HT3 link could narrow itself to 1, 2, 4 or 8 bits wide (and each direction is managed independently). The HT3 link can even be disconnected and certain components powered down when not in use, for further power saving. The HT3 hardware has a variety of options for power saving. For instance, the clock recovery circuitry and delay locked loops on the receiver can be turned off, which increases wake up latency, but saves more power. Similarly, training sequences can be sent periodically across a ‘sleeping’ link to reduce wake up latency.

Thermal management was another area where Griffin improved upon the K8. Each core integrates two thermal sensors, and there is an embedded thermal controller in the MPU itself. When OEM-defined limits are exceeded, the embedded thermal controller can reduce the processors frequency and voltage. The embedded thermal controller can also interface with an external monitor for the memory system. The embedded controller can then throttle memory to keep energy consumption below pre-specified levels for the next 128 cycles. The previous generation K8 had a single analog sensor, and no integrated controller, which meant that the management capabilities varied from system to system.


Figure 2 – AMD Puma Platform

AMD’s mobile platform, known as Puma, also includes chipsets, both from AMD itself and from partner companies such as NVIDIA or VIA. AMD offering in this area uses the RS780M northbridge and the SB700 southbridge. The former connects to the MPU with HT3 and adds PCI-E Gen2 for discrete graphics and other peripherals. The northbridge also contains two display controllers, one for HDMI with HDCP and the other for older displays (TV out is also included). One neat feature the RS780 offers is switching between integrated and discrete graphics on the fly. High-end notebooks can use discrete graphics when plugged into AC power for gaming performance, but switch over to integrated graphics for better power consumption when using the battery. The southbridge offers support for NAND flash, serial and parallel ATA, HD audio and legacy PCI. It appears that it will be up to OEMs to integrate 3rd party Wifi, and possibly WiMAX or 3G chipsets.

Conclusion

The enhancements in Griffin and Puma will make AMD’s microprocessors substantially more attractive for mobile applications. Most of the issues with the previous generation have been fixed, which should produce much more consistent power consumption across different AMD based notebooks. Whether Griffin will be competitive with Penryn is somewhat uncertain. Certainly from a performance standpoint, Griffin will lag behind Intel’s offerings. However, the situation for power and battery life, which is probably more important, is much less clear. In some areas, AMD’s power management is more sophisticated than what Intel currently offers – particularly the separate voltage planes for each core. However, the K8 microarchitecture is not as highly tuned for performance/watt as Intel’s competing microarchitecture. Although Griffin cannot be expected to dominated the mobile world, it is clearly a good first step along the road for AMD and first silicon has already booted Windows.



No comments: