-->

Tuesday, April 27, 2010

AMD's Six-Core Phenom II X6 1090T & 1055T Reviewed

by Anand Lal Shimpi on 4/27/2010 12:26:00 AM

Posted in  CPUs , AMD , Phenom II X6

AMD's Phenom II X6 is here based on the brand new Thuban core. Boasting Turbo Core support and nearly 1 billion transistors, AMD is willing to sell you six of its finest cores for under $300. The price drops to under $200 if you're willing to deal with a 2.8GHz clock speed.
AMD is continuing its strategy of selling you more cores than Intel at the same price. With the Phenom II X6, AMD is going after Intel's Lynnfield CPUs - primarily the Core i5 750 and Core i7 860. In our tests we found that the Phenom II X6 excels (as expected) at heavily threaded applications, while lightly threaded apps or mixed workloads generally favor Intel's quad-core chips. It's the expected outcome we've been seeing for the past few months here - if you need lots of threads below $300, go AMD otherwise go Intel.
Read on to get the full story on AMD's Phenom II X6.


Page 1

A very smart man once told me that absolute performance doesn’t matter, it’s performance at a given price point that makes a product successful. While AMD hasn’t held the absolute performance crown for several years now, that doesn’t mean the company’s products haven’t been successful.
During the days of the original Phenom, AMD started the trend of offering more cores than Intel at a given price point. Intel had the Core 2 Duo, AMD responded with the triple core Phenom X3. As AMD’s products got more competitive, the more-for-less approach didn’t change. Today AMD will sell you three or four cores for the price of two from Intel.
In some situations, this works to AMD’s benefit. The Athlon II X3 and X4 deliver better performance in highly threaded applications than the Intel alternatives. While Intel has better performance per clock, you can’t argue with more cores/threads for applications that can use them.
When Intel announced its first 6-core desktop processor, the Core i7 980X at $999, we knew a cheaper AMD alternative was coming. Today we get that alternative, this is the Phenom II X6 based on AMD’s new Thuban core:
It’s still a 45nm chip but thanks to architecture and process tweaks, the new Phenom II X6 still fits in the same power envelope as last year’s Phenom II X4 processors: 125W.
CPU Specification Comparison
Processor Clock Speed Max Turbo L2 Cache L3 Cache TDP Price
AMD Phenom II X6 1090T 3.2GHz 3.6GHz 3MB 6MB 125W $285
AMD Phenom II X6 1055T 2.8GHz 3.3GHz 3MB 6MB 125W $199
AMD Phenom II X4 965 BE 3.4GHz N/A 2MB 6MB 125W/140W $185
AMD Phenom II X4 955 BE 3.2GHz N/A 2MB 6MB 125W $165
AMD Phenom II X4 945 3.0GHz N/A 2MB 6MB 95W $155
AMD Phenom II X4 925 2.8GHz N/A 2MB 6MB 95W $145
You also don’t give up much clock speed. The fastest Phenom II X6 runs at 3.2GHz, just 200MHz shy of the fastest X4.
When Intel added two cores to Nehalem it also increased the L3 cache of the chip by 50%. The Phenom II X6 does no such thing. The 6 cores have to share the same 6MB L3 cache as the quad-core version.

The Phenom II X6 die. Monolithic, hexa-core
There’s also the issue of memory bandwidth. Intel’s Core i7 980X is paired with a triple channel DDR3 memory controller, more than enough for four cores under normal use and enough for a six core beast. In order to maintain backwards compatibility, the Phenom II X6 is still limited to the same dual channel memory controller as its quad-core predecessor.
CPU Specification Comparison
CPU Codename Manufacturing Process Cores Transistor Count Die Size
AMD Phenom II X6 1090T Thuban 45nm 6 904M 346mm2
AMD Phenom II X4 965 Deneb 45nm 4 758M 258mm2
Intel Core i7 980X Gulftown 32nm 6 1.17B 240mm2
Intel Core i7 975 Bloomfield 45nm 4 731M 263mm2
Intel Core i7 870 Lynnfield 45nm 4 774M 296mm2
Intel Core i5 670 Clarkdale 32nm 2 384M 81mm2
AMD Phenom II X4 965 Deneb 45nm 4 758M 258mm2
The limitations are nitpicks in the grand scheme of things. While the 980X retails for $999, AMD’s most expensive 6-core processor will only set you back $285 and you can use them in all existing AM2+ and AM3 motherboards with a BIOS update. You're getting nearly 1 billion transistors for $200 - $300. Like I said earlier, it’s not about absolute performance, but performance at a given price point.
AMD 2010 Roadmap
CPU Clock Speed Max Turbo (<= 3 cores) L3 Cache TDP Release
AMD Phenom II X6 1090T 3.2GHz 3.6GHz 6MB 125W Q2
AMD Phenom II X6 1075T 3.0GHz 3.5GHz 6MB 125W Q3
AMD Phenom II X6 1055T 2.8GHz 3.3GHz 6MB 125W/95W Q2
AMD Phenom II X6 1035T 2.6GHz 3.1GHz 6MB 95W Q2
AMD Phenom II X4 960T 3.0GHz 3.4GHz 6MB 95W Q2
We'll soon see more flavors of the Phenom II X6 as well as a quad-core derivative with 2 of these cores disabled. As a result, motherboard manufacturers are already talking about Phenom II X4 to X6 unlocking tools.
The new Phenom II X6 processors are aimed squarely at Intel’s 45nm Lynnfield CPUs. Both based on a 45nm process, AMD simply offers you more cores for roughly the same price. Instead of a quad-core Core i7 860, AMD will sell you a six-core 1090T. Oh and the T stands for AMD’s Turbo Core technology.


Page 2

AMD’s Turbo: It Works

In the Pentium 4 days Intel quickly discovered that there was a ceiling in terms of how much heat you could realistically dissipate in a standard desktop PC without resorting to more exotic cooling methods. Prior to the Pentium 4, desktop PCs saw generally rising TDPs for both CPUs and GPUs with little regard to maximum power consumption. It wasn’t until we started hitting physical limits of power consumption and heat dissipation that Intel (and AMD) imposed some limits.
High end desktop CPUs now spend their days bumping up against 125 - 140W limits. While mainstream CPUs are down at 65W. Mobile CPUs are generally below 35W. These TDP limits become a problem as you scale up clock speed or core count.
In homogenous multicore CPUs you’ve got a number of identical processor cores that together have to share the maximum TDP of the processor. If a single hypothetical 4GHz processor core hits 125W, then fitting two of them into the same TDP you have to run the cores at a lower clock speed. Say 3.6GHz. Want a quad-core version? Drop the clock speed again. Six cores? Now you’re probably down to 3.2GHz.
Single Core Dual Core Quad Core Hex Core
This is fine if all of your applications are multithreaded and can use all available cores, but life is rarely so perfect. Instead you’ve got a mix of applications and workloads that’ll use anywhere from one to six cores. Browsing the web may only task one or two cores, gaming might use two or four and encoding a video can use all six. If you opt for a six core processor you get great encoding performance, but worse gaming and web browsing performance. Go for a dual core chip and you’ll run the simple things quickly, but suffer in encoding and gaming performance. There’s no winning.
With Nehalem, Intel introduced power gate transistors. Stick one of these in front of a supply voltage line to a core, turn it off and the entire core shuts off. In the past AMD and Intel only put gates in front of the clock signal going to a core (or blocks of a core), this would make sure the core remained inactive but it could still leak power - a problem that got worse with smaller transistor geometries. These power gate transistors however addressed both active and leakage power, an idle core could be almost completely shut off.
If you can take a single core out of the TDP equation, then with some extra logic (around 1M transistors on Nehalem) you can increase the frequency of the remaining cores until you run into TDP or other physical limitations. This is how Intel’s Turbo Boost technology works. Depending on how many cores are active and the amount of power they’re consuming a CPU with Intel’s Turbo Boost can run at up to some predefined frequency above its stock speed.
With Thuban, AMD introduces its own alternative called Turbo Core. The original Phenom processor had the ability to adjust the clock speed of each individual core. AMD disabled this functionality with the Phenom II to avoid some performance problems we ran into, but it’s back with Thuban.
If half (or more) of the CPU cores on a Thuban die are idle, Turbo Core does the following:
1) Decreases the clock speed of the idle cores down to as low as 800MHz.
2) Increases the voltage of all of the cores.
3) Increases the clock speed of the active cores up to 500MHz above their default clock speed.
The end result is the same as Intel’s Turbo Boost from a performance standpoint. Lightly threaded apps see a performance increase. Even heavily threaded workloads might have periods of time that are bound by the performance of a single thread - they benefit from AMD’s Turbo Core as well. In practice, Turbo Core appears to work. While I rarely saw the Phenom II X6 1090T hit 3.6GHz, I would see the occasional jump to 3.4GHz. As you can tell from the screenshot above, there's very little consistency between the cores and their operating frequencies - they all run as fast or as slow as they possibly can it seems.
AMD's Turbo Core Benefit
AMD Phenom II X6 1090T Turbo Core Disabled Turbo Core Enabled Performance Increase
x264-HD 3.03 1st Pass 71.4 fps 74.5 fps 4.3%
x264-HD 3.03 2nd Pass 29.4 fps 30.3 fps 3.1%
Left 4 Dead 117.3 fps 127.2 fps 8.4%
7-zip Compression Test 3069 KB/s 3197 KB/s 4.2%
Turbo Core generally increased performance between 2 and 10% in our standard suite of tests. Given that the max clock speed increase on a Phenom II X6 1090T is 12.5%, that’s not a bad range of performance improvement. Intel’s CPUs stand to gain a bit more (and use less power) from turbo thanks to the fact that Lynnfield, Clarkdale, et al. will physically shut off idle cores rather than just underclock them.
I have noticed a few situations where performance in a benchmark was unexpectedly low with Turbo Core enabled. This could be an artifact of independent core clocking similar to what we saw in the Phenom days, however I saw no consistent issues in my time with the chip thus far.


Page 3

The Performance Summary

At $199 and $285 the obvious comparison points are Intel’s Core i5 750 and Core i7 860. We’ll dive into the complete performance tests in a bit, but if you’re looking for some quick analysis here’s what we’ve got.
Single threaded performance is squarely a Lynnfield advantage. Intel’s quad-cores can turbo up more and Intel does have the advantage of higher IPC.
Phenom II X6 vs. Intel's Lynnfield Processors
  Cinebench R10 (Single Threaded) Cinebench R10 (Multithreaded) 3dsmax r9 x264 HD - 2nd Pass Left 4 Dead
AMD Phenom II X6 1090T 3951 18526 13.7 28.5 fps 127.2 fps
AMD Phenom II X6 1055T 3547 16268 12.7 25.1 fps 111.5 fps
Intel Core i7 860 4490 16598 15.0 26.8 fps 131.0 fps
Intel Core i5 750 4238 14142 13.4 21.0 fps 130.0 fps
Highly threaded encoding and 3D rendering performance are obviously right at home on the Phenom II X6. The 6MB L3 cache and lower IPC does appear to hamper the Phenom II X6 in a couple of tests but for the most part if you need threads, the X6 is the way to go.
Applications in between generally favor Intel’s quad-cores over the Phenom II X6. This includes CPU-bound games.
None of this should be terribly surprising as it’s largely the same conclusion we came to with the Athlon II X3 and X4. If you run specific heavily threaded applications, you can’t beat the offer AMD is giving you. It’s the lighter or mixed use workloads that tend to favor Intel’s offerings at the same price points.


Page 4

AMD’s 890FX Chipset

The Phenom II X6 will work in all existing Socket-AM2+ and AM3 motherboards that can 1) support the 125W TDP of the processors, and 2) have BIOS support (apparently over 160 boards at launch). Despite this impressive showing of backwards compatibility, we also get a new chipset today for those of you looking to build a new system instead of upgrade.
The 890FX is a mildly updated version of AMD’s 790FX chipset, mostly adding AMD’s SB850 South Bridge with 6Gbps SATA support. The number of PCIe 2.0 lanes and other major features remains unchanged.
  AMD 890FX AMD 890GX AMD 790FX
CPU AMD Socket-AM3 AMD Socket-AM3 AMD Socket-AM3/AM2+
Manufacturing Process 65nm 55nm 65nm
PCI Express 44 PCIe 2.0 lanes 24 PCIe 2.0 lanes 44 PCIe 2.0 lanes
Graphics N/A Radeon HD 4290 (DirectX 10.1) N/A
South Bridge SB850 SB850 SB750
USB 14 USB 2.0 ports 14 USB 2.0 ports 12 USB 2.0 ports
SATA 6 SATA 6Gbps ports 6 SATA 6Gbps ports 6 SATA 3Gbps ports
IOMMU 1.2 N/A N/A
Max TDP 19.6W 25W 19.6W
You get IOMMU support (an advantage over 790FX) and despite the chipset being built on TSMC's 65nm process, it pulls less power than the 890GX as it lacks any integrated graphics.

The Test

To keep the review length manageable we're presenting a subset of our results here. For all benchmark results and even more comparisons be sure to use our performance comparison tool: Bench.
Motherboard: ASUS P7H57DV- EVO (Intel H57)
Intel DP55KG (Intel P55)
Intel DX58SO (Intel X58)
Intel DX48BT2 (Intel X48)
Gigabyte GA-MA790FX-UD5P (AMD 790FX)
MSI 890FXA-GD70 (AMD 890FX)
Chipset Drivers: Intel 9.1.1.1015 (Intel)
AMD Catalyst 8.12
Hard Disk: Intel X25-M SSD (80GB)
Memory: Corsair DDR3-1333 4 x 1GB (7-7-7-20)
Corsair DDR3-1333 2 x 2GB (7-7-7-20)
Video Card: eVGA GeForce GTX 280 (Vista 64)
ATI Radeon HD 5870 (Windows 7)
Video Drivers: ATI Catalyst 9.12 (Windows 7)
NVIDIA ForceWare 180.43 (Vista64)
NVIDIA ForceWare 178.24 (Vista32)
Desktop Resolution: 1920 x 1200
OS: Windows Vista Ultimate 32-bit (for SYSMark)
Windows Vista Ultimate 64-bit
Windows 7 x64


Page 5

SYSMark 2007 Performance

Our journey starts with SYSMark 2007, the only all-encompassing performance suite in our review today. The idea here is simple: one benchmark to indicate the overall performance of your machine.
SYSMark really taxes two cores most of the time, giving the edge to Lynnfield and its aggressive turbo modes. Lightly threaded or mixed workloads won't do so well on the Phenom II X6.
SYSMark 2007 - Overall

Adobe Photoshop CS4 Performance

To measure performance under Photoshop CS4 we turn to the Retouch Artists’ Speed Test. The test does basic photo editing; there are a couple of color space conversions, many layer creations, color curve adjustment, image and canvas size adjustment, unsharp mask, and finally a gaussian blur performed on the entire image.
The whole process is timed and thanks to the use of Intel's X25-M SSD as our test bed hard drive, performance is far more predictable than back when we used to test on mechanical disks.
Time is reported in seconds and the lower numbers mean better performance. The test is multithreaded and can hit all four cores in a quad-core machine.
Adobe Photoshop CS4 - Retouch Artists Speed Test
Performance here is good, but even Photoshop doesn't make consistent enough use of all six cores to really give the Phenom II X6 the edge it needs here. It's faster than the Phenom II X4, but not faster than the Core i5 750.


Page 6

DivX 6.8.5 with Xmpeg 5.0.3

Our DivX test is the same DivX / XMpeg 5.03 test we've run for the past few years now, the 1080p source file is encoded using the unconstrained DivX profile, quality/performance is set balanced at 5 and enhanced multithreading is enabled.
Thanks to AMD's Turbo Core the Phenom II X6 is pretty close here, but still not able to topple Intel's Core i5 and i7.
DivX 6.8.5 w/ Xmpeg 5.0.3 - MPEG-2 to DivX Transcode

x264 HD Video Encoding Performance

Graysky's x264 HD test uses x264 to encode a 4Mbps 720p MPEG-2 source. The focus here is on quality rather than speed, thus the benchmark uses a 2-pass encode and reports the average frame rate in each pass.
And we finally see the Phenom II X6 flex its muscle, even the 1055T is faster than the Core i7 860:
x264 HD Encode Benchmark - 720p MPEG-2 to x264 Transcode
In the actual encoding pass the 1055T falls behind the 860 but it's still a good 19% faster than the Core i5 750.
x264 HD Encode Benchmark - 720p MPEG-2 to x264 Transcode


Page 7

3dsmax 9 - SPECapc 3dsmax CPU Rendering Test

Today's desktop processors are more than fast enough to do professional level 3D rendering at home. To look at performance under 3dsmax we ran the SPECapc 3dsmax 8 benchmark (only the CPU rendering tests) under 3dsmax 9 SP1. The results reported are the rendering composite scores.
Not all heavily threaded workloads will show the Phenom II X6 in a good light. Here Intel maintains the advantage:
3dsmax 9 - SPECapc 3dsmax 8 CPU Test

Cinebench R10

Created by the Cinema 4D folks we have Cinebench, a popular 3D rendering benchmark that gives us both single and multi-threaded 3D rendering results.
Single threaded performance is obviously an Intel advantage, but crank up the thread count and there's no match for the Phenom II X6. As we pointed out earlier, if you've got a lot of CPU intensive threads there's no replacement for more cores.
Cinebench R10 - Single Threaded Benchmark
Cinebench R10 - Multi Threaded Benchmark

POV-Ray 3.73 beta 23 Ray Tracing Performance

POV-Ray is a popular, open-source raytracing application that also doubles as a great tool to measure CPU floating point performance.
I ran the SMP benchmark in beta 23 of POV-Ray 3.73. The numbers reported are the final score in pixels per second.
Once again, the Phenom II X6 does very well here.
POV-Ray 3.7 beta 23 - SMP Test


Page 8

WinRAR - Archive Creation

Our WinRAR test simply takes 300MB of files and compresses them into a single RAR archive using the application's default settings. We're not doing anything exotic here, just looking at the impact of CPU performance on creating an archive.
The i7 860 wins against the 1090T, but the lack of Hyper threading keeps the 750 behind the Phenom II X6 1055T.
WinRAR 3.8 Compression - 300MB Archive

7-Zip Performance

7-Zip Benchmark - 32MB Dictionary
7-Zip 300MB 7z Archive - Max Compression


Page 9

Gaming Performance

None of the games here can take advantage of more than 4 cores. The Phenom II X6 ends up performing no different than a Phenom II X4. Thankfully due to Turbo Core you rarely see a drop in performance compared to the Phenom II X4 965.
If you want the better gaming chip, you want Lynnfield.
Fallout 3 - 1680 x 1050 - Medium Quality
Left 4 Dead - 1680 x 1050 - Max Settings (No AA/AF/Vsync)
Crysis Warhead - 1680 x 1050 - Mainstream Quality (Physics on Enthusiast) - assault bench
Batman: Arkham Asylum
Dragon Age Origins
Dawn of War II


Page 10

Power Consumption

Most impressive is AMD's ability to run six 45nm cores at the same power consumption as four 45nm cores. The Phenom II architecture in general does reasonably well at idle, but without power gating AMD can't compete with Intel's idle power levels.
Under load Intel also has the clear advantage.
Idle Power Consumption


Page 11

Overclocking

The Phenom II X6 1090T is a Black Edition part, meaning it has a fully unlocked clock multiplier. With very little effort our 3.2GHz sample was up and running at 3.80GHz without any additional cooling beyond the stock heatsink/fan.
With a little extra effort, 3.9GHz should be possible, but the fact that we can even run at 3.8GHz with six 45nm cores is very impressive. Update: You asked, and we pushed harder. Our 1090T sample can hit 4GHz at 1.45V and even reach 4.1GHz but not with great stability. The even more important takeaway is that AMD's 64-bit/4GHz limits appear to be gone with Thuban.


Page 12

Final Words

Today's conclusion is no different than what we've been saying about AMD's CPU lineup for several months now. If you're running applications that are well threaded and you're looking to improve performance in them, AMD generally offers you better performance for the same money as Intel. It all boils down to AMD selling you more cores than Intel at the same price point.
Applications like video encoding and offline 3D rendering show the real strengths of the Phenom II X6. And thanks to Turbo Core, you don't give up any performance in less threaded applications compared to a Phenom II X4. The 1090T can easily trump the Core i7 860 and the 1055T can do even better against the Core i5 750.
You start running into problems when you look at lightly threaded applications or mixed workloads that aren't always stressing all six cores. In these situations Intel's quad-core Lynnfield processors (Core i5 700 series and Core i7 800 series) are better buys. They give you better performance in these light or mixed workload scenarios, not to mention lower overall power consumption.
The better way to look at it is to ask yourself what sort of machine you're building. If you're building a task specific box that will mostly run heavily threaded applications, AMD will sell you nearly a billion transistors for under $300 and you can't go wrong. If it's a more general purpose machine that you're assembling, Lynnfield seems like a better option.

0 comments:

Post a Comment