For the last few months, Microsoft has had the next-generation console PR space all to itself. Nintendo had nothing in the pipe and Sony was staying quiet about the PlayStation 5, which gave Microsoft plenty of room to talk about the capabilities of its Xbox Series X. Sony is finally opening up about the capabilities of its console and how some of the features will work.
The PlayStation 5 will feature an eight-core, 16-thread CPU based on AMD’s Zen 2 CPU architecture. There are 36 GPU compute units running at up to 2.23GHz. That’s rather high for AMD GPU and could indicate some custom engineering for Sony. Onboard memory is 16GB of GDDR6 with a 256-bit memory interface and 448GB/s of custom memory bandwidth. Internal storage is via a custom 825GB SSD, with an expandable storage slot provided via NVMe. There will be built-in support for USB external hard drives and the system will ship with a UHD Blu-ray drive.
Of Boost Clocks and Nimble GPUs
There are two particularly interesting facets of what Sony unveiled today. First is how Sony is using boost. When Microsoft announced the Xbox Series X specs, it made it very clear that the CPU and GPU were clock-locked at 3.6GHz with SMT enabled and 1.825GHz, respectively. Sony, in contrast, is emphasizing how the PS5 will boost — and it doesn’t work the same as a PC CPU. PC turbo clocks are unique to the configuration of any given system and can vary depending on chip quality and system cooling. Sony still expects the PlayStation 5 to play games identically across every console. Eurogamer states: “According to Sony, all PS5 consoles process the same workloads with the same performance level in any environment, no matter what the ambient temperature may be.” (Emphasis original).
“Rather than look at the actual temperature of the silicon die, we look at the activities that the GPU and CPU are performing and set the frequencies on that basis – which makes everything deterministic and repeatable,” Cerny explained in his presentation. “While we’re at it, we also use AMD’s SmartShift technology and send any unused power from the CPU to the GPU so it can squeeze out a few more pixels.”
Cerny argues that having a smaller GPU at high clock can be more efficient than a large GPU at lower clock and argued during Sony’s presentation that a hypothetical 36 CU GPU at 1GHz would outperform a 48 CU part at 750MHz, even though both solutions offer the same 4.6TFLOPS of performance.
Performance is noticeably different, because ‘teraflops’ is defined as the computational capability of the vector ALU. That’s just one part of the GPU, there are a lot of other units – and those other units all run faster when the GPU frequency is higher. At 33 per cent higher frequency, rasterisation goes 33 per cent faster, processing the command buffer goes that much faster, the L1 and L2 caches have that much higher bandwidth, and so on,” said Cerny.
The first part of this is absolutely true. Teraflops is an absolutely terrible way to compare GPU performance. The second part is a little more complicated. Clockspeed has been our primary means of boosting CPU and GPU performance since both were invented, but GPUs operate on what are sometimes called “embarrassingly parallel” workloads that respond well to additional GPU width.
According to Cerny, the advantages of higher clock speed outweigh these factors.
“About the only downside is that system memory is 33 per cent further away in terms of cycles, but the large number of benefits more than counterbalance that. As a friend of mine says, a rising tide lifts all boats. Also, it’s easier to fully use 36 CUs in parallel than it is to fully use 48 CUs – when triangles are small, it’s much harder to fill all those CUs with useful work.”
I’m very curious to see if this proves true, because it cuts against how GPU performance typically scales with power consumption. A GPU’s power consumption begins to rise sharply as it approaches maximum design clock. The faster you want to run, the more voltage you need. The more voltage you need, the hotter your chip is going to run and the more power it’s going to consume. Eventually, these become major limiting factors. For years, both AMD and Nvidia have built faster GPUs by increasing GPU core count, even when they had to cut core clock to bring a card in at a given TDP.
Flexible, Super Fast Storage
The PlayStation 5 will offer up to 5.5GB/s of raw storage bandwidth or 8-9GB/s of compressed storage bandwidth. Sony is talking up this capability a great deal, emphasizing the idea of instant load times and previously impossible gameplay, but we haven’t seen very many demos yet of how this tech works in practice.
Don’t get me wrong. The sustained storage performance of the PlayStation 5 is only about 1GB/s lower than the Athlon 64’s dual-channel memory bandwidth when paired with DDR-400, circa 2004 – 2005. Obviously PCIe 4.0 access latencies are going to be higher than DRAM latency, but that’s still a remarkable achievement and I don’t doubt the company can do great things with it. Remade versions of old games with long transitions and load times removed might be very popular next generation. The odd storage capacity (825GB) is due to Sony’s decision to use a 12-channel interface. Games now have the ability to flag certain data blocks with up to six priority levels, meaning developers have fine-grained control over which data is loaded. The entire storage pool is connected by an x4 PCIe 4.0 interface, which is why sustained bandwidth is so high.
Overall, the PS5 looks like an interesting system, even if it doesn’t pack as much firepower as the Xbox Series X. Having the most powerful console isn’t always a guarantee of success, and Sony is coming into this generation with a huge install advantage.