The GPU Feature Set For the Next Generation of Games

While the 2020 Game Developers Conference has been postponed, that thankfully doesn’t mean everything gaming-related for this spring has been postponed as well. As the saying goes, the show must go on, and this week we’ve seen Microsoft, Sony, and others go ahead and make some big announcements about their gaming consoles and other projects. Not to be left out of the fray, there is PC-related news coming out of the show as well.

Leading the PC pack this week is Microsoft (again), with the announcement of DirectX 12 Ultimate. Designed as a new, standardized DirectX 12 feature set that encapsulates the latest in GPU technology, DirectX 12 Ultimate is intended to serve as a common baseline for both PC game development and Xbox Series X development. This includes not only wrapping up features like ray tracing and variable rate shading into a single package, but then branding that package so that developers and the public at large can more easily infer whether games are using these cutting-edge features, and whether their hardware supports it. And, of course, this allows Microsoft to maximize the synergy between PC gaming and their forthcoming console, giving developers a single feature set to base their projects around while promoting the fact that the latest Xbox will support the very latest GPU features.

To be sure, what’s being announced today isn’t a new API – even the features being discussed today technically aren’t new – but rather it’s a newly defined feature set that wraps up several features that Microsoft and its partners have been working on over the past few years. This includes DirectX Raytracing, Variable Rate Shading, Mesh Shaders and Sampler Feedback. Most of these features have been available in some form for a time now as separate features within DirectX 12, but the creation of DirectX 12 Ultimate marks their official promotion from in-development or early adaptor status to being ready for the masses at large.

For Microsoft, the importance of DirectX 12 Ultimate is twofold. First, DirectX has accumulated a lot of new features since its last feature set, feature level 12_1, was defined over half a decade ago. So DirectX has been overdue to package up features introduced by things like NVIDIA’s Turing GPU architecture as well as AMD’s forthcoming RDNA2 architecture. The end result of that process is the new feature level 12_2, or as it’s being branded, DirectX 12 Ultimate.

Second, this is the first console launch for Microsoft where DirectX 12 has been ready and available at launch. While Microsoft has always attempted to take advantage of the synergy between PC and console, the fact that DirectX 12 was finalized after the Xbox One family launch meant that the past generation has been somewhat misaligned. So similar to the current-generation consoles being the true kick-off point for DirectX 11 becoming the baseline for video game development, Microsoft is aiming to do the same for the next-gen consoles and DirectX 12; and they’re trying to do it in a more organized fashion than ever before.

Of course, it doesn’t hurt that this also lets Microsoft talk up the Xbox Series X as being on the same level as (current) PC GPUs. The long lifecycle of consoles means that by mid-generation they’re outpaced by PC GPUs in teams of features, and while the two products are not perfect substitutes for each other in an economic sense, it becomes one less thing that game consoles have going for themselves, and one more advantage for the PC. So by giving the new feature level a public brand, Microsoft can clearly communicate that they are on the cutting edge of GPU technology period; there is no PC GPU that can surpass them in terms of features.

And clarity is an important goal here not just for marketing reasons, but customer relations in general. The features being bundled under the DirectX 12 Ultimate banner are significant, and mesh shaders in particular stand to allow developers to completely upend the traditional rendering pipeline. So when developers start using these features as a true baseline in future games – and to be sure, that’s likely to be some time off – then the requirements need to be clearly communicated to PC gamers. It won’t be enough for a video card to just support DirectX 12, it will need to support (at a minimum) this new feature set to meet that baseline. But even in the present, where games will continue to work on multiple generations of GPUs for some time to come, the DirectX 12 Ultimate branding is useful for clearly explaining what kind of hardware it will take to access the new features that forthcoming games will be using.

The Features of DirectX 12 Ultimate (aka DX12 feature level 12_2)

Diving into the new feature level itself, as I previously mentioned, the new feature set is designed to encapsulate new GPU features introduced in the last few years. This means tying together existing features like ray tracing and mesh shaders into a new feature level, so that they can be more readily targeted by developers.

All told – and much to the glee of NVIDIA – DirectX 12 Ultimate’s feature set ends up looking a whole heck of a lot like their Turing architecture’s graphics feature set. Ray tracing, mesh shading, and variable rate shading were all introduced for the first time on Turing, and this represents the current cutting edge for GPU graphics functionality. Consequently, it’s no mistake that this new feature level, which Microsoft is internally calling 12_2, follows the Turing blueprint so closely. Feature levels are a collaboration between Microsoft and all of the GPU vendors, with feature levels representing a common set of features that everyone can agree to support.

Ultimately, this collaboration and timing means that there is already current-generation hardware out there that meets the requirements for 12_2 with NVIDIA’s GeForce 16 and 20 series (Turing) products. And while AMD and Intel are a bit farther behind the curve, they’ll get there as well. In fact in a lot of ways AMD’s forthcoming RDNA2 architecture, which has been at the heart of this week’s console announcements, will serve as the counterbalance to Turing as far as 12_2 goes. This is a feature set that crosses PCs and consoles, and while NVIDIA may dominate the PC space, what AMD is doing with RDNA2 is defining an entire generation of consoles for years to come.

DirectX 12 Feature Levels
 12_2
(DX12 Ult.)
12_112_0
GPU Architectures
(Introduced as of)
NVIDIA: Turing
AMD: RDNA2
Intel: Xe?
NVIDIA: Maxwell 2
AMD: Vega
Intel: Gen9
NVIDIA: Maxwell 2
AMD: Hawaii
Intel: Gen9
Ray Tracing
(DXR 1.1)
YesNoNo
Variable Rate Shading
(Tier 2)
YesNoNo
Mesh ShadersYesNoNo
Sampler FeedbackYesNoNo
Conservative RasterizationYesYesNo
Raster Order ViewsYesYesNo
Tiled Resources
(Tier 2)
YesYesYes
Bindless Resources
(Tier 2)
YesYesYes
Typed UAV LoadYesYesYes

Overall, there are four big features that Microsoft and partners are focusing on for 12_2, at least publicly. These are ray tracing, variable rate shading, mesh shaders, and sampler feedback. Some of these features, particularly ray tracing, have been available in DirectX 12 for a while, and all of them have been previously announced by Microsoft as they’ve worked with developers to refine them. None the less, even ray tracing is getting some important feature updates to coincide with 12_2, so as a whole the new feature level brings a lot of new toys to the table for game developers to play with.

As for gamers, the Windows introduction of 12_2 is set to occur in the next couple of months, when Microsoft ships their next big Windows 10 feature update, Windows 10 version 2004 (also known as 20H1). And while games using the new feature level will be slow to trickle out (like any new feature level launch), it does mean that gamers will need to stick to the latest Windows to use it. Holding back a version or two (as some of us are want to do) means no DX12U for you.

Raytracing with DXR 1.1

Kicking off the feature level 12_2 family is raytracing support. The raytracing component of DX12 was first introduced by Microsoft back in 2018, and it has been available for developer use for a while now. I’m not going to recap raytracing in detail here – we’ve written about it a few times now – but at a high level it’s going to play an important part in future games. Essentially simulating the way that real light is projected and interacts with the world, raytracing is designed to take over in areas where the current rasterization rendering paradigm has been stretched to its limits. Developers have been able to do a ton of amazing things for lighting with incredibly clever hacks on rasterization, but there are some areas where the quality or performance of actually casting light (rays) simply cannot be beat. And this is where hardware raytracing fits in.


Ray Tracing Diagram (Henrik / CC BY-SA 4.0)

Though it was officially a complete and shipping standard, the original 1.0 standard was none the less a bit experimental in nature. By its very nature it was designed around the hardware at the time (Turing) and no one was entirely sure what developers would do with raytracing. So for raytracing’s inclusion into a full DirectX feature level, the raytracing API itself is seeming some enhancements.

The new DXR 1.1 standard extends 1.0 in several ways, to incorporate new features that developers have asked for in the last couple of years. And because it’s only comprised of new software functionality, that means that it works on existing Turing hardware as well. So in practice, DXR 1.1 is going to supplant DXR 1.0 going forward, and no one outside of developers should be any wiser.

The big additions for DXR 1.1 are primarily focused on making it easier or more efficient for developers to use raytracing in their games. Heading up this list is the ability to spawn raytracing tasks on the GPU itself, without requiring the host CPU to do it. This is something the GPU can already do in other situations – particularly having compute kernels spawn other compute kernels via dynamic parallelism – and now the same concept is being extended to raytracing. Microsoft sees this feature as being helpful for scenarios where the GPU would want to prepare raytracing work and them immediately spawn it anyhow, such as shaders using raytracing as a means of culling. Which, to be sure, is doable even under DXR 1.0; but the fact that the CPU would have to invoke it makes it less efficient.

The other major addition here is what Microsoft is calling inline raytracing. Perhaps best conceptualized as a stripped-down version of raytracing for simple tasks, inline raytracing exposes raytracing to more stages of the GPU rendering process, thereby allowing developers to take more direct control over raytracing and potentially use it in more places. Of particular note here, inline raytracing can be invoked in shader stages that can’t invoke regular raytracing, such as compute shaders, giving developers more flexibility. Overall, inline raytracing comes with less overhead for simple tasks, making them a better fit for that scenario, while traditional raytracing (and its better scheduling mechanisms) are going to be superior for complex tasks.

Variable Rate Shading

The second feature getting bundled into 12_2 is variable rate shading. Another Turing architecture launch feature, Microsoft began incorporating VRS into DirectX last year.

At a high level, variable rate shading allows for the rate of shading operations to be varied within a frame. Rather than running pixel and other shaders at a 1:1 rate with individual pixels, the shading rate can be dialed up or dialed down to focus on improved quality or reducing the rendering workload in certain areas. Developer use is primarily going to be focused on the latter of the two, with developers using it to cut back on the amount of shading done in areas of a screen where that level of detail is unnecessary – or at least unlikely to be noticed.

Variable rate shading already has two tiers, and feature level 12_2 will be incorporating the second, more powerful tier of that feature. Tier 2 allows for the shading rate to be varied within a draw call, allowing for a relatively fine-grained approach to where the shading rate is adjusted. This can be done on per-primitive basis, or by defining general areas in a frame where the rate should be adjusted (screenspace).

Variable rate shading has already been optionally used in a few games to date, particularly Wolfenstein II, but its current use isn’t as widespread as raytracing.

And while the primary use for variable rate shading is going to be on improving performance by selectively reducing the shading resolution – especially for the 4K resolution gaming Microsoft wants to do on the Xbox Series X – the feature is also set to play a part in VR headsets. Variable rate shading is the core rendering technology behind making foveated rendering possible, which itself promises significant efficiency gains for VR headsets. By only rendering the center of the user’s vision at full resolution (if not higher, for improved clarity), the amount of work required to render a VR frame is significantly reduced. This can help bring down the costs of VR by requiring less powerful hardware, or it can be used to free up performance for even better-looking games.

Mesh Shaders: The Next Generation Geometry Pipeline

The third feature on the 12_2 list is mesh shaders. And truth be told, nothing I’m going to write here is going to quite do them justice.

At a very high level, mesh shading is the basis for next-generation geometry pipelines. The current geometry pipeline paradigm has essentially had new stages bolted on to it at multiple points over the last twenty years, with features like geometry shaders and tessellation tacked on to extend the pipeline. But the core concept of this pipeline is still based on traditional, pre-pixel shader rasterization methods, and this brings with it unnecessary complexity and inefficiency.

Thus, hardware and software developers alike want to throw out the current geometry pipeline in favor of something new, and that new thing is mesh shaders.

Mesh shaders are perhaps best thought of as compute shaders for geometry. The significance of which being that modern compute shaders are incredibly powerful by virtue of not only their parallelism, but their flexibility in terms of how data is processed and routed. Basically, instead of making developers follow a rigid pipeline to setup their geometry, mesh shaders let developers take near complete control to do it as they see fit.

Mesh shaders can also optionally be used with amplification shaders. I won’t go into those too much, but the basic principle there is to help setup data for the mesh shaders. Microsoft notes that they’re particularly useful for culling, though that isn’t their only use.

Ultimately, the goal of mesh shaders is to significantly improve the efficiency of the geometry pipeline, and thereby give developers the performance headroom to use ever more detailed geometry. This is accomplishing by removing overhead at several levels, as well as making it practical to do very geometry culling, stopping geometry before it would hit the vertex shader. Mesh shaders will also allow for index buffer compression, with an eye towards mitigating the memory bandwidth cost of using very complex geometry.

The catch to all of this, as is often the case, is how quickly developers can adopt it. Mesh shaders throw out a tried and true geometry pipeline for something new entirely, which means developers will need to become accustomed to it. It’s a big change from a game development standpoint, and consequently it’s very much a “baseline” feature. So mesh shading is something developers can really only do when they rebuild their engines for the next generation of consoles, where they no longer need to support pre-12_2 hardware.

Sampler Feedback

The final marquee feature for Direct X 12 Ultimate/feature level 12_2 is sampler feedback. This is a very new feature that has only recently been exposed, and has received very little publicity so far; though like everything else here, the hardware capabilities first showed up in Turing.

Previously demoed by NVIDIA as texture-space shading, sampler feedback is a broader feature with a few different uses. At a very high level, the idea behind sampler feedback is to allow game engines to track how the texture samplers are being (or will be) used – thus, the samplers give feedback to the engine – allowing the engine to make more intelligent decisions about how the samplers are used and what resources are kept in VRAM.

The principle use case for this, Microsoft envisions, will be in improving texture streaming. By using sampler feedback, game engines can determine what texture tiles are actually going to be needed, and thus only loading up the necessary tiles. This keeps overall VRAM pressure down, ultimately allowing developers to use higher quality textures overall by losing less VRAM to unneeded tiles. Fittingly for the Xbox Series X, this is especially handy when your games are stored on a high speed SSD, as it means the necessary tiles can be pulled in from storage incredibly quickly (almost in a just-in-time fashion), instead of having to stage them in RAM or take measures to mitigate the long access time of a HDD.

Meanwhile texture-space shading is the other major use for this feature. Another efficiency technique, texture-space shading allows for the shading of an object to take place without actually rasterizing it. Microsoft’s example here involves lighting – where an object has its lighting calculated once instead of repeatedly as a rasterized object would require. Ultimately, the central idea behind this feature is to be able to cache and reuse shading results, freeing up GPU resources for other, more important tasks.

ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH52gpJwZqahk6e8tLvFrWSipqSnvLR5w6KpnpukrXpyfoyuo62hnZbBpnnNnq%2BtZZeau26yxJqrrqqVYsCmwA%3D%3D