Since DirectX 12 API’s debut, speculation about performance increases across PC hardware has been rampant online. But will it actually deliver what its promises?
Answers depend on your graphics card and CPU configurations; more modern GPUs may experience significant performance gains with DX12 while older hardware dating back 10+ years may not see much of an upgrade.
Improved Performance
DirectX 12 is a low-level API that enables developers to fine-tune performance more precisely than with higher-level tools like OpenGL and Vulkan, thus significantly improving game performance on modern PC hardware that utilizes CPU and GPU resources more efficiently than earlier graphics cards.
Not all games will benefit from DirectX 12, or see an increased frame rate when using it; this depends on a number of factors including game quality and optimization for DX 12, as well as whether your graphics card can handle it efficiently. That being said, most titles that support DirectX 12 typically experience increased performance over DX 11 when switching over. Thus making it worth your while to explore this more modern option if you possess one of today’s graphics cards.
DirectX 12 games can reduce CPU overhead by as much as 50% while increasing GPU performance by 20%, due to better utilization of hardware by enabling multiple stages of the pipeline to execute simultaneously, providing greater GPU efficiency while eliminating some bottlenecks found with DX 11.
DX 12 offers support for advanced rendering techniques like ray tracing, mesh shaders, variable rate shading and sampler feedback – improving game fidelity while necessitating more powerful GPUs to take full advantage of these features.
DX 12’s main drawback is its exclusivity to Windows computers – as this API is only supported natively on this platform. That being said, many of today’s best graphics cards support DX 12, such as those manufactured by NVIDIA and AMD alike – in past iterations it had much superior driver support over AMD but with DX 12 becoming increasingly widespread on gaming hardware, we are witnessing more close competition between them both.
Asynchronous Compute
DirectX 12 introduces an asynchronous compute feature which gives games access to GPU-specific engines that handle rendering commands, machine learning algorithms and decompression acceleration. This marks a dramatic departure from traditional practices where graphics work was executed on separate queues which were ordered by drivers; thus creating performance variations between queues optimized by games optimized specifically for them.
Utilizing asynchronous compute technology allows smaller compute tasks to fit more efficiently into existing 3D rendering commands for faster execution, increasing system performance and creating an enhanced gaming experience for you.
AMD has long championed this hardware feature since it first made its debut two years ago. NVIDIA, on the other hand, has not shown as much enthusiasm due to a lack of support on their cards for async compute capabilities despite GeForce driver purporting to expose this feature for use by games running on them. This became an issue this week when it was discovered that Maxwell GPU architecture does not support async compute even though GeForce driver supports async compute functionality on games running on NVIDIA cards! This has caused much debate since this week when it became public that GeForce driver exposes this capability via GeForce driver while games running on GeForce cards do not support async compute capability at all despite claims to the contrary by game developers using GeForce driver.
NVIDIA graphics cards cannot make use of the performance gains offered by asynchronous compute in the same way that AMD cards do, as seen in the chart from Time Spy results above comparing tests conducted with and without enabled asynchronous compute enabled. As can be seen from these Time Spy results, when enabled asynchronous compute improves results significantly when compared with same tests run without it enabled.
NVIDIA GPU architecture must include support for multiple command queues to support asynchronous computing; however, sources close to NVIDIA have confirmed that their current generation of Maxwell GPUs do not possess this functionality due to lack of resource barriers implemented through HyperQ architecture.
NVIDIA has publicly stated its intent to support asynchronous compute with future GPUs, specifically those from their Turing architecture. So if you’re hoping to leverage its benefits with your NVIDIA GPU now is an opportune time to upgrade.
Variable Rate Shading
Gamers often encounter problems when using complex graphics in games; the more detailed a scene is, the harder its GPU must work to render it. Unfortunately, this puts undue strain on hardware and can result in unplayable experiences. Thanks to recent advances in GPU design, variable rate shading provides a way to mitigate pressure without compromising image quality; it is available with DirectX 12.
VRS works by decreasing the resolution at which shaders are applied within a scene, enabling neighboring pixels to share one shader instead of having two shades applied twice for every pixel in their vicinity. Compared with MSAA which relies on multiple samples per pixel to smooth out edge-tilting effects, VRS offers much smarter performance optimization methods as it not only reduces shader count but also decreases GPU time spent doing these shaders.
VRS does this by breaking up the screen into blocks and weighting each based on their importance. Shaded blocks will only appear where more important details exist; less-essential areas will have their resolution reduced while still maintaining high image fidelity through interpolated pixels nearby.
Developers have three choices when specifying VRS according to hardware: at a draw call level, primitive level or using screen-space attachments for the entire renderpass. They can combine all three techniques for optimal results. A negative texture mipmap LOD bias may help reduce blurriness in areas shaded at lower resolutions.
Today we’re unveiling a new test in 3DMark that shows how different graphics settings affect performance and visual fidelity with variable rate shading enabled. To run this test you’ll need a device running Windows 10 version 1903 or later that supports DirectX 12 Tier 1 VRS as well as having D3D12_FEATURE_DATA_D3D12_OPTIONS6::AdditionalShadingRatesSupported enabled for DirectX 12. To run this test successfully you will require DirectX 12 Tier 1 VRS support as well as D3D12_FEATURE_DATA_D3D12_OPTIONS6::AdditionalShadingRatesSupported capability enabled on DirectX 12 Tier 1 VRS enabled device running Windows 10. To run this test successfully you will require Windows 10 version 1903 or later and DirectX 12 Tier 1 VRS support as well as having DirectX 12 Tier 1 VRS enabled support from DirectX 12 Tier 1. To run this test successfully you will require device capable of running Windows 1903 or later with DirectX Tier 1 VRS and support DirectX 12 Tier 1 VRS as well as being capable of DirectX12_FEATURE_DATA_D3D12_OPTIONS6:AdditionalShadingRatesSuppacked capability enabled from DirectX12_FEATURE_DATA_D3D12_OPTIONS6:AdditionalShadingRatesSuppated capability enabled from version 1903 or later version 1903 or later version 1903 support D3D12_FEATURE_FEA12 version 1903 or later and DirectX 12 Tier 1VRS supported from DirectX12 Tier 1VRS having its capability having D3D12_OPTION_DA_FEATURE_OPTIOND12_OPTIONSBopedanceCOD6:AdditionalshadingRATEMENTS12_FEATURE_OPTIOND12_D12_OPTIONS12_OPTIONS16_ADDitionalShadingRatesRatesSUPpediPatusS_OPTIONDATA_DATUREDATAD12_FEATUREDATA_OPTIONDATA_OPTON=ADDitionalsSIN10 version 1903 and D3DA190 enabled with_FATURE=DD12_DA+ support [8:S064 or greater than 1903).1v11 support DirectX12_Tier 12_FEA12_DA1VS:ADDINP4[DF” option as per #AdditionAL6OPTONSIN]_OPTONSumSIN”AdditionalShadingRatesRatesSumS=ADDIONIN5OPTIONDAUS]OPTION4Ol2::ADDitionalRatesS SUP Port.Tier1_FEA12OPTION D10=_FEAlINPin7S = Direct _FE Availe5].D12_FE=1903 or later with “OPTONS:[VarTsS]IN [DA12_FEA 12_FEA12_FEA12_FEA]_OPTONS=12_FEA”.ADDS7sSSupported[OPTONS=Additional SAV [=10_TIER 1 VRS], supported in *”.ADD_OPTONS 6 [D12_OPTION_DAL]ADDIN]Spil8 [[IA5[L/OPTONSUSPR_OPT1[ D12[=ADDIN].
Sampler Feedback
When loading a texture into a game, it must first determine how many texels are visible at each resolution. To accomplish this task, a sample of pixels are taken and sent to a shader processor for lighting calculations. With sampler feedback capabilities, rasterization pipeline can record which pixels texel locations are most visible so as to only shade those pixels which require lighting calculations; thus enabling expensive texel shading calculations from occurring which reduces memory usage and performance for games significantly.
Microsoft provides an in-depth blog post that describes sampler feedback and its operation, with code examples for both CPU and GPU versions of this function. They have also created a DirectX 12 specification entry describing it.
Sampler feedback can be enabled on GPU hardware by setting the D3D12_RESOLVE_MODE_ENCODE_SAMPLER_FEEDBACK flag; currently available for DirectX 12 on Windows 10 PCs running the Creators Update or later and mobile devices with Android OS versions above 8.0.
Developers wishing to take advantage of this feature must implement a command buffer recorder and specify DirectX 12 feature level. Microsoft says the feature allows developers to reduce driver overhead by recording lists of commands in parallel threads and sending them in batches directly to GPU, saving rendering time, decreasing latency and increasing frames per second.
As with all new features, Microsoft will release documentation and examples to accompany their introduction. One example would be 3DMark feature test samples which include sampler feedback implementation; these provide an ideal way to experiment with this technology against existing software products.
Microsoft has assured that this new functionality will provide games on Series X hardware a significant performance increase, including higher texture fidelity and asset quality, VRAM stutter and pop-in elimination, and sufficient space for gameplay – effectively increasing 10GB of GPU optimal RAM to 16GB – giving an experience similar to 16GB memory! In addition, additional RAM will enable applications to stream in large assets quickly while decreasing SSD storage needs; Microsoft is working closely with Nvidia and Qualcomm on bringing this feature into upcoming GPU hardware devices.