DLSS – Deep Learning Super Sampling
Super Sampling is also known as Super Sampling Anti Aliasing(SSAA) is a spatial anti-aliasing method i.e. a method to remove aliasing (jagged and pixelated edges also known as “jaggies”) from a video, rendered images or another software that produces computer graphics.
Aliasing is not often dealt-with in higher resolutions but if the user does not own the required hardware then the user would be dealing with this situation. It occurs in computer-generated graphics because, unlike the real word which made up of continuous colours and materials, the computer screen is divided into a small number of equal-sized squares known as pixels, which display their own designated colour (mono-colour) in the image, when these pixels are aligned next to each other not in a vertical or horizontal position then the jagged edges starts to show which in technical terms is known as Aliasing but is commonly referred as jaggies. These are less likely to occur in higher resolution as it has enough number of pixels to make the jagged edges less visible.
Now coming over to its process and how it is done – It is basically done in four steps which are as follows –
- The low-resolution image is taken which is full of jaggies.
- The upper image is converted into high resolution.
- From the higher resolution image, colour samples are taken from the excessive pixels (pixels that were not present in the lower resolution.).
- The higher resolution image is shrunk down to its original resolution and each pixel is given a new colour which is the averaged colour from the sampled pixels from the higher resolution image.
Super Sampling is quite a hardware extensive as it requires large video card memory and memory bandwidth because the amount of buffer used is often large. There are a various number of ways in which the sample pixels are taken for Super Sampling but some of the most commonly used techniques are – Grid, Jittered, Poisson disc, Random and Rotated grid.
What is Deep Learning Super Sampling ?
Deep Learning Super Sampling or DLSS is a technology developed by Nvidia which uses deep learning to produce an image which looks like a higher resolution version of the previous lower resolution image. This technology was advertised as a key feature of the RTX graphic cards line-up of Nvidia in 2018. As of now, it is only available with the RTX 20 series GPUs. At the time of its launch, the result was not very good as the usual resolution upscaling and the algorithm also has to be specifically trained for each game in which it has to be put.
In 2020 Nvidia released driver 445.75, an improvement to the basic DLSS and was named DLSS 2.0 which was available for few existing games at that time and would be available for the upcoming games. Nvidia told that it uses Machine Learning again but does not have to be specifically designed for each game. Benchmarks on Control (video game) tend to show that for example with an output 4K resolution, the resulting image with a “Quality” DLSS pre-set (upscaled from a 1706×960 pixels input resolution) have the same quality as the native 4K resolution but double the native resolution performance.
But DLSS 2.0 does not work well with other anti-aliasing techniques such as MSAA or TSAA and the performance could be very negatively affected if these are implemented atop DLSS.
The algorithms used by both versions of DLSS are as follows –
- DLSS 1.0:
Nvidia states that DLSS 1.0 worked by generating a perfect frame for the lower resolution frame using the traditional Super Sampling technique. Then the neural networks were trained on these resulting frames also the model was trained to recognize aliased inputs on receiving the initial result.
- DLSS 2.0:
The neural networks were trained using the ideal high-resolution images of a game using a high-end computer and also there lower resolution images. The result would be stored in the video card driver.
The neural network stored in the driver compares the actual low-resolution images with the reference images and produce a full high-resolution result image. The trained neural networks use the low-resolution image as well as the low-resolution motion vector from the game engine as the input. The motion vectors help the networks to determine in which direction the object in the frame is moving and is used to determine the upcoming frames.