Wednesday, September 14, 2016

Reflections and Roughness

This post continues my work on cube map reflections from where I left off in an earlier post on this topic. I had it working pretty well at the time. However, I was never able to get 100 reflective objects in the scene in realtime because I didn't have enough GPU memory on my 2GB card. I now have a GeForce GTX 1070 with 8GB of video memory, which should allow me to add as many as 300 reflective objects.

Another problem that I had with the earlier reflection framework was the lack of surface roughness support. Every object was a perfect mirror reflector. I did some experiments with mipmap biasing to try and get a proper rough surface (such as brushed metal), but I never got it working at a reasonable performance and quality point. I think I've finally solved this one, as I'll explain below.

3DWorld uses a Phong specular exponent (shininess factor) lighting model because of its simplicity. Physically based rendering incorporates more complex and accurate lighting models, which often include a factor for surface roughness. I'm converting shininess to surface roughness by mapping the specular exponent to a texture filter/mipmap level, which determines which power-of-two sampling window to use to compute each blurred output texel. I use an equation I found online for the conversion:
filter_level = log2(texture_size*sqrt(3)) - 0.5*log2(shininess + 1.0)

The problem with using lower mipmap levels to perform the down-sampling/blurring of the reflection texture is the poor quality of the filtering. Mipmaps use a recursive 2x2 pixel box filter, which produces blocky artifacts in the reflection as seen in the following screenshot. Here the filter_level is equal to 5, which means that each pixel is an average of 2^5 x 2^5 = 32*32 source texels. Click on the image to zoom in, and look closely at the reflection of the smiley in the closest sphere.

Rough reflection using mipmap level 5 (32x32 pixel box filter) with blocky artifacts.

The reflection would look much better with a higher order filter, such as a bi-cubic filter. Unfortunately, there is no GPU texture hardware support for higher order filtering. Only linear filtering is available. Adding bi-cubic texture filtering is possible through shaders, but is complex and would make the rendering time increase significantly.

An alternative approach is to do the filtering directly in the fragment shader when rendering the reflective surface, by performing many texture samples within a window. This is more of a brute force approach. Each sample is offset to access a square area around the target pixel. I use an NxN tap Gaussian weighted blur filter, where:
N = 2^(filter_level+1) - 1
A non-blurred perfect mirror reflection with filter_level=0 has a single sample computed as N = 2^(0+1)-1 = 1. [Technically, a single filter sample still linearly interpolates between 4 adjacent texels using the hardware interpolation unit.] A filter_level=5 Gaussian kernel has N= 2^(5+1)-1 = 63 samples in each dimension, for 3969 samples total. That's a lot of texture samples! It really kills performance, dropping the framerate from 220 FPS to only 19 FPS as shown in the screenshot below. Note the framerate in the lower left corner of the image. But the results look great!

Rough reflection using a 63x63 hardware texture filter kernel taking 3969 texture samples and running at only 19 FPS.

The takeaway is that mipmaps are fast but produce poor visual results, and shader texture filtering is slow but produces good visual results. So what do we do? I chose to combine the two approaches: select a middle mipmap level, and filter it using a small kernel. This has a fraction of the texture lookups/runtime cost, but produces results that are almost as high quality as the full filtering approach. For a filter_level of 5, I split this into a mipmap_filter_level of 2 and a shader_filter_level of 3. The mipmap filtering is applied first with a 2^2 x 2^2 = 4x4 pixel mipmap. Then the shader filtering is applied with a kernel size N= 2^(3+1)-1 = 15. The total number of texture samples is 15x15 = 225, which is nearly 18x fewer texture accesses. This gets the frame rate back up to around 220 FPS.

I'm not sure exactly why it's as fast as a 1x1 filter. The texture reads from the level 2 mipmap data are likely faster due to better GPU cache coherency between the threads. That would make sense if the filtering was texture memory bandwidth limited. I assume the frame rate is limited by something else for this scene + view, maybe by the CPU or other shader code.

Here is what the final image looks like. It's almost identical in quality to the 63x63 filter kernel image above. The amount of blur is slightly different due to the inexactness of the filter_level math (it's integer, not floating-point, so there are rounding issues). Other than that, the results are perfectly acceptable. Also, this image uses different blur values for the other spheres to the right, so concentrate on the closest sphere on the left for comparison with the previous two images.

Rough reflection using a combination of mipmap level 2 and a 15x15 texture filter kernel taking 225 texture samples.

Here is a view of 8 metal spheres of varying roughness, from matte (fully diffuse lighting) on the left to mirror reflective (fully specular lighting) on the right. Each sphere is one filter_level different from the one next to it; the specular shininess factor increases by 2x from left to right.

Reflective metal spheres of varying roughness with roughest on the left and mirror smooth on the right.

This screenshot shows a closer view of the rough sphere on the left, with the filter_level/specular exponent biased a bit differently to get a clearer reflection. There are no significant filtering artifacts even at this extreme blurring level.

Smiley reflection in rough metal sphere showing high quality blur.

I'm pretty happy with these results, and the solution is relatively simple. The next step is to make the materials editable by the user and to make the reflective shapes dynamic so that they can be moved around the level. In fact, I've already done this, but I'll have to show it in a later post.

No comments:

Post a Comment