Sunday, August 10, 2008

Deferred rendering

I remember back to the intro data structure and algorithm class I took in college. The thing the professor kept trying to hammer home was not how a red and black tree works. It was data structures and algorithms have strengths and weaknesses. This point affects everything we do in graphics. The majority of things we implement have already been done before, either by other game developers, offline graphics years ago, academics, etc. There is very few places that we come up with something brand new. Most of the inventions are even small variations to existing techniques. So, given this fact, the most important skill we can have is the ability to get knowledge of all the available options, the strengths and weaknesses of each and apply the one best for the current job. And at every chance we get, add our own little tweaks and flavor to make it better than what has come before.

Forward Rendering

Forward rendering means for every light to surface interaction the surface is drawn with that lighting information. Every one of these light surface interactions is drawn additively to the screen. There are many problems with this method. Since each surface needs to be drawn again for every light that hits it there are many redundant draws, triangles, and material pixel operations such as texture fetches or adding in detail maps. With all these disadvantages it is still very popular. For instance it is the way Doom 3 and Unreal 3 engines work.

Deferred Rendering
Deferred rendering was invented to solve these problems. Traditional deferred rendering is drawing all needed surface and material attributes to a deep framebuffer called the G-buffer. For each light seen the light geometry can be drawn to the screen with a shader that reads from the G buffer and can add up the light interaction to the color buffer. No direct interaction with the surface and the light is needed. This is only possible because anything that shader would need is put in the G buffer. Any surfaces to be drawn only need to fill in the G buffer with their attributes once meaning no redundant draws, triangles, or material pixel cost.

Deferred rendering is not without its disadvantages either. The G buffer can take up quite a bit of space so trying to pack the attributes to the smallest space possible is the goal. Since the G buffer is always quite fat the attribute drawing pass is almost always ROP bound. In my opinion the worst problem is special materials that do something non standard can't work. Exactly what custom materials can do is defined by what is in the G buffer. Usually only the common attributes are packed for space reasons.

Gurialla's deferred system for Killzone 2 is explained in this great presentation.

In their system the attributes stored in the G buffer are:
  • RGBA8 for color
  • standard depth/stencil buffer (can be used to derive world position)
  • normal
  • XY motion vectors
  • spec exponent
  • spec intensity
  • diffuse color

There's a few things immediately obvious that this can't do. There is no floating point color buffer so real HDR is not possible nor things like gamma correct lighting that require higher precision color. Because only spec intensity is stored only grayscale specularity is possible. Since the game looks pretty gray this is likely not a problem for them but it is for other people.

What isn't obvious is the lighting equation has to be the same across all materials. It is likely Phong based. This rules out cool things like hair shaders, anisotropic brushed metal, fake subsurface scattering, fresnel, roughness, fuzz, cloth shaders, etc.

So, how about some alternative methods? A few have been popping up over the past year.

Light Indexed Deferred Rendering

This builds a screen buffer of light indexes that interact with that pixel. The base implementation ignores depth so the light may not hit the front most surface at that pixel. It can be extended to do so. The advantages is a large number of lights and only one pass of surface drawing. It also solves the problem of custom materials because the lighting happens when the surface is drawn. No attributes of either the light nor the surface have to be picked to store out. It also solves the ROP problem because no real deep framebuffer is ever drawn.

There are a number of problems with it though. First off to pass in light data that is indexable dx9 does not support dynamic indexed uniform access. This means the data needs to be passed in with textures for all the light data. This is a major pain in the ass and can be a performance problem depending on how often it is updated and how many textures are required to pass the data. To pass everything multiple floating point textures may be needed. Another problem is that applying a light does not happen one at a time. This means calculating shadows for a light need to stay around until all lights are applied. So for 4 shadowing lights on screen, you will likely need a 4 channel screen texture to composite them because stencil wont work and the number of shadow maps will likely be high. Now if you have 5 you'll need another screen texture. Other methods apply one light at a time so there is no need to save the shadows from multiple lights at once. You're "G-buffer" is now based on # of light indexes per pixel + # of shadows on screen. You can change this to # of shadows per pixel if you read the light index buffer when writing out the shadows.

Next Time
I didn't think this post was going to be so massive so I've decided to split it up and post this part now. I will go through the other deferred rendering alternatives as well as another option I haven't heard people talking about that I particularly like in my next post.


James said...

I am pretty sure that Unreal Engine 3 uses a deferred renderer.

Koka said...

It doesn't- Unreal Engine 4 does though.