I just stumbled on this blog the other day that I haven't seen linked in the graphics blog circle so I thought I'd make a point of mentioning it. Chris Evans, a technical artist from Crytek, has a blog. He has just left Crytek to go to ILM and I hope he keeps up the blog as it's full of art and technical topics. His main site also has some good stuff like cryTools, Crytek's suite of max scripts.
I didn't get a chance to play Resistance 2 yet but for a good rendering break down of the game check out Timothy Farrar's post. Timothy is a Senior Systems Programmer I work with at Human Head so you can trust him ;). From a part of the game I did see I think the object shadows work similarly to the first cascade of cascaded shadow maps as in there is one shadow map that is fit to the view frustum within a short range and fades out past that range. I haven't seen the rest of the game to know whether shadows can come from more than one direction that would make this not work.
Thursday, November 20, 2008
Sunday, November 16, 2008
Smooth transitions
The T-rex night attack scene from Jurassic Park was a major milestone for CG. It showed a realistic and convincing CG character in a movie. As the T-rex came after a girl Dr. Grant tells her, "Don't move. If we don't move he won't see us". This is something graphics programmers should remember as well because it holds true for humans as well. Our eyes are very good at seeing sharp changes but not very good at seeing smooth changes. If there is a way to smooth a hard transition you can get away with a lot more than if you didn't. I think this is the prime thing missing in most LOD implementations. The LOD change gets pushed far enough in the distance that you can't tell when it pops from one LOD level to the next. If the pop was replaced with a smooth transition the LOD distance could be pushed significantly closer and still not be noticeable.
Fracture
I only played the demo but I really liked their cascaded shadow map implementation. It looks like there is just 2 levels to it. After that it goes to no shadows. What is really nice about it is there's a smooth transition between the levels. As you walk out to an object it will fade from no shadow to low res shadow to high res shadow. So many cascaded shadow maps in games look strange or jarring because there is a line on the ground where it goes from one res to another in the shadows. This moves with your view direction and movement.
Gears of War 2
In my opinion any serious game graphics programmer or artist is obligated to play Gears of War 2. It is the bar now for graphics on a console. For being blown away by visuals it gives Crysis a run for its money too. As far as tech that seems absurd but it's just the combination of art and tech with enough things I'd never seen before that this takes the prize for me. It isn't this through and through so it really takes playing the whole game to get what I mean.
As far as tech I was a bit surprised when Tim Sweeney showed at GDC the new things they added to the Unreal engine for the next Gears of War. I was surprised because it wasn't very much. In the time between UT2k4 and GoW they built a whole new engine. Sure it was an evolution but it was a large one. The renderer was rewritten, they created a whole set of high end tools and changed the internal framework drastically. For GoW to GoW2 they added SSAO, hordes of distant guys, water, changed character lighting, and destructible geometry (which wasn't really in the game). This doesn't sound like very much considering they have 18 engine programmers listed in the credits.
The change that impressed me the most is fading in textures when they stream in. Some UE3 games have gotten some flak for streaming images popping in. Instead of not pushing the memory as much they added fading in of mip levels. For smooth transitions this is brilliant! I'm guessing most will never know textures aren't streamed in on time because they will never see the pop again. I also noticed they may have pushed the texture streamer further because they didn't have to worry about subtle pops for new mip levels. I couldn't tell if this happens when an image downsizes because I never noticed any image downsize.
Fading new mip levels once they stream in is something I've wanted to do but I just don't know how they are doing it. If anyone knows or is doing something similar themselves I'd love to hear how it works. The problem I see is there are min and max mip levels settable for a texture sampler on a 360. These unfortunately are dwords. Lod bias is settable but this happens before the clamp to min and max. The only way I could see this working is if they calculate the gradient themselves in the shader and clamp it to the fade value as they lerp from the old clamp to the new full mip level. This seems to me like it would create a shader explosion if this needs to be turned on and off for every texture for every shader. The alternative is always manually calculating the texture gradient for all uv's used and then clamping it individually for each texture which I believe would be quite a bit slower.
Next up is the screen space ambient occlusion (SSAO). This helped with their low res lightmaps and showed off their very high poly environments. Personally I think it was over done in many cases but I guess overall it was an improvement. I was surprised by the implementation. They are using frame recirculation to reduce the per frame cost. You can tell because obscuring objects will wipe the AO away for a moment before it grows back in. It seems to be at a pretty high res, possibly screen res. Previous results are found either using the velocity buffer or just the depth buffer and camera transformation matrix. Using this position they can sample from the previous calculated results without smearing things as you look around.
Much of the visual splendor comes from clever material effects. They have scrolling distortion maps to distort the uv's. There's pulsing parallax maps that look so good for a moment I thought the whole area was deforming geometry. There was inner glow from using an additive reverse fresnel like the cave ant lions in Half Life ep2. There was a window pane with rain drops coming down it that I had to study for like 2 mins to figure out what was going on. My guess, 1 droplet map, 2 identical drop stream maps independently masked by scrolling textures. The final normal map was used to distort and look up into an environment map. Their artists really went to town with some of this stuff.
The lighting is still mostly directional lightmaps. Shadows are character based modulate shadows, this time higher res than before. It seems they are only on characters this time leaving the SSAO to handle the rest of the dynamic objects.
Fracture
I only played the demo but I really liked their cascaded shadow map implementation. It looks like there is just 2 levels to it. After that it goes to no shadows. What is really nice about it is there's a smooth transition between the levels. As you walk out to an object it will fade from no shadow to low res shadow to high res shadow. So many cascaded shadow maps in games look strange or jarring because there is a line on the ground where it goes from one res to another in the shadows. This moves with your view direction and movement.
Gears of War 2
In my opinion any serious game graphics programmer or artist is obligated to play Gears of War 2. It is the bar now for graphics on a console. For being blown away by visuals it gives Crysis a run for its money too. As far as tech that seems absurd but it's just the combination of art and tech with enough things I'd never seen before that this takes the prize for me. It isn't this through and through so it really takes playing the whole game to get what I mean.
As far as tech I was a bit surprised when Tim Sweeney showed at GDC the new things they added to the Unreal engine for the next Gears of War. I was surprised because it wasn't very much. In the time between UT2k4 and GoW they built a whole new engine. Sure it was an evolution but it was a large one. The renderer was rewritten, they created a whole set of high end tools and changed the internal framework drastically. For GoW to GoW2 they added SSAO, hordes of distant guys, water, changed character lighting, and destructible geometry (which wasn't really in the game). This doesn't sound like very much considering they have 18 engine programmers listed in the credits.
The change that impressed me the most is fading in textures when they stream in. Some UE3 games have gotten some flak for streaming images popping in. Instead of not pushing the memory as much they added fading in of mip levels. For smooth transitions this is brilliant! I'm guessing most will never know textures aren't streamed in on time because they will never see the pop again. I also noticed they may have pushed the texture streamer further because they didn't have to worry about subtle pops for new mip levels. I couldn't tell if this happens when an image downsizes because I never noticed any image downsize.
Fading new mip levels once they stream in is something I've wanted to do but I just don't know how they are doing it. If anyone knows or is doing something similar themselves I'd love to hear how it works. The problem I see is there are min and max mip levels settable for a texture sampler on a 360. These unfortunately are dwords. Lod bias is settable but this happens before the clamp to min and max. The only way I could see this working is if they calculate the gradient themselves in the shader and clamp it to the fade value as they lerp from the old clamp to the new full mip level. This seems to me like it would create a shader explosion if this needs to be turned on and off for every texture for every shader. The alternative is always manually calculating the texture gradient for all uv's used and then clamping it individually for each texture which I believe would be quite a bit slower.
Next up is the screen space ambient occlusion (SSAO). This helped with their low res lightmaps and showed off their very high poly environments. Personally I think it was over done in many cases but I guess overall it was an improvement. I was surprised by the implementation. They are using frame recirculation to reduce the per frame cost. You can tell because obscuring objects will wipe the AO away for a moment before it grows back in. It seems to be at a pretty high res, possibly screen res. Previous results are found either using the velocity buffer or just the depth buffer and camera transformation matrix. Using this position they can sample from the previous calculated results without smearing things as you look around.
Much of the visual splendor comes from clever material effects. They have scrolling distortion maps to distort the uv's. There's pulsing parallax maps that look so good for a moment I thought the whole area was deforming geometry. There was inner glow from using an additive reverse fresnel like the cave ant lions in Half Life ep2. There was a window pane with rain drops coming down it that I had to study for like 2 mins to figure out what was going on. My guess, 1 droplet map, 2 identical drop stream maps independently masked by scrolling textures. The final normal map was used to distort and look up into an environment map. Their artists really went to town with some of this stuff.
The lighting is still mostly directional lightmaps. Shadows are character based modulate shadows, this time higher res than before. It seems they are only on characters this time leaving the SSAO to handle the rest of the dynamic objects.
The Latest Games
So, I've gotten some complaints that I haven't updated this blog in a while. Partly this is due to the abundance of games that have come out lately and partly this is due to my mind share being in work specific graphics algorithms. Since I can not talk about what I do for work I decided a better area for blog topics is other peoples games. The majority of my graphics research comes from reading about and studying games in whatever form I can get it. I'll go through some of the things I've found lately.
Dead Space
One of the stand out graphical features to me for Dead Space is dynamic shadows. All lights and shadows are dynamic. The shadows are projected or cube shadow maps depending on the light. It looks like bilinear filtering which makes the shadows look very pixelated and nasty at times. They really take advantage of the shadows being dynamic though by having objects and lights move around whenever they can. Swinging and vibrating light positions are abundant. For large lights the resolution really suffered and the players shadow could reduce to blocks.
Coronas and similar effects were used a lot and looked great. With bloom being all the rage to portray bright objects it's a nice slap in the face that the old standbys can sometimes get a lot better results than the new fancy system. Similar tricks were used to get light beams through dusty corridors by using sheets that fade out as you get near them. This masks their flat shape well. It seems the artists went to town with this effect making different light beam textures for different situations. Foggy particle systems were also used that faded out when you get near them.
David Blizard, Dead Space's lighting designer, claimed their frame budget for building the deferred lighting buffer was 7.5ms, 4ms for building the shadow buffers, and 2ms for post processing including bloom and an antialiasing pass which means they are not using multisampling. For ambient light there is baked ambient occlusion for the world that modulates an ambient color coming from the lighting.
Mirrors Edge
Although it uses the Unreal engine it looks distinctly not like any other Unreal game or for that matter any other game out. This is due to its "graphic design" looking art style. Tech wise this relies heavily on global illumination. For this they replaced the normal lightmap generation tool from Unreal with Beast. From what I've seen in the first hour or so the only dynamic shadows are from characters and they are modulate ones. All environment lighting is baked into the lightmap with Beast. It looks like they're at a higher resolution than I've seen used before. My guess is they sacrificed a larger portion of their texture memory and added streaming support for the lightmaps. It has auto exposure which I haven't seen from an Unreal game before. All of the reflections but one planer one were cube maps which caught me a little by surprise because DICE had talked about doing research into getting real reflections working. Overall I was impressed. You could tell there was a very tight art / tech vision for the game.
Fallout 3
The world in Fallout 3 is stunning. Tech side I don't see much improved from Oblivion. Maybe most of the changes are under the hood to get more things running faster but I don't see much different, just a few odds and ends. It does seem like the artists have really grown and they have come up with good ways for portraying the things they needed to.
A few things stood out to me. First is the light beams. This is a common new technique of using a shader that fades out the edges of a cone like mesh and some other stuff to portray a light cone. The first place I saw it was Gears of War but it was used heavily and well in Fallout 3.
Second was the grass that was done with many grass clump models with variation. The textures had a darkened core to simulate shadowing and with AA on they used alpha to coverage and looked great. So many games use just one grass sprite to fill in grass. No matter how dense you can get it it will not look right. Grass needs variation in size, color, texture, density and shape. Great job, it's some of the best grass I've seen.
The last thing that stood out to me was their crumble decals (don't know what else to call them). To portray crumbling damaged stone and concrete, alpha tested normal mapped decals were placed on the edges of models. It looks like the same instanced model was placed around with different placement of these crumble decals. It added both an apparent uniqueness to the models as well as broke up their straight edges. There was quite a few times I was fooled into thinking some perfectly straight edge was tessellated when it was just effective use of these decals. Also of note is that these decals will fade out in the distance often before anything else LODs. There was likely a system to handle these decals specially.
Overall, the art was nice but the tech was underwhelming. They really should have shadows by this point. There is no world shadows of any sort. With this addition I think the game would look twice as good as it does.
Dead Space
One of the stand out graphical features to me for Dead Space is dynamic shadows. All lights and shadows are dynamic. The shadows are projected or cube shadow maps depending on the light. It looks like bilinear filtering which makes the shadows look very pixelated and nasty at times. They really take advantage of the shadows being dynamic though by having objects and lights move around whenever they can. Swinging and vibrating light positions are abundant. For large lights the resolution really suffered and the players shadow could reduce to blocks.
Coronas and similar effects were used a lot and looked great. With bloom being all the rage to portray bright objects it's a nice slap in the face that the old standbys can sometimes get a lot better results than the new fancy system. Similar tricks were used to get light beams through dusty corridors by using sheets that fade out as you get near them. This masks their flat shape well. It seems the artists went to town with this effect making different light beam textures for different situations. Foggy particle systems were also used that faded out when you get near them.
David Blizard, Dead Space's lighting designer, claimed their frame budget for building the deferred lighting buffer was 7.5ms, 4ms for building the shadow buffers, and 2ms for post processing including bloom and an antialiasing pass which means they are not using multisampling. For ambient light there is baked ambient occlusion for the world that modulates an ambient color coming from the lighting.
Mirrors Edge
Although it uses the Unreal engine it looks distinctly not like any other Unreal game or for that matter any other game out. This is due to its "graphic design" looking art style. Tech wise this relies heavily on global illumination. For this they replaced the normal lightmap generation tool from Unreal with Beast. From what I've seen in the first hour or so the only dynamic shadows are from characters and they are modulate ones. All environment lighting is baked into the lightmap with Beast. It looks like they're at a higher resolution than I've seen used before. My guess is they sacrificed a larger portion of their texture memory and added streaming support for the lightmaps. It has auto exposure which I haven't seen from an Unreal game before. All of the reflections but one planer one were cube maps which caught me a little by surprise because DICE had talked about doing research into getting real reflections working. Overall I was impressed. You could tell there was a very tight art / tech vision for the game.
Fallout 3
The world in Fallout 3 is stunning. Tech side I don't see much improved from Oblivion. Maybe most of the changes are under the hood to get more things running faster but I don't see much different, just a few odds and ends. It does seem like the artists have really grown and they have come up with good ways for portraying the things they needed to.
A few things stood out to me. First is the light beams. This is a common new technique of using a shader that fades out the edges of a cone like mesh and some other stuff to portray a light cone. The first place I saw it was Gears of War but it was used heavily and well in Fallout 3.
Second was the grass that was done with many grass clump models with variation. The textures had a darkened core to simulate shadowing and with AA on they used alpha to coverage and looked great. So many games use just one grass sprite to fill in grass. No matter how dense you can get it it will not look right. Grass needs variation in size, color, texture, density and shape. Great job, it's some of the best grass I've seen.
The last thing that stood out to me was their crumble decals (don't know what else to call them). To portray crumbling damaged stone and concrete, alpha tested normal mapped decals were placed on the edges of models. It looks like the same instanced model was placed around with different placement of these crumble decals. It added both an apparent uniqueness to the models as well as broke up their straight edges. There was quite a few times I was fooled into thinking some perfectly straight edge was tessellated when it was just effective use of these decals. Also of note is that these decals will fade out in the distance often before anything else LODs. There was likely a system to handle these decals specially.
Overall, the art was nice but the tech was underwhelming. They really should have shadows by this point. There is no world shadows of any sort. With this addition I think the game would look twice as good as it does.
Saturday, August 30, 2008
How Pixar Fosters Collective Creativity
I just read this article by Ed Catmull on the business principles that have driven Pixar to their great success. Link. The first reason you should be interested is that Mr. Catmull is one of the pioneers of computer graphics. The second is Pixar, the company he created and still runs, is one of the most consistently successful companies around that creates artistic products. The type of work Pixar does is very close to the work we do in games and there is plenty to learn from his experiences.
Wednesday, August 13, 2008
Global Illumination
Talking about precalculated lighting reminded me of this awesome paper I just read from Pixar Point Based Color Bleeding. They got a 10x speed up on GI over raytracing. I did a bunch of research on this topic and it's funny I was just one small insight away from what they are doing. Not doing it this way required me to have to go down a completely different path. Sometimes it's small things that change success to failure.
Tuesday, August 12, 2008
Deferred rendering 2
I'll start off by saying check out the new papers from siggraph posted here. I was really surprised with the one on Starcraft II. Blizzard in the past has stayed behind the curve purposely to keep their requirements low and audience large. It seems this time they have kept the low range while expanding more into the high end. I was also surprised due to nature of the visuals in the game. It's part adventure game? Count me in. It's looking great. It also has an interesting deferred rendering architecture which leads me to my next thing.
Deferred rendering part II. Perhaps I should have just waited and made one monster post but now you'll just have to live with it.
Light Pre-Pass
post
This was recently proposed by Wolfgang Engel. The main idea is to split material rendering into two parts. First part is writing out depth and normal to a small G-buffer. It's possible this can even all fit in one render target. With this information you can get all that is important from the lights which is N dot L and R dot V or N dot H whichever you want. The buffer is as follows:
LightColor.r * N.L * Att
LightColor.g * N.L * Att
LightColor.b * N.L * Att
R.V^n * N.L * Att
With this information standard forward rendering can be done just once. This comprises the second part of the material rendering.
He explains that R.V^n can be derived later by dividing out the N.L * Att but I don't understand any reason to do this. This also means a divide by the color that is just wrong. There's also the mysterious exponent that must be a global or something meaning no surface changeable exponent.
There are really a number of issues here. Specular doesn't have any color at all, not even from the lights. If you instead store R.V in the forth channel and try to apply the power and multiply by LightColor * N.L * Att in the forward pass the multiplications have been shuffled with additions and it doesn't work out. There is no specular color or exponent and it is dependent on everything being the phong lighting equation. It has solved the deep framebuffer problem but it is a lot more restrictive than traditional deferred rendering. All in all it's nice for a demo but not for production.
Naughty Dog's Pre-Lighting
presentation
I have to admit when I sat through this talk I didn't really understand why they were doing what they were doing. It seemed overly complicated to me. After reading the slides afterwards the brilliance started to show through. The slides are pretty confusing so I will at least explain what I think they mean from it. Insomniac has since adopted this method as well but I can't seem to find that presentation. The idea is very similar to the Pre-pass lighting method. It is likely what you would get if you take Light Pre-Pass to it's logical conclusion.
Surface rendering is split in 2 parts. First pass it renders out depth, normal and specular exponent. Second, the lights are drawn additively into two HDR buffers, diffuse and specular. The materials specular exponent has been saved out so this can all be done correctly. These two buffers can then be used in the second surface pass as the accumulated lighting and material attributes such as diffuse color and spec color can be applied. They apply some extra trickery that complicates the slides that is combining light drawing in quads so a single pixel on screen never gets drawn during light drawing more than once.
This is completely usable in a production environment as proven by Uncharted having shipped and looking gorgeous. Lights can be handled one at a time (even though they don't) so multiple shadows pose no problems. The size of the framebuffer is smaller. HDR obviously works fine.
It doesn't solve all the problems though. Most are small and without testing it myself I can't say whether they are significant or not. The one nagging problem of being stuck with phong lighting still remains. This time it's just a different part of Phong that has been exposed and is rigid in the system.
Light Pass Combined Forward Rendering
I am going to propose another alternative that I haven't really seen talked about. The idea is similar to Light indexed deferred. The idea there was forward rendering style but with all the lights that hit that pixel rendering in one pass. This can be handled far simpler if when drawing that surface the light parameters were merely passed in when drawing the surface and more than one light is applied at a time. This is nothing new. Crysis can apply up to 4 lights at a time. What I haven't seen discussed is what to do when a light only hits part of a surface. Light indexed rendering handles this on a per pixel basis so it is a non issue. If the lights are "indexed" per surface then there can be many more lights that have to affect every pixel than is needed.
We can solve this problem in another way other than screen space. For instance, splitting the world geometry at the bounds of static lights will get you pixel perfect light coverage for any mesh you wish to split. The surfaces with the worst problems are the largest, being hit with the most lights. These are almost always large walls, floors and ceilings. Splitting this type of geometry is not typically very expensive and is rarely instanced. For objects that don't fall in this category they are typically instanced, relatively contained meshes that do not have very smooth transitions with other geometry. I suggest keeping only a fixed number of real affecting lights to render these surfaces by combining any less significant lights into a spherical harmonic. For more details see Tom Forsyth's post on it. In my experience the light count hasn't posed an issue.
The one remaining issue is shadows. Because all lights for a surface are applied at once shadows can't be done a light at a time. This is the same issue as light indexed rendering and the solution will be the same as well. All shadows have to be calculated and stored, likely as a screen space buffer. The obvious choice is 4 shadowing lights using 4 components of a RGBA8 render target. This is the same solution Crytek is using. That doesn't mean only 4 shadowing lights are allowed on screen at a time. There is nothing stopping you from rendering a surface again after you've completed everything using those 4 lights.
Given the limit of 4 shadowing lights this turns into a forward rendering architecture that is only one pass. It gets rid of all the redundant work from draws, tris, and material setup. It also gives you all the power of a forward renderer such as changing the light equation to be whatever you want it to. It doesn't rely in any way on screen space buffers for doing the lighting besides the shadow buffer. This means no additional memory and 360 edram headaches.
There are plenty of problems with this. Splitting meshes only works with static lights. In all of the games I've referenced so far this poses no problems. Most environmental lighting does not move (at least the bounds), nor does the scenery to a large extent. Splitting a mesh adds more triangles, vertices, and draw calls than before. In the cases where you split this it is typically not a major issue.
You do not get one of the cool things from deferred rendering and that is independence from the number of lights. In the Starcraft II paper that came out today they had a scene with over 50 lights in it including every bulb on a string of Xmas lights. This is not a major issue for a standard deferred renderer but it is for pass combined forward rendering. It is really cool to be able to do that but in my opinion it is not very important. The impact on the scene from those Xmas lights actually casting light is minimal and there are likely other ways of doing it besides tiny dynamic lights.
Summary
That is my round up of dynamic lighting architectures. I left out any kind of precalculated lighting such as lightmaps, environment maps or Carmack's baked lighting into a unique virtual texture as it's pretty much just a different topic.
Deferred rendering part II. Perhaps I should have just waited and made one monster post but now you'll just have to live with it.
Light Pre-Pass
post
This was recently proposed by Wolfgang Engel. The main idea is to split material rendering into two parts. First part is writing out depth and normal to a small G-buffer. It's possible this can even all fit in one render target. With this information you can get all that is important from the lights which is N dot L and R dot V or N dot H whichever you want. The buffer is as follows:
LightColor.r * N.L * Att
LightColor.g * N.L * Att
LightColor.b * N.L * Att
R.V^n * N.L * Att
With this information standard forward rendering can be done just once. This comprises the second part of the material rendering.
He explains that R.V^n can be derived later by dividing out the N.L * Att but I don't understand any reason to do this. This also means a divide by the color that is just wrong. There's also the mysterious exponent that must be a global or something meaning no surface changeable exponent.
There are really a number of issues here. Specular doesn't have any color at all, not even from the lights. If you instead store R.V in the forth channel and try to apply the power and multiply by LightColor * N.L * Att in the forward pass the multiplications have been shuffled with additions and it doesn't work out. There is no specular color or exponent and it is dependent on everything being the phong lighting equation. It has solved the deep framebuffer problem but it is a lot more restrictive than traditional deferred rendering. All in all it's nice for a demo but not for production.
Naughty Dog's Pre-Lighting
presentation
I have to admit when I sat through this talk I didn't really understand why they were doing what they were doing. It seemed overly complicated to me. After reading the slides afterwards the brilliance started to show through. The slides are pretty confusing so I will at least explain what I think they mean from it. Insomniac has since adopted this method as well but I can't seem to find that presentation. The idea is very similar to the Pre-pass lighting method. It is likely what you would get if you take Light Pre-Pass to it's logical conclusion.
Surface rendering is split in 2 parts. First pass it renders out depth, normal and specular exponent. Second, the lights are drawn additively into two HDR buffers, diffuse and specular. The materials specular exponent has been saved out so this can all be done correctly. These two buffers can then be used in the second surface pass as the accumulated lighting and material attributes such as diffuse color and spec color can be applied. They apply some extra trickery that complicates the slides that is combining light drawing in quads so a single pixel on screen never gets drawn during light drawing more than once.
This is completely usable in a production environment as proven by Uncharted having shipped and looking gorgeous. Lights can be handled one at a time (even though they don't) so multiple shadows pose no problems. The size of the framebuffer is smaller. HDR obviously works fine.
It doesn't solve all the problems though. Most are small and without testing it myself I can't say whether they are significant or not. The one nagging problem of being stuck with phong lighting still remains. This time it's just a different part of Phong that has been exposed and is rigid in the system.
Light Pass Combined Forward Rendering
I am going to propose another alternative that I haven't really seen talked about. The idea is similar to Light indexed deferred. The idea there was forward rendering style but with all the lights that hit that pixel rendering in one pass. This can be handled far simpler if when drawing that surface the light parameters were merely passed in when drawing the surface and more than one light is applied at a time. This is nothing new. Crysis can apply up to 4 lights at a time. What I haven't seen discussed is what to do when a light only hits part of a surface. Light indexed rendering handles this on a per pixel basis so it is a non issue. If the lights are "indexed" per surface then there can be many more lights that have to affect every pixel than is needed.
We can solve this problem in another way other than screen space. For instance, splitting the world geometry at the bounds of static lights will get you pixel perfect light coverage for any mesh you wish to split. The surfaces with the worst problems are the largest, being hit with the most lights. These are almost always large walls, floors and ceilings. Splitting this type of geometry is not typically very expensive and is rarely instanced. For objects that don't fall in this category they are typically instanced, relatively contained meshes that do not have very smooth transitions with other geometry. I suggest keeping only a fixed number of real affecting lights to render these surfaces by combining any less significant lights into a spherical harmonic. For more details see Tom Forsyth's post on it. In my experience the light count hasn't posed an issue.
The one remaining issue is shadows. Because all lights for a surface are applied at once shadows can't be done a light at a time. This is the same issue as light indexed rendering and the solution will be the same as well. All shadows have to be calculated and stored, likely as a screen space buffer. The obvious choice is 4 shadowing lights using 4 components of a RGBA8 render target. This is the same solution Crytek is using. That doesn't mean only 4 shadowing lights are allowed on screen at a time. There is nothing stopping you from rendering a surface again after you've completed everything using those 4 lights.
Given the limit of 4 shadowing lights this turns into a forward rendering architecture that is only one pass. It gets rid of all the redundant work from draws, tris, and material setup. It also gives you all the power of a forward renderer such as changing the light equation to be whatever you want it to. It doesn't rely in any way on screen space buffers for doing the lighting besides the shadow buffer. This means no additional memory and 360 edram headaches.
There are plenty of problems with this. Splitting meshes only works with static lights. In all of the games I've referenced so far this poses no problems. Most environmental lighting does not move (at least the bounds), nor does the scenery to a large extent. Splitting a mesh adds more triangles, vertices, and draw calls than before. In the cases where you split this it is typically not a major issue.
You do not get one of the cool things from deferred rendering and that is independence from the number of lights. In the Starcraft II paper that came out today they had a scene with over 50 lights in it including every bulb on a string of Xmas lights. This is not a major issue for a standard deferred renderer but it is for pass combined forward rendering. It is really cool to be able to do that but in my opinion it is not very important. The impact on the scene from those Xmas lights actually casting light is minimal and there are likely other ways of doing it besides tiny dynamic lights.
Summary
That is my round up of dynamic lighting architectures. I left out any kind of precalculated lighting such as lightmaps, environment maps or Carmack's baked lighting into a unique virtual texture as it's pretty much just a different topic.
Sunday, August 10, 2008
Deferred rendering
I remember back to the intro data structure and algorithm class I took in college. The thing the professor kept trying to hammer home was not how a red and black tree works. It was data structures and algorithms have strengths and weaknesses. This point affects everything we do in graphics. The majority of things we implement have already been done before, either by other game developers, offline graphics years ago, academics, etc. There is very few places that we come up with something brand new. Most of the inventions are even small variations to existing techniques. So, given this fact, the most important skill we can have is the ability to get knowledge of all the available options, the strengths and weaknesses of each and apply the one best for the current job. And at every chance we get, add our own little tweaks and flavor to make it better than what has come before.
Forward Rendering
Forward rendering means for every light to surface interaction the surface is drawn with that lighting information. Every one of these light surface interactions is drawn additively to the screen. There are many problems with this method. Since each surface needs to be drawn again for every light that hits it there are many redundant draws, triangles, and material pixel operations such as texture fetches or adding in detail maps. With all these disadvantages it is still very popular. For instance it is the way Doom 3 and Unreal 3 engines work.
Deferred Rendering
Deferred rendering was invented to solve these problems. Traditional deferred rendering is drawing all needed surface and material attributes to a deep framebuffer called the G-buffer. For each light seen the light geometry can be drawn to the screen with a shader that reads from the G buffer and can add up the light interaction to the color buffer. No direct interaction with the surface and the light is needed. This is only possible because anything that shader would need is put in the G buffer. Any surfaces to be drawn only need to fill in the G buffer with their attributes once meaning no redundant draws, triangles, or material pixel cost.
Deferred rendering is not without its disadvantages either. The G buffer can take up quite a bit of space so trying to pack the attributes to the smallest space possible is the goal. Since the G buffer is always quite fat the attribute drawing pass is almost always ROP bound. In my opinion the worst problem is special materials that do something non standard can't work. Exactly what custom materials can do is defined by what is in the G buffer. Usually only the common attributes are packed for space reasons.
Gurialla's deferred system for Killzone 2 is explained in this great presentation.
In their system the attributes stored in the G buffer are:
There's a few things immediately obvious that this can't do. There is no floating point color buffer so real HDR is not possible nor things like gamma correct lighting that require higher precision color. Because only spec intensity is stored only grayscale specularity is possible. Since the game looks pretty gray this is likely not a problem for them but it is for other people.
What isn't obvious is the lighting equation has to be the same across all materials. It is likely Phong based. This rules out cool things like hair shaders, anisotropic brushed metal, fake subsurface scattering, fresnel, roughness, fuzz, cloth shaders, etc.
So, how about some alternative methods? A few have been popping up over the past year.
Light Indexed Deferred Rendering
paper
This builds a screen buffer of light indexes that interact with that pixel. The base implementation ignores depth so the light may not hit the front most surface at that pixel. It can be extended to do so. The advantages is a large number of lights and only one pass of surface drawing. It also solves the problem of custom materials because the lighting happens when the surface is drawn. No attributes of either the light nor the surface have to be picked to store out. It also solves the ROP problem because no real deep framebuffer is ever drawn.
There are a number of problems with it though. First off to pass in light data that is indexable dx9 does not support dynamic indexed uniform access. This means the data needs to be passed in with textures for all the light data. This is a major pain in the ass and can be a performance problem depending on how often it is updated and how many textures are required to pass the data. To pass everything multiple floating point textures may be needed. Another problem is that applying a light does not happen one at a time. This means calculating shadows for a light need to stay around until all lights are applied. So for 4 shadowing lights on screen, you will likely need a 4 channel screen texture to composite them because stencil wont work and the number of shadow maps will likely be high. Now if you have 5 you'll need another screen texture. Other methods apply one light at a time so there is no need to save the shadows from multiple lights at once. You're "G-buffer" is now based on # of light indexes per pixel + # of shadows on screen. You can change this to # of shadows per pixel if you read the light index buffer when writing out the shadows.
Next Time
I didn't think this post was going to be so massive so I've decided to split it up and post this part now. I will go through the other deferred rendering alternatives as well as another option I haven't heard people talking about that I particularly like in my next post.
Forward Rendering
Forward rendering means for every light to surface interaction the surface is drawn with that lighting information. Every one of these light surface interactions is drawn additively to the screen. There are many problems with this method. Since each surface needs to be drawn again for every light that hits it there are many redundant draws, triangles, and material pixel operations such as texture fetches or adding in detail maps. With all these disadvantages it is still very popular. For instance it is the way Doom 3 and Unreal 3 engines work.
Deferred Rendering
Deferred rendering was invented to solve these problems. Traditional deferred rendering is drawing all needed surface and material attributes to a deep framebuffer called the G-buffer. For each light seen the light geometry can be drawn to the screen with a shader that reads from the G buffer and can add up the light interaction to the color buffer. No direct interaction with the surface and the light is needed. This is only possible because anything that shader would need is put in the G buffer. Any surfaces to be drawn only need to fill in the G buffer with their attributes once meaning no redundant draws, triangles, or material pixel cost.
Deferred rendering is not without its disadvantages either. The G buffer can take up quite a bit of space so trying to pack the attributes to the smallest space possible is the goal. Since the G buffer is always quite fat the attribute drawing pass is almost always ROP bound. In my opinion the worst problem is special materials that do something non standard can't work. Exactly what custom materials can do is defined by what is in the G buffer. Usually only the common attributes are packed for space reasons.
Gurialla's deferred system for Killzone 2 is explained in this great presentation.
In their system the attributes stored in the G buffer are:
- RGBA8 for color
- standard depth/stencil buffer (can be used to derive world position)
- normal
- XY motion vectors
- spec exponent
- spec intensity
- diffuse color
There's a few things immediately obvious that this can't do. There is no floating point color buffer so real HDR is not possible nor things like gamma correct lighting that require higher precision color. Because only spec intensity is stored only grayscale specularity is possible. Since the game looks pretty gray this is likely not a problem for them but it is for other people.
What isn't obvious is the lighting equation has to be the same across all materials. It is likely Phong based. This rules out cool things like hair shaders, anisotropic brushed metal, fake subsurface scattering, fresnel, roughness, fuzz, cloth shaders, etc.
So, how about some alternative methods? A few have been popping up over the past year.
Light Indexed Deferred Rendering
paper
This builds a screen buffer of light indexes that interact with that pixel. The base implementation ignores depth so the light may not hit the front most surface at that pixel. It can be extended to do so. The advantages is a large number of lights and only one pass of surface drawing. It also solves the problem of custom materials because the lighting happens when the surface is drawn. No attributes of either the light nor the surface have to be picked to store out. It also solves the ROP problem because no real deep framebuffer is ever drawn.
There are a number of problems with it though. First off to pass in light data that is indexable dx9 does not support dynamic indexed uniform access. This means the data needs to be passed in with textures for all the light data. This is a major pain in the ass and can be a performance problem depending on how often it is updated and how many textures are required to pass the data. To pass everything multiple floating point textures may be needed. Another problem is that applying a light does not happen one at a time. This means calculating shadows for a light need to stay around until all lights are applied. So for 4 shadowing lights on screen, you will likely need a 4 channel screen texture to composite them because stencil wont work and the number of shadow maps will likely be high. Now if you have 5 you'll need another screen texture. Other methods apply one light at a time so there is no need to save the shadows from multiple lights at once. You're "G-buffer" is now based on # of light indexes per pixel + # of shadows on screen. You can change this to # of shadows per pixel if you read the light index buffer when writing out the shadows.
Next Time
I didn't think this post was going to be so massive so I've decided to split it up and post this part now. I will go through the other deferred rendering alternatives as well as another option I haven't heard people talking about that I particularly like in my next post.
Friday, August 8, 2008
First post
I finally decided to start a blog on graphics stuff. I've always wanted to get more out there in the graphics / game dev community but I hardly ever post on forums. I talk plenty with my fellow colleagues but I never get to communicate outside my tiny sphere so I'm changing that all now. Hopefully I won't be just talking to myself. Classic first post with nothing to say other than exclaiming that I'm saying something. Next post I will be talking about deferred vs forward rendering and some options available beyond the standard implementations but that I will have to leave for tomorrow because it is getting late.