Friday, November 6, 2009


Epic released UDK today. It is the next step in their plan to completely dominate the game engine licensing market place. They've been doing a pretty good job of that so far. To all the other companies trying to compete I'd suggest following in their footsteps. The reason I believe they are so successful is because they do everything they can to get the engine out there and in peoples hands. You can't be scared of people looking at or stealing your stuff. If devs can't look at it they certainly won't plunk down major money to buy it. Epic has been powering their freight train of engine licensing with visibility of their product. With UDK now they have a brand new group of students and indie developers who are going to be familiar with unreal engine as well as make the barrier to evaluate the engine for commercial purposes practically nothing. You get to see all the tools and a major portion of documentation without even having to talk to someone at Epic. You can see engine updates whenever they release a new build. If you want to know more they will send you more than enough stuff to make up your mind in the form of an evaluation version. Compare that to most of the other engine providers. Many will never let you see any code and require all sorts of paper work to see private viewings of just an engine demonstration. As expected they are also not getting very many licensees. Stop being so paranoid and let people see your stuff.

On to the tech stuff. I haven't spent a ton of time looking at it yet but I've noticed some new things already. Of coarse there's the obvious stuff like Lightmass that they've talked all about. There's docs on all that which is great. I'm going to talk about what they aren't talking about. First they changed the way they encode their lightmaps. Previously it was 3 DXT1's that stored the 3 colors of incoming light in the 3 HL2 basis directions. They still use the same concept now but instead only have 2 DXT1's. The first stores the incoming light intensity in each of the 3 directions as RGB. The second is the average color of all 3 directions. The loss in quality may be small as the error is mainly bumps picking up different colors of light. The memory is 2/3's of what it was so it seems like a smart optimization.

What impressed me though was their signed distance field baked shadows. Previously they stored baked shadow masks as G8 or 8 bit luminance format textures. This was used simply as a mask to the light. Now they are computing a signed distance field like this paper. It's still stored in a G8 format texture so there's no difference in storage. The big difference comes from the sharpness and quality from a relatively low res texture. The same smooth lines that are useful for vector graphics for Valve's use also work well for shadows. After I read this paper when it came out I thought the exact same thing. This is perfect for baked shadow textures! I implemented it then but I was trying to have another value in the texture to specify how far the occluder was so I could control the width of the soft edge. The signed distance field could be in the green channel of a DXT1 and the softness could be in red. If that wasn't good enough softness in green and distance field in alpha of a DXT5. The prototype was scraped after it didn't really give me what I needed and was being replaced with a different shadow system anyways. I am really impressed with Epic's results though. For a fairly large map they get good quality sun shadows from only 4 1024x1024 G8 textures. That's only 4 mb of shadow data. I bet it would still work well from DXT1's too with the data in the green channel. That would bring it down to 2 mb. The only down side is it is unable to represent soft shadows unless someone can get my encoding softness as an extra value idea to work which would be cool. Apparently I should have given that experiment more consideration.

Thursday, October 1, 2009

Uncharted 2

We are hiring!
First up, Human Head Studios, where I work, is hiring for basically every department. We are currently staffing up and are looking for talented people. I can't tell you what we are working on but I can say it's awesome and we are doing some very interesting and innovative things on many fronts including technology. The tech department where I am, is developing our own cutting edge tech that I intend to be competitive with all the games I talk about on this blog. I really wish I could be more specific but alas, I cannot. Not yet at least. I'm trying to whet your appetite without getting in trouble if you haven't noticed.

Of most interest to the common visitor of this blog, we are looking for tech programmers. Of most interest to me we are looking for a graphics guy that I will work with on aforementioned secret, awesome, graphics tech. If you understand all of what I talk about here and live and breathe graphics you may be the perfect fit. For submission details see here. Mention my name and you might filter up in the pile. Being a reader of my blog has to count for something right?

Uncharted 2
On to the main event. The Uncharted 2 multiplayer demo came out yesterday and I suggest you check it out. The first game has been a benchmark for graphics on the ps3 and I'm sure when number two comes out it will set the bar again. Although all you can see at the moment are some of the mp maps it is very pretty and much can be discerned from dissecting it.

There are many options in the demo that offer an ability to study things. You can start games with just yourself. There is a machinima mode that I haven't fully figured out yet but it allows you to fool with more than in a normal game. You can also replay a play through using the cinema mode. That allows you to pause, single step forward, fly around and even change post processing and lighting a bit.

Right out of the gate I noticed their nice DOF blur. It's one of the best I've seen. It looks like there is both a small blur and a large blur that are lerped between to simulate a continuous blur radius. It looks better than gaussian but I could be imagining things. It is definitely done in HDR as bright things dominate in the blur.

Speaking of blur they now have object motion blur. In the first game there was motion blur when the camera swung around quickly but it only happened in the distance. It was at low resolution and pretty blurry. I didn't really care for it as any time I moved the camera quickly to look at something it blurred and my eyes lost focus on what I was trying to look at. This type of motion blur is still there but doesn't seem to bother me as much. It's likely less blurry than before but I can't say for sure. It is done in HDR though. In addition to the camera motion blur objects have there own geometry that draws to blur them. This is the same thing Tomb Raider Underworld did for character motion blur. U2 uses it for objects as well as characters like the bus that drives by in one of the videos. The blur trails behind and can look weird in certain situations when paused but while playing the game these oddities aren't noticeable and it looks pretty nice.

Glare or bloom seemed to be map specific. In the snow map the sky was bright enough to bloom everywhere. In other maps light sources that where bright enough to go almost white but were obviously saturated when blurred didn't bloom at all. The control to change the sun intensity could be jacked up to the point of being completely blown out but with no hint of bloom. On that note, they are correctly tone mapping and not clamping off bright colors, something so many games are getting wrong. Please, people. Use a tone curve that doesn't just clamp off bright colors, as in not linear. Bloom all you want but without a nice response curve your brights will look terrible.

In regards to lighting the only dynamic shadow in their maps was the sun. I don't know whether that is the same in single player or whether it's mp only. There were other dynamic light sources but they didn't cast dynamic shadows. Strangely they did have precomputed shadows that looked like it was the light being baked into a lightmap. I'm not sure what was happening here as the machinima playground map showed evidence of the the artifact from storing precomputed lighting at the verts which is what they did in U1. It's possible that some objects had lightmaps and others were vertex lit. Another possibility is the lightmap looking shadows were not in the baked lighting but were more like light masks. It's hard to say from what I saw. There were also other lights that seemed to be completely baked and not influence characters.

I don't remember from U1 but in this demo the ambient character lighting doesn't seem to change much or be influenced by local features. There seemed to be a warm light direction always there in the night time city map that should be a very cool ambient practically everywhere. I'm guessing the ambient lighting is artist set up and isn't based on sampling the environment.

They are still using a lot of vertex color blending of textures. I think this is done in the shader where it lerps between 2 sets of textures based on a vertex value and a heightmap texture. The heightmap corresponds with one or both of the texture sets and the vertex value is like an alpha test. This hides the typical smooth vertex colored blending and matches the material that is fading. Think of brick and mortar to get the idea. Brick is one material, concrete the other. The heightmap is based off of the brick so as it transitions the mortar starts to dominate and cover the brick, masking the gradient of the vertex values until it's all concrete. In U1 it was very much like an alphatest because it was a hard transition. For U2 these transitions can be smooth. The vertex value is likely a separate vertex stream so that different variations can be used with a single model instance. A material set up in this fashion can be used in a variety of places and with a simple bit of vertex coloring the geometry seems to be uniquely textured.

Here's just a list of some other miscellaneous things I noticed:

SSAO darkens ambient term. Nice addition, helps ground the characters and lessens the pressure on detailed baked lighting. It's much more subtly done than Gears which tended to look very constrasty.

Shadowmap sampling pattern changed to be a bit smoother. Not a big change. There is major shadow acne on characters at times which is not cool.

Particle systems with higher fill expense fade out when you get close to them. There were some nice clouds of dust in a few places around the maps that are gone when you get up to them.

Last shadowmap cascade (3rd?) fades out to no shadow in the distance. This is mostly not noticeable but it can result in the inside of buildings when outdoors turning bright in the distance.

They still have low res translucent drawing although I don't think it's frame rate dependent anymore. I saw low and normal res translucent things at the same time. The odd thing was the low res wasn't bilinearly filtered but nearest. I have no idea why they would do that.

The snow particles stretch with camera movement to fake motion blur. Cool trick.

I can't wait to play the full game. I loved the first and this is looking to be even better. As I said before this will be the new bar so you should definitely check it out for your self.

Tuesday, April 28, 2009

RGBM color encoding

There has been some talk about using LogLUV encoding to store HDR colors in 32 bits by packing it all into a RGBA_8888 target. I won't go into the details as a better explanation than I could give is here. This encoding has the benefit over standard floating point buffers of reducing ROP bandwidth and storage space at the expense of some shader instructions for both encoding and decoding. It has been used in shipped games. Heavenly Sword used it to achieve 4xaa with HDR on a PS3. Uncharted used it for the 2xaa output from their material pass. The code for encoding and decoding is as follows (copied from previous link):
// M matrix, for encoding
const static float3x3 M = float3x3(
0.2209, 0.3390, 0.4184,
0.1138, 0.6780, 0.7319,
0.0102, 0.1130, 0.2969);

// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
6.0014, -2.7008, -1.7996,
-1.3320, 3.1029, -5.7721,
0.3008, -1.0882, 5.6268);

float4 LogLuvEncode(in float3 vRGB) {
float4 vResult;
float3 Xp_Y_XYZp = mul(vRGB, M);
Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
float Le = 2 * log2(Xp_Y_XYZp.y) + 127;
vResult.w = frac(Le);
vResult.z = (Le - (floor(vResult.w*255.0f))/255.0f)/255.0f;
return vResult;

float3 LogLuvDecode(in float4 vLogLuv) {
float Le = vLogLuv.z * 255 + vLogLuv.w;
float3 Xp_Y_XYZp;
Xp_Y_XYZp.y = exp2((Le - 127) / 2);
Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
float3 vRGB = mul(Xp_Y_XYZp, InverseM);
return max(vRGB, 0);
There is a different encoding which I prefer. I don't know whether anyone else is using this but I imagine someone is. The idea is to use RGB and a multiplier in alpha. This is often used as a HDR format for textures. I prefer it over RGBE for HDR textures. Here's some info related even though 2 DXT5's is a bit overkill for most applications. Here's some slides from Lost Planet on storing lightmaps as DXT5's. Both LogLUV and these texture encodings are about storing the luminance information separately with a higher precision. This is a standard color compression thing which becomes even more powerful when dealing with HDR data. What at first doesn't make sense is if RGBM is stored in a RGBA_8888 there is no increase in precision by placing luminance in the alpha over having it stored with RGB. The thing is luminance isn't only in alpha. What is essentially stored in alpha is a range value. The remainder of the luminance is stored with the chrominance in rgb. The code is really simple to do this encoding:
float4 RGBMEncode( float3 color ) {
float4 rgbm;
color *= 1.0 / 6.0;
rgbm.a = saturate( max( max( color.r, color.g ), max( color.b, 1e-6 ) ) );
rgbm.a = ceil( rgbm.a * 255.0 ) / 255.0;
rgbm.rgb = color / rgbm.a;
return rgbm;

float3 RGBMDecode( float4 rgbm ) {
return 6.0 * rgbm.rgb * rgbm.a;
I should also note that it is best to convert the colors from linear to gamma space before encoding. If you plan to use them again in linear a simple additional sqrt and square will work fine for encoding and decoding respectively. The constant 6 gives a range in linear space of 51.5. Sure it's no 1.84e19 of LogLUV but honestly did you really need that? 51.5 should be plenty so long as exposure has already been factored in. This constant can be changed to fit your tastes. Those 3 max's can be replaced with a max4 on the 360 if the compiler is smart enough. I haven't looked to see if it does this. Also the epsilon value to prevent dividing by zero I haven't found necessary in practice. The hardware must output black in the event of denormals which is the same as handling it correctly. I haven't tried it on a large range of hardware so beware if you remove it.

There are some major advantages of RGBM over LogLUV. First off is the cost of encoding and decoding. There is no need for matrix multiplies or logs and exp. Especially of note is how cheap the decoding is. It behaves very well in filtering so you can still use the 4 samples in 1, bilinear trick for downsizing. This isn't technically correct but the difference is negligible.

As far as quality I can't see any banding even in dark stress test cases on a fancy monitor after I've turned all the lights off. It also unsurprisingly handles very bright and saturated colors with the same level of quality. I found no discernible differences in my testing versus LogLUV. I don't have any sort of data on what amount of error it has or whether it covers whatever color space. What I can tell you is that it handles my HDR needs perfectly.

Storing your colors encoded means you cannot do any blending into the buffer. This rules out multi-pass additive lighting and transparency. You will have to use another buffer for transparent things such as particles. This is also a good time to try a downsized buffer since you need a separate one anyways. Now a transparency buffer can store additive, alpha blended and multiply type transparency but only grayscale multiplies since they are going into the alpha channel. Multiply decals can be very useful in adding surface variation while still having tiling textures underneath. These often use color to tint the underlying surface and need to be at full res.

Now for the cool part. Because what is stored in RGB is basically still a color, you can apply multiply blending straight into a buffer stored as RGBM. Multiplying will never increase the range required to store the colors so this is a non destructive operation. In practice I have seen no perceivable precision problems crop up due to this. It is also mathematically correct so there are no worries as to whether it will get weird results.

Wednesday, February 25, 2009

Killzone 2

I got a bogus DMCA notice on this post. Google took it down and now I'm putting it back up. I just finished Killzone 2 and it really is graphically impressive. If you are reading this blog then you are interested in graphics which means you owe it to yourself to play this game. The other levels in the game I think are actually more impressive than the one in the demo. The level in the demo was pretty geometrically simple. Lots of boxy bsp brush looking shapes. The later levels are a lot more complex. In particular the sand level was very pretty.

Level Construction
There didn't seem to be much high poly mesh rendered to a normal map looking stuff. Most everything was made from texture tiles and heightmap generated normalmaps. Most of the textures are fairly desaturated to the point of being likely grayscale with most of the color coming from the lighting and post processing. This is something we did quite a bit in Prey and is something we are trying to change. You may notice the post changing when you walk through some door ways. The most likely candidates are doors from inside to outside.

Their biggest triumph I think is in the fx and atmospherics. There is a ridiculous number of particles. The explosions are some of the best I've seen in a game. There is a lot of dust from bullet impacts, foot falls, wind, explosions. There's smoke coming from explosions, world fires, rocket trails. Each bullet impact also causes a spray of trailed sparks that collide with the world and bounce. Particles are not the only thing contributing. There are also a lot of tricks with sheets and moving textures. For the dust blowing in the wind effect there is a distinct shell above the ground with a scrolling texture plus lots of particles. The common trick with sheets is fading them out when they get edge on and when you get close to them. Add soft z clipping and a flat sheet can look very volumetric. There is also a lot of light shafts done with sheets. One of these situations you can see in the demo. All of this results in a huge amount of overdraw. It has already been pointed out that they are using downsized drawing. This looks to be 1/4 the screen dimensions (1/16 the fill). This is bilinearly upsampled to apply it to the screen opposed to using the msaa trick and drawing straight in. Having the filtering makes it look quite a bit better. It looks like it averages about 10% of the GPU frame time. That would mean they didn't need to sacrifice much to get these kind of effects.


All the shadows are dynamic shadow maps. Sunlight is cascaded shadow maps with each level at 512x512. Omni lights use cube shadow maps. They are drawing the back faces to the shadow map to reduce aliasing. Some of the shadow maps can be pretty low resolution. This isn't as bad as Dead Space because they have really nice filtering. This is likely because the rsx has shadow map bilinear pcf for free. I can't tell exactly what the sample pattern is but it looks to alternate. They have stated there is up to 12 samples per pixel. There is a really large number of lights casting dynamic shadows at a time. Even muzzle flashes cast shadows. Lightning flashes cast shadows. At a distance the shadows fade out but the distance is pretty far. To be expected their triangle counts were evenly split between screen rendering and shadow map rendering at about 200k-400k. They should be able to get away with a lot more than that amount of tris.


I think this is the first game to really milk deferred lighting for what its worth. There are a ton of lights. The good guys have like 3 small lights on each one of them. That doesn't include muzzle flashes. The bad guys are defined by the red glowing eyes. These have a small light attached to them so the glowing eyes actually light the guys and anything they are close to. In the factory level you can see 230 lights on screen at once. I'm curious if all of these are drawn or if a good fraction is faded out. If there aren't any faded that is insane. 200 draw calls just in lights and that doesn't count stencil drawing that can happen before. Their draw counts seem to always be below 1000 so this is not likely the case.

Post processing

A fair amount of their screen post processing is done on SPU's. As far as I know this is a first. The DOF has a variable blur size. This is most easily visible when going back and forth to the menu. There is motion blur on everything but the blur distance is clamped to very small.

Environment maps are used on many surfaces. They are mostly crisp to show sharp reflections. I didn't see any situation where they were locational. They are instead tied to the material.

Another neat effect was the water from the little streams. This wasn't actually clipping with the ground or another piece of geometry at all. It is merely in the ground material and it masked to where it should be. The plane moves up and down by changing what range of a heightmap to mask to.

Their profiler says they are spending up to 30% of an SPU on scene portals. I assumed this meant area / portal visibility. In the demo this made sense. After playing it all it no longer makes sense. There are many areas in the game that are just not portalable. I'm not sure what that could mean anymore. They could use it as a component of visibility and the other component is not on the SPUs. In that case I am curious what they used for visibility.

The texture memory amount stayed constant. This must mean that they are not doing any texture streaming.

They have the player character cast shadows but you can not see his model. I found this to be kind of strange especially when you can see the shadows at the feet z fighting with the ground but no feet that would have conveniently hid the problem. It's expensive to get the camera in the head thing to work really well so I understand why they didn't wish to do it but personally I would have gone with both or nothing concerning the players shadow. BTW, why is the player like a foot and a half shorter than everyone else?

For more killzone info:
Deferred lighting
Profiling numbers

It isn't quite to the level of the original prerendered footage but honestly who expected it to be? It is a damn good effort from the folks at Guerrilla. I look forward to their presentation at GDC next week. This is the first year since I've been doing this professionally that I am not going to GDC. I'll have to try and get what I can from the powerpoints and audio recordings. You are all posting your slides right? Wink, wink.

Saturday, January 10, 2009

Virtual Geometry Images

Geometry images are one of those ideas so simple you ask yourself "Why didn't I think of this?" I'll admit it isn't the topic of much discussion concerning the "more geometry" problem for the next generation. They work great for compression but they don't inherently solve any of the other problems. Multi-chart geometry images have a complicated zipping procedure that is also invalid if a part is rendered at a different resolution.

A year ago when I was researching a solution to "more geometry" on DirectX 9 level hardware I came across this paper that was in line with the direction I was thinking. The idea is an extension to virtual textures by having another layer with the textures that is a geometry image. For every texture page that is brought in there is a geometry image page with it. By decomposing the scene into a seemless texture atlas you are also doing a Reyes like split operation. The splitting is a preprocess and the dice is real time. The paper also explains an elegant seem solution.

My plan on how to get this running really fast was to use instancing. With virtual textures every page is the same size. This simplifies many things. The way detail is controlled is similar to a quad tree. The same size pages just cover less of the surface and there are more of them. If we mirror this with geometry images every time we wish to use this patch of geometry it will be a fixed size grid of quads. This works perfectly with instancing if the actual position data is fetched from a texture like geometry images imply. The geometry you are instancing then is grid of quads with the vertex data being only texture coordinates from 0 to 1. The per instance data is passed in with a stream and the appropriate frequency divider. This passes data such as patch world space position, patch texture position and scale, edge tessellation amount, etc.

If patch tessellation is tied to the texture resolution this provides the benefit that no page table needs to be maintained for the textures. This does mean that there may be a high amount of tessellation in a flat area merely because texture resolution was required. Textures and geometry can be at a different resolution but still be tied such as the texture is 2x the size as the geometry image. This doesn't affect the system really.

If the performance is there to have the two at the same resolution a new trick becomes available. Vertex density will match pixel density so all pixel work can be pushed to the vertex shader. This gets around the quad problem with tiny triangles. If you aren't familiar with this, all pixel processing on modern GPU's gets grouped into 2x2 quads. Unused pixels in the quad get processed anyways and thrown out. This means if you have many pixel size triangles your pixel performance will approach 1/4 the speed. If the processing is done in the vertex shader instead this problem goes away. At this point the pipeline is looking similar to Reyes.

If this is not a possibility for performance reasons, and it's likely not, the geometry patches and the texture can be untied. This allows the geometry to tessellate in detailed areas and not in flat areas. The texture page table will need to come back though which is unfortunate.

Geometry images were first designed for compression so disk space should be a pretty easy problem. One issue though is edge pixels. Between each page the edge pixels need to be exact otherwise there will be cracks. This can be handled by losslessly compressing just the edge and using normal lossy image compression for the interiors. As the patches mip down they will be using shared data from disk so this shouldn't be an issue. It should be stored uncompressed in memory thought or the crack problem will return.

Unfortunately vertex texture fetch performance, at least on current console hardware, is very slow. There is a high amount of latency. Triangles are not processed in parallel either. With DirectX 11 tessellators it sounds like they will be processed in parallel. I do not know whether vertex texture fetch will be up to the speed of a pixel texture fetch. I would sure hope so. I need to read specs for both the API and this new hardware before I can postulate on how exactly this scheme can be done with tessellators instead of instanced patches but I think it will work nicely. I also have to give the disclaimer that I have not implemented this. The performance and details of the implementation are not yet known because I haven't done it.

To compare this scheme with the others it has some advantages. Given that it is still triangle rasterization dynamic objects are not a problem. To make this work with animated meshes it will probably need bone indexes and weights stored in a texture along with the position. This can be contained to an animation only geometry pool. It doesn't have the advantage subd meshes have that you can animate just the control points. This advantage may not work that well anyways because you need a fine grained cage to get good animation control which increases patch number, draw count, and tessellation of the lowest detail LOD (the cage itself).

It's ability to LOD is better than subd meshes but not as good as voxels. The reason for this is the charts a model has to be split up into are usually quite a bit bigger than the patches of a subd mesh. This really depends on how intricate the mesh is though. It scales the same subd meshes do but just with a different multiplier. Things like terrain will work very well. Things like foliage work terribly.

Tools side, anything can be converted into this format. Writing the tool unfortunately looks very complicated. This primarily lies with the texture parametrization required to build the seemless texture atlas. After UV's are calculated the rest should be pretty straight forward.

I do like this format better than subd meshes with displacement maps but it's still not ideal. Tiny triangles start to lose the benefits of rasterization. There will be overdraw and triangles missing the center of pixels. More important I think is that it doesn't handle all geometry well, so it doesn't give the advantage of telling your artists they can make any model and place it however they want and it will have no impact on the performance of the game. Once they start making trees or fences you might as well go back to how they used to work because this scheme will run even slower than the old way. The same can be said for subd meshes btw.

To sum it up I think it's a pretty cool system but it's not perfect and doesn't solve all the problems.

Thursday, January 8, 2009

More Geometry

There has been a lot of talk lately about the next generation of hardware and how we are going to render things. The primary topic seems to be "How are we going to render a lot more geometry than we can now?" There are two approaches that are getting a lot of attention. The first is subdivision surfaces with displacement maps and the second is voxel ray casting.

Here are some others that are getting a bit of attention.

Ray casting against triangles
Intel's ray tracer
Cuda ray tracer

Point splatting
Far Voxels

Progressive meshes
Progressive Buffers
View dependent progressive mesh on the GPU

Progressive Buffers is one of my favorite papers. It's one that I keep coming back to time and time again.


Who knows exactly what is going on in Otoy. It almost seems like they are being deliberately confusing. It's for a game engine, it's for movie CG, it's lightstage (which I thought was a non-commercial product), it's a server side renderer, it's a web 3d viewer / streaming video.

What I have gathered it generates an unstructured point cloud on the gpu and creates a point hierarchy. It uses this for ray tracing not just eye rays but shadows and reflections. The reflections are massively cached. It's not clear how. I can't figure out how this is working with full animations like they have in their videos. Either that would require regenerating the points, which makes ray casting into it kind of pointless, or it has to deal with holes. Whatever they are doing the results are very impressive.

Subdivision surfaces with displacement maps
This has a lot of powerful people behind it. Both nVidia and AMD are behind it along with DirectX 11 API support through the hull shader, tessellator, and domain shader. I'm not a big fan of this. First off, it's only really useful in data amplification not data reduction. For example our studio and many others are now using the subd cage for the in game models for our characters. That means the the lowest tessellation level the subd surface can get to, the subd cage, is the same poly count as our current near LOD character meshes. It makes subdivision surfaces not useful at all in LODing moderate distances. This can be reduced some but likely not by enough to solve the problem. It looks really complicated to implement and rife with problems. It requires artists input to create models that work well with it. The data imported from the modelling package can be specific to that package. It's hard to beat the generality of a triangle soup.

The plus side is there is no issue with multiple moving meshes or deforming. They are fully animatable. To allow good animation control the tessellation of the cage may need to be higher. In theory every piece of geometry in your current engine could be replaced with subd models with displacement maps and the rest of your pipeline would work exactly the same.

Check out some work in this direction from Michael Bunnell of Fantasy Lab:
GPU Gems 2 chapter
Fantasy Lab

Voxel ray casting
This has been made popular by John Carmack who has described this as his planned approach for idTech6. Considering he's always at the head of the pack this should give it pretty strong backing. John refers to his implementation as a sparse voxel octree (SVO). The idea is to extend his current virtual texturing mip blocks to 3D with a "mip" hierarchy of voxels that will be stored as an octree. The way this is even remotely reasonable is that you only need to store the important data, no data in empty space. This is very different from most scientific and medical applications that require the whole data block. This structure is great for compression. Geometry compression now turns into image compression which is a well studied problem and effective. LODing works from the whole screen bing a single voxel to subpixel detail. To render it every screen pixel casts a ray into the octree until it hits a voxel the size of a pixel or smaller. This means that both rendering is reduced by LODing and memory is reduced. If you don't need the voxel data you don't need it in memory.

I like this approach because it gives one elegant way of handling textures, geometry, streaming, compression and LODing all in one system. There are no demands on the input data. Anything can be converted into voxels. Due to a streaming structure very similar to virtual texturing it allows unique geometry and unique texturing. This means there are no restrictions on the artists. There is little way an artist can impact performance or memory with assets they create. That puts the art and visuals in the artists hands and the technical decisions in the engineers hands.

There are some problems. Ray tracing has always been slow. Ray casting is a search where rasterization is binning. Binning is almost always faster than searching. In a highly parallel environment the synchronization required in binning may tip the scales in searching's favor. As triangles shrink the number of multiple triangles in one bin or bin misses also hurts the speed advantage. It has now been demonstrated to be fast enough to render on current hardware at 60fps. This should be enough proof to let this concern slide a bit. It does mean it will only be able to be rendered once. It's unlikely there will be power left to render a shadow map or ray trace to the light for that matter. My guess is John's intent is to have the lighting fully baked into the texture like how idTech5 works currently. This also cuts down on the required memory as only one color is required per voxel.

Memory is another possible problem. Jon Olick's demo of one character using a SVO required ~1gb of video memory which was not enough to completely hide paging. His plans to decrease this size was entropy encoding which means each child's data is based on its parents data. As far as I'm aware this is only going to work if you use a kd-tree restart traversal which is slower than the other alternatives. Otherwise he would need to evaluate the voxel data for the whole stack once he wishes to draw the pixel.

The most important problem is it doesn't work with dynamic meshes. The scheme I believe John Carmack is planning on using is a static world with baked lighting with all dynamic objects and characters using traditional triangle meshes. I expect this to work out well. The performance of this type of situation has been shown to be there so it's not too risky to pursue this direction. There is something about it that bugs me. You release the constraints of the environmental artists but leave the other artists with the same problems. If you handle texturing inherently with voxels does that mean he needs to keep around his virtual texturing for everything else? Treating the world and the dynamic objects in it separately has been required in the past with lightmaps and vertex lighting. To bring back this Hanna Barbara effect in a even more complicated way leaves a bad taste in my mouth. I'm really looking for a uniform solution for geometry and textures.

For more detailed information see Jon Olick's Siggraph presentation. He is a former employee of id. Also check out the interview that started this all off.

The brick based voxel implementations seem like a better solution to me than having the tree uniform. This means the leaves are a different type than the nodes. They consist of a fixed size brick of voxels. Being in a brick format has many advantages. It allows free filtering and hardware volume texture compression through DXT formats.

Check out these for brick approaches:
GPU ray casting of voxels