Tuesday, April 28, 2009

RGBM color encoding

There has been some talk about using LogLUV encoding to store HDR colors in 32 bits by packing it all into a RGBA_8888 target. I won't go into the details as a better explanation than I could give is here. This encoding has the benefit over standard floating point buffers of reducing ROP bandwidth and storage space at the expense of some shader instructions for both encoding and decoding. It has been used in shipped games. Heavenly Sword used it to achieve 4xaa with HDR on a PS3. Uncharted used it for the 2xaa output from their material pass. The code for encoding and decoding is as follows (copied from previous link):
// M matrix, for encoding
const static float3x3 M = float3x3(
0.2209, 0.3390, 0.4184,
0.1138, 0.6780, 0.7319,
0.0102, 0.1130, 0.2969);

// Inverse M matrix, for decoding
const static float3x3 InverseM = float3x3(
6.0014, -2.7008, -1.7996,
-1.3320, 3.1029, -5.7721,
0.3008, -1.0882, 5.6268);

float4 LogLuvEncode(in float3 vRGB) {
float4 vResult;
float3 Xp_Y_XYZp = mul(vRGB, M);
Xp_Y_XYZp = max(Xp_Y_XYZp, float3(1e-6, 1e-6, 1e-6));
vResult.xy = Xp_Y_XYZp.xy / Xp_Y_XYZp.z;
float Le = 2 * log2(Xp_Y_XYZp.y) + 127;
vResult.w = frac(Le);
vResult.z = (Le - (floor(vResult.w*255.0f))/255.0f)/255.0f;
return vResult;

float3 LogLuvDecode(in float4 vLogLuv) {
float Le = vLogLuv.z * 255 + vLogLuv.w;
float3 Xp_Y_XYZp;
Xp_Y_XYZp.y = exp2((Le - 127) / 2);
Xp_Y_XYZp.z = Xp_Y_XYZp.y / vLogLuv.y;
Xp_Y_XYZp.x = vLogLuv.x * Xp_Y_XYZp.z;
float3 vRGB = mul(Xp_Y_XYZp, InverseM);
return max(vRGB, 0);
There is a different encoding which I prefer. I don't know whether anyone else is using this but I imagine someone is. The idea is to use RGB and a multiplier in alpha. This is often used as a HDR format for textures. I prefer it over RGBE for HDR textures. Here's some info related even though 2 DXT5's is a bit overkill for most applications. Here's some slides from Lost Planet on storing lightmaps as DXT5's. Both LogLUV and these texture encodings are about storing the luminance information separately with a higher precision. This is a standard color compression thing which becomes even more powerful when dealing with HDR data. What at first doesn't make sense is if RGBM is stored in a RGBA_8888 there is no increase in precision by placing luminance in the alpha over having it stored with RGB. The thing is luminance isn't only in alpha. What is essentially stored in alpha is a range value. The remainder of the luminance is stored with the chrominance in rgb. The code is really simple to do this encoding:
float4 RGBMEncode( float3 color ) {
float4 rgbm;
color *= 1.0 / 6.0;
rgbm.a = saturate( max( max( color.r, color.g ), max( color.b, 1e-6 ) ) );
rgbm.a = ceil( rgbm.a * 255.0 ) / 255.0;
rgbm.rgb = color / rgbm.a;
return rgbm;

float3 RGBMDecode( float4 rgbm ) {
return 6.0 * rgbm.rgb * rgbm.a;
I should also note that it is best to convert the colors from linear to gamma space before encoding. If you plan to use them again in linear a simple additional sqrt and square will work fine for encoding and decoding respectively. The constant 6 gives a range in linear space of 51.5. Sure it's no 1.84e19 of LogLUV but honestly did you really need that? 51.5 should be plenty so long as exposure has already been factored in. This constant can be changed to fit your tastes. Those 3 max's can be replaced with a max4 on the 360 if the compiler is smart enough. I haven't looked to see if it does this. Also the epsilon value to prevent dividing by zero I haven't found necessary in practice. The hardware must output black in the event of denormals which is the same as handling it correctly. I haven't tried it on a large range of hardware so beware if you remove it.

There are some major advantages of RGBM over LogLUV. First off is the cost of encoding and decoding. There is no need for matrix multiplies or logs and exp. Especially of note is how cheap the decoding is. It behaves very well in filtering so you can still use the 4 samples in 1, bilinear trick for downsizing. This isn't technically correct but the difference is negligible.

As far as quality I can't see any banding even in dark stress test cases on a fancy monitor after I've turned all the lights off. It also unsurprisingly handles very bright and saturated colors with the same level of quality. I found no discernible differences in my testing versus LogLUV. I don't have any sort of data on what amount of error it has or whether it covers whatever color space. What I can tell you is that it handles my HDR needs perfectly.

Storing your colors encoded means you cannot do any blending into the buffer. This rules out multi-pass additive lighting and transparency. You will have to use another buffer for transparent things such as particles. This is also a good time to try a downsized buffer since you need a separate one anyways. Now a transparency buffer can store additive, alpha blended and multiply type transparency but only grayscale multiplies since they are going into the alpha channel. Multiply decals can be very useful in adding surface variation while still having tiling textures underneath. These often use color to tint the underlying surface and need to be at full res.

Now for the cool part. Because what is stored in RGB is basically still a color, you can apply multiply blending straight into a buffer stored as RGBM. Multiplying will never increase the range required to store the colors so this is a non destructive operation. In practice I have seen no perceivable precision problems crop up due to this. It is also mathematically correct so there are no worries as to whether it will get weird results.