[13 Dec 2012]

# Improvements for shadow mapping in OpenGL and GLSL

This tutorial shows how to improve results of shadow mapping method, lists disadvantages of basic shadow mapping and possible solutions to those problems. Following improvements are considered: Variance Shadow Mapping (VSM), Exponential Shadow Maps (ESM), Percentage Closer Filtering (PCF), Stratified Poisson Sampling, Rotated Poisson Sampling and other. You can find more info about shadow mapping and shadow mapping for point light sources in previous tutorials.
It is easy to implement shadow mapping, but it requires a lot of time and tweaking in order to improve quality. Also, parameters of shadow mapping should depend on a scene. There are following problems of shadow mapping:
• Z-fighting - artifacts that arise on lit surfaces due to errors during comparison of depths from shadow map with depth of current fragment. In most cases this problem is resolved with additional offset for all depths in shadow map.
• Aliasing - visible square-like aliasing on the borders between light and shadow when size of the shadow map is too small. It appears because multiple fragments visible through camera are mapped to the same texel in the shadow map. The problem can be solved by choosing optimal size of the shadow map and optimal light space frustum. Another aliasing problem appears in places where light rays are parallel to the surface.
• Perspective aliasing - more aliasing near the end of perspective projection frustum. It appears because perspective projection has higher visible area to texel ratio near the far clipping plane.
• Lost detail - when size of shadow map is large, and shadow casting objects are small in screen space (few fragments).
• Banding - appears on the borders between light and shadow instead of smooth transition between light and shadow. This is issue of Percentage Closer Filtering and related mehtods.
• Peter Panning - additional offset for depths in shadow map (to fix z-fighting) is too big and becomes visible. Such shifted shadows creates effect of "flying objects".
• Following sections shows how to solve these problems and how to improve quality of shadow mapping.
Surfaces that aren't oriented to light are in shadow
One of the most simple optimizations for shadow mapping is automatic shadowing for fragments where directions of normals are opposite to direction of light rays. You can perform this check with dot product of normal (N) and vector from the light source (L). If value is less than zero, then fragment is in shadow. This optimization creates better shadows on surfaces parallel to light sources, and improves performance. Normals should be valid and well defined for correct work of this optimization. This optimization can decrease quality of shadows during percentage closer filtering and partially remove Peter Panning artifacts. Following code snippet shows how to implement orientation check:
  // check for orientation. Check against 0, or smaller value. // You can create smooth transition from light to shadow if(dot(N, L) < 0) {    shadowFactor = 0; } else{ ... } 
The problem is to determine optimal offset for each depth in shadow map. If you apply not enough offset, z-fighting will be still present. If you apply very large offset then Peter Panning will become noticeable. Offset should depend on precision of the shadow map, and on slope of the surface relative to direction to light source.
Following code snippet shows how to add constant offset to distance from fragment to light source:
 float distToLight = distance(fragmentWorldPosition, lightPosition) + epsilon; 
This offset doesn't take slope of the surface into account. Let's consider the following image. The figure on the left shows distances in shadow map without any additional offset. This shadow map will produce z-fighting. The figure on the right shows initial depths and depths after additional offset. As you can see, offset for each depth is different. If the surface is perpendicular to light rays, then offset should be minimal (as for the left most texel). If the surface is parallel to lights rays, then use maximum offset (as for the right most texel). You can determine amount of additional offset with normal at the fragment and with direction to the light source. Check the following code:
 // dot product returns cosine between N and L in [-1, 1] range // then map the value to [0, 1], invert and use as offset float offsetMod = 1.0 - clamp(dot(N, L), 0, 1) float offset = minOffset + maxSlopeOffset * offsetMod; // another method to calculate offset // gives very large offset for surfaces parallel to light rays float offsetMod2 = tan(acos(dot(N, L))) float offset2 = minOffset + clamp(offsetMod2, 0, maxSlopeOffset); 
OpenGL can automatically compute and add offset to values which are stored in Z-Buffer. You can setup offset with glPolygonOffset function. Two parameters are available: multiplier for offset that depends on slope of the surface, and value that determines amount of additional smallest possible offsets (depends on format of shadow map):
 glEnable(GL_POLYGON_OFFSET_FILL); glPolygonOffset(1.0f, 1.0f); 
 OpenGL has option that allows the fragment shader to sample shadow map with linear filtering. Linear filtering returns interpolated value that is based on four closest texels to sampling location instead of a value of closest texel. But shadow map has special built-in logic. Linear filter doesn't interpolate between four depths, but automatically compares current distance from light to fragment with four values in shadow map. Each passed check adds 0.25 to result, and failed check doesn't modify the result. So there are five possible values returned with sampling: 0.0, 0.25, 0.5, 0.75, 1.0.
In case if none of the checks have passed then 0 is returned. If all checks have passed then 1 is returned. So this value can be interpreted as shadowing factor. This is not linear filtration of depths but linear filtration of depth comparisons. To enable automatic comparisons you have to use samplerShadow sampler type in the fragment shader. Following code snippet shows how to initialize texture for automatic depth comparisons with linear filtering:
 // bind shadow map glBindTexture(GL_TEXTURE_2D, shadowMap); // set linear filter glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR); // set automatic comparisson mode glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE, GL_COMPARE_REF_TO_TEXTURE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_FUNC, GL_LEQUAL); 
Shadowing factor with 5 possible values makes transition from light to shadow more smooth and decreases visible aliasing. Also, it may be much quicker to take single sample with linear filter than to take four separate samples and manually compare them with current distance from the fragment to the light source.
Percentage closer filtering
 Percentage closer filtering (PCF) is the simplest method to create soft shadows with shadow mapping. Instead of taking one sample from the shadow map, it's possible to perform multiple samples from shadow map around the location of the fragment. Shadowing factor is calculated as ratio of passed depth comparisons to total number of depth comparisons. Even more, PCF uses linear filtering to get results of 4 comparisons at once. So if PCF takes 4 samples, then in combination with linear filtering it gives 16 depth checks, and same number of intensity steps in transition between light and shadow. Such samples should be taken in such a way that all checks are done with unique texels from the shadow map. That is, linear filtering shouldn't sample from same texels twice or more.
 // number of samples const int numSamplingPositions = 4; // offsets for rectangular PCF sampling vec2 kernel[4] = vec2[] (    vec2(1.0, 1.0), vec2(-1.0, 1.0), vec2(-1.0, -1.0), vec2(1.0, -1.0) ); // performs sampling and accumulates shadowing factor void sample(in vec3 coords, in vec2 offset,             inout float factor, inout float numSamplesUsed) {    factor += texture(          u_textureShadowMap,          vec3(coords.xy + offset, coords.z)       );    numSamplesUsed += 1; } void main() {    // ...    // sample each PCF texel around current fragment    float shadowFactor = 0.0;    float numSamplesUsed = 0.0;    float PCFRadius = 1;    for(int i=0; i<numSamplingPositions; i++)    {       sample(projectedCoords,             kernel[i] * shadowMapStep * PCFRadius,             shadowFactor,             numSamplesUsed          );    }    // normalize shadowing factor    shadowFactor = shadowFactor/numSamplesUsed; } 
Poisson sampling kernel
 Poisson disk of offsets allows PCF to sample with more irregular offsets than in standard PCF method. Offsets in Poisson disk are uniformly distributed in unit circle and distance between any two samples is not less than defined distance. This fact allows to reduce banding a bit and use less samples. To generate Poisson disk of offsets you can use Poisson Disk Generator app. You won't get valid Poisson disk with simple random generation. Comparing with regular PCF grid, sampling with Poisson kernel gives quite greater quality with smaller number of required samples. It reduces banding and preserves soft transitions between light and shadow. Following images show results of Poisson sampling and example of Poisson sampling kernel.
Rotated Poisson kernel
In any case, if you use constant grid of sampling positions for Percentage closer filtering, you will have to deal with insufficient softness of transition, banding and performance. Good quality of soft shadows and absence of banding require a lot of samples. And huge amount of samples is always the main performance bottleneck. The banding can be visible even with large number of samples because offsets of samples are same for all fragments, and they create patterns of intensity steps along boundaries of shadows.
You can try to randomly generate sampling kernel for each fragment. But as it was noted previously, random non uniform samples may sample from same texels of shadow map multiple times. You have to generate Poisson disk for each fragment, but all the math will decrease performance of the fragment shader even more than large number of samples. Instead, you can define one Poisson kernel in fragment shader, and then randomly rotate kernel in 2D for each fragment.
 This gives us another problem - to generate random angle in the shader. But usually shaders doesn't support built-in generation of random numbers. So you have to implement this functionality manually. In order to generate each time same random number for each fragment in the scene, generator of pseudo random numbers should depend on position of fragment in the scene. In other case generator would generate other numbers each time and the final image will flicker. You can write pseudo random number generator as some high frequency function in the shader, or provide texture with previously generated random numbers and sample different location for each fragment (using position of the fragment in the scene as texture coordinates). Following code snippet shows example of pseudo random number generator:
 // generates pseudorandom number in [0, 1] // seed - world space position of a fragemnt // freq - modifier for seed. The bigger, the faster // the pseudorandom numbers will change with change of world space position float random(in vec3 seed, in float freq) {    // project seed on random constant vector    float dt = dot(floor(seed * freq), vec3(53.1215, 21.1352, 9.1322));    // return only fractional part    return fract(sin(dt) * 2105.2354); } 
 As we have to rotate Poisson disk with random angles, you can store cos(theta) and sin(theta) in a texture, and these values can be directly used in 2D rotation. Following code snippet shows implementation of Rotated Poisson kernel sampling with pseudo random number generator. Image to the right shows results of Poisson sampling with increased radius of Poisson kernel but with same number of samples. As you can see random rotations of sampling kernel almost totally removes banding, increases possible kernel radius, but with low number of samples this method creates visible noise. Noise becomes less visible when shadows are combined with diffuse maps, lighting and other effects.
 // default Poisson disk (offsets) const int numSamplingPositions = 9; uniform vec2 kernel[9] = vec2[] (    vec2(0.95581, -0.18159), vec2(0.50147, -0.35807), vec2(0.69607, 0.35559),    vec2(-0.0036825, -0.59150), vec2(0.15930, 0.089750), vec2(-0.65031, 0.058189),    vec2(0.11915, 0.78449), vec2(-0.34296, 0.51575), vec2(-0.60380, -0.41527) ); // returns random angle float randomAngle(in vec3 seed, in float freq) {    return random(seed, freq) * 6.283285; } void main() {    // ...    // generate random rotation angle for each fragment    float angle = randomAngle(o_worldPos, 15);    float s = sin(angle);    float c = cos(angle);    float PCFRadius = 20;    for(int i=4; i < numSamplingPositions; i++)    {       // rotate offset       vec2 rotatedOffset = vec2(kernel[i].x * c + kernel[i].y * s,          kernel[i].x * -s + kernel[i].y * c);       sample(projectedCoords, rotatedOffset * shadowMapStep * PCFRadius,          shadowFactor, numSamplesUsed);    }    shadowFactor = shadowFactor/numSamplesUsed; } 
Stratified Poisson kernel
 Instead of random rotation of Poisson disk in shader, you can define more offsets in Poisson kernel, and for each fragment use randomly selected part of these offsets. This will lead to very similar results to Rotated Poisson kernel modification. It will decrease banding artifacts and increase possible radius of transition. But this modification requires more samples than Rotated Poisson Sampling in order to decrease noise as it may sample same locations multiple times.
 // Poisson disc for Stratified Poisson sampling const int numSamplingPositions = 9; uniform vec2 kernel[9] = vec2[] (    vec2(0.95581, -0.18159), vec2(0.50147, -0.35807), vec2(0.69607, 0.35559),    vec2(-0.003682, -0.5915), vec2(0.1593, 0.08975), vec2(-0.6503, 0.05818),    vec2(0.11915, 0.78449), vec2(-0.34296, 0.51575), vec2(-0.6038, -0.41527) ); // returns pseudorandom number in [0, maxInt) range int randomInt(in vec3 seed, in float freq, in int maxInt) {    return int(float(maxInt) * random(seed, freq)) % maxInt; } void main() {    // ...    // four random samples    float radius = 5;    for(int i=1; i <= 4; i++)    {       // random index that depends on position of the fragment       int randomIndex = randomInt(o_worldPos * i, 100, numSamplingPositions);       // get offset through random index       sample(projectedCoords, kernel[randomIndex] * shadowMapStep * radius,             shadowFactor, numSamplesUsed);    }    shadowFactor = shadowFactor/numSamplesUsed; } 
Early bailing
 There is one optimization that can increase speed and reduces number of required samples from the shadow map. It's applicable to scenes where large parts of the scene are in light, or in shadow, and scenes doesn't contain high frequency changes between light and shadow (for example, like shadows from trees with a lot of branches). First step of this modification takes small number of samples at distant locations from the shadow map. If all checks passed then assume that fragment is in light and end sampling. Similarly, if all checks failed assume that fragment is in shadow and end sampling.
Otherwise you have to take arbitrary number of samples from the shadow map to calculate smooth transition between light and shadow. If distance between samples in early checks is too large, then it's possible that small lit areas may become shadowed and small areas of shadow may become lit. You can see these strange artifacts on the image.
 // Poisson disk with offsets for early bailing const int numSamplingPositions = 13; uniform vec2 kernel[13] = vec2[] (    vec2(-0.9328896, -0.03145855), // left check offset    vec2(0.8162807, -0.05964844), // right check offset    vec2(-0.184551, 0.9722522), // top check offset    vec2(0.04031969, -0.8589798), // bottom check offset    vec2(-0.54316, 0.21186), vec2(-0.039245, -0.34345), vec2(0.076953, 0.40667),    vec2(-0.66378, -0.54068), vec2(-0.54130, 0.66730), vec2(0.69301, 0.46990),    vec2(0.37228, 0.038106), vec2(0.28597, 0.80228), vec2(0.44801, -0.43844) ); void main() { // ...    float PCFRadius = 5;    // take samples required for early bailing    for(int i=0; i < 4; i++)    {       sample(projectedCoords, kernel[i] * shadowMapStep * PCFRadius,          shadowFactor, numSamplesUsed);    }    // early bailing check    if(shadowFactor>0.1 && shadowFactor<3.9)    {       // additional sampling to get smooth transition       float angle = randomAngle(o_worldPos, 15);       float s = sin(angle);       float c = cos(angle);       // don't take early bailing offsets again       for(int i=4; i < numSamplingPositions; i++)       {          vec2 rotatedOffset = vec2(kernel[i].x * c + kernel[i].y * s,             kernel[i].x * -s + kernel[i].y * c);          sample(projectedCoords, rotatedOffset * shadowMapStep * PCFRadius,             shadowFactor, numSamplesUsed);       }    } } 
The main problem of PCF method and its modifications is that they require multiple samples from shadow map and comparisons with current fragment's depth in order to smooth transitions between light and shadow. It would be good to blur shadow map, use automatic linear filtration, take only one sample and get smooth transition from light to shadow. But standard shadow mapping can't handle this, as sampling from shadow map and following comparison can't be treated as linear operation. The shadows will be wrong and there won't be smooth transition, as only one sample is taken and comparison has only two possible results (total light or shadow).
Variance Shadow Mapping (VSM) method calculates soft shadows and requires only one sample from the shadow map. It modifies logic of shadow mapping, and instead of simple shadow comparison, VSM uses average depth and average square of depths around the texel. Then, these values can be used in Chebyshev's inequality. Input parameters for inequality are difference between depth of current fragment and average depth and standard deviation of depths around texel. Chebyshev's inequality creates soft shadows with probabilistic approach. Following steps are required to implement Variance Shadow Mapping:
First rendering pass is similar to first pass of standard shadow mapping. Calculate normalized distance from fragment to the light source in [0, 1] range. Additionally you have to save square of that depth. So you have to setup render target with two channels. Following code snippet shows shader, that saves two required depth values:
 float worldSpaceDistance = distance(u_lightPosition, o_worldSpacePosition.xyz); float dist = (worldSpaceDistance - u_nearFarPlane.x) /               (u_nearFarPlane.y - u_nearFarPlane.x) + u_depthOffset; resultingColor.r = clamp(dist, 0, 1); resultingColor.g = resultingColor.r * resultingColor.r; 
Now you have to blur shadow map, that contains linear depths and squares of linear depths. You can achieve blur with two pass Gauss filter with 5x5 sampling kernel (one horizontal pass and one vertical). After bluring you will get average depth and average square of depth around each texel. Also, you can enable linear filtering and anisotropic filtering to increase blur and quality of soft shadows.
Render target with local average depths is passed to last rendering pass of VSM. Vertex shader of last pass is similar to vertex shader in last pass of the simple shadow mapping. Fragment shader of last pass of VSM similarly calculates texture coordinates to access shadow map.
Then you have to calculate standard deviation and apply Chebyshev's inequality. Take sample from shadow map. Let's average depth is M1 and average square of depths is M2. Then square of standard deviation σ is equal to:
$\sigma^{2}=M_{2} - M{_{1}}^{2}$
From Chebyshev's inequality shadowing factor can be calculated as:
$I = \left\{\begin{matrix}1 & d < M_{1} \\\frac{\sigma^{2}}{\sigma^{2} + (d-M_{1})} & d\geq M_{1}\end{matrix}\right.$
Intensity of lighting is maximum when difference between current depth and average depth is minimum. Also, standard deviation weights significance of difference between depth and average depth. If standard deviation is large, then it increases intensity of lighting in places where difference is big.
 But Variance Shadow Mapping has it's own disadvantages. Among them is light bleeding. Light bleeding means that some surfaces become lit when they have to be in shadow. This artifact may appear in places of high frequency changes of light and shadow. Light bleeding can be decreased by setting limit to final intensity calculated through Chebyshev's inequality. You can also automatically add shadow to fragments where normals are opposite to direction of light rays.
Following fragment shader shows how to implement last pass of Variance Shadow Mapping:
 #version 330 in vec4 o_shadowCoord; in vec3 o_normal; in vec3 o_worldPos; layout(location = 0) uniform sampler2D u_textureShadowMap; uniform vec3 u_lightPosition; uniform float u_minVariance; uniform float u_lightBleedingLimit; uniform vec2 u_nearFarPlane; layout(location = 0) out vec4 resultingColor; //////////////////////////////////////////////////////// float reduceLightBleeding(float p_max, float amount) {     return clamp((p_max-amount)/ (1.0-amount), 0.0, 1.0); } void main(void) {     /////////////////////////////////////////////////////////     // current distance to light     vec3 fromLightToFragment = u_lightPosition - o_worldPos.xyz;     float worldSpaceDistance = length(fromLightToFragment);     float currentDistanceToLight = clamp((worldSpaceDistance - u_nearFarPlane.x)         / (u_nearFarPlane.y - u_nearFarPlane.x), 0, 1);     /////////////////////////////////////////////////////////     // get blured and blured squared distance to light     vec3 projectedCoords = o_shadowCoord.xyz / o_shadowCoord.w;     vec2 depths = texture(u_textureShadowMap, projectedCoords.xy).xy;     float M1 = depths.x;     float M2 = depths.y;     float M12 = M1 * M1;     float p = 0.0;     float lightIntensity = 1.0;     if(currentDistanceToLight >= M1)     {         // standard deviation         float sigma2 = M2 - M12;         // when standard deviation is smaller than epsilon         if(sigma2 < u_minVariance)         {             sigma2 = u_minVariance;         }         // chebyshev inequality - upper bound on the         // probability that fragment is occluded         float intensity = sigma2 / (sigma2 + pow(currentDistanceToLight - M1, 2));         // reduce light bleeding         lightIntensity = reduceLightBleeding(intensity, u_lightBleedingLimit);     }     /////////////////////////////////////////////////////////     resultingColor.rgb = vec3(lightIntensity);     resultingColor.a = 1.0; } 
Variance Shadow Mapping uses two channels of shadow map, and as result it takes two times more memory. And light bleeding might be too severe. Instead of Variance Shadow Mapping you can use Exponential Shadow Maps (EMS). Exponential Shadow Maps method uses only one channel of shadow map, and decreases light bleeding effect.
Exponential Shadow Maps method is based on the fact that difference between depth from shadow map and distance from current fragment to the light source has to be greater or equal to zero. When fragment is lit, then difference (in ideal case!) has to be 0, and when fragment is in shadow - greater than zero. It may be positive due to floating point errors or due to some manual manipulations on input values (like blur).
$Z-D \geq 0$
Shadowing factor can be approximated with a function that depends on value of exponent. Value of such function will decrease very quickly to zero if difference is negative. When depths are equal then exp(z-d) = exp(0) = 1
$e^{z} e^{-d} = e^{z-d}$
In order to adjust softness of the transition between light and shadow you can use c constant. The greater the value, the more sharp are the shadows. For 32 bit shadow maps, you can use c=80 and lower values, and for 16 bit - c = 30 and lower values. With too low values the shadows will become to bright near the beginning.
$e^{cz} e^{ -cd} = e^{-c(d-z)}$
As you can see from following image, the function returns 1.0 for fragments where difference between depths is zero. When difference is negative, then the function returns value greater than 1.0. In such case use 1.0 as shadow factor. If difference d-z is positive than the function returns shadow factor between 1.0 and 0.0, and creates smooth transition.
Now let's look how to implement Exponential Shadow Maps. In first pass store exp(c * linearDepth) instead of linear depth. Note, depth values should be in [0, 1] range:
 resDepth = exp(c * depth); 
As for VSM, you can blur ESM with Gauss filter. The greater the blur effect, the softer the shadows. But large blur will lead to artifacts.
In the last rendering pass of ESM, calculate shadowing factor as exp(cz) * exp(-cd). Following code snippet shows implementation of Exponent Shadow Maps shader:
 #version 330 in vec4 o_shadowCoord; in vec3 o_normal; in vec3 o_worldPos; layout(location = 0) uniform sampler2D u_textureShadowMap; uniform vec3 u_lightPosition; uniform float u_depthMultiplier; uniform vec2 u_nearFarPlane; uniform float u_epsilon; layout(location = 0) out vec4 resultingColor; ///////////////////////////////////////////////// void main(void) {     /////////////////////////////////////////////////     // current distance to light     vec3 fromLightToFragment = u_lightPosition - o_worldPos.xyz;     float worldSpaceDistance = length(fromLightToFragment);     float currentDistanceToLight = clamp((worldSpaceDistance - u_nearFarPlane.x)             / (u_nearFarPlane.y - u_nearFarPlane.x), 0, 1);     /////////////////////////////////////////////////     // get blured exp of depth     vec3 projectedCoords = o_shadowCoord.xyz / o_shadowCoord.w;     float depthCExpBlured = texture(u_textureShadowMap, projectedCoords.xy).r;     // current exp of depth     float depthCExpActual = exp(- (u_depthMultiplier * currentDistanceToLight));     float expFactor = depthCExpBlured * depthCExpActual;     // Threshold classification for high frequency artifacts     if(expFactor > 1.0 + u_epsilon)     {         expFactor = 1.0;     }     /////////////////////////////////////////////////     resultingColor.rgb = vec3(expFactor);     resultingColor.a = 1.0; } 
ESM may improperly interpret scenes with high frequency details. In such case instead of one sample, you can take multiple samples and average them. Something similar to PCF. To determine which fragments require additional sampling you can check whether calculated shadow factor is greater than 1.0 + epsilon.
 Exponential Shadow Maps has its own disadvantages. First is that EMS is a hack, that gives nice shadows and gives good performance, but it doesn't have any physical background. Because of this, light bleeding is present, but it is much lower than in VSM. And shadows near objects can be too bright and darker at father distances, but it have to be conversely. Nevertheless, EMV with correct parameters is fine and quick method to get real-time shadows.

Sun and Black Cat- Igor Dykhta () © 2007-2014