Real-Time Rendering——7.9 Irregular Z-Buffer Shadows不规则Z缓冲阴影 - 码农知识堂

Real-Time Rendering——7.9 Irregular Z-Buffer Shadows不规则Z缓冲阴影

Shadow-map approaches of various sorts are popular for several reasons. Their costs are predictable and scale well to increasing scene sizes, at worst linear with the number of primitives. They map nicely onto the GPU, as they rely on rasterization to regularly sample the light’s view of the world. However, due to this discrete sampling, problems arise because the locations the eye sees do not map one-to-one with those the light sees. Various aliasing problems arise when the light samples a surface less frequently than the eye. Even when sampling rates are comparable, there are biasing problems because the surface is sampled in locations slightly different than those the eye sees.

各种各样的阴影贴图方法之所以流行有几个原因。它们的成本是可预测的，并且随着场景大小的增加而增加，最坏的情况是与图元的数量成线性关系。它们很好地映射到GPU上，因为它们依赖光栅化来定期采样光线的世界视图。然而，由于这种离散采样，因为眼睛看到的位置与光看到的位置不一一对应，所以出现了问题。当光线对表面的采样频率低于肉眼时，会出现各种混叠问题。即使采样速率相当，也会存在偏差问题，因为表面采样的位置与肉眼看到的位置略有不同。

Shadow volumes provide an exact, analytical solution, as the light’s interactions with surfaces result in sets of triangles defining whether any given location is lit or in shadow. The unpredictable cost of the algorithm when implemented on the GPU is a serious drawback. The improvements explored in recent years [1648] are tantalizing,but have not yet had an “existence proof” of being adopted in commercial applications.

阴影体积提供了一个精确的分析解决方案，因为灯光与曲面的交互会产生一组三角形，这些三角形定义了任何给定位置是被照亮还是处于阴影中。当在GPU上实现时，算法的不可预测的成本是一个严重的缺点。近年来探索的改进[1648]是诱人的，但还没有在商业应用中被采用的“存在证据”。

Another analytical shadow-testing method may have potential in the longer term:ray tracing. Described in detail in Section 11.2.2, the basic idea is simple enough,especially for shadowing. A ray is shot from the receiver location to the light. If any object is found that blocks the ray, the receiver is in shadow. Much of a fast ray tracer’s code is dedicated to generating and using hierarchical data structures to minimize the number of object tests needed per ray. Building and updating these structures each frame for a dynamic scene is a decades-old topic and a continuing area of research.

从长远来看，另一种分析阴影测试方法可能很有潜力:光线追踪。在第11.2.2节中有详细的描述，基本思想很简单，特别是对于阴影。光线从接收器位置射向灯。如果发现任何阻挡光线的物体，接收器就在阴影中。快速光线跟踪器的大部分代码致力于生成和使用分层数据结构，以最小化每条光线所需的对象测试数量。为动态场景的每一帧构建和更新这些结构是一个几十年的老话题，也是一个持续的研究领域。

Another approach is to use the GPU’s rasterization hardware to view the scene,but instead of just z-depths, additional information is stored about the edges of the occluders in each grid cell of the light [1003, 1607]. For example, imagine storing at each shadow-map texel a list of triangles that overlap the grid cell. Such a list can be generated by conservative rasterization, in which a triangle generates a fragment if any part of the triangle overlaps a pixel, not just the pixel’s center (Section 23.1.2).One problem with such schemes is that the amount of data per texel normally needs to be limited, which in turn can lead to inaccuracies in determining the status of every receiver location. Given modern linked-list principles for GPUs [1943], it is certainly possible to store more data per pixel. However, aside from physical memory limits,a problem with storing a variable amount of data in a list per texel is that GPU processing can become extremely inefficient, as a single warp can have a few fragment threads that need to retrieve and process many items, while the rest of the threads are idle, having no work to do. Structuring a shader to avoid thread divergence due to dynamic “if” statements and loops is critical for performance.

另一种方法是使用GPU的光栅化硬件来查看场景，但是除了z深度之外，还存储了关于光的每个网格单元中遮光器边缘的附加信息[1003，1607]。例如，想象在每个阴影贴图纹理元素中存储一个与网格单元重叠的三角形列表。这样的列表可以通过保守光栅化来生成，其中如果三角形的任何部分与像素重叠，而不仅仅是像素的中心，则三角形生成片段(第23.1.2节)。这种方案的一个问题是，通常需要限制每个纹理元素的数据量，这又会导致在确定每个接收器位置的状态时不准确。鉴于GPU的现代链表原理[1943]，每像素存储更多数据当然是可能的。然而，除了物理内存限制之外，在每个纹理元素的列表中存储可变数量的数据的问题是GPU处理可能变得非常低效，因为单个经线可能有几个需要检索和处理许多项目的片段线程，而其余的线程是空闲的，没有工作可做。构建着色器以避免由于动态“if”语句和循环导致的线程分歧对于性能至关重要。

An alternative to storing triangles or other data in the shadow map and testing receiver locations against them is to flip the problem, storing receiver locations and then testing triangles against each. This concept of saving the receiver locations, first explored by Johnson et al. [839] and Aila and Laine [14], is called the irregular z-buffer(IZB). The name is slightly misleading, in that the buffer itself has a normal, regular shape for a shadow map. Rather, the buffer’s contents are irregular, as each shadowmap texel will have one or more receiver locations stored in it, or possibly none at all.See Figure 7.30.

在阴影贴图中存储三角形或其他数据并根据它们测试接收器位置的替代方法是翻转问题，存储接收器位置，然后根据每个位置测试三角形。首先由Johnson等人[839]以及Aila和Laine [14]探索的节省接收器位置的概念被称为不规则z缓冲器(IZB)。这个名字有一点误导，因为缓冲区本身有一个正常的、规则的阴影贴图形状。相反，缓冲区的内容是不规则的，因为每个阴影贴图纹理元素将有一个或多个接收器位置存储在其中，或者可能根本没有。参见图7.30。

Figure 7.30. Irregular z-buffer. In the upper left, the view from the eye generates a set of dots at the pixel centers. Two triangles forming a cube face are shown. In the upper right, these dots are shown from the light’s view. In the lower left, a shadow-map grid is imposed. For each texel a list of all dots inside its grid cell is generated. In the lower right, shadow testing is performed for the red triangle by conservatively rasterizing it. At each texel touched, shown in light red, all dots in its list are tested against the triangle for visibility by the light. (Underlying raster images courtesy of Timo Aila and Samuli Laine [14].)

图7.30。不规则z缓冲区。在左上角，眼睛的视角在像素中心产生了一组点。示出了形成立方体面的两个三角形。在右上角，这些点是从灯光的角度显示的。在左下方，阴影贴图网格被添加。对于每个纹理元素，生成其网格单元内所有点的列表。在右下角，通过保守地栅格化红色三角形，对其执行阴影测试。在每个被触摸的纹理元素上，用浅红色显示，列表中的所有点都将对照三角形进行测试，以确定灯光下的可见性。(基础光栅图像由Timo Aila和Samuli Laine提供[14]。)

Using the method presented by Sintorn et al. [1645] and Wyman et al. [1930, 1932],a multi-pass algorithm creates the IZB and tests its contents for visibility from the light. First, the scene is rendered from the eye, to find the z-depths of the surfaces seen from the eye. These points are transformed to the light’s view of the scene, and tight bounds are formed from this set for the light’s frustum. The points are then deposited in the light’s IZB, each placed into a list at its corresponding texel. Note that some lists may be empty, a volume of space that the light views but that has no surfaces seen by the eye. Occluders are conservatively rasterized to the light’s IZB to determine whether any points are hidden, and so in shadow. Conservative rasterization ensures that, even if a triangle does not cover the center of a light texel,it will be tested against points it may overlap nonetheless.

使用Sintorn等人[1645]和Wyman等人[1930，1932]提出的方法，多次通过算法创建IZB，并测试其内容在光线下的可见性。首先，从眼睛渲染场景，找到从眼睛看到的表面的z深度。这些点被转换为灯光的场景视图，并且从这个集合为灯光的平截头体形成紧密的边界。然后，这些点被存放在灯光的IZB中，每个点都放在一个列表中对应的纹理元素上。请注意，一些列表可能是空的，即灯光可以看到的空间体积，但眼睛看不到任何表面。遮光器保守地栅格化到灯光的IZB，以确定是否有任何点被隐藏，因此在阴影中。保守的栅格化确保了，即使一个三角形没有覆盖一个光照纹理元素的中心，它仍然会被它可能重叠的点所测试。

Visibility testing occurs in the pixel shader. The test itself can be visualized as a form of ray tracing. A ray is generated from an image point’s location to the light. If a point is inside the triangle and more distant than the triangle’s plane, it is hidden.Once all occluders are rasterized, the light-visibility results are used to shade the surface. This testing is also called frustum tracing, as the triangle can be thought of as defining a view frustum that checks points for inclusion in its volume.

可见性测试发生在像素着色器中。测试本身可以被视为光线追踪的一种形式。光线是从图像点的位置到光源产生的。如果一个点在三角形内部，并且比三角形的平面更远，则该点是隐藏的。一旦所有的遮光器被栅格化，光可见度的结果被用来给表面着色。这种测试也称为视锥追踪，因为三角形可以被认为是定义了一个视图视锥，用于检查点是否包含在其体积中。

Careful coding is critical in making this approach work well with the GPU. Wyman et al. [1930, 1932] note that their final version was two orders of magnitude faster than the initial prototypes. Part of this performance increase was straightforward algorithm improvements, such as culling image points where the surface normal was facing away from the light (and so always unlit) and avoiding having fragments generated for empty texels. Other performance gains were from improving data structures for the GPU,and from minimizing thread divergence by working to have short, similar-length lists of points in each texel. Figure 7.30 shows a low-resolution shadow map with long lists for illustrative purposes. The ideal is one image point per list. A higher resolution gives shorter lists, but also increases the number of fragments generated by occluders for evaluation.

仔细编码对于让这种方法很好地与GPU一起工作是至关重要的。Wyman等人[1930，1932]注意到他们的最终版本比最初的原型快两个数量级。这种性能提高的一部分是直接的算法改进，例如剔除表面法线背向灯光的图像点(因此总是不亮的),并避免为空纹理像素生成碎片。其他性能提升来自于改进GPU的数据结构，以及通过在每个纹理元素中使用长度相似的短点列表来最小化线程差异。图7.30显示了一个低分辨率的阴影图，为了便于说明，有一个很长的列表。理想情况是每个列表一个图像点。分辨率越高，列表越短，但也会增加由遮光器生成的用于评估的碎片数量。

As can be seen in the lower left image in Figure 7.30, the density of visible points on the ground plane is considerably higher on the left side than the right, due to the perspective effect. Using cascaded shadow maps helps lower list sizes in these areas by focusing more light-map resolution closer to the eye.

从图7.30左下方的图像中可以看出，由于透视效应，地平面上可见点的密度左侧明显高于右侧。使用层叠阴影贴图有助于降低这些区域的列表大小，方法是将更多的光照贴图分辨率聚焦到眼睛附近。

This approach avoids the sampling and bias issues of other approaches and provides perfectly sharp shadows. For aesthetic and perceptual reasons, soft shadows are often desired, but can have bias problems with nearby occluders, such as Peter Panning.Story and Wyman [1711, 1712] explore hybrid shadow techniques. The core idea is to use the occluder distance to blend IZB and PCSS shadows, using the hard shadow result when the occluder is close and soft when more distant. See Figure 7.31. Shadow quality is often most important for nearby objects, so IZB costs can be reduced by using this technique on only a selected subset. This solution has successfully been used in video games. This chapter started with such an image, shown in Figure 7.2 on page 224.

这种方法避免了其他方法的采样和偏差问题，并提供了非常清晰的阴影。出于审美和感知的原因，通常需要柔和的阴影，但对于附近的遮光器(如彼得·潘)可能会有偏差问题。斯托里和怀曼[1711，1712]探索混合阴影技术。核心思想是使用遮光器距离来混合IZB和PCSS阴影，当遮光器近时使用硬阴影结果，当遮光器远时使用软阴影结果。参见图7.31。阴影质量对于附近的物体通常是最重要的，所以只对选定的子集使用这种技术可以降低IZB成本。该解决方案已成功用于视频游戏。本章从这样一幅图像开始，如图7.2所示。

Figure 7.31. On the left, PCF gives uniformly softened shadows for all objects. In the middle, PCSS softens the shadow with distance to the occluder, but the tree branch shadow overlapping the left corner of the crate creates artifacts. On the right, sharp shadows from IZB blended with soft from PCSS give an improved result [1711]. (Images from “Tom Clancy’s The Division,” courtesy of Ubisoft.)

图7.31。在左边，PCF为所有对象提供均匀的柔化阴影。在中间，PCSS通过与遮光器的距离来柔化阴影，但是树枝阴影重叠在箱子的左角产生了伪像。在右边，来自IZB的锐利阴影与来自PCSS的柔和阴影的混合给出了一个改进的结果[1711]。(图片来自《汤姆·克兰西的分裂》，育碧提供。)
相关阅读:
IDEA 的模块没有执行本模块的代码
 记录Yolov5的使用（1）
星环科技数据中台解决方案，助力某政府机构建设新型智慧城市
 Redis中的原子操作(2)-redis中使用Lua脚本保证命令原子性
 求1~100000之间所有的“水仙花数”，并输出
 LeetCode 230.二叉搜索树中第K小的元素
 springSecurity(二)：实现登入获取token与解析token
计算机网络入门基础篇——应用层
 DynamicProgramming 动态规划
 java毕业设计毕业设计管理系统Mybatis+系统+数据库+调试部署
原文地址：https://blog.csdn.net/m0_37609239/article/details/125496483