Forward mapping directly maps the volume data onto the image plane. For each voxel, the renderer maps the point onto the image plane and then adds its contribution to the accumulating image. On the other hand, backward mapping maps the image plane into the data, commonly called ray casting. For each pixel in the final image, the renderer shoots rays from the pixel into the volume data and intersects that ray with each data point until either the ray exits the volume or the opacity accumulates enough density to become opaque. In backward mapping, input values rarely fall exactly along a ray. Consequently, an approximation of the volume data is generated using an interpolation scheme. More sophisticated methods may relieve some of this replication but will not eliminate it.
Texture mapping was introduced in [Catmull 74] as a method of adding to the visual richness of a computer generated image without adding geometry. This process includes a mapping from the space of the object to be textured to the texture space.
Bilinear/trilinear interpolation is usually used to sample data in the 2D/3D textures. To accelerate texture mapping, traditional graphics hardware includes a complete copy of all texture images at each parallel computation node to allow all nodes to operate in parallel.
An new way of rendering volume data using texture mapping is introduced in this presentation. Texture mapping which maps from object space to texture space is used for backward mapping in volume rendering which maps from image space to volume space. An innovative parallelization scheme is used in this work to subdivide the entire volume into subvolumes and avoids the need to store the whole volume at each computation node.
To handle volume data, parallel computations are done in graphics accelerator Denali manufactured by Kubota. Denali is a general purpose graphics accelerator with capabilities and functionality found only in the very high end graphics systems. By using the innovative parallelization scheme, we have extended it to be an imaging workstation which can do versatile volume rendering.
The Denali architecture has parallelism at two levels. A Denali system can have up to 6 Transformation and Rasterization Modules (TRMs) which, being general purpose processors (AMD 29050s), can perform arbitrarily complex computations.
In addition, a system can have up to 20 Frame Buffer Modules (FBMs), each of which can perform a predefined set of per pixel operations, like image/volume resampling and pixel blending. Image/volume sampling operations can be performed in the FBMs and large parallel speedups (up to 20x) are possible. Some image operations which can not be performed on the FBMs can be done in parallel on the TRMs with a resulting parallel speedup of up to 6x. Complex volumetric rendering operations (consisting of multiple suboperations) may be carried out utilizing both the FBMs and the TRMs to best advantage.
In summary, traditional 3D texture mapping used in 3D graphics is used for backward projection in volume rendering. A very high degree of parallelism (up to 20-fold) is achieved by storing a different subvolume in each of the FBMs. This differs significantly from traditional approaches that store the entire volume data set at each computational node. Overcoming this limitation permits volume sizes up to 256x256x253 (or 512x512x64) to be rendered in hardware. Progressive refinement schemes can be utilized to reduce sampling rate and interactive rendering speeds can be realized. Currently, maximum intensity projection, multi planar reformatting, volume resampling, ray sum, and surface rendering are implemented in Denali.