EP1114400A1 - Appareil et procede permettant un traitement de volume en temps reel et un rendu tridimensionnel universel - Google Patents

Appareil et procede permettant un traitement de volume en temps reel et un rendu tridimensionnel universel

Info

Publication number
EP1114400A1
EP1114400A1 EP99934066A EP99934066A EP1114400A1 EP 1114400 A1 EP1114400 A1 EP 1114400A1 EP 99934066 A EP99934066 A EP 99934066A EP 99934066 A EP99934066 A EP 99934066A EP 1114400 A1 EP1114400 A1 EP 1114400A1
Authority
EP
European Patent Office
Prior art keywords
volume
volume dataset
rendering
image
slice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP99934066A
Other languages
German (de)
English (en)
Other versions
EP1114400A4 (fr
Inventor
Arie E. Kaufman
Ingmar Bitter
Baoquan Chen
Frank Dachille
Kevin Kreeger
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Foundation of State University of New York
Original Assignee
Research Foundation of State University of New York
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Foundation of State University of New York filed Critical Research Foundation of State University of New York
Priority to EP07120006A priority Critical patent/EP1890267A3/fr
Publication of EP1114400A1 publication Critical patent/EP1114400A1/fr
Publication of EP1114400A4 publication Critical patent/EP1114400A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/08Volume rendering

Definitions

  • the present invention relates generally to three-dimensional (3D) graphics and volume visualization, and more particularly relates to an apparatus and method for real time volume processing and universal three-dimensional rendering.
  • Volumetric data which consists of information relating to three-dimensional phenomena, is one species of complex information that can benefit from improved image rendering techniques.
  • the process of presenting volumetric data, from a given viewpoint, is commonly referred to as volume rendering.
  • Volume visualization is a vital technology in the interpretation of the great amounts of volumetric data generated by acquisition devices (e.g., biomedical scanners), by supercomputer simulations, or by synthesizing geometric models using volume graphics techniques.
  • volumetric objects Of particular importance for manipulation and display of volumetric objects are the interactive change of projection and rendering parameters, real-time display rates, and in many cases, the possibility to view changes of a dynamic dataset over time, called four-dimensional (4D) visualization (i.e., spatial-temporal), as in the emerging integrated acquisition visualization systems.
  • 4D four-dimensional
  • a volumetric dataset is commonly represented as a 3D grid of volume elements (voxels), often stored as a full 3D raster (i.e., volume buffer) of voxels.
  • Volume rendering is one of the most common techniques for visualizing the 3D scalar field of a continuous object or phenomenon represented by voxels at the grid points of the volume dataset, and can be accomplished using two primary methods: object-order methods and image-order methods.
  • object-order approach the contribution of each voxel to the screen pixels is calculated, and the combined contribution yields the final image.
  • image-order approach sight rays are cast from screen pixels through the volume dataset, and contributions of voxels along these sight rays are used to evaluate the corresponding pixel values.
  • Cube-4 an architecture developed by Dr. Arie Kaufman, Ingmar Bitter and Dr. Hanspeter Pfister, some of whom are also named inventors in the present application, is a special purpose scalable volume rendering architecture based on slice-parallel ray-casting.
  • Cube-4 is capable of delivering true real-time ray-casting of high resolution datasets (e.g., 1024 3 16-bit voxels at 30 Hertz frame rate).
  • Cube-4 cannot deliver such real-time performance for perspective projections.
  • the use of perspective projections either increases the rendering time or decreases the projected image quality.
  • prior architectures do not provide the ability to combine volumes and geometries into a single image.
  • FIG. 1 a conventional volume visualization system 1 is shown. As illustrated in Figure 1, the volume data is stored on a disk 2 and loaded into memory 4 before rendering. A Central Processing Unit (CPU) 6 then computes the volume rendered image from the data residing in memory 4. The final image is written to a frame buffer 8, which is typically embedded on a graphics card, for displaying on a monitor 9 or similar display device.
  • CPU Central Processing Unit
  • the present invention is intended to provide a method and apparatus which significantly enhances the capabilities of known methods and apparatus to the extent that it can be considered a new generation of imaging data processing.
  • An apparatus, in accordance with the present invention, for real-time volume processing and universal three-dimensional (3D) rendering includes one or more three-dimensional (3D) memory units; at least a first pixel bus; one or more rendering pipelines; one or more geometry busses; and a control unit.
  • the apparatus is responsive to viewing and processing parameters which define a viewpoint, and the apparatus generates a 3D volume projection image from the viewpoint.
  • the projected image includes a plurality of pixels.
  • the 3D memory units store a plurality of discrete voxels, each of the voxels having a location and voxel data associated therewith.
  • the voxels together form a volume dataset, and the viewing and processing parameters define at least one face of the volume dataset as the base plane of the volume dataset as well as first and last processing slices of the volume dataset.
  • the control unit initially designates the first processing slice as a current slice of sample points, and controls sweeping through subsequent slices of the volume dataset as current slices until the last processing slice is reached.
  • Each of the plurality of rendering pipelines is vertically coupled to both a corresponding one of the plurality of 3D memory units and the at least first pixel bus, and each of the rendering pipelines has global horizontal communication preferably with at most its two nearest neighbors.
  • the rendering pipelines receive voxel data from the corresponding 3D memory units and generate a two-dimensional (2D) base plane image aligned with a face of the volume dataset.
  • the geometry I/O bus provides global horizontal communication between the plurality of rendering pipelines and a geometry engine, and the geometry I/O bus enables the rendering of geometric and volumetric objects together in a single image.
  • the apparatus and methods of the present invention surpass existing 3D volume visualization architectures and methods, not only in terms of enhanced performance, image rendering quality, flexibility and simplicity, but in terms of the ability to combine both volumes and surfaces (particularly translucent) in a single image.
  • the present invention provides flexible, high quality, true real-time volume rendering from arbitrary viewing directions, control of rendering and projection parameters, and mechanisms for visualizing internal and surface structures of high- resolution datasets. It further supports a variety of volume rendering enhancements, including accurate perspective projection, multi-resolution volumes, multiple overlapping volumes, clipping, improved gradient calculation, depth cuing, haze, super-sampling, anisotropic datasets and rendering of large volumes.
  • the present invention is more than a mere volume rendering machine; it is a high-performance interpolation engine, and as such, it provides hardware support for high-resolution volume rendering and acceleration of discrete imagery operations that are highly dependent on interpolation, including 2D and 3D texture mapping (with mip-mapping) and image-based rendering.
  • the apparatus and methods of the present invention coupled with a geometry engine, combine volumetric and geometric approaches, allowing users to efficiently model and render complex scenes containing traditional geometric primitives (e.g., polygonal facets), images and volumes together in a single image (defined as universal 3D rendering).
  • the apparatus of the present invention additionally provides enhanced system flexibility by including various global and local feedback connections, which adds the ability to reconfigure the pipeline stages to perform advanced imagery operations, such as imaging wa ⁇ ing and multi-resolution volume processing. Furthermore, the present invention accomplishes these objectives in a cost-effective manner.
  • Figure 1 is a block diagram of a conventional volume visualization system.
  • Figure 2 is a conceptual block diagram illustrating a universal three- dimensional rendering system formed in accordance with one embodiment of the present invention.
  • FIG 3 is a simplified block diagram of the Cube-5 unit of Figure 2 illustrating a preferred implementation of the present invention.
  • Figure 4 is a functional block diagram depicting an overview of the universal three-dimensional rendering architecture formed in accordance with one embodiment of the present invention.
  • Figure 5 is a functional block diagram illustrating a unit rendering pipeline formed in accordance with one embodiment of the present invention.
  • Figure 6A is a graphical representation showing how 32 bits of texel data is stored for 2x2 neighborhood in a miniblock of 16-bit voxels, in accordance with a preferred method of the present invention.
  • Figure 6B depicts a tabular comparison of voxel storage and texel storage for the example of Figure 6 A.
  • Figure 7 illustrates special parallel preserving scanlines in the source and target images in accordance with a preferred forward image wa ⁇ ing method of the present invention.
  • Figure 8 is a graphical representation illustrating a method for determining the parallel preserving scanline direction in accordance with the preferred forward image wa ⁇ ing method of the present invention.
  • Figure 9 is two-dimensional graphical representation of an example illustrating pixel read templates in the source image for performing scanline processing, in accordance with the preferred forward image wa ⁇ ing method of the present invention.
  • Figure 10 is two-dimensional graphical representation of the example of Figure 9 illustrating a bilinear inte ⁇ olation of samples for performing scanline processing, in accordance with the preferred forward image wa ⁇ ing method of the present invention.
  • Figure 11 is two-dimensional graphical representation of a linear inte ⁇ olation on samples to obtain pixel values for performing target pixel correction, in accordance with a preferred method of the present invention.
  • Figure 12 is a graphical representation illustrating the calculation of an anisotropic filter footprint for performing antialiasing in accordance with a preferred forward image wa ⁇ ing method of the present invention.
  • Figure 13 is a graphical representation illustrating the splatting of source pixels onto the target samples, in accordance with the preferred forward image wa ⁇ ing method of the present invention.
  • Figure 14 depicts an example of a v-slice shear for performing three- dimensional rotation by two-dimensional slice shear decomposition, in accordance with one method of the present invention.
  • Figure 15 depicts an example of an jc-beam shear for performing three- dimensional rotation by two-dimensional beam shear decomposition, in accordance with another method of the present invention.
  • Figure 16 depicts an example of an -slice-v-beam shear for performing three- dimensional rotation by two-dimensional slice-beam shear decomposition, in accordance with still another method of the present invention.
  • Figure 17 depicts an example of a three-dimensional j -beam shear for performing three-dimensional rotation by three-dimensional beam shear decomposition, in accordance with yet another method of the present invention.
  • Figure 18A illustrates a conventional undersampling method for performing perspective projections.
  • Figure 18B illustrates a conventional oversampling method for performing perspective projections.
  • Figure 19 illustrates an adaptive perspective ray-casting method for performing perspective volumetric projections, in accordance with a preferred form of the present invention, wherein a view frustum is divided into regions based on exponentially increasing distance from a viewpoint.
  • Figure 20A is a graphical representation illustrating the splitting/merging of rays at exponential region boundaries, in accordance with the preferred adaptive perspective ray-casting method of the present invention.
  • Figure 20B is a graphical representation illustrating the effective filter weights for ray segments A, B and C of the adaptive perspective ray-casting method example of Figure 20 A.
  • Figure 21 illustrates an example of the weights for a two-dimensional filter of size ⁇ 2 samples, in accordance with a preferred form of the present invention.
  • Figure 22 is a graphical representation illustrating an example of a of the adaptive perspective ray-casting method of the present invention, wherein a 7 3 volume is three voxel units from the viewpoint.
  • Figure 23 is a pseudo-code representation of a preferred method for performing Exponential-Regions Perspective back-to-front projection of a volume, in accordance with one form of the present invention.
  • Figure 24 illustrates an example of the Exponential-Regions Perspective ray casting method of the present invention across two regions.
  • Figure 25 depicts an example of the preferred weights for performing a 3 3 symmetric approximation of the -component of a Sobel gradient filter, in accordance with one embodiment of the present invention.
  • Figure 26 is a graphical representation illustrating a method for mixing geometric objects and volumes in a single image in accordance with one form of the present invention.
  • Figure 27 is a graphical representation of a method for clipping triangles to thin slab boundaries in accordance with one form of the present invention.
  • Figure 28 is a graphical representation of a method for bucket sorting translucent polygons in accordance with a preferred form of the present invention.
  • Figure 29 is a graphical representation of a method, in accordance with one form of the present invention, for creating sheared viewing geometry by pre-wa ⁇ ing the polygon footprints.
  • Figure 30 is a graphical representation of a Cube-5 pipeline, formed in accordance with one form of the present invention, illustrating an SRAM composite buffer included therein.
  • Figure 31 is a graphical representation of a conventional graphics accelerator, conceptually illustrating the interfaces between the texture memory, frame buffer and geometry pipeline.
  • Figure 32 is a graphical representation illustrating one embodiment of the present invention employing a dual-use DRAM frame buffer connecting a geometry pipeline with the Cube-5 volume rendering pipeline of the present invention.
  • Figure 33 is a block diagram illustrating memory interfaces for each Cube-5 pipeline including a coxel FIFO queue, in accordance with one form of the present invention.
  • Figure 34 is a graphical representation of a RGB ⁇ coxel layout onto eight
  • DRAM chips formed in accordance with a preferred embodiment of the present invention.
  • FIG 35 is a partial block diagram representation of an embedded DRAM chip implementation of run-length encoding (RLE) frame buffer hardware, formed in accordance with one form of the present invention.
  • RLE run-length encoding
  • Figure 36 is a pseudo-code representation showing processing occurring in the RLE hardware of Figure 35, in accordance with one form of the present invention.
  • Figure 37 is a graphical representation of a preferred embodiment of the present invention illustrating a RLE frame buffer connecting a geometry pipeline to an SRAM compositing buffer included in the Cube-5 pipeline.
  • Figure 38 illustrates a density profile of an oriented box filter taken along a line from the center of a solid primitive outward, pe ⁇ endicular to the surface, in accordance with one form of the present invention.
  • Figure 39 illustrates a density profile of an oriented box filter taken along a line pe ⁇ endicular to a triangle surface primitive, in accordance with another form of the present invention.
  • Figure 40 depicts a two-dimensional illustration of seven voxelization regions for a triangle primitive, in accordance with a preferred embodiment of the present invention.
  • Figure 41 is a pseudo-code representation of a method for computing the distance from a plane, in accordance with one form of the present invention.
  • Figure 42 is a block diagram representation illustrating an overview of a hardware voxelization pipeline, formed in accordance with one embodiment of the present invention.
  • Figure 43 is a block diagram depicting a distance unit which incrementally computes the distance of a current voxel from a desired plane, in accordance with one form of the present invention.
  • Figure 44 is a top view graphical representation illustrating a preferred method for performing image-based rendering in accordance with one form of the present invention.
  • the apparatus and methods of the present invention are capable of processing data and supporting real-time visualization of high resolution voxel-based data sets.
  • the present invention is a universal three-dimensional (3D) rendering system delivering enhanced volume rendering in addition to the integration of imagery (e.g., volumes, textures and images) with geometry (e.g., polygons).
  • imagery e.g., volumes, textures and images
  • geometry e.g., polygons.
  • the apparatus and methods are designed for use as a voxel-based system as described in the issued patents and pending applications of Dr.
  • FIG. 2 illustrates a conceptual view of a universal 3D rendering system 10 formed in accordance with one embodiment of the present invention.
  • Applications 12 which display collections of renderable objects are preferably split by an Applications Program Interface (API) 14 into appropriate imagery and geometry representations. These representations are subsequently processed by an imagery unit 16 and a geometry unit 18, respectively, which are illustrated generally as functional blocks.
  • API Applications Program Interface
  • the imagery unit 16 preferably includes a plurality of imagery pipelines and the geometry unit 18 preferably includes a plurality of geometry pipelines (not shown) for rendering the imagery and geometry representations, respectively.
  • the rendered outputs of the imagery unit 16 and the geometry unit 18 are subsequently combined in a blending unit 20 to generate a single baseplane image.
  • This baseplane image may preferably be transformed by a wa ⁇ unit 22 to a final projection plane for display.
  • Figure 3 illustrates one implementation of the Cube-5 volume visualization system of the present invention.
  • the system preferably includes one or more three-dimensional memory units 24, with each 3D memory unit 24 vertically coupled to an input bus 26 and a corresponding Cube-5 chip 28.
  • a plurality of Cube-5 chips 28 are shown connected to a frame buffer pixel bus 34.
  • the system 10 of the present invention preferably interfaces to at least one conventional geometry engine 30 and a host computer 32, both operatively coupled between the input bus 26 and the frame buffer pixel bus 34 for communicating with the Cube-5 apparatus of the present invention.
  • the apparatus of the present invention 10 includes a plurality of 3D memory units 24 which are preferably connected to an imagery input bus 26, providing global horizontal communication between the 3D memory units 24.
  • the volume dataset is commonly represented as a regular grid of volume elements, or voxels, often stored as a full 3D raster (i.e., volume buffer). This volume dataset is preferably distributed across the 3D memory units 24. With a skewed distribution, the present invention allows conflict-free access to complete beams (i.e., rows) of voxels parallel to any of the major axes, thereby reducing the memory-processor bandwidth bottleneck.
  • each 3D memory unit 24 is preferably connected to a dedicated real-time input 36.
  • a dedicated connection to a real-time input source By providing a dedicated connection to a real-time input source, the memory-processor bandwidth bottleneck is further reduced.
  • the universal 3D rendering system 10 of the present invention further includes a plurality of rendering pipelines, shown as functional blocks of Cube-5 units 38 in
  • Each rendering pipeline 38 is connected to a corresponding 3D memory unit 24 and preferably has horizontal communication with at least preferably its two nearest neighbors.
  • the Cube-5 units 38 read from their dedicated 3D memory units 24 and produce a two-dimensional (2D) baseplane image.
  • This baseplane image which contains a plurality of composited pixels generated by the Cube-5 units 38, is preferably distributed across a plurality of two-dimensional (2D) memory units 40.
  • Each of the plurality of 2D memory units 40 is preferably connected to both a corresponding Cube-5 pipeline unit 38 and a baseplane pixel bus 42 which provides global horizontal communication between 2D memory units 40.
  • the present invention includes a plurality of wa ⁇ units 44 connected to the baseplane pixel bus 42.
  • the wa ⁇ units 44 assemble and transform (i.e., wa ⁇ ) the baseplane image stored in the plurality of 2D memory units 40 onto a user-defined image plane.
  • the present invention contemplates using a single wa ⁇ unit 44 (e.g., in order to reduce the costs or overhead of the hardware), the use of a plurality of wa ⁇ units 44 is desirable to accelerate image transformations.
  • each of the wa ⁇ units 44 is preferably connected to a frame buffer pixel bus 34 which provides global horizontal communication between wa ⁇ units 44. Reading the source pixels over the baseplane pixel bus 42 and writing the final image pixels over the frame buffer pixel bus 34 preferably happens concurrently in order to allow greater system throughput. Although not a preferred architecture, the present invention also contemplates sequential reading and writing by the wa ⁇ units 44. In this manner, only one pixel bus may be required, assuming the one pixel bus offers sufficient bandwidth for real-time image transfer for a full screen image.
  • the present invention preferably includes a geometry input bus 46 and a geometry output bus 48, although it is contemplated to combine the two busses into a single geometry input/output bus of sufficient bandwidth for real-time imaging.
  • the geometry input and output busses 46 and 48 are preferably connected to the inputs and outputs of the Cube-5 units 38 respectively and provide for the unique coupling of at least one geometry pipeline or engine (not shown) to the present system 10.
  • the architecture of the present invention coupled with a geometry engine via the geometry busses 46 and 48, supports the integration of imagery, such as volumes and textures, with geometries, such as polygons and surfaces. This mixing of geometric data with volumetric objects is a powerful feature which is unique to the present invention.
  • each rendering pipeline 52 preferably includes four types of processing units: a trilinear inte ⁇ olation unit (TriLi ) 54, a gradient estimation unit (Gradient) 56, a shading unit (Shader) 58 and a compositing unit (Compos) 60.
  • TriLi trilinear inte ⁇ olation unit
  • Gdient gradient estimation unit
  • Shader shading unit
  • Compos compositing unit
  • the volume dataset is stored as a regular grid of voxels distributed across the 3D memory units 24 in a skewed fashion, with each Cube-5 unit 38 connected to a corresponding 3D memory unit 24 (see Figure 4). Voxels of the same skewed beam are preferably fetched and processed in parallel, distributed across all Cube-5 units 38. Consecutive slices of the volume dataset parallel to a predefined baseplane (i.e., parallel to a face of the volume dataset which is most pe ⁇ endicular to a predefined view direction) are preferably traversed in scanline order.
  • an address generation and control unit 62 preferably generates the addresses for access into the 3D memory unit 24.
  • the address generation and control unit 62 additionally designates a first processing slice as the current processing slice and controls sweeping through subsequent slices of the volume dataset until the final slice has been processed.
  • the trilinear inte ⁇ olation unit 54 computes a new slice of inte ⁇ olated sample values between two processing slices. It is contemplated by the present invention that the trilinear inte ⁇ olation function may alternatively be performed as a sequence of linear or bilinear inte ⁇ olations.
  • the gradient estimation unit 56 preferably computes central difference gradients using volume data from multiple slices of the volume dataset. Utilizing the central difference gradients generated by the gradient estimation unit 56, sample points of the current processing slice are subsequently shaded by the shading unit 58.
  • the shading unit 58 preferably uses the samples and gradients as indices into one or more look-up tables (LUTs), preferably residing in each shading unit 58, which store material color and intensity information.
  • LUTs look-up tables
  • the material color table is dataset-type dependent, while the color intensity table is based on a local illumination model, as known by those skilled in the art. In simple terms, the multiplication of color and intensity yields a pixel color for each sample which is used in the compositing unit 60 to composite such color with the previously accumulated pixels along each sight ray.
  • data for computing the next sample along a continuous sight ray may reside on a neighboring Cube-5 unit 38.
  • the nearest-neighbor connections between Cube-5 units 38 are preferably used to transfer the necessary data to the appropriate Cube-5 unit 38, which will continue to process that particular sight ray.
  • the composited pixels i.e., baseplane pixels
  • the baseplane pixels which form the baseplane image, are subsequently read from the 2D memory units 40, via the baseplane pixel bus 42, and assembled by the wa ⁇ units 44.
  • the wa ⁇ units 44 additionally transform the baseplane image to the final projection plane image.
  • the delay of data required for the trilinear inte ⁇ olation unit 54 and gradient estimation unit 56 is preferably achieved by inserting one or more first-in-first-out (FIFO) units 64 into the pipeline data path prior to being processed by the trilinear inte ⁇ olation 54 and the gradient estimation 56 units.
  • the FIFO unit(s) 64 may be implemented as, for example, random access memory (RAM), preferably embedded on the Cube-5 chip.
  • RAM random access memory
  • a compositing buffer (Compos Buffer) 74 operatively coupled to a bilinear inte ⁇ olation unit (BiLin) 72 essentially provides a one slice FIFO.
  • the bilinear inte ⁇ olation unit 72 preferably inte ⁇ olates to obtain values between voxels as needed for texture mapping.
  • BiLin 72 preferably uses only weights of 0.0 or 1.0 which selects one of the corner voxels of the volume dataset (determined by Select x and Select y). It just moves the ray data, if the ray crosses pipelines. Just a mux
  • the Cube-5 architecture preferably supports re-ordering of the pipeline stages and a number of multipass rendering and processing operations, which require feedback connections between various stages of the Cube-5 rendering pipelines 52 and the 3D memory units 24. For example, correct rendering of overlapping volumetric objects preferably requires at least two passes through the Cube-5 pipeline
  • a multiple volumes feedback path 66 is preferably provided, operatively connecting the output of the compositing unit 60 to the corresponding 3D memory unit 24, which allows the re-sampled volumes to be written back into the 3D memory unit 24 after re-sampling, classification and shading.
  • the final rendering pass works on RGB ⁇ volumes.
  • each Cube-5 rendering pipeline 52 preferably includes an image- based rendering feedback path 68 connected between the wa ⁇ unit 44 and the 3D memory unit 24.
  • the image-based rendering feedback line 68 preferably provides a feedback path for writing the intermediate wa ⁇ ed images to the 3D memory unit 24. This may be particularly useful for accelerating certain image-based rendering operations requiring multiple wa ⁇ passes.
  • the architecture of the present invention further contemplates feedback connections between the 3D memory unit 24 and various other Cube-5 rendering pipeline stages, or between the individual pipeline stages themselves. Image rendering speed may be substantially increased by including feedback paths which provide direct and immediate access to the computational results of individual pipeline stages, without having to wait for the results to traverse through the entire Cube-5 rendering pipeline 52.
  • the Cube-5 system includes connections which bypass selective stages of the rendering pipeline, that, for example, may not be required for certain imaging operations. By bypassing these unused pipeline stages, such imaging operations can be accelerated. As illustrated in
  • a texture map bypass 70 is preferably included in each Cube-5 rendering pipeline 52.
  • This texture map bypass connection 70 substantially speeds up mip- mapping, for instance, which consists of storing multiple levels-of-detail (LOD) of the image to be processed, by bypassing the shading unit 58 and compositing unit 60 and directly presenting the results from the trilinear inte ⁇ olation unit 54 and gradient estimation unit 56 to the bilinear inte ⁇ olation unit 72.
  • the architecture of the present invention can preferably be considered not only as an array of pipelines for performing volume rendering, but as a collection of hardware resources which can be selectively configured to perform a variety of imaging operations. For example, when the Cube-5 system of the present invention is performing volume rendering, essentially all of the hardware resources are required, while texture mapping generally requires only memory, some buffering and the inte ⁇ olation units.
  • the Cube-5 architecture preferably interfaces with at least one conventional geometry engine 76 to support mixing of geometric data and volumetric objects in a single image. This is preferably accomplished by providing at least one geometry bus, as discussed above, to interface with the geometry engine 76.
  • the Cube-5 architecture of the present invention is adapted to reuse pipeline components (e.g., inte ⁇ olation unit, etc.), wherever possible, to accelerate a variety of rendering algorithms using multiple configurations, in particular, rendering scenes of multiple volumetric and polygonal objects, texture mapping, and image-based rendering.
  • pipeline components e.g., inte ⁇ olation unit, etc.
  • reusing pipeline components reduces hardware costs.
  • the Cube-5 architecture also supports various unique methods and algorithms for enhancing volume rendering and acceleration of other imaging operations. Some of these methods and algorithms will be discussed individually in greater detail below.
  • volume datasets are stored in blocks, thereby taking advantage of spatial locality.
  • linear blocking e.g., Voxelator API
  • hierarchical blocks are used which are preferably stored in a distributed arrangement, skewed across multiple 3D memory units. For example, using current Mitsubishi Electric 16- bit, 125 megahertz synchronous dynamic random access memory (SDRAM) to implement the 3D memory, each block can contain 8 3 16-bit voxels requiring 1024 bytes or two SDRAM pages.
  • SDRAM synchronous dynamic random access memory
  • Each block is preferably organized as a collection of 2 3 -voxel miniblocks residing in the same 3D memory unit.
  • the banks inside the SDRAM can preferably be accessed in a pipelined fashion such that the current burst transfer essentially completely hides the setup of the subsequent burst transfer. If the view-dependent processing order of the voxels in a miniblock does not coincide with their storage order, then the eight miniblock voxels are preferably reordered on the Cube-5 chip.
  • Hierarchical blocking allows random access to miniblocks at essentially full burst mode speed, essentially full (100%) bandwidth utilization, view-independent data storage and balanced workload.
  • Blocking not only optimizes the memory interface, but has an additional advantage of reducing the inter-chip communication bandwidth (i.e., between Cube-5 hardware units), since only the voxels on the block perimeters need to be exchanged between neighboring chips processing neighboring blocks. While processing a b 3 -voxel block in 0(b 3 ) time, only the 0(b 2 ) voxels on the block boundary need to be communicated between chips processing neighboring blocks, where b is the size of a block edge and each block has bxbxb (i.e., b 3 ) voxels. Therefore, inter-chip communication needs 0(l/b) less bandwidth than with a non-blocking solution.
  • the size of the block edge b can be in the range of about 4 ⁇ b ⁇ 64, although a block edge size of eight (8) is preferred.
  • Block look-up tables are preferably utilized to store the pointers to all blocks comprising the current volume. This approach provides an easy method to restrict the active volume while zooming into a selected region of interest of a large volume. It also allows rendering of arbitrarily shaped sub-volumes (at block-sized granularity). Additionally, scenes containing many small volumes can be rendered very efficiently, as all volumes can reside anywhere among the 3D memory units, and only the look-up tables must be reloaded for each volume, rather than the 3D memory units.
  • One method of performing perspective projection and/or Level-of-Detail (LOD) relies on two-fold super-sampling in the x and y directions. Accordingly, a four-times (4 ) replication of the inte ⁇ olation units for trilinear inte ⁇ olation, as well as the gradient estimation units for gradient computation, is preferably employed. As a result, the datapath between the SDRAM and the Cube-5 pipelines is essentially unchanged. However, the bandwidth between Cube-5 pipelines is quadrupled, as is the on-chip throughput and buffers, primarily because each sample of the normal mode is replaced by up to four samples (i.e., 2 ⁇ in the x direction and 2 ⁇ in the v direction).
  • Handling anisotropic datasets and super-sampling preferably require a modification of opacity a.
  • a look-up table LUT
  • LUT look-up table
  • the perspective rendering of volumetric data with close to uniform sampling of the underlying volume dataset requires re- scaling of the compositing buffer 74 with filtering between levels.
  • Level-of-detail (LOD) perspective rendering requires re-alignment of the compositing buffer 74 between levels.
  • a hardware wa ⁇ unit is generally necessary to obtain final full screen images in real time (i.e., a 30 Hertz frame rate).
  • the baseplane image generated by the compositing units 60 of the Cube-5 rendering pipelines 52, is preferably buffered in the 2D memory units 40.
  • each pixel of the baseplane image is preferably accessed only once.
  • another FIFO unit sized to hold at least one scanline, is required to store the previous scanline samples.
  • the inte ⁇ olation weights for each grid pixel are preferably pre-calculated on a host machine.
  • the Z-buffer image is preferably written to the compositing buffer
  • the compositing unit 60 must perform a z-comparison prior to blending each new sample.
  • the geometry engine 76 preferably utilizes the geometry input bus (reference number 46 in Figure 4) of the present invention to insert each slab of RGB values into the data stream so that each slab is interleaved with the volumetric data slices.
  • Figure 6 shows, by way of example, how 32 bits of texel data are preferably stored for a 2 2 neighborhood in a miniblock of 16-bit voxels in the 3D memory unit, in accordance with the present invention. Therefore, a four-texel neighborhood of 32-bit texels is preferably read during each memory burst read.
  • the Cube-5 system preferably performs, on average, 2.25 data burst reads to access the appropriate texel neighborhood, since some texture coordinates may lie between stored miniblocks.
  • one way to implement image-based rendering in hardware is to utilize the memory control unit 78, preferably included in each Cube-5 pipeline 52, to read the appropriate source pixels based on the contributing region for each pipeline.
  • the inte ⁇ olation units e.g., 54 and 72
  • the wa ⁇ unit 44 may be utilized to perform this function.
  • the source pixels contributing to the current view are read and assembled into the 2D memory units 40, preferably through a connection line 41, followed by the wa ⁇ transformation.
  • four assembled source images are processed in four consecutive wa ⁇ passes.
  • the image-based rendering feedback line 68 provides feedback for writing the intermediate wa ⁇ ed images to the 3D memory 24.
  • the 3D memory units 24 provide local storage for a large database of images.
  • the apparatus of the present invention described herein above may considerably accelerate conventional volume processing methods, beyond the universal rendering already described.
  • the Cube-5 apparatus of the present invention may be used in conjunction with a number of unique algorithms adapted for enhancing the performance of and/or providing enhanced features for real-time volume processing, therefore making the overall Cube-5 system superior to existing volume rendering architectures, such as Cube-4.
  • Some of these unique algorithms including those for performing image wa ⁇ ing, three-dimensional transformations, perspective projections, handling large volumes, high quality rendering, clipping, depth cueing, super-sampling and anisotropic datasets, are discussed in detail below.
  • Image wa ⁇ ing is preferably the final stage of the Cube-5 volume rendering pipeline.
  • image wa ⁇ ing primarily relates to the geometric transformation between two images, namely, a source image and a target image.
  • the geometric transformation defines the relationship between source pixels and target pixels. Efficiency and high quality are equally critical issues in such applications.
  • the wa ⁇ unit preferably performs the image transformation function. Consequently, applications employing a wa ⁇ unit benefit from the image wa ⁇ ing method of the present invention.
  • image wa ⁇ ing methods are generally classified as either forward wa ⁇ ing or backward wa ⁇ ing.
  • the source pixels are processed in scanline order and the results are projected onto the target image.
  • the target pixels in raster order are inversely mapped to the source image and sampled accordingly.
  • Most known prior art wa ⁇ ing algorithms employ backward wa ⁇ ing.
  • affine transformations i.e., translation, rotation, scaling, shearing, etc.
  • a perspective transformation is considered to be more expensive and challenging.
  • an expensive division is needed when calculating the sample location in the baseplane image for a pixel in the projection plane.
  • Conventional perspective wa ⁇ ing is typically at least three-fold slower than parallel wa ⁇ ing, when implemented by a CPU. Accordingly, some prior art approaches have decomposed the perspective transformation into several simpler transformations requiring multiple passes.
  • One primary problem inherent in multipass transformation algorithms is that the combination of two one- dimensional (ID) filtering operations is not as flexible as true two-dimensional (2D) filtering.
  • conventional multi-pass approaches introduce additional filtering operations which degrade image quality.
  • the present invention preferably employs a unique single-pass forward wa ⁇ ing method which can be implemented with substantially the same efficiency as affine transformations. Costly divisions, which were traditionally performed for every pixel, are reduced to only twice per scanline according to the present invention. Thus, by reducing the number of division operations, the present invention provides an alternative perspective wa ⁇ ing method which is superior to known prior art methods, at least, for example, in terms of speed and the efficient hardware implementation. A preferred method for perspective wa ⁇ ing, in accordance with the present invention, will now be discussed.
  • the present invention uses a scanline approach to perform perspective wa ⁇ ing. Rather than scanning in normal raster scanline order, however, the algorithm of the present invention is processed in a special scanline direction in the source image. As illustrated in Figures 7 and 8, this special scanline direction 92 ( Figure 8) preferably has the property that parallel scanlines 84 in the source image 80 appear as parallel scanlines 86 in the target image 82, and that equi-distant sample points 88 along a source scanline 84 remain as equi-distant sample points 90 in the target scanline 86.
  • Some advantages of this unique approach include a reduced complexity of perspective-correct image wa ⁇ ing (i.e., by eliminating the division per pixel and replacing it with two divisions per scanline), accurate antialiasing by inco ⁇ orating anisotropic filtering, correction of flaws in Gouraud shading caused by bilinear inte ⁇ olation and optimization of the memory bandwidth by reading each source pixel exactly once.
  • the source image 80 is preferably placed on a three-dimensional (3D) surface and the target image 82 is placed on a screen.
  • a sight ray (or rays) 94 is cast from a viewpoint (or eye point) 96 to 3D space and intersected with the screen 82 and 3D surface 80.
  • the intersection points are the sample points 98.
  • This parallel-preserving (PP) scanline direction exists and is unique for a given perspective transformation. It is to be appreciated that for parallel projections, any scan direction preserves this parallelism on both images, and thus a raster scanline direction may be preferably used due to its simplicity.
  • parallel-preserving (PP) scanlines 84 and 86 are shown in both the source 80 and target 82 images respectively. Once the parallelism property is achieved, pixel access becomes regular, and spatial coherency can be utilized in both images. Additionally, the PP scanline enables the application of a pure incremental algorithm without division to each scanline for calculating the projection of source samples 88. One division is still needed, however, for the two endpoints of every scanline due to the non-linear projection.
  • sample points 90 on the target scanline 86 may not necessarily coincide with the target pixels 91.
  • the sample points 90 can be aligned on the x grid lines 89 of the target image 82, thus the sample points 90 are only off the v grid lines 87 (they are equi-distant along the scanline).
  • placing the sample value in the nearest-neighbor target pixel is a reasonable approximation, as a half pixel is the maximum error.
  • the present invention may perform pixel correction and effective antialiasing, to be described herein below.
  • the forward wa ⁇ ing algorithm of the present invention is preferably performed in two stages: (1) calculating the special parallel-preserving (PP) scanline direction, and (2) forward mapping the source image to the target image along the special PP scanlines, incrementally within each scanline.
  • PP parallel-preserving
  • the parallel-preserving (PP) scanline is the intersection line between the three-dimensional (3D) planar surface and the screen (i.e., target image).
  • the PP scanline must be calculated based on a 2D matrix.
  • a perspective transformation can be presented as
  • (u, v) is the coordinate of the source pixel
  • (x, y) is the coordinate of the target pixel
  • Mis the perspective transformation matrix (u, v) coordinate can be expressed in terms of (x, y) as
  • slope k denotes a line direction
  • B denotes a line intercept.
  • two parallel lines are preferably defined having identical slope k and intercepts B of 0 and 1, represented by point pairs of (0, 0), (1, k) and (0, 1), (1, k + 1), respectively.
  • the coordinates of these points in the source image are then calculated. Since perspective transformation preserves straight lines, these two lines will remain as straight lines in the source image and their slopes can be calculated from two point pairs.
  • an equation in k is preferably obtained. Solving this equation for k results in
  • the second stage of the preferred forward wa ⁇ ing method of the present invention involves scanline processing and is illustrated in Figures 9 and 10 by way of example.
  • the preferred algorithm sweeps the scanlines 84 (e.g., scanlines SI - S4) through the source image 80.
  • the scanlines 84 have the slope k'.
  • the samples 88 along each scanline 84 are preferably incrementally calculated.
  • the projection of the endpoints from the target image onto the source image is calculated.
  • increments are calculated in both the x and the v directions.
  • pixels that have been previously read are preferably buffered so that common pixels are read from the buffer rather than from the source image itself.
  • pixels are preferably read in a fixed pattern, called the pixel read template 100, calculated based on the Bresenham line algorithm (as appreciated by those skilled in the art).
  • the binary digits shown at the bottom of Figure 7 represent one way of encoding the read template 100.
  • this code indicates the increase in the positive v direction; a "0" represents no increase and a "1" denotes an increase by one unit, while u is always increased by one unit.
  • the u axis may preferably be referred to as the primary processing axis. It is preferred that the template 100 always start from the left-most pixel and moves in the vertical direction
  • the buffer size is preferably four scanlines.
  • FIG. 10 A there is illustrated the addressing of samples in the buffer. Whenever the template code value is 1 , the sample decreases by one unit in the v direction.
  • the thick zigzag line 104 represents the output scanline in the buffer.
  • Figure 10B illustrates a preferred procedure for bilinearly inte ⁇ olating one of the samples, s, in this region.
  • the contents of the buffer are preferably updated based on the scanline position.
  • templates 1, 2, 3 and 4 are preferably in the buffer when processing scanline S .
  • the buffer preferably remains the same.
  • template 5 is preferably read into the buffer and template 1 is discarded.
  • template 6 preferably replaces template 2, and so on.
  • one of the features of the unique forward image wa ⁇ ing method of the present invention is the correction of flaws in Gouraud shading.
  • Gouraud shading is a popular intensity inte ⁇ olation algorithm used to shade the surfaces of geometric objects. Given color only at the vertices, Gouraud shading bilinearly inte ⁇ olates the intensities for the entire rasterization of a geometry in a raster scanline order.
  • the flaws of the Gouraud shading approach are known in the art and have been the subject of such articles as, for example, Digital Image Wa ⁇ ing. by
  • the image wa ⁇ ing method of the present invention corrects the perspective distortion in Gouraud shading.
  • the perspective distortion is present because the linear inte ⁇ olation along a raster in screen space is generally non-linear when transformed into geometrical coordinates.
  • Using the special scan direction of the present invention linearity is preserved by the mapping.
  • inte ⁇ olation is linear in both image and geometrical space, thereby fixing the distortion of Gouraud shading. It is to be appreciated that inte ⁇ olation along the edges is still non-linear, and therefore the scanline endpoints must be transformed into geometrical space for correct inte ⁇ olation.
  • the forward mapping algorithm of the present invention preferably generates a target image that is essentially indistinguishable from an image generated using traditional methods.
  • the method of the present invention can preferably calculate the pixel value at exact grid points.
  • a simple target pixel correction scheme may preferably be introduced to perform this correction.
  • a linear inte ⁇ olation of the two samples immediately above and below each pixel is preferably performed. Performing this linear inte ⁇ olation simply as a second pass may increase the cost, since the samples must be read over again. Instead, as each sample is generated, a preferred method of the present invention spreads the contribution of each sample to the corresponding upper and lower pixels with no intermediate buffering. As illustrated by the example of Figure 11, samples 112 located on the thicker inclined scanline 108 contribute to the shaded pixels neighboring them (lighter shading above the scanline, darker shading below the scanline). The arrows indicate that each sample 112 preferably contributes to two pixels. It is preferred that a pixel not be written out until both contributions are collected. Thus, a one scanline buffer is preferably included for storing the intermediate pixel values.
  • a pixel write pattern is preferably pre-calculated.
  • the pixel write template 110 is preferably calculated by truncating the y coordinate value of samples along a scanline.
  • the template 110 is preferably encoded as a series of integer v steps and fractional distances dy from the true scanline 86.
  • the weights used for the final linear inte ⁇ olation are dy and 1 - dy for the upper and lower pixels, respectively. Since all scanlines are preferably one unit apart in the vertical direction (i.e., y direction), the template is calculated only once per projection.
  • the forward image wa ⁇ ing method of the present invention can further improve on image quality by antialiasing.
  • an appropriate resampling filter may preferably be used to avoid aliasing on the upper scanlines.
  • Isotropic filtering results in clearly incorrect and blurry images.
  • the need for anisotropic filters has been addressed in such articles as Survey of Texture Mapping, by P. S. Heckbert, IEEE Computer Graphics and Applications, 6(l l):56-67, November 1986, and more recently in Texram: Smart Memory for Texturing, by A. Schilling, et al., IEEE Computer Graphics and Applications, 16(3):32-41, May 1996.
  • each filter is defined by its footprint and profile. Taking a target sample as a circle, its projection in the source image is its footprint. As illustrated in Figure 12, this footprint 114 should generally be neither circular (i.e., isotropic) nor square-shaped (i.e., as in mip-mapping), but conic in shape.
  • the profile of the filter decides the weights of the contributing pixels within the footprint. Although a sine filter is optimal, a gaussian filter is easier to implement "" and is preferred because of its finite footprint and good low-pass characteristics.
  • the perspective wa ⁇ ing algorithm of the present invention offers more accuracy in calculating the anisotropic footprint, producing higher image quality at a lower cost.
  • the Jacobian must be calculated. Using the image wa ⁇ ing method of the present invention, however, calculation of the Jacobian may be eliminated.
  • the Jacobian J for the generalized transformation is a non-linear function of x and v,
  • the Jacobian is used to determine the footprint of each pixel in the source image and is necessary for anisotropic filtering.
  • the differences between screen pixels in xy raster space are projected into the source image by computing the directional derivatives in the [1, 0] and [0, 1] directions. These derivatives in source image space are called rl and r2, and are defined as
  • These vectors, r, and r 2 define the bounding box of an ellipse that approximates the footprint 114.
  • these vectors 116 and 118 are calculated for every pixel, when needed, for conventional methods of anisotropic filtering (e.g., elliptical weighted average (EWA), footprint assembly). This requires one more division per pixel for calculating C.
  • EWA elliptical weighted average
  • the Jacobian is a linear approximation of the non-linear mapping, it is more accurate, and therefore preferable, to compute the footprint by taking the distances to neighboring samples in source image space. Since the projections of neighboring samples are already computed, this method of the present invention requires no additional division.
  • the parallel-preserving (PP) scan direction provides for greater coherency and no division to compute the Jacobian.
  • the footprint is preferably defined by rf and rf.
  • the directional derivative rf in direction [l, k] along the PP scanline is
  • rf varies linearly along the scanline since it is a function of x, and thus it can be incremented along the scanline.
  • the special scan direction makes it possible to compute the source image coordinates and pixel footprints simply and efficiently.
  • correct anisotropic filtering can be performed using a standard method known by those skilled in the art, such as, for example, Greene and Heckbert's elliptical weighted average (EWA) or Shilling et al.'s footprint assembly.
  • EWA Greene and Heckbert's elliptical weighted average
  • Shilling et al.'s footprint assembly are described, for example, in the text Creating Raster Omnimax Images from Multiple Perspective Views Using the Elliptical Weighted
  • the present invention provides a forward mapping technique in which all source pixels are read once in pixel read template order and subsequently splatted onto the target image with a filter kernel.
  • each source pixel 124 has a ⁇ x 120 and a ⁇ v 122 relative to each of its nearest-neighbor target samples 126.
  • the ⁇ x can be preferably computed incrementally since all samples along a scanline are equi-distant.
  • the special scan direction essentially guarantees that the ⁇ v is constant along each scanline.
  • the actual distances can be estimated preferably by adding a small correction which may be stored in the pixel read template 130 and is preferably uniform among scanlines.
  • the filter kernel is preferably pre-computed once and stored in a lookup table (LUT). Subsequently, the contribution of each source pixel 124 is preferably indexed by its ⁇ x and ⁇ v into the lookup table (LUT) for the four (or more) nearest-neighbor target samples 126.
  • the number of target samples 126 depends upon the footprint of the filter used, and it may preferably vary from four to 16 samples.
  • each source pixel 124 is preferably read exactly once from memory, then four (or more) times modulated by a lookup table entry and accumulated in the target pixel. In this manner, the final pixel value is the weighted average of the nearby source pixels 124. This weighted average requires a division by the sum of the filter weights to normalize each final pixel intensity.
  • 3D volume transformation plays a key role in volume rendering, volume modeling and registration of multiple volumes.
  • rotation generally consumes the most computation time and is considered the most complicated. Accordingly, in providing a universal 3D rendering architecture in accordance with the present invention, several unique methods for performing arbitrary 3D volume rotation are presented, as described in detail herein below.
  • the universal 3D rendering hardware of the present invention may be used without the 3D volume rotation methods described herein, these methods, or algorithms, are preferably implemented in conjunction with the apparatus of the present invention to provide enhanced speed and features and are adapted to most efficiently utilize the apparatus of the present invention.
  • a beam in a volume may be defined as a row of voxels along one major coordinate axis (e.g., an x- beam is a row of voxels in the x direction).
  • a slice of a volume is a plane of voxels which is pe ⁇ endicular to a major axis (e.g., an x-slice is defined as a plane pe ⁇ endicular to the x axis).
  • the present invention further provides novel methods for performing arbitrary 3D rotation, essentially by decomposing the 3D rotations into sequences of different types of shear transformations.
  • a 3D rotation matrix can be expressed as the concatenation of three major axis rotations, R x ( ⁇ ), R y (Q), R Z (CL), where
  • a method for performing two-dimensional (2D) slice shear rotation preferably involves a decomposition of the 3D rotation into a sequence of 2D slice shears.
  • a volume slice i.e., a plane of voxels along a major projection axis and parallel to any two axes
  • a slice may be arbitrarily taken along any major projection axis.
  • Figure 14 illustrates a y-slice shear.
  • a 2D v-slice shear is preferably expressed as:
  • a 2D v-slice shear may preferably be written as S(xz, y, (a, b)), inte ⁇ reted as a shear along the y axis by an amount a 132 in the x-direction and an amount b 134 in the z-direction. Although both a and b are preferably constants, it is further contemplated that a and b can represent functions as well.
  • a 2D x-slice shear, S(yz, x, (c, d)), and a 2D z-slice shear, S(xy, z, (e, j)), are similarly defined. With reference to Figure 14, the volume represented by the solid lines 136 is the shear result of the volume defined by the dotted lines 138.
  • consecutive shears along the same axis produce a conforming shear.
  • a conforming shear For example:
  • shear products may be restricted to products of different shears: S(yz, x, (c, d)), S(xz, y, (a, b)) and S(xy, z, (e, f)).
  • the product matrix of these three shear matrices will still not be in the general form due to a constant 1 in the present matrix.
  • another shear matrix is preferably concatenated, where this final shear is the same slice shear as the first one. This results in the following six permutations of shear sequences:
  • the product matrix of the consecutive shear matrices is preferably computed and set equal to the underlying 3D rotation matrix.
  • the first shear sequence i.e., S(xz, y, (a, b)) S(xy, z, (e, ⁇ ) S(yz, x, (c, d)) S(xz, y, (g, h))
  • the shear matrices for the remaining five slice shear sequences given above may be obtained.
  • the slice shear sequence with the solution given above has the simplest expression and is preferably termed the dominant sequence.
  • a beam shear may be defined as a beam that is merely shifted in its major direction without any change of the other two coordinates.
  • a 2D x-beam shear is preferably expressed as:
  • a 2D x-beam shear may preferably be written as S(x, yz, (c, d)), inte ⁇ reted as a shear along the x axis by an amount a in the x-direction and an amount b in the z- direction.
  • a 2D v-beam shear, S(y, xz, (a, b)), and a 2D z-beam shear, S(z, xy, (e, f)), are similarly defined.
  • Figure 15 illustrates an x-beam shear, wherein the volume represented by the dotted lines 146 is sheared to the volume position represented by the solid lines 144.
  • a two-dimensional (2D) beam shear is advantageous over a 2D slice shear, and is therefore preferred, since a beam is shifted without changing the other two coordinates.
  • the resampling for each pass of the 2D beam shear approach is simpler, as only a linear inte ⁇ olation is required.
  • a 2D slice shear approach requires a bilinear inte ⁇ olation which is more complex.
  • shear products may be restricted to products of different shears: S(x, yz, (c, d)), S(y, xz, (a, b)), S(z, xy, (e.f)).
  • the product matrix of these three shear matrices will still not be in the general form due to a constant 1 in the matrix.
  • another shear matrix is preferably concatenated, where this final shear is the same beam shear as the first one. This results in the following six permutations of shear sequences:
  • a 2D beam-slice shear may preferably be defined as a beam that is shifted within a plane.
  • a 2D x-beam-v-slice shear is preferably expressed as:
  • a 2D x-beam- -slice shear may preferably be written as S((x, yz, (a, g)), (z, y, b)), inte ⁇ reted as a shear along the x axis by an amount a in the v-direction and an amount g in the z-direction, combined with a shear along the z axis by an amount b in the v-direction, where a, g and b are preferably constants.
  • a beam-slice shear is a combination of a beam shear and a slice shear.
  • Figure 16 illustrates an x- beam-v-slice shear, S((x, yz, (a, g)), (z, y, b)), wherein the volume represented by the dotted lines 156 is sheared to the volume position represented by the solid lines 154.
  • shear products may be restricted to products of different shears: v-beam-x-slice shear S((y, xz, (c, h)), (z, x, d)), x-beam-v-slice shear S((x, yz, (a, g)), (z, y, b)), and v-beam shear S(y, xz, (I, ⁇ ).
  • v-beam shear S(y, xz, (I, ⁇ ) As in the case of the slice shear and beam shear approaches, it is to be appreciated that there are also six permutations of beam-slice shear sequences.
  • the product matrix of the consecutive shear matrices is preferably computed and set equal to the underlying 3D rotation matrix.
  • the product matrix of the consecutive shear matrices is preferably computed and set equal to the underlying 3D rotation matrix.
  • shear matrices for the remaining five shear sequences may be obtained in a similar manner.
  • Figure 17 illustrates a fourth method for performing an arbitrary three- dimensional (3D) rotation using 3D beam shear decompositions, according to the present invention.
  • the first pair and the last pair of 2D shears can be merged since there is a common beam in each pair.
  • x beam is a common beam of the -slice and z-slice shears of the first pair. Therefore, the number of shears can be reduced to two by introducing a new definition of a 3D beam shear.
  • Figure 17 illustrates a 3D x-beam shear, which is equal to the concatenation of two consecutive 2D slice shears S(xz, y, (a, b)) S(xy, z, (e, ⁇ ).
  • a 3D z-beam shear represented as S(yz, x, (c, d)) S(xz, y, (a, b)
  • a 3D v-beam shear represented as S(yz, x, (c, d)) S(xy, z, (e, ⁇ ).
  • Every 3D beam shear preferably involves only one major beam.
  • the marked x beam 158 (dark shaded beam) is preferably translated to a new 3D location following the arrows.
  • the lighter shaded beam 158' indicates the intermediate position if the shear decomposition is inte ⁇ reted as two consecutive 2D slice shears.
  • the three 3D beam shears may preferably be denoted as SH ,SH and
  • SH 2D S(yz,x, (c, d)) -S(xz,y, (g, h))
  • an arbitrary 3D rotation preferably involves only two major beam transformations, whereas conventional decomposition approaches require three (e.g., Hanrahan's decomposition).
  • the first pass involves only x beams and the second pass involves only z beams.
  • all voxels of a beam preferably have the same offsets.
  • N 2 beams for an N 3 volume there are only N 2 different offset values. Accordingly, the offset values for N 2 beams can be stored at the end of the first pass, while storing the voxels to their nearest neighbor integral positions.
  • a preferred method of the present invention achieves one pass resampling of a volume.
  • the method of the present invention involves precomputing a sampled volume and then using only zero-order (i.e., nearest neighbor) inte ⁇ olation in each shear pass, thereby distinguishing from known prior art methods which require global communication (e.g., Wittenbrink and Somani's permutation wa ⁇ ing).
  • the method of the present invention Given an original volume (source volume) and the desired rotated volume (target volume), the method of the present invention preferably first builds up a one- to-one correspondence between a source voxel and a target voxel. This one-to-one mapping is guaranteed by the multi-pass shear decomposition of the present invention because each shear is a one-to-one transformation using zero-order inte ⁇ olation. The concatenation of a sequence of one-to-one mapping remains one-to-one.
  • the method of the present invention preferably calculates for each source voxel its corresponding target voxel and stores it in the source voxel position. During this procedure, no global communication is required; the resampling is performed by inte ⁇ olation on the local voxels.
  • the sampling position of each target voxel is preferably computed using a backward transformation of rotation.
  • the method of the present invention preferably shuffles them to their destinations in target volume. Intuitively, this would involve global communication. However, global communication is expensive to perform for parallel implementation. Therefore, the method according to present invention preferably uses multiple shears with a nearest neighbor placement scheme to achieve this voxel shuffling. Since shear is a regular, non-conflict transformation, each pass can be performed more efficiently than if global communication was utilized. Using the 3D beam shear decomposition method of the present invention described herein, only a minimum of two passes of regular local communication are necessary to achieve virtually the same effect as global communication.
  • the method of the present invention preferably calculates the destination coordinates using the same order as that of two consecutive 2D slice shears, but communication is preferably only performed once.
  • the v and z coordinates of each beam are preferably calculated as
  • Perspective projections present inherent challenges, particularly when performing ray casting.
  • sight rays that are cast through a volume dataset maintain a constant sampling rate on the underlying volume data. It is straightforward to set this sampling rate to create an output image of the required quality.
  • the rays do not maintain such a continuous and uniform sampling rate. Instead, the rays diverge as they traverse the volume from front to back. This creates an uneven sampling of the underlying volume, as shown in Figures 18A and 18B.
  • FIG. 18A and 18B conventional ray casting algorithms generally handle ray divergence from perspective projections by one of two methods.
  • the first method is undersampling ( Figure 18 A), in which rays 160 are cast, from a predefined viewpoint 162, so that the sampling rate at the front of the volume 164 is appropriate for the desired image quality.
  • the underlying volume dataset is undersampled. This may result in severe aliasing by creating "holes" in the rear of the volume 166 where regions of voxels remain unsampled.
  • the second method is oversampling ( Figure 18B), in which rays
  • the rays 160 can be cast with a sampling rate between undersampling and oversampling. This results in a tradeoff between the image quality of oversampling and the rendering speed of undersampling. Many prior imaging architectures do not even attempt to perform perspective projections.
  • a ray-splitting method applies the concept of adaptive super-sampling in order to maintain a uniform ray density.
  • a ray is split into two child rays when neighboring rays diverge beyond some predetermined threshold.
  • a method was proposed which divides the viewing frustum into regions based on distance from the viewpoint, such that the ray density in each region is near the underlying volume resolution. Afterwards, such method projects each region onto sub-images and composites them into the frame buffer using texture mapping hardware.
  • the technique casts continuous rays through a region, then at specified boundaries, splits them into a new set of continuous rays. This, however, creates a potential undesired discontinuity between regions.
  • a method for performing perspective projections of uniform regular datasets termed ER-Perspective (exponential regions perspective)
  • ER-Perspective preferably adaptively samples the underlying volume, whereby the above-described problems, inherent in conventional volume rendering systems and methods, are essentially eliminated.
  • the ER-Perspective algorithm combines the desirable properties of both undersampling and oversampling, providing extremely good anti-aliasing properties associated with oversampling methods, while providing runtimes on the order of undersampling methods.
  • this algorithm preferably creates at least one sample for every visible voxel in the volume dataset.
  • ER-Perspective gains a runtime advantage over previous work by utilizing slice-order voxel access, while maintaining equal or better image quality in comparison to known perspective projection methods.
  • Figure 19 is a 2D top view illustration of the ER-Perspective algorithm, in accordance with the present invention.
  • the ER-Perspective algorithm preferably works by dividing a view frustum 168 into a plurality of regions based on exponentially increasing distances along a major projection axis (e.g., z-axis) from a predefined viewpoint 172.
  • a major projection axis e.g., z-axis
  • continuous sight rays 174 are cast from the viewpoint 172 from back-to-front (or front-to-back) through the volume dataset and the rays 174 are merged (or split) once they become too close (or too far) from each other.
  • the ER-Perspective algorithm preferably uses region boundaries 170, which define the exponential regions, to mark the locations where the sight rays 174 are preferably merged.
  • region boundaries 170 which define the exponential regions, to mark the locations where the sight rays 174 are preferably merged.
  • FIG. 20A more clearly illustrates the merging of sight rays at region boundaries 170 for contribution to baseplane pixel B, in particular.
  • an odd number of rays 174 are preferably merged such that the resulting ray 174' is essentially an exact continuation of the previous center ray, thus eliminating potential discontinuities present at the region boundaries 170.
  • This is one important advantage of the method of the present invention over known prior approaches.
  • this algorithm can be qualified by characterizing the filtering achieved when adaptively sampling the volume.
  • FIG. 20B An example of a preferred filtering scheme is shown in Figure 20B.
  • a Bartlett window i.e., linear inte ⁇ olation, triangle filter
  • Cascading efficient local Bartlett windows at each region boundary 170 is essentially the equivalent of resampling the rays 174 with a single large Bartlett filter for each baseplane pixel (see Figure 20 A).
  • a graphical representation of the preferred filter weights 175 is shown for contribution to the baseplane pixels (e.g., pixels A, B,
  • the base sampling rate of the algorithm can be set to a predefined value according to a desired image quality.
  • the base sampling rate is the minimum ray density compared to the underlying volume resolution.
  • a sampling rate of at least one ray per voxel is preferred.
  • the algorithm has the advantage of keeping the ray density between one to two times the base sampling rate. This guarantees that ⁇ no voxels are missed in the rear of the volume dataset and places an upper bound on the total amount of work performed at two times (2x) supersampling.
  • the volume dataset is projected onto a baseplane of the volume which is most pe ⁇ endicular to the view direction.
  • the baseplane image is then wa ⁇ ed onto the final image plane in a conventional manner (e.g., in the same manner as in shear-wa ⁇ or the prior Cube-4 architecture).
  • the ER-Perspective method of the present invention is ideally suited for implementation on the Cube-5 architecture described above. Specifically, this algorithm preferably only requires nearest neighbor communication between processing elements. While processing a row of voxels on a one-dimensional array of processing elements, the algorithm only requires processing elements to communicate with their immediate left and right neighbors.
  • the Cube-5 rendering pipelines similarly support nearest neighbor communication.
  • the ER-Perspective algorithm of the present invention preferably employs slice-order processing along one of the three major axes. Consequently, the regions in the ER-perspective algorithm are defined as slabs of slices along a major projection axis.
  • the volume dataset is projected along slices pe ⁇ endicular to the z-axis. So as not to limit the methods of the present invention to projections along the z-axis only, it is to be appreciated that the coordinate system may be flipped and the geometry rotated.
  • the algorithm proceeds, as illustrated in Figure 7, by measuring the distance along the z-axis, from the viewpoint 86 to the front of the volume dataset 80, is determined (e 2 ). Subsequently, a first region 92 is created to consist of as many z-slices as this distance. Each successive region after the first region 92 is preferably twice as deep as the one before it.
  • the first region When combined with high quality supersampling, the first region is exactly as large as needed to have one ray per voxel at the end of the region when shooting one ray per pixel of the final image.
  • supersampling higher than 2 might be needed in the front of the volume to render high quality close up views.
  • the first region 176 is preferably three voxel units thick, the next region is six voxel units thick, and so on.
  • the 7-th region is preferably e z • 2' slices thick, where e z is the distance from the viewpoint 172 to the front of the volume (see Figure 19).
  • the efficiency of the ray casting algorithm is maximized by providing a mechanism for keeping the ray density between one and two times the underlying volume resolution in each dimension. It also creates a regular topology so that the filtering of the data can be controlled as perspective rays are cast.
  • a mechanism must preferably be provided to adjust the ray density when crossing a region boundary. Since each ray preferably starts on a voxel coordinate at the rear of a region, at the front of the region every second ray in each dimension will preferably coincide directly with a voxel coordinate. The remaining rays preferably intersect the region boundary halfway between two voxel positions.
  • a two-dimensional (2D) Bartlett filter also known as tent or triangle filter
  • Figure 22 shows a 2D example of how sight rays 186 travel through a 7 3 volume 192 when the viewpoint 196 is three voxel units in front of the volume (i.e., from the baseplane 198). Notice that the sampling rate remains between 7 and 14 per slice, and that it increases as the rays 186 travel through the regions from back to front.
  • the number of ray density resampling stages for an N 3 volume is limited by log 2 N, since that is the maximum number of regions in an N 3 volume.
  • the last resampling step shown on the baseplane 198 is preferably performed when the final image wa ⁇ takes place.
  • the rear of the volume dataset 182 does not necessarily always coincide with a region boundary 184.
  • the rays 186 since it is preferred that the rays 186 be on exact voxel coordinates 188 at all of the region boundaries 184, the rays 186 preferably originate on the grid coordinates 190 at the rear of the last region enclosing the volume dataset 192 (shaded area). Therefore, the voxel coordinates and the ray sample locations 194 may not be congruent at the rear of the volume 182. This not only provides the mentioned boundary conditions, but aids with temporal anti-aliasing when the viewpoint 196 is moved in smaller than voxel unit distances, because the rays 186 will continue to originate from the same positions relative to the voxels.
  • Figure 23 depicts a preferred method for performing ER-Perspective back- to-front projection of a volume, in accordance with one form of the present invention, although other embodiments of the ER-Perspective method are contemplated.
  • the distance from the eye or viewpoint to the baseplane is preferably determined (in voxel units).
  • this viewpoint position exponential region boundaries are created.
  • enough regions are preferably established to completely encompass the volume dataset.
  • the algorithm loops through each region from the back to the front, computing normal ray casting, but in a slice-order fashion, and stores the partially computed rays in a compositing buffer. Between regions (i.e., at the region boundaries), ray density re-sampling of the compositing buffer is preferably preformed, as described previously.
  • the baseplane image is then wa ⁇ ed onto the final image plane for display.
  • the ER-Perspective method of the present invention preferably uses regular boundaries for the filtering operations and exact ray placement within the boundaries, it is easier to compute the effective filter achieved by the cascading of local Bartlett filters. This is an important advantage of the ER-Perspective algorithm of the present invention. Additionally, the boundaries and filter of the present invention have preferably been chosen to overcome the poor image quality usually associated with conventional successive filtering of discrete data.
  • the rays 200 that are cast through a region are one voxel unit apart at the rear of the region. However, when the rays reach a region boundary 202 they are preferably filtered using local Bartlett filters.
  • the Bartlett filters (simplified to 1 -dimension) contain the following weights for a kernel of size 2n+l, normalized so that the output has the same scalar range as the input:
  • the present invention preferably employs a two-dimensional Bartlett filter by convolving two one-dimensional Bartlett filters in the two principal directions.
  • the ER-Perspective algorithm preferably always resamples the rays to have half of the original density.
  • Using a filter of size ⁇ 2 rays (n-2) creates a filter kernel of 5x5, or just the following five weights for one dimension:
  • n and r have been omitted since they have a weight of zero in the final filter for pixel A.
  • the resampled partial rays n, o, p, q and r are preferably cast through region 1 where they are again filtered by a local Bartlett filter.
  • the normalized contribution of n, o,p, q and r to pixel A will be:
  • a similar analysis can be used to demonstrate the performance of the algorithm and the result of successive applications of the bilinear inte ⁇ olation.
  • Each sample of a slice preferably contributes the same amount to the final image as any other sample in the same region (assuming all other operations on samples, such as color mapping and compositing, are equal). For example, the value that sample e contributes to pixel A with an effective weight of 1/4 after the cascading of the local Bartlett filters. Likewise, sample I contributes to pixel B with an effective weight of 1/4. Sample/contributes to pixel A with a weight of 3/16 and to pixel B with a weight of 1/16 for a total of 1/4. This can be repeated for samples g and h.
  • the samples to the left of sample e and to the right of sample /partially contribute to pixels left of pixel A and right of pixel B, respectively, such that the sum of their contributions to the final image is also 1/4.
  • every sample that is in this region has the same weight.
  • the weight is 1/4 because this region is the second region in the volume.
  • every sample preferably has a weight of Vi. This is qualifiable by realizing that there are two rays per final image pixel in this region. There are four rays per final image pixel in the second region, etc. Consequently, the weight which determines the contribution of each sample towards the final image is the ratio image pixels samples in this slice
  • the total amount of computation may be analyzed by calculating the amount of work performed on each slice. Assuming that the work done on each sample is the same, the count of the number of samples processed can be used as a comparison of the workloads. For example, in the oversampling method (see Figure 18B), the number of samples on the rear slice of a volume which ends exactly on a region boundary is N 2 . On the front slice, the sample count depends on the geometry of the viewpoint. In particular, using similar triangles and defining e z as the distance from the viewpoint to the front of the volume, the number of samples taken is
  • ER-Perspective algorithm of the present invention processes the following number of samples:
  • the amount of work performed by the ER-Perspective algorithm of the present invention is bounded by o(N) and ⁇ (4N) per slice.
  • a conventional oversampling approach provides a lower bound on the image quality yet the runtime of the algorithm may become much greater than that of the ER-Perspective method of the present invention.
  • a conventional undersampling method provides an upper bound on the runtime for rendering, but the image quality may become much worse than the ER-Perspective approach.
  • FIG. 23 a preferred back-to-front ER-Perspective ray- casting algorithm, in accordance with the present invention, is illustrated.
  • the algorithm of Figure 23 is shown as a pseudo-code representation and assumes a Z- major axis projection.
  • the ER-Perspective algorithm of the present invention does not suffer from the traditional pitfalls when performing perspective projections on uniform regular grids. This unique approach runs faster than oversampling methods and produces better quality images than undersampling methods. Employing a
  • Bartlett filter for ray merging provides an image quality improvement over a conventional box filter.
  • the ER-Perspective algorithm is qualified by characterizing the effective filtering on the input data.
  • a method for rendering a large volume, wherein the volume dataset exceeds the physical single-pass capacity of the Cube-5 apparatus of the present invention.
  • the prefe ⁇ ed method subdivides the volume dataset into a plurality of cuboid bricks.
  • Traversing the bricks in a predefined order preferably enables initialization of the compositing buffer of the Cube-5 apparatus with a baseplane image of a previous brick before rendering it, whereby ray path and compositing are logically extended throughout the entire volume.
  • Information regarding the boundary between bricks is preferably re-read to insure correct sampling. Using this approach, the maximum volume size is limited only by the available intermediate baseplane storage.
  • images of equivalent quality may preferably be rendered using a level-of-detail (LOD) tree, which may be generated, for example, by combining voxels of increasing neighborhood size in a pre-processing step.
  • LOD level-of-detail
  • perspectively rendering a single large volume utilizing LOD preferably only a small portion of the volume, substantially close to the viewpoint, must be read in its highest detail.
  • the more distant portions of the volume, with respect to the viewpoint may then be rendered from lower resolution versions of the data.
  • the frame rate and/or dataset size is preferably increased.
  • each region in the perspective algorithm of the present invention (previously described) will now be at a different LOD, there is no longer need to filter the rays between regions, but merely to redistribute them.
  • the level-of-detail (LOD) method of the present invention may also be used for rendering scenes comprised of multiple objects at differing distances from the viewpoint.
  • a starting LOD is preferably selected that delivers a baseplane image of about the same size as the screen space image, thereby relating rendering time to image resolution and not to object size (i.e., scale independence).
  • the unique LOD method will be described herein in a front-to-back rendering context.
  • Rendering front-to-back it is preferable to start with a slab of the most detailed representation of the volume to be rendered.
  • the thickness of the volume slab is chosen so that projected voxel distances in front and back of the slab differ by a factor of two, similar to perspective projections according to the present invention, as previously described herein.
  • the current compositing buffer image is preferably scaled by a factor of 0.5 in the wa ⁇ unit. This initializes the compositing buffer for the rendering of the next slab of half the resolution.
  • only one slab of each LOD actually flows through the rendering pipelines; thus, for large volumes, only those slabs must be paged into the on-board 3D memory.
  • the apparatus of the present invention can also be employed to speed up off-line computations, such as generation of level-of-detail (LOD) and filtering of datasets.
  • LOD level-of-detail
  • the trilinear inte ⁇ olation unit (TriLin) of the present invention preferably sets all its weights to 0.5. Once new samples become available, they are preferably subsampled and compacted into a new volume, which is the next coarser LOD.
  • the trilinear inte ⁇ olation unit again uses only 0.5 weights; this time, however, data is fed back to the beginning of the rendering pipeline without compaction. Each additional pass creates a new filtered volume with a filter kernel having one more voxel extent in every major axis direction.
  • the apparatus and methods of the present invention preferably provide the flexibility to utilize a full hardware implementation, multi-pass algorithms, and/or a combination of the two, depending on the desired tradeoffs.
  • the full hardware implementations and multi-pass methods preferably provide more accurate computations in two primary functional areas: filtering and inte ⁇ olation.
  • the Cube-4 architecture a predecessor of the present invention (Cube-5), utilizes a central difference gradient filter with only two sample points to estimate each of the x, v and z gradients at a particular location.
  • a larger 3D filter can deliver a more accurate gradient estimate, such as a Sobel filter (which is a 3 3 filter with weights derived from the inverse of the Manhattan distance from the center point).
  • Sobel filter which is a 3 3 filter with weights derived from the inverse of the Manhattan distance from the center point.
  • a straightforward hardware implementation of a 3 3 filter requires 27 multipliers and 26 adders.
  • FIG. 25 illustrates a symmetric approximation of the x-component of the Sobel gradient filter.
  • the weights are preferably applied to the nearest neighbors before summation.
  • the weights presented in Figure 25 will effectively produce the 3 3 symmetric approximation of the Sobel gradient filter (right side of Figure 25). Changing the x-weights to ⁇ 1 w 1 ⁇ will produce an approximation of a Gaussian filter instead.
  • the present invention contemplates higher quality rendering modes in which no additional hardware is needed, but in which the frame rate is lowered.
  • One such example is to achieve larger neighborhood contributions to the gradient estimation by utilizing level-of-detail (LOD) information.
  • LOD level-of-detail
  • the central difference gradient is computed on data of the next coarser LOD, it is effectively the equivalent of employing a 6x4 2 filter, with 6 being the extent in the direction of the current gradient component. Since the apparatus of the present invention (i.e., Cube-5 architecture) is able to hold mip-mapped LOD representations of the data, this filter is preferably achieved with essentially no increase in hardware, beyond the simple central difference solution.
  • Another higher quality multi-pass rendering mode provided by the present invention is an approximation of tri- cubic inte ⁇ olation, which has beneficial applications in the medical field as well as other fields.
  • This mode enables more accurate resampling and iso-position calculation.
  • the present invention preferably decomposes a piecewise 4 3 - voxel filter into a series of linear inte ⁇ olations and extrapolations which is symmetric in every dimension, thereby allowing efficient reuse of intermediate results.
  • a preferred embodiment of the present invention supports clipping by arbitrary planes.
  • the distance from each plane may preferably be incrementally computed using only registers and one adder per plane.
  • the apparatus of the present invention preferably supports extracting an arbitrarily thick slice from the dataset for oblique multi-planar reformatting (MPR) by invalidating all samples lying outside a predetermined offset.
  • MPR multi-planar reformatting
  • Axis-aligned cutting planes are preferably implemented by restricting the volume traversal to the cuboid of interest.
  • the present invention contemplates restricting this traversal to exclude a simple cuboid from the volume (e-g-, visualizing all but one octant of a volume).
  • the present invention further contemplates depth cueing, which modulates the color of objects to simulate, for example, atmospheric attenuation of light through a translucent medium.
  • This phenomenon is termed fog or haze when the medium also contributes some color (e.g., white or gray).
  • normally clear regions are preferably replaced with a semi- transparent color (e.g., black for depth cueing, white for fog) by modifying the transfer function.
  • Each final pixel is preferably further attenuated to account for the distance from the viewpoint to the surface of the volume, preferably implemented as a part of wa ⁇ ing.
  • the apparatus of the present invention additionally supports rendering of super-sampled images with a preferred default super-sampling rate of two in the x and v directions, although other sampling rates are contemplated. To improve image quality further, the sampling rate along each ray can also be increased. Neither approach requires re-reading voxels from the 3D memory.
  • the apparatus of the present invention preferably changes the volume traversal order so that voxels already residing in the buffers will be read out repeatedly. Each time they are reused, new weights are preferably utilized in the trilinear inte ⁇ olation units (TriLin) of the present invention to reflect the new resampling position.
  • TriLin trilinear inte ⁇ olation units
  • central difference gradients are computed between neighbors one distance unit apart to ensure sufficient precision. These gradients are preferably computed by taking the difference first and inte ⁇ olating afterwards or, alternatively, by inte ⁇ olating first and then taking the difference between neighbors k positions apart (assuming k times oversampling), and preferably not immediate neighbors.
  • a classification stage must consider the new inters ample distances when computing a new a' value. Therefore, during super-sampling, the volume will preferably be traversed in an interleaved pattern within each slice. This essentially ensures that a translucent material (gel) keeps its accumulated opacity (RGB ⁇ value) independent of the sampling rate.
  • Anisotropic datasets have different distances between samples along different axes.
  • the gradient computation and the final two-dimensional (2D) image wa ⁇ preferably require axis-dependent scaling factors.
  • the direction in which the sight rays are being cast through the volume dataset preferably require adjustment to account for the implicit volume scaling, which occurs when storing anisotropic data in an isotropic grid.
  • the ⁇ ' value is preferably adjusted according to the direction-dependent distance d which a sight ray travels through a voxel cell.
  • the present invention further provides several unique methods for universal three-dimensional (3D) rendering, including mixing polygons and volumes, voxelization of polygons, rendering multiple overlapping volumes, performing texture mapping and accelerating image-based rendering. These methods are described in greater detail herein below.
  • An important aspect of the present invention is its unique ability to correctly mix geometric objects (i.e., polygons) and volumes in a single image.
  • the apparatus of the present invention i.e., Cube-5) preferably leverages conventional geometry hardware to render opaque and translucent polygons together with the Cube-5 volume rendering pipeline.
  • all opaque polygons are first projected onto a Z-buffer coincident with a predefined baseplane and having sufficient resolution to match the volume sample distance.
  • a determination is preferably made as to which slices of the volume are in front of the polygons for each pixel of the baseplane image.
  • the compositing buffer is then preferably pre-loaded (i.e., initialized) with this projected RGB ⁇ Z (i.e., Z-buffer) image, representing the color and depth image of the polygons.
  • the volume is rendered with z- comparison enabled in the compositing buffer.
  • the depth values of the opaque polygons are checked to keep volume samples which are hidden by opaque polygons from contributing to the final image.
  • the opaque polygons occlude the volume behind, and the volume in front correctly composites over the polygons.
  • the compositing buffer is pre-loaded with the z-buffer image ⁇ C z , Z 2 ⁇ , in accordance with the preferred method of the present invention, where C z represents the value of the geometry sample and Z z represents the depth of the geometry sample from a predetermined viewpoint.
  • the resulting output pixel in the compositing buffer, C out will preferably be equal to the geometry sample value, C z , when the volume sample is behind the geometry (i.e., when the depth of the sample, Z s , is greater than the geometry depth, Z.).
  • the samples are preferably composited using the Porter-Duff o er operator, as appreciated by those skilled in the art.
  • Porter-Duff ⁇ compositing rules are described, for example, in the text Compositing Digital Images, by T. Porter and T. Duff, Computer Graphics (SIGGRAPH 84), vol. 18, no. 3, pp.
  • the resulting output pixel in the compositing buffer, C out will preferably be equal to the volume sample value, C s , over the geometry sample value, C z , when the volume sample is in front of the geometry (i.e., when the depth of the volume sample, Z s , is less than the geometry depth, Z z ).
  • Translucent polygons pose a more complicated situation, since all fragments (both translucent polygon pixels and volume samples) must be drawn in topologically depth-sorted order. This is required because compositing translucent fragments with the over operator is not commutative. Therefore, polygons must be re-depth-sorted whenever the scene or viewing geometry changes. Additionally, the sorting must be topologically correct, including the handling of depth cycles.
  • the present invention adapts polygon rendering to slice order ray casting, and synchronizes the overall rendering process on a volume slice- by-slice basis, rather than a polygon-by-polygon or pixel-by-pixel basis.
  • the Cube-5 apparatus preferably utilizes the geometry pipeline and conventional graphics hardware to render geometric objects in thin slabs that are interleaved or dove-tailed between slices of volume samples 212, as illustrated in Figure 26.
  • each slice of the volume is preferably sampled in planes pe ⁇ endicular to the volume storage axes.
  • the planes are drawn in depth order (e.g., using near and far clipping planes) from farthest from the eye or viewpoint 214 to nearest to the eye. Therefore, to mix translucent polygons with volumetric data, thin slabs of the polygons 210 are preferably rendered and composited in between the slices of volume samples 212. It is to be appreciated that the slabs 210 represent all of the translucent objects which lay between two consecutive slices of the volume sample planes.
  • the boundaries of the slabs are preferably defined such that the union of all rendered slabs 210 neither misses nor duplicates any region (e.g., ( ], ( ], ..., ( ], as shown in Figure 26).
  • the data from the volume slices and the translucent polygonal slabs 210 are dove-tailed together in an alternating fashion. In this manner, the correct depth ordering of all contributing entities is preserved and use of the over operator to composite them creates correct colors in the final image pixels.
  • the opaque polygons are drawn first with Z-buffering.
  • the translucent polygons which lie behind the volume extent are preferably drawn over the opaque polygons using any conventional translucent polygon rendering algorithm (e.g., painters).
  • translucent polygons which lie in front of the volume are preferably drawn after the mixing portion of the algorithm.
  • Polygons which lie depth- wise within the volume boundary, but to the top/bottom/side of the volume are preferably drawn in slice order as if the volume slices were planes that extend to infinity cutting the translucent polygons.
  • OpenGL may be used to directly render the thin slabs of translucent polygonal objects.
  • the polygons are preferably shaded using the Gouraud shading model included in OpenGL.
  • a naive approach would be to render the complete set of translucent polygons for every slab and set the hither and yon clipping planes to cut the current thin slab of data.
  • n 3 volume there could be up to n thin slabs that must be rendered.
  • the present invention contemplates an alternative approach which would involve clipping the polygons to the slab boundaries and only rendering the portions of the polygons within each slab. This would substantially reduce the processing load on the polygon pipeline. However, it would require the application to clip every polygon against the two planes of each thin slab which contains that polygon.
  • the present invention may take advantage of the fact that the two clipping planes 216, 218 are parallel to keep only the portions of the polygons which lie between the planes. While this creates fewer polygons than clipping against each plane separately, it still can increase the triangle count dramatically.
  • the first case occurs when a triangle 220 intersects the thin slab, but no vertices are within the slab boundaries 216, 218. When this occurs, one vertex must be on one side of the slab and the other two vertices on the other side of the slab, thus creating a trapezoid which is decomposed into two triangles.
  • a triangle 222 intersects the slab such that the remaining two vertices lay on the same side of the current slab, creating only one triangle.
  • a triangle 224 intersects the slab such that the remaining two vertices lay on opposite sides of the current slab.
  • a bucket sorting method is applied to the translucent polygons.
  • the present invention preferably creates a bucket for each thin slab between two volume sample planes. All of the translucent polygons in a scene are preferably traversed and each of the polygons is placed in a bucket for each of the slabs it intersects. For example, as shown in Figure 28, triangle Tl is placed in all six buckets since it spans all six slabs S1-S6. Triangle T2 is placed in buckets corresponding to slabs S2 and S3, and likewise for the remaining triangles.
  • An alternative to bucketing is to create an active triangle list similar to the active edge list utilized in scan converting polygons.
  • the triangles may be placed in the active list at the first slice they intersect and removed from the list when they no longer intersect any slices.
  • a data structure is preferably pre-computed which indicates which slice each triangle first encountered. This preprocessing is essentially the same as for bucketing, with the exception that bucketing does not have to check for triangle removal for each slice.
  • One advantage of the method of the present invention is that for applications which choose to trade off image quality in order to maintain a predetermined frame rate, the number of polygons drawn decreases as the number of slices drawn for the volume decreases. This occurs because the interslice size increases as the number of volume slices decreases.
  • the rendering rate achieved is substantially proportional to the number of polygons drawn and the number of volume samples drawn (which is proportional to the number of volume slices drawn). The image quality degradation resulting from this tradeoff affects only the volume data, similar to taking fewer samples in any volume rendering algorithm.
  • the polygons could be drawn in regular processing order with the over operator. While this method may produce the incorrect color, the amount of color error is limited because the polygons are still sorted by bucketing them into thin slabs.
  • Another method for handling two or more translucent polygons is to draw thin slabs of translucent polygons between two volume sample slices as on-the-fly voxelization.
  • conventional voxelization methods when a surface is 3D scan converted into a 3D volume grid, the resolution of the grid is commonly chosen such that the size of a single voxel represents the smallest area that can be discerned by the human eye when it is rendered.
  • the polygons are drawn to screen resolution.
  • the Z dimension it is assumed that the volume is being rendered with enough slices such that each volume sample also represents the smallest area that can be discerned by the human eye. Therefore, each pixel bounded by two volume slices in the Z dimension also represents this small area.
  • a method, performed in accordance with one embodiment of the present invention may be viewed as computing on-the-fly voxelization by utilizing 3D graphics hardware.
  • Voxelization methods combine polygons into a single voxel by using one of two prefe ⁇ ed methods.
  • the first method is to take the max of each color channel.
  • the second method is to take the weighted- max as
  • C pl is the color of a first polygon (polygon 1)
  • D pl is the density of polygon 1
  • C v is the color assigned to the voxel.
  • Many OpenGL implementations allow max blending with glBlendEquationext(gl_max_ext). Assuming that the density is equal to the alpha value (e.g., linear ramp transfer function for volume rendering), then the colors may preferably be weighted by their alpha values before blending by using a glBlendFunc (gl_src_alpha, gl_one).
  • glBlendFunc gl_src_alpha, gl_one
  • the third method of drawing two or more translucent polygons to the same pixel within one thin slab may also be considered the most accurate approach.
  • depth sorting such as BSP tree
  • proper ordering of all translucent polygons within each slab is maintained.
  • Depth cycles are preferably handled by the BSP algorithm by splitting polygons which span a plane used in the partitioning, and eventually one of the polygons in the cycle is used as the partitioning plane.
  • an important feature of the present Cube-5 invention is the unique ability to couple at least one geometry pipeline or engine to the Cube-5 system.
  • two prefe ⁇ ed methods of connecting one or more geometry pipelines to the claimed Cube-5 system on PC-class machines is provided, as described herein below. Both methods allow the unique mixing of opaque and/or translucent polygons with volumetric data.
  • the opaque polygons are preferably rendered such that, after projection through the volume dataset, wa ⁇ ing creates the co ⁇ ect footprint on the final image.
  • the Z-depth values are preferably aligned along the processing axis, so that a volume slice index may be used for the Z-depth check.
  • a preferred method begins by determining a major viewing axis for the current viewing direction. As illustrated in Figure 29, a transformation is preferably applied to the geometry 228 so that the major viewing axis 230 is along, for example, the Z-axis. Next, the view or eye point 232 is moved to be along this direction, preferably by rotating the vector between the look-at point 234 and the eye point 232 by a predefined angle ⁇ around the X-axis and an angle ⁇ around the Y-axis. Preferably, ⁇ and ⁇ are always in a range between -45 and +45 degrees, otherwise a different baseplane would be chosen.
  • a Z-slice shear transformation along X and Y (also known as a "X and Y according to Z" shear) is preferably subsequently applied to the viewing matrix as follows: 1 0 tan ⁇ 0
  • the polygon footprints are "prewa ⁇ ed" so that the wa ⁇ ing operation at the end of Cube-5 rendering creates co ⁇ ect polygons in the final image.
  • the Z-depths computed are preferably proportional to the distances along the processing axis. It is possible (e.g., if all opaque geometry fits within the volume extents) to set the hither and yon clipping planes to the edges of the volume and, if the precision of the depth buffer is the same, the depths computed are exactly the volume slice indexes for depth checking. Otherwise, a simple scaling must be applied when the computed depths are utilized by the volume rendering system. Light positions should be considered when using this method, however, as the shearing may not move the lights to the co ⁇ ect location.
  • the thin slices of translucent polygons preferably align geometrically with their 3D positions in space.
  • the eye point is first aligned as previously described.
  • the Cube-5 volume rendering pipeline 236 of the present invention preferably utilizes a tightly coupled on-chip SRAM buffer 238, termed a composting buffer, to hold the partially composited rays as a volume is processed in slice order.
  • This architecture exploits the regular processing sequence inherent in slice order rendering. Specifically, each slice of the volume 240 is preferably processed in the same order as the previous, left-most voxel to right-most voxel of each row, and bottom-most row to top-most row of each slice (possibly with some skewing).
  • the SRAM composting buffer 238 becomes a simple FIFO queue having a length equal to the size of a slice.
  • the SRAM queue is preferably 32 bits wide to hold 8-bit fixed point RGB ⁇ values (called coxels).
  • Each pipeline 236 preferably reads a coxel from the front of the queue and writes a coxel to the rear of the queue for each clock cycle.
  • conventional PC-class geometry pipelines 242 utilize an external DRAM frame buffer 244, which stores the RGB ⁇ color values and Z-depth values for each pixel.
  • This frame buffer 244 must support random access, since polygon rendering does not enjoy the regular access ordering inherent in slice-order volume rendering. Normal polygon rendering produces triangles on a screen of average between 10 and 50 pixels. Therefore, the DRAM memory is organized to maximize access to areas of the screen of this size.
  • volume slices 246 pe ⁇ endicular to the screen are texture mapped through the volume.
  • the per-vertex geometry calculations for the volume slices 246 are easily achievable with any level graphics hardware.
  • the requirement to support random access to both the texture memory 248 and frame buffer 244 limits the performance of this approach to the fill rate achievable with a DRAM frame buffer.
  • Very high end surface graphics systems typically utilize massive parallelism in the fragment processing section 250 of the polygon pipeline. This, coupled with a highly distributed frame buffer, allow increased fill rate performance.
  • FIG 32 there is shown one embodiment for connecting a geometry pipeline 242 to the Cube-5 volume rendering system 252, according to the present invention.
  • the SRAM composting buffer is preferably removed from inside the Cube-5 pipeline 252 and replaced with an external DRAM frame buffer 254.
  • the memory in the frame buffer of the present invention is preferably organized so that it is specifically optimized for volume rendering.
  • the frame buffer 254 is also preferably accessible from a 3D graphics pipeline 242 to allow mixing of polygonal data 256 with volumes.
  • the dual use frame buffer 254 preferably connects the two pipelines 242, 252.
  • the geometry pipeline 242 first renders all opaque polygons with Z-depth.
  • the volume slices, stored in volume memory 258, and thin slabs of translucent polygons are then rendered in an alternating (e.g., dovetailing) fashion - volume slices by the Cube-5 pipeline 252 and translucent polygons by the graphics pipeline 242 (opaque polygons may also be rendered with the same dovetailing algorithm, but with increased demand on the graphics pipeline).
  • Z-depth checking is preferably utilized to insure co ⁇ ect hidden object removal and blending is set in both pipelines to co ⁇ ectly composite the samples and fragments.
  • the geometry engine 242 preferably performs the final baseplane wa ⁇ required by the Cube-5 system of the present invention.
  • the design of the DRAM buffer 254 is critical to achieve, for example, the 503 Million samples per second required for 30Hz rendering of 256 3 volume datasets.
  • the volume rendering system of the present invention is preferably comprised of multiple Cube-5 pipelines.
  • a coxel compound consisting of RGB ⁇
  • the new coxel is then placed at the rear of the FIFO.
  • the structure of a coxel is changed to contain 32 bits of color, 8 for each RGB ⁇ and 32 bits of Z-depth information, 24 + 8-bit stencil. This configuration is required to handle Z-depth checking in the composting stage. Assuming that opaque polygon rendering is completed before any volume rendering begins, the 32 bits of Z- depth/stencil information is read, but not re-written. Therefore, for every clock cycle, each Cube-5 pipeline needs to read 8 bytes of coxel data and write back 4 bytes.
  • the rendering pipeline of the present invention utilizes memory chips with a word size of 16 bits. Using this configuration, four words must be read by each pipeline every cycle and two words must be written. To do this would require six 16-bit memory interfaces per pipeline.
  • DDR SDRAMs double data rate
  • the present invention can utilize two 16-bit memory interfaces for reading 64 bits of data per clock and one 16-bit memory interface for writing 32 bits per clock, for a total of three 16-bit memory interfaces per pipeline.
  • the present invention preferably reads from one set of frame buffer chips (e.g., set A) 260 and writes to another set
  • the Cube-5 system contemplates reading from set A 260 and writing to set B 262 for a complete slice of the volume, and then swapping for the next slice. With this approach, however, each frame buffer chip set would have to be large enough to hold the complete frame buffer. Furthermore, the polygon engine would have to be instructed as to which set is the current set. Therefore, in a prefe ⁇ ed embodiment, the present invention alternates reading and writing between set A 260 and set B 262 within a frame and buffers the processed coxels from the read set until it becomes the write set.
  • each burst actually lasts four clock cycles and reads/writes four coxels (i.e., eight words) with 16-bit DDR DRAM chips.
  • the Cube-5 system preferably cycles through all 4 banks to keep the memory bandwidth saturated before writing the new RBG ⁇ values back. For this reason, there is preferably a 16-coxel FIFO queue 264 (four coxels for each of four banks) that the newly composited RBG ⁇ portions of the coxels are stored in.
  • each pipeline contains one read interface 266 to the Z-depth/stencil portion 268 of the frame buffer and two read/write interfaces 270 and 272 to set A 260 and set B 262, respectively, of the RGB ⁇ portion of the frame buffer.
  • each of the four pipelines process 125 million voxels per second. Therefore, a 133 MHZ clock is utilized for the chip and the SDRAM.
  • the mapping of the frame buffer pixels onto the memory chips is critical to the performance.
  • Figure 34 shows a prefe ⁇ ed layout of the RGB ⁇ portion of the coxels in the frame buffer.
  • a group of pixels which reside in set A 276 followed by a group of pixels which reside in set B 278, repeated across the entire scanline 274.
  • the length of each set is 64 pixels due to the fact that each set must contain pixels which are read from four different banks inside each chip, and each bank consists of four RGB ⁇ values from four parallel chips/pipelines.
  • the pixel data in the frame buffer is interleaved across eight chips; In fine detail, it is really interleaved across only four chips. This provides an interface which reads
  • the frame buffer sub-system is capable of writing
  • This prefe ⁇ ed connection approach keeps both the graphics pipeline and the volume rendering pipeline working at all times and merges the data in the SRAM compositing buffer inside the Cube-5 chip.
  • the volume rendering pipeline is composting the cu ⁇ ent volume slice with the previous thin slab of polygon data over the compositing buffer and the graphics pipeline is rendering the next thin slab of translucent polygons.
  • the method described herein still utilizes the unique approach of dovetailing volume slices and thin slabs of translucent polygonal data, as previously described herein above.
  • a first step all opaque polygons are projected onto a Z-buffer coincident with the baseplane (e.g., the volume face most parallel to the screen).
  • the projected RGB ⁇ Z image is loaded into the composting buffer of the volume rendering pipeline.
  • the volume is rendered with a Z-comparison enabled in the composting stage.
  • the thin slabs of translucent polygons are preferably rendered by the geometry pipeline, and their co ⁇ esponding RGB ⁇ data is sent to the volume pipeline of the present invention to be blended into the SRAM composting buffer within the volume pipeline.
  • the composting stage of the volume rendering accelerator is modified to composite two layers (one volume and one translucent polygon) per step, thus not delaying the volume rendering process.
  • This requires the addition of some extra logic.
  • the straightforward formula for performing a double composition of a volume sample v over a translucent pixel fragment ? over the old coxel c would require four additions and four multiplies in five stages:
  • C C C V ⁇ V + [ sC p ⁇ p + C c (s.1 - ⁇ )]J(1 - ⁇ v ) y
  • C c (C c + (C p - C c)' ⁇ p) + [ L C V - (C c + (C p - C c)' ⁇ p)] J ⁇ V
  • the hardware designer would choose the option more desirable for a given implementation (i.e., less logic and more stages, or fewer stages and more logic).
  • the present invention preferably uses run- length encoding (RLE) of the blank pixels.
  • RLE run- length encoding
  • each scanline is encoded separately and a "run-of-zeros" is encoded as four zeros (RGB ⁇ ) followed by the length of the run. Since typically only a small percentage of the polygons in a scene are translucent, the translucent polygon slabs will be relatively sparse. Run- length-encoding just the blank pixels in these thin slabs results in over 99% reduction in the required bandwidth.
  • the method of the present invention utilizes RLE on 2D images of sparse translucent polygons to save on bandwidth.
  • Using this prefe ⁇ ed method requires adding hardware to the Cube-5 system of the present invention. Specifically, additional hardware may be included in the volume rendering pipeline that can decode the RLE input stream and create RGB ⁇ fragments. However, since these fragments are utilized by the volume pipeline in a regular order, it is preferable to decode the input stream using a double buffer to synchronize the two pipelines. Every clock cycle, a value is output from the decoding hardware. If the volume rendering machine has multiple pipelines (as most cu ⁇ ent designs do) the decoding hardware is preferably replicated for each pipeline so that they can keep up with pixel demand.
  • RLE hardware at the originating end connected to the geometry pipeline may encode the data in real-time before sending it to the volume pipeline.
  • a separate frame buffer is preferably employed which stores the data directly in RLE format. Since the thin slabs of translucent data are very sparse, more time is spent clearing and reading than is spent rasterizing.
  • An RLE buffer while generally not optimal for rasterization, is well suited for both clearing and reading the data. For example, to clear an RLE frame buffer requires merely storing a single run of zeros (in five bytes) for each scanline, instead of writing an entire 256 2 frame buffer.
  • the RLE frame buffer is preferably implemented using the emerging technology of embedded DRAM and connecting it in parallel to the normal frame buffer. This differs from conventional encoding algorithms which typically assume that the data was given in physical order. Triangle rasterization, however, does not guarantee any ordering of the fragments. Therefore, the apparatus of the present invention must be able to randomly insert an RGB ⁇ value into an RLE scanline of data.
  • Figure 35 illustrates a diagram of an RLE insert, formed in accordance with the present invention.
  • the encoded scanline is copied from one buffer to another, inserting the new RGB ⁇ value.
  • a single flit i.e., either an RGB ⁇ pixel or run-of-zeros
  • the entire scanline is preferably processed flit by flit.
  • an input buffer (“in Buffer") 280 holds the cu ⁇ ent encoded scanline
  • an output buffer (“out Buffer”) 282 holds the newly encoded scanline with the new RGB ⁇ fragment inserted.
  • the choice of what to insert at each cycle is preferably performed by a 5-byte multiplexor 284.
  • the apparatus of the present invention preferably includes pointers, namely "inPtr” 286 and “outPtr” 288, which point to the cu ⁇ ent flit of both the in buffer 280 and out buffer 282, respectively.
  • the logic on the right side of Figure 35 calculates how much has been processed ("Total") 290 and two of the control points ("ctrl_l” and “ctrl_3").
  • the other mux control point (“ctrl_2”) is calculated by 'OR'-ing together all of the RGB ⁇ values (the flag for run-of-zeros).
  • "xPos" is defined as the x position of the fragment.
  • a lookup table is implemented of where the cu ⁇ ent buffer is located in memory for each v value.
  • the buffer can be moved while inserting ⁇ new pixels and the table is simply updated.
  • This prefe ⁇ ed method is illustrated in the RLE_AddFragment pseudo-code routine of Figure 36.
  • the RLE_AddPixelToScanline function demonstrates the processing that occurs in the hardware embodiment of the present invention shown in Figure 35.
  • the present invention takes advantage of the extremely high bandwidth available when processing occurs on the memory chip.
  • the processing is simple enough to be implemented in the DRAM manufacturing process. For example, for a 1280 x 1024 frame buffer, the maximum amount of memory required is 50Mbits. This fits onto eDRAM dies with room for over 3 million gates for the encoding hardware.
  • Figure 37 is a prefe ⁇ ed block diagram illustrating how a polygon pipeline 242 and volume pipeline 252 are connected through the RLE frame buffer 292, which is preferably double-buffered to allow rendering during transmission of data.
  • the auxiliary frame buffer is preferably connected at the same place as the existing one by simply duplicating the fragments, thus not affecting the remainder of the geometry pipeline 242.
  • the volume pipeline 252 also preferably utilizes double buffering to allow receiving of data while blending the previous slab. It is to be appreciated that, using the system of the present invention, volume rendering does not conflict with polygon rendering. Since the volume pipeline 252 always accesses its memory in a repeatable ordered fashion, it achieves the sample fill rate into the frame buffer at a sufficient rate to achieve 30Hz volume rendering.
  • the system of the present invention utilizes the graphics pipeline 242 to render the opaque polygons before rendering the volume, stored in volume memory 258. This can normally be accomplished concu ⁇ ently with the rendering of the volume for the previous frame. Even if the polygon engine must render translucent polygons mixed in with the volume, there is usually enough time to render the opaque polygons before the volume finishes due to the small number of translucent polygons in normal scenes.
  • a method is provided to incrementally voxelize triangles into a volumetric dataset with pre- filtering, thereby generating an accurate multivalued voxelization.
  • Multivalued voxelization allows direct volume rendering with intermixed geometry, accurate multiresolution representations, and efficient antialiasing.
  • Prior voxelization methods either computed only a binary voxelization or inefficiently computed a multivalued voxelization.
  • the method in accordance with the present invention, preferably develops incremental equations to quickly determine which filter function to compute for each voxel value. This prefe ⁇ ed method, which is described in greater detail herein below, requires eight additions per voxel of the triangle bounding box.
  • the present invention preferably employs pre-filtering, in which scalar- valued voxels are used to represent the percentage of spatial occupancy of a voxel, an extension of the two-dimensional line anti-aliasing method conventionally known ( " Filtering Edges for Grayscale Displays, by S. Gupta and R. F.
  • the optimal volume sampling filter for central difference gradient estimation is a one-dimensional oriented box filter pe ⁇ endicular to the surface.
  • the method of the present invention preferably utilizes this filter which is a simple linear function of the distance from the triangle.
  • edge functions are linear expressions that maintain a distance from an edge by efficient incremental arithmetic.
  • the methods of the present invention extend this concept into three dimensions and apply antialiasing during the scan conversion of volumetric triangles.
  • the general idea of the triangle voxelization method of the present invention is to voxelize a triangle by scanning a bounding box of the triangle in raster order. For each voxel in the bounding box, a filter equation is preferably evaluated and the result is stored in memory. The value of the equation is a linear function of the distance from the triangle. The result is preferably stored using a fuzzy algebraic union operator, namely, the max operator.
  • FIG. 38 there is shown a density profile of an oriented box filter along a line 294 from the center of a solid primitive 296 outward, pe ⁇ endicular to the surface 298.
  • the width of the filter is defined as W.
  • the inclusion of a voxel in the fuzzy set varies between zero and one, inclusive, determined by the value of the oriented box filter.
  • the surface 298 of the primitive 296 is assumed to lie on the 0.5 density isosurface. Therefore, when voxelizing a solid primitive 296, as in Figure 38, the density profile varies from one inside the primitive to zero outside the primitive, and varies smoothly at the edge.
  • the density is preferably one on the surface and drops off linearly to zero at distance FFfrom the surface.
  • the present invention similarly contemplates the voxelization of solids, the voxelization of surfaces will be described herein.
  • the optimum value for filter width Wis voxel units see e.g., Object Voxelization by Filtering, by M. Sramek and A. Kaufman, 1998 Volume Visualization Symposium, pp. 111-118, IEEE, Oct. 1998.
  • the normal is preferably estimated by computing the central difference gradient at the 0.5 isosurface. Because the overall width of the central difference filter is at most 2 ⁇ ⁇ units, a co ⁇ ect gradient is found on the 0.5 density isosurface.
  • the thickness of the triangle 300 may be defined as T. Normally, can be zero, unless thick surfaces are desired.
  • T Normally, can be zero, unless thick surfaces are desired.
  • the first step of the present method preferably determines a tight bound for the triangle 300, then inflates it in all directions by S voxel units and rounds outward to the nearest voxels.
  • C Pain C 2 and C 3 may be divided into seven regions (e.g., RI through R7) which must be treated separately.
  • each candidate voxel is tested for inclusion within the seven regions, then filtered with a different equation for each region.
  • the value of the oriented box filter is simply proportional to the distance from the plane of the triangle.
  • the value of the filter is preferably proportional to the distance from the edge of the triangle.
  • the value of the filter is preferably proportional to the distance from the corner of the triangle.
  • the regions RI - R7 are preferably distinguished by their distance from seven planes.
  • the first plane a is preferably coplanar with the triangle and its normal vector a points outward from the page.
  • the next three planes b, c, and d preferably have normal vectors b, c, and d respectively and pass through the corner vertices C, , C 2 , and C 3 of the triangle, respectively.
  • the final three planes e, f, and g are preferably pe ⁇ endicular to the triangle and parallel to the edges; their respective normal vectors, e, f, and g, lie in the plane of the triangle and point inward so that a positive distance from all three planes defines region RI.
  • All of the plane coefficients are normalized so that the length of the normal is one, except for normal vectors b, c, and d which are normalized so that their length is equal to the inverse of their respective edge lengths. In that manner, the computed distance from the plane varies from zero to one along the valid length of the edge.
  • the distance of any point from the surface can be computed using the plane equation coefficients:
  • Dist ' Dist + r ⁇ [A, B, C]
  • the 7-step is more complicated than the -step because it not only steps one unit in the Y direction, but it also steps back multiple units in the X direction.
  • the Z-step combines stepping back in both the X and Y directions and stepping forward one unit in the Z direction. This simple pre-processing step ensures efficient stepping throughout the entire volume. If numerical approximation issues arise, then it is possible to store the distance value at the start of each inner loop and restore it at the end, thereby minimizing numerical creep due to roundoff e ⁇ or in the inner loops.
  • the density value of a voxel is preferably computed with the box filter oriented pe ⁇ endicular to plane a. Given a distance DistA from plane a, the density value Vis computed using:
  • the density is preferably computed using the distance from planes a and b:
  • region R3 uses planes a and c
  • region R4 uses planes a and d
  • Region R5 uses the Pythagorean distance from the corner point Cf.
  • regions R6 and R7 use corner points C 2 and C 3 , respectively.
  • the oriented box filter guarantees accurate filtering of the edges for any polyhedra, provided the union of the voxelized surfaces is co ⁇ ectly computed.
  • the union operator can be defined over multivalued density values V(x) with V AuB ⁇ max(J ⁇ (x), V B (x)) .
  • Other Boolean operators are available.
  • the max operator preserves the co ⁇ ect oriented box filter value at shared edges, and is therefore prefe ⁇ ed.
  • the efficiency of the algorithm of the present invention may be further increased by limiting the amount of unnecessary computation because the bounding box often contains a higher percentage of voxels unaffected by the triangle than affected by it.
  • the bounding box can be made tighter by recursively subdividing the triangle when edge lengths exceed a predetermined constant.
  • the polygons are preferably voxelized into the target volume and rendered in a single pass. If the polygons move with respect to the volume, then voxelization should occur into a copy of the original volume so as not to corrupt the data.
  • the multivalued voxelized polygon voxels may be tagged to distinguish them from volume data. In this manner, polygons can be colored and shaded separately from other data.
  • the prefe ⁇ ed triangle voxelization algorithm described above is efficiently implemented in the distributed pipelines of the Cube-5 volume rendering system of the present invention.
  • This algorithm adds just a small amount of hardware to the existing pipelines and performs accurate multivalued voxelization at interactive rates.
  • One important advantage of the claimed Cube-5 volume rendering algorithm is that the volume data is accessed coherently in a deterministic order. This feature allows orderly scanning of a bounding box for this algorithm.
  • the system of the present invention may preferably include separate pipelines for volume rendering and voxelization. If voxelization can occur in a separate pass, then these volume rendering and voxelization pipelines may be combined, with the voxelization pipeline re-using most of the hardware from the volume rendering pipeline.
  • the setup for each triangle preferably occurs on the host system, in a similar manner as setup is performed on the host system for 2D rasterization.
  • the distances from the seven planes are preferably computed.
  • Seven simple distance units are allocated with four registers for each of the seven planes.
  • one register holds the current distance from the plane and the other three registers hold the increments for the X-, Y-, and Z-steps.
  • Figure 43 shows a distance computation unit 310 for one of the seven planes, formed in accordance with a prefe ⁇ ed embodiment of the present invention.
  • This distance computation unit 310 may be included as part of the distance calculation stage 302 of the pipeline (see Figure 42).
  • the other six units can be essentially identical in design, but hold different values.
  • the pipeline preferably steps in either the X, Y, or Z direction (i.e., performs an X-Step 312, Y-Step 314, or Z-Step 316), thereby updating the cu ⁇ ent distance according to the direction of movement.
  • the hardware for looping through the volume is already present in the volume rendering pipeline and is therefore re-used here to scan the bounding box of the triangle.
  • the resulting values preferably flow down the pipeline.
  • the next pipeline stage 304 then preferably determines in which region the cu ⁇ ent voxel resides.
  • the region selection stage 304 only seven comparators are needed to determine the outcome of the truth table, due to the mutual exclusion of some cases. For instance, in Figure 40, from the negative (lower) side of plane b, it is not necessary to test the distances from plane/or g, depending on the value of the distance from plane e.
  • the next pipeline stage 306 computes the filter function.
  • the cu ⁇ ent voxel 306 of the pipeline is preferably only activated if the cu ⁇ ent voxel is within 5 voxel units of the triangle. Otherwise, the cu ⁇ ent voxel is essentially unaffected by the triangle and different regions require different calculations, ranging from a simple linear expression to a complex Pythagorean distance evaluation. Since hardware ideally must handle all cases equally well, it is prefe ⁇ ed that such hardware be able to perform a square root approximation by means of a limited resolution look up table (LUT). However, the range of inputs and outputs is small, and therefore the size of the required LUT will be small. Furthermore, the Cube-5 hardware of the present invention has several LUTs available for volume rendering which can be re-used for voxelization. Instead of providing three separate units to compute the expression
  • V - 1 - - T/2)IW it is more efficient to roll all the calculations into one LUT.
  • the input is Dist 2 , defined over [0,12]
  • the output is the density value in the range [0,1].
  • the most complex calculation is the corner distance computation of regions R5, R6, and R7 which, in a prefe ⁇ ed embodiment, requires five adders and three multipliers, in addition to the square root LUT previously mentioned.
  • the line distance computations in regions R2, R3, and R4 are simpler, requiring only one adder, two multipliers and the square root LUT.
  • Region RI requires a single multiply to obtain the distance squared, which is the required input to the LUT.
  • the final stage 308 of the pipeline preferably computes the max operation using the cu ⁇ ent voxel value and the computed density estimate.
  • the max operator is simply a comparator attached to a multiplexor such that the greater of the two values is written back to memory. Since most voxels in the bounding box are not close enough to the triangle to be affected by it, memory bandwidth will be saved by only reading the necessary voxels. Further bandwidth savings may be achieved by only writing back to memory those voxels that change the cu ⁇ ent voxel value.
  • the voxel is preferably fetched as soon as possible in the pipeline and the results queued until the memory is received.
  • the final stage 308 is write-back to memory, which can be buffered without worry of dependencies.
  • the present invention thus far has been described outside the context of skewing, which complicates the traversal.
  • the present invention contemplates building skewing into the Y- and Z-step distance update values. Skewing also adds more complexities to the Cube-5 hardware of the present invention. Specifically, when a left-most voxel moves one unit in the Y direction, placing it outside of the bounding box, the pipeline actually takes p - 1 steps in the direction to keep the voxel within the bounding box. Similarly, when the left-most voxel moves one step in the Z direction, it also moves one step in the negative X direction, which is handled in the same way as before.
  • the apparatus of the present invention is preferably adapted to perform skewing by adding fourteen (14) more registers and co ⁇ esponding logic to determine when the pipeline is cu ⁇ ently processing the left-most voxel.
  • Pre-filtering which may be performed in combination with the voxelization methods of the present invention, can be used to optimally generate a series of volumes of different resolutions. This technique is useful for rendering images of different sizes; the size of the volume is preferably chosen to co ⁇ espond to the size of the final image. In this manner, aliasing is avoided at all image resolutions and no unnecessary work is performed rendering parts of a scene not visible at the image scale.
  • Pre-filtering can additionally be used to model motion blur. For example, as an object sweeps past a camera, it sweeps out a complex volume during the time the shutter is open, causing motion blur. To accurately render motion blur, conventional rendering techniques render multiple images and blend them into a single image.
  • the present invention performs the sweeping operation once, during voxelization, so that motion blur can be rendered in the same time as regular volume rendering.
  • This method works well, particularly for certain cases where the motion is constant (e.g., the same direction and/or rotation). For example, consider a helicopter blade which spins at a constant speed during flight. For example, to voxelize the blade spinning at the rate of 5Hz for an animation frame rate of 30Hz, the blade sweeps out an arc of — (2 ⁇ ) each frame.
  • the density value is much lower and the blade appears more transparent than in the center, where it sweeps out a smaller volume and appears more solid.
  • the volume rendering transfer function may be set so that the lower density values appear less opaque and higher density values appear more opaque.
  • a prefe ⁇ ed method performed in accordance with one form of the present invention, multiple objects are combined into one object for a final rendering pass to create the resulting image.
  • the colors from each object are preferably modulated together at each sample location along a projected sight ray. Therefore, it is prefe ⁇ ed that each object be classified and shaded prior to being combined, followed by color modulation. If, alternatively, voxel data were combined first, a new transfer function would be required for each possible combination. This latter approach is therefore not prefe ⁇ ed.
  • a prefe ⁇ ed method for mixing multiple overlapping volumes resamples all but the first object in the z-dimension of the first object so that slices of each object become interlaced.
  • This includes a classification, a shading and a transformation which aligns all objects.
  • Object transformations include translation and scaling, preferably performed by the apparatus of the present invention using nearest neighbor connections, and rotation, which is preferably performed using the rotation methods of the present invention previously described herein above.
  • the present invention contemplates optimizations such as high-level scene graph compilation that can preferably be employed. For instance, static objects are preferably combined once and stored for subsequent rendering, while non-static objects are re-combined each time they are moved with respect to the other objects.
  • Texture mapping is a widely used technique to simulate high-quality image effects, such as surface details, and even lighting and shadows.
  • texture mapping involves mapping a two-dimensional (2D) image onto a three- dimensional (3D) surface. Texture mapping occurs while geometric objects are rasterized onto the screen.
  • the (x, v) pixel coordinates are preferably mapped into (u, v) texture coordinates and an RGB ⁇ value is returned as the color value to use for that pixel on the screen.
  • the mapping from (x, v) to (u, v) coordinates preferably involves simple matrix multiplication, as appreciated by those skilled in the art.
  • the look-up into the image of the (u, v) coordinate to return an RGB ⁇ value is complex.
  • the very large scale integration (VLSI) hardware requirements for the texture lookup commonly consume large portions of today's graphics boards, at a significant cost. This is primarily due to the fact that (u, v) coordinates rarely map directly to a discrete image coordinate, called a texel. Therefore, the neighboring RGB ⁇ values are preferably linearly inte ⁇ olated to produce the RGB ⁇ value at the exact (u, v) coordinate.
  • Mip-Mapping basically consists of storing multiple levels-of-detail (LOD) of an image. Then, when an (x, v) pixel is mapped to a (u, v) texel, the appropriate Mip-Map level texels are chosen so that the pixel is smaller than the texels.
  • LOD levels-of-detail
  • Texture mapping hardware from conventional graphics pipelines has been used to accelerate volume rendering and has been the subject of such texts as RealitvEngine Graphics, by K. Akeley, Computer Graphics (SIGGRAPH 93), 27:109- 116, Aug. 1993. and Accelerated Volume Rendering and Tomographic Reconstruction Using Texture Mapping Hardware, by B. Cabral, N. Cam and J. Foran, Symposium on Volume Visualization, pp. 91-98, Oct. 1994.
  • This conventional approach neither achieves the cost-performance nor supports the various functionalities (e.g., shading) of the present invention.
  • texture mapping is unscalable without data replication, often employs two-dimensional (2D) rather than three-dimensional (3D) inte ⁇ olation, downloads datasets slowly, and/or does not support real-time four-dimensional (4D) input.
  • the Cube-5 apparatus is combined with a conventional geometry engine via the geometry input/output bus 46, 48 (see Figure 4).
  • the rendering pipeline(s) of the present invention are utilized to perform the texture look-up function, while the geometry engine is used for mapping (x, v) pixel coordinates to (u, v) texture coordinates.
  • the responsibility of the geometry engine is essentially to rasterize triangles, while the apparatus of the present invention preferably provides the high performance inte ⁇ olation engine for texture mapping.
  • texel data is preferably loaded into 3D memory included within the Cube-5 unit(s).
  • Figures 6 A and 6B illustrate an example of how 32 bits of texel data for a 2x2 neighborhood are preferably a ⁇ anged in a 2 3 subcube of 16-bit voxels.
  • image-based rendering methods render complex scenes from arbitrary viewpoints based on a finite set of images of that scene.
  • Two similar image-based rendering methods known by those skilled in the art, which use four-dimensional (4D) inte ⁇ olation without requiring the depth information of the source images are light field rendering and Lumigraph.
  • the high-performance inte ⁇ olation engine of the present invention may be used to accelerate these two techniques.
  • Figure 44 shows that in light field rendering, the scene is modeled by uv 322 and st 320 planes. Every uv grid point preferably defines a viewpoint and has an associated st image. For every pixel of the projection plane 324, a sight ray 326 is preferably cast into the uv plane 322. The four st images co ⁇ esponding to the uv grid points su ⁇ ounding the intersection of the sight ray with the uv plane contribute to that ray. The contributions are preferably calculated by casting a sight ray into each st image through its uv grid point. These rays hit between st image pixels and, therefore, a bi-linear inte ⁇ olation must be performed for each st image.
  • Performing lookups for each projection plane ray usually causes random access into the st images. Therefore, in accordance with a prefe ⁇ ed method of the present invention, st images are accessed in object order, which is more appropriately adapted for use with the apparatus of the present invention since the Cube-5 apparatus allows reading of each st image pixel only once.
  • each quadrilateral 328 in the uv plane e.g., abed
  • its projections on the four st planes preferably determine which four tile regions 330 contribute to the final image. All st tile regions 330 are then preferably assembled into four images and are perspectively projected onto the projection plane 324. The final image is subsequently formed by bilinear inte ⁇ olation among the four projected images. Inte ⁇ olation weights are preferably determined by the intersection between the original ray and the uv plane 322.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un appareil et un procédé permettant un traitement de volume en temps réel, et un rendu (10) tridimensionnel universel. Cet appareil comprend plusieurs unités de mémoire tridimensionnelles; au moins un bus à pixels permettant d'établir une communication horizontale globale; plusieurs pipelines de rendu; au moins un bus à géométrie; et une unité de commande. Les pipelines de rendu comprennent chacun, de préférence, un matériel destiné à l'interpolation, l'ombrage, la mise en mémoire FIFO, la communication et des tables de consultation. L'appareil de l'invention peut être couplé à un pipeline à géométrie (18) destiné à mélanger des surfaces, des images et des volumes avec une seule image. Un procédé permettant d'exécuter un lancer de rayon volumétrique d'un volume en 3D, consiste à calculer une distance le long d'un axe principal de projection à partir d'un point de vue prédéfini; à diviser ce volume en plusieurs régions consécutives possédant des liaisons à croissance exponentielle; à lancer plusieurs rayons à travers le volume à partir du point de vue; à fusionner ou à diviser plusieurs rayons au niveau des limites de région; et à répéter les étapes de lancer, de fusion et de division de rayon jusqu'à ce que la totalité du volume ait été traitée. L'appareil et les procédés de l'invention réalisent une vraie performance en temps réel, en une seule opération d'imagerie, comprenant un mappage de texture et un rendu à base d'image.
EP99934066A 1998-07-16 1999-07-16 Appareil et procede permettant un traitement de volume en temps reel et un rendu tridimensionnel universel Withdrawn EP1114400A4 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07120006A EP1890267A3 (fr) 1998-07-16 1999-07-16 Appareil et procédé de traitement du volume en temps réel et rendu en 3D universel

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US9297798P 1998-07-16 1998-07-16
US92977P 1998-07-16
PCT/US1999/016038 WO2000004505A1 (fr) 1998-07-16 1999-07-16 Appareil et procede permettant un traitement de volume en temps reel et un rendu tridimensionnel universel

Related Child Applications (1)

Application Number Title Priority Date Filing Date
EP07120006A Division EP1890267A3 (fr) 1998-07-16 1999-07-16 Appareil et procédé de traitement du volume en temps réel et rendu en 3D universel

Publications (2)

Publication Number Publication Date
EP1114400A1 true EP1114400A1 (fr) 2001-07-11
EP1114400A4 EP1114400A4 (fr) 2006-06-14

Family

ID=22236075

Family Applications (1)

Application Number Title Priority Date Filing Date
EP99934066A Withdrawn EP1114400A4 (fr) 1998-07-16 1999-07-16 Appareil et procede permettant un traitement de volume en temps reel et un rendu tridimensionnel universel

Country Status (5)

Country Link
EP (1) EP1114400A4 (fr)
JP (1) JP2002520748A (fr)
AU (1) AU757621B2 (fr)
CA (1) CA2337530C (fr)
WO (1) WO2000004505A1 (fr)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6906732B1 (en) 1999-12-07 2005-06-14 Nintendo Co., Ltd. Texture morphing process provided by the preferred embodiment of the present invention
GB2361396B (en) * 2000-04-10 2002-04-03 Voxar Ltd Imaging volume data
US7034828B1 (en) * 2000-08-23 2006-04-25 Nintendo Co., Ltd. Recirculating shade tree blender for a graphics system
US7324116B2 (en) 2002-06-20 2008-01-29 Microsoft Corporation Systems and methods for providing controllable texture sampling
US7618948B2 (en) 2002-11-26 2009-11-17 Medtronic, Inc. Devices, systems and methods for improving and/or cognitive function through brain delivery of siRNA
WO2004072907A1 (fr) * 2003-02-13 2004-08-26 Koninklijke Philips Electronics N.V. Systeme de graphisme informatise et procede de rendu d'image de graphisme informatise
EP1616299B1 (fr) * 2003-04-15 2010-01-06 Nxp B.V. Processeur graphique et procede de creation d'images infographiques
WO2005064541A1 (fr) 2003-12-23 2005-07-14 Koninklijke Philips Electronics N.V. Processeur graphique et methode de rendu d'images
CN1981306B (zh) * 2004-05-03 2010-12-08 三叉微系统(远东)有限公司 用于渲染图形的图形管道
EP1800289B1 (fr) * 2004-09-09 2012-05-23 QUALCOMM Incorporated Systeme de deformation d'image monopassage et methode faisant appel a un filtrage anisotropique
US7298372B2 (en) * 2004-10-08 2007-11-20 Mitsubishi Electric Research Laboratories, Inc. Sample rate adaptive filtering for volume rendering
US8963942B2 (en) * 2004-12-03 2015-02-24 Intel Corporation Programmable processor
JP5369526B2 (ja) * 2008-07-28 2013-12-18 株式会社日立製作所 画像信号処理装置、表示装置、録画再生装置、画像信号処理方法
JP5419044B2 (ja) * 2008-12-20 2014-02-19 国立大学法人 東京大学 ボリュームデータの実時間レンダリング方法及び装置
SG184509A1 (en) * 2010-04-12 2012-11-29 Fortem Solutions Inc Camera projection meshes
EP2555166B1 (fr) * 2011-08-01 2019-10-16 Harman Becker Automotive Systems GmbH Paramètre d'erreur d'espace pour constructions et terrains en 3D
JP5559222B2 (ja) * 2012-03-01 2014-07-23 クゥアルコム・インコーポレイテッド 異方性フィルタリングを用いたシングルパス画像ワーピングシステム及び方法
CN102903147A (zh) * 2012-09-18 2013-01-30 深圳市旭东数字医学影像技术有限公司 三维数据的裁剪方法及系统
KR102111626B1 (ko) * 2013-09-10 2020-05-15 삼성전자주식회사 영상 처리 장치 및 영상 처리 방법
US11069021B2 (en) * 2016-07-02 2021-07-20 Intel Corporation Mechanism for providing multiple screen regions on a high resolution display
EP3432581A1 (fr) * 2017-07-21 2019-01-23 Thomson Licensing Procédés, dispositifs et flux pour le codage et le décodage de vidéos volumétriques
US10339704B2 (en) 2017-12-04 2019-07-02 Axell Corporation Image data processing method in image processor and computer readable medium storing program therefor for rendering a texture based on a triangulation pattern
JP6418344B1 (ja) * 2018-02-23 2018-11-07 大日本印刷株式会社 コンピュータプログラム、画像処理装置及び画像処理方法
CN111369661B (zh) * 2020-03-10 2023-03-17 四川大学 一种基于OpenCL的三维体数据可视化并行渲染方法
CN111445549B (zh) * 2020-03-24 2023-05-23 浙江明峰智能医疗科技有限公司 用作gpu并行计算的静态和动态混合模体ct仿真方法
CN113516751B (zh) * 2020-03-26 2023-06-30 网易(杭州)网络有限公司 游戏中云的显示方法、装置以及电子终端
CN113747060B (zh) * 2021-08-12 2022-10-21 荣耀终端有限公司 图像处理的方法、设备、存储介质
CN113887107A (zh) * 2021-10-13 2022-01-04 国网山东省电力公司电力科学研究院 基于数字孪生体的六面体体积计算方法及系统
CN115591240B (zh) * 2022-12-01 2023-04-07 腾讯科技(深圳)有限公司 三维游戏场景的特征提取方法、装置、设备及存储介质
CN117218042B (zh) * 2023-11-09 2024-02-20 广东蛟龙电器有限公司 一种发质类型视觉分析检测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09512937A (ja) * 1994-09-06 1997-12-22 ザ リサーチ ファウンデーション オブ ステイト ユニヴァーシティ オブ ニューヨーク ボリュームを実時間で視覚化する装置及び方法
US5724493A (en) * 1994-12-13 1998-03-03 Nippon Telegraph & Telephone Corporation Method and apparatus for extracting 3D information of feature points
US5877779A (en) * 1995-07-06 1999-03-02 Sun Microsystems, Inc. Method and apparatus for efficient rendering of three-dimensional scenes
JPH09237354A (ja) * 1996-02-29 1997-09-09 Chokosoku Network Computer Gijutsu Kenkyusho:Kk 3次元形状データ転送表示方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PFISTER H ET AL INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS ASSOCIATION FOR COMPUTING MACHINERY: "CUBE-4 - A SCALABLE ARCHITECTURE FOR REAL-TIME VOLUME RENDERING" PROCEEDINGS OF THE 1996 SYMPOSIUM ON VOLUME VISUALIZATION. SAN FRANCISCO, OCT. 28 - 29, 1996, PROCEEDINGS OF THE SYMPOSIUM ON VOLUME VISUALIZATION, NEW YORK, IEEE/ACM, US, 28 October 1996 (1996-10-28), pages 47-54, XP000724429 ISBN: 0-89791-865-7 *
See also references of WO0004505A1 *

Also Published As

Publication number Publication date
AU4998299A (en) 2000-02-07
EP1114400A4 (fr) 2006-06-14
CA2337530A1 (fr) 2000-01-27
CA2337530C (fr) 2007-11-20
AU757621B2 (en) 2003-02-27
JP2002520748A (ja) 2002-07-09
WO2000004505A1 (fr) 2000-01-27

Similar Documents

Publication Publication Date Title
US6674430B1 (en) Apparatus and method for real-time volume processing and universal 3D rendering
AU757621B2 (en) Apparatus and method for real-time volume processing and universal 3D rendering
US7471291B2 (en) Apparatus and method for real-time volume processing and universal three-dimensional rendering
Pfister et al. The volumepro real-time ray-casting system
US7649533B2 (en) Sliding texture volume rendering
Meißner et al. Enabling classification and shading for 3 d texture mapping based volume rendering using opengl and extensions
US6246422B1 (en) Efficient method for storing texture maps in multi-bank memory
US7184041B2 (en) Block-based fragment filtration with feasible multi-GPU acceleration for real-time volume rendering on conventional personal computer
EP0447227B1 (fr) Méthode pour générer des adresses des primitives graphiques mémorisées dans RIP Maps
US6232981B1 (en) Method for improving texture locality for pixel quads by diagonal level-of-detail calculation
US6879328B2 (en) Support of multi-layer transparency
US6512517B1 (en) Volume rendering integrated circuit
US20020000988A1 (en) Rendering lines with sample weighting
US6532017B1 (en) Volume rendering pipeline
US7038678B2 (en) Dependent texture shadow antialiasing
Meißner et al. Interactive lighting models and pre-integration for volume rendering on PC graphics accelerators
EP1890267A2 (fr) Appareil et procédé de traitement du volume en temps réel et rendu en 3D universel
Räsänen Surface splatting: Theory, extensions and implementation
AU3235500A (en) Graphics system having a super-sampled sample buffer with efficient storage of sample position information
EP1209620A2 (fr) Procédé d'amélioration de gradients dans des données graphiques espacées irrégulièrement
Doggett Displacement Mapping and Volume Rendering Graphics Hardware
Carr et al. Real-Time Procedural Solid Texturing
Chen Image-based volume rendering
CLINE COMPUTER GRAPHICS HARDWARE
Weiskopf Visualization of 3D Scalar Fields

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20010104

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

A4 Supplementary search report drawn up and despatched

Effective date: 20060424

17Q First examination report despatched

Effective date: 20070911

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20080605