USRE42638E1 - Resample and composite engine for real-time volume rendering - Google Patents

Resample and composite engine for real-time volume rendering Download PDF

Info

Publication number
USRE42638E1
USRE42638E1 US11305902 US30590205A USRE42638E US RE42638 E1 USRE42638 E1 US RE42638E1 US 11305902 US11305902 US 11305902 US 30590205 A US30590205 A US 30590205A US RE42638 E USRE42638 E US RE42638E
Authority
US
Grant status
Grant
Patent type
Prior art keywords
voxel
block
voxels
system
rendering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US11305902
Inventor
Harvey Ray
Deborah Silver
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rutgers State University of New Jersey
Original Assignee
Rutgers State University of New Jersey
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/08Volume rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/275Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/28Indexing scheme for image data processing or generation, in general involving image processing hardware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/257Colour aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/286Image signal generators having separate monoscopic and stereoscopic modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/388Volumetric displays, i.e. systems where the image is built up from picture elements distributed through a volume
    • H04N13/395Volumetric displays, i.e. systems where the image is built up from picture elements distributed through a volume with depth sampling, i.e. the volume being constructed from a stack or sequence of 2D image planes

Abstract

The present invention is a digital electronic system for rendering a volume image in real time. The system accelerators the processing of voxels through early ray termination and space leaping techniques in the projection guided ray casting of the voxels. Predictable and regular voxel access from high-speed internal memory further accelerates the volume rendering. Through the acceleration techniques and devices of the present invention real-time rendering of parallel and perspective views, including those for stereoscopic viewing, are achieved.

Description

FIELD OF THE INVENTION

The present invention is a system for providing three-dimensional computer graphics. More particularly, the present invention is a system that accelerates the processing of volume data for real-time ray casting of a three-dimensional image and a method thereof.

BACKGROUND OF THE INVENTION

Volume rendering projects a volume dataset onto a two-dimensional (2D) image plane or frame-buffer. Volume rendering can be used to view and analyze three-dimensional (3D) data from various disciplines, such as biomedicine, geo-physics, computational fluid dynamics, finite element models and computerized chemistry. Volume rendering is also useful in the application of 3D graphics, such as Virtual Reality (VR), Computer Aided Design (CAD), computer games, computer graphics special effects and the like. The various applications, however, may use a variety of terms, such as 3D datasets, 3D images, volume images, stacks of 2D images and the like, to describe volume datasets.

As schematically depicted in FIG. 1, a volume dataset is typically organized as a 3D array of samples which are often referred to as volume elements or voxels. The volume dataset can vary in size, for example from 1283 to 10243 samples, and may also be non-symmetric, i.e., 512×512×128. The samples or voxels can also vary in size. For example, a voxel can be any useful number of bits, for instance 8 bits, 16 bits, 24 bits, 32 bits or larger, and the like.

The volume dataset can be thought of as planes of voxels or slices. Each slice is composed of rows or columns of voxels or beams. As depicted in FIG. 1, the voxels are uniform in size and regularly spaced on a rectilinear grid. Volume datasets can also be classified into non-rectilinear grids, for example curvilinear grids. These other types of grids can be mapped onto regular grids.

Voxels may also represent various physical characteristics, such as density, temperature, velocity, pressure and color. Measurements, such as area and volume, can be extracted from the volume datasets. A volume dataset may often contain more than a hundred million voxels thereby requiring a large amount of storage. Because of the vast amount of information contained in a dataset, interactive volume rendering or real-time volume rendering defined below requires a large amount of memory bandwidth and computational throughput. These requirements often exceed the performance provided by typical modern workstations and personal computers.

Volume rendering techniques include direct and indirect volume rendering. Direct volume rendering projects the entire dataset onto an image-plane or frame buffer. Indirect volume rendering extracts surfaces from the dataset in an intermediate step, and these projected surfaces are approximated by triangles and rendered using the conventional graphics hardware. Indirect volume rendering, however, only allows a viewer to observe a limited number of values in the dataset (typically 1-2) as compared to or all of the data values contained therein for direct volume rendering.

Direct volume rendering that is implemented in software, however, is typically very slow because of the vast amount of data to be processed. Moreover, real-time direct (interactive) volume rendering (RTDVR) involves rendering the entire dataset at over 10 Hz, however, 30 Hz or higher is desirable. Recently, RTDVR architectures have become available for the personal computer, such as VolumePro, which is commercially available from RTVIZ, a subsidiary of Mitsubishi Electronic Research Laboratory. VIZARD II and VG-Engine are two other RTDVR accelerators that are anticipated to be commercially available. These accelerators may lower the cost of interactive RTDVR and increase performance over previous non-custom solutions. Moreover, they are designed for use in personal computers. Previous solutions for real-time volume rendering used multi-processor, massively parallel computers or texture mapping hardware. These solutions are typically expensive and not widely available due to, for instance, the requirement for parallel computers. Alternatively, these solutions generate lower quality images by using texture-mapping techniques.

Although accelerators have increased the availability and performance of volume rendering, a truly general-purpose RTDVR accelerator has yet to emerge. Current accelerators generally support parallel projections and have little or no support for perspective projections and stereoscopic rendering. These different projections are illustrated in FIG. 2. Stereoscopic rendering is a special case where two images, generally two perspective images, are generated to approximate the view from each eye of an observer. Stereoscopic rendering typically doubles the amount of data to be processed to render a stereoscopic image. Moreover, current accelerators also require high memory bandwidths that can often exceed 1 Gbyte per second for a 2563 dataset.

Furthermore, these current accelerators are typically either image-order or object-order architectures. An image-order architecture is characterized by a regular stepping through image space and the object-order architecture is characterized by a regular stepping through object space. Image-order ray casting architectures may support algorithmic speed-ups, such as space leaping and early ray termination, and perspective projections. Object-order architectures tend to provide more hardware acceleration and increased scalability. Object-order architectures, however, have not generally provided algorithmic acceleration. The trade-off between these various limitations are typically either (i) good parallel rendering performance and no support for perspective projections or (ii) good algorithmic acceleration and little hardware acceleration and vice versa.

The voxel-to-pipeline topologies of typical image-order and object-order accelerators are shown schematically in FIGS. 3 and 4, respectively. Image-order architectures must access several voxels from a volume memory per processor. This typically causes a bottleneck in achievable hardware acceleration and thereby limits the number of useful processors. For example, as illustrated in FIG. 3, a typical image-order architecture has an 8-to-1 bottleneck for each image-order pipeline. Although algorithmic acceleration for the reconstruction, classification, shading and the composition of the voxels can often increase performance, such an increase in performance is often outweighed by the voxel bottleneck in the memory system, thereby limiting the overall acceleration.

As depicted in FIG. 4, object-order pipelines generally require only one voxel access per processor thereby providing greater hardware acceleration due to the lack of a voxel or a memory bottleneck. Object-order reconstruction of the dataset, however, makes it difficult, if not impossible, to implement algorithmic acceleration or support perspective projections.

Neither image-order nor object-order architectures are general-purpose techniques because of their limitations. For example, image-order architectures only deliver interactive performance for certain types of datasets by relying heavily on algorithmic acceleration. Performance can be extremely sensitive to viewing parameters (and dataset characteristics) potentially causing large fluctuations in performance. On the other hand, object-order architectures yield more consistent performance but typically do not support perspective projections. As a result, these architectures cannot be used for applications that require stereoscopic rendering, virtual reality, computer graphics, computer games and fly-throughs.

Thus, there is a need for a device capable of general-purpose volume rendering performance that supports interactive rendering for both parallel and perspective projections. Furthermore, there is a need for a general-purpose device that supports interactive rendering for stereoscopic displays.

SUMMARY OF THE INVENTION

The present invention is a general-purpose device that supports interactive rendering for parallel and perspective projections and stereoscopic rendering thereof. The general-purpose device is further characterized as a digital electronic system for real-time volume rendering of a 3D volume dataset. A new hybrid ray casting is used to volume render a real-time image from external memory. Volume rendering includes reconstruction, classification, shading and composition of subvolumes or voxels of a volume dataset representing the 3D image. Early ray termination and space leaping accelerate the processing of the voxels by dynamically reducing the number of voxels necessary to render the image. Furthermore, the underlying hardware of the present invention processes the remaining voxels to in an efficient manner. This allows for real-time volume imaging for stereoscopic displays.

The hardware architecture of the present invention supports projection-guided ray casting, early ray termination and space leaping for improved memory usage. The hardware architecture further accelerates the volume rendering due, in part, to regular and predictable memory accessing, fully pipelined processing and space leaping and buffering of voxels to eliminate voxel-refetch.

The incorporation of the projection guided ray casting, including early ray termination and space leaping, and the hardware architecture permit rendering of the image where the rendering is not the critical time-consuming operation. In other words, the present invention can render many volumes in a faster time period than the entire volumes can be read from external memory.

Another aspect of the present invention includes a method for volume rendering an image where there is no substantial refetching of data from external memory. Perspective projections, under certain circumstances, may require a minimal, but non-limiting, refetching of some data. The method includes early ray termination and space leaping accelerations and the processing of voxels in predictable manner in hardware to volume render an image in real-time.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a schematic depiction of a volume dataset for rendering an image.

FIG. 2 is an illustration of different projection useful for rendering an image.

FIG. 3 is a schematic illustration showing voxel-to-pipeline topology or processor of an image-order accelerator.

FIG. 4 is a schematic illustration showing voxel-to-pipeline topology or processor of an object-order accelerator.

FIG. 5 is a schematic illustration of the projection-guided ray casting of the present invention.

FIG. 6 is a conceptual illustration of the projection-guided ray casting of the present invention.

FIG. 7 is a schematic illustration of a frame buffer initialization of the present invention.

FIG. 8 is a schematic overview of the hardware architecture the present invention.

FIG. 9 is a schematic depiction of data flow for processors of the hardware architecture of FIG. 8.

FIG. 10 is a schematic depiction of data flow for controller of the hardware architecture of FIG. 8.

DETAILED DESCRIPTION OF THE INVENTION

The system of the present invention is a digital electronic system, including hardware architecture, for real-time volume rendering of a 3D volume dataset. The system of the present invention maximizes processing efficiency while retaining flexibility of ray casting by selecting image-forming voxels, such as non-transparent and non-occluded voxels, for further processing and minimizing the processing requirements or rejecting non-image-forming voxels, such as transparent or occluded voxels.

Desirably, the system of the present invention (1) sustains burst memory accesses to every voxel, (2) constantly accesses voxels from the memory system, (3) does not fetch voxels from the memory system more than once and (4) allows for early-ray termination and space leaping. Sustaining burst memory accesses to every voxel is accomplished, in part, by having each set of voxels being accessed in a regular manner based on the desired virtual viewing position. The number of voxels in the set is dictated by the minimum burst length required to hide the latency of the dynamic random access memory (DRAM) device. The constant access of voxels requires, in part, that the set of voxels be processed in a predictable order so that the correct voxels can be prefetched from memory. This allows fully pipelined rendering and eliminates delays or idle cycles in the hardware architecture. The elimination of refetching is achieved, in part, by having each voxel's contribution to the image-plane being determined when the voxel is accessed, thereby allowing the voxel to be discarded once it is processed. The last condition requires, in part, that rays be launched independently of each other.

The system of the present invention may be included into a personal computer or similar device. Such a device will also typically contain a screen for viewing the rendered graphic image, and typically contains memory.

As described in further detail herein, the present invention includes projection guided ray casting and hardware architecture for rendering real-time images. The projection guided ray casting further includes early ray termination and space leaping, which are discussed below in further detail.

Projection Guided Ray Casting (PGRC)

The hybrid ray casting of the present invention is described as Projection Guided Ray Casting (PGRC) because it successfully merges the benefits of the object- and image-order processing using hardware acceleration and sample processing acceleration. Required memory-bandwidth and computational-throughput for interactive volume rendering is reduced making it possible to render a dataset faster than the entire dataset can be read from memory.

In traditional ray casting, rays are cast through each pixel on the image-plane. Samples inside of the volumetric dataset are reconstructed and rendered at evenly space intervals along each ray. Image-plane traversal is typically scanline-by-scanline, which gives rise to random memory access of the volume dataset and multiple voxel refetches which typically thrash the volume memory resulting in poor hardware efficiency due to idle memory cycles. Although the overall efficiency of traditional ray casting may possibly be enhanced by algorithmic acceleration, the low hardware acceleration efficiency typically causes the rendering performance to be slower than the reading of the dataset from memory. These aspects of traditional ray casting typically limit its performance.

A schematic and a conceptual illustration of PGRC are shown in FIGS. 5 and 6, respectively. PGRC uses forward projections to enhance the memory performance of the ray casting. The dataset 30 is partitioned into hundreds or thousands of sub-volumes referred to as voxel access blocks 32. Ray casting is applied to rays that penetrate these voxel access blocks 32, when the voxel access-blocks are accessed from memory Since these voxel access blocks 32 are small, they project to a small portion of the image-plane. Only the small groups of rays that penetrate each voxel access block 32 are rendered. The PGRC iterates over each voxel access block 32 with a front-to-back processing thereof until the entire dataset 30 is processed. In PGRC virtually all voxel re-fetch is eliminated.

Forward projections that are used during PGRC may also used during scan-conversion in traditional 3D polygon-based acceleration. Scan-conversion hardware is an integral part of personal computers and workstations. Using a view transformation matrix that maps from object-space to image-space, each vertex can be projected onto the image-plane. The polygon is filled with a color and/or texture (texture-mapping). In PGRC, these conventional scan-conversion computations along with a front-to-back processing of voxel access blocks 32 are used, in part, to eliminate memory thrashing in the ray-casting algorithm.

Referring to FIG. 5, at step 10 a view transformation matrix is computed based on the desired view or perspective. A frame buffer is initialized with the entry-point of each ray into the dataset 30. At step 12, a cubic set of voxels or the voxel access blocks 32 are selected and processed in front-to-back order. Voxel access blocks 32 are a “b×b×b” array of voxels as shown in FIG. 6. At step 14, eight voxels on the corner of a voxel access block 32 are each projected onto the image plane 34 using a view the transformation matrix or forward projectors 36, as depicted in FIG. 6, forming a 2D footprint 40 in image-space. At step 16, the pixel access blocks 42, which contain the forward projected image, is bound to complete the creation of the 2D footprint 40 in image space.

At step 18, rays of backward projectors 38 are then cast through each pixel that lies on or within this 2D footprint 40. At step 20, the segment along each ray that penetrates the voxel access block 32 is computed. Upon exiting the voxel access block 32, the rays are written into a frame buffer 35. The new state (color, opacity, and position) of these rays is stored at step 22 as a pixel inside of the frame buffer 35. The above steps are repeated for each voxel access block 32 in front-to-back order until every voxel access block has been processed.

As depicted in FIG. 7, the initial intersection of each ray 48 with the dataset 30 is stored into frame buffer 35 along with its X, Y and Z increment vector. The opacity and color values are initialized to zero for the entire frame buffer.

Voxel access blocks are processed from front-to-back order to allow early-ray termination. Since front-to-back ordering depends on a particular view position and view direction, which are known prior to rendering, the next voxel access block is prefetched allowing fully pipelined operation in hardware. The direction of projection can be determined from the viewing parameters. It is a vector pointing from the center of projection towards a viewer. The eight corner voxels of each voxel access block 32 are projected onto the image-plane 34. The resulting vertices are mapped into image-space using a view transformation matrix.

The eight projected vertices form a convex region in image-space are then filled using well-known scan-line algorithms. The filling process determines the pixels (i.e., rays) that lie within the 2D footprint 40 of voxel access block. As a result, only the exact rays that are needed are cast.

As discussed above, ray casting is applied to each ray from the true 2D footprint of the voxel access block. In practice, however, clipping regions are projected onto the image-plane instead of the voxel access block boundaries. Clipping regions are a function of the front-to-back ordering and type of projection. Clipping regions represent portions of a projected voxel access block near a projected ray and these clipping regions are processed for image rendering. Clipping regions are both translated by and enlarged so that the clipping region coincides with data in the internal buffers. The clipping regions are enlarged by one to handle reconstruction computations, such as interpolation and gradient computations, in the proximity of an intra-block space.

Each pixel in the frame-buffer contains the state of the ray that penetrates it. Using an increment vector and the sample location of the ray, a segment of the ray is rendered until it exits the voxel block's clipping region. For perspective projections, the clipping region closest to the viewer is accessed first.

Early Ray Termination and Space Leaping

The PGRC algorithm directly supports early-ray termination and space leaping. Both of these are “algorithmic acceleration” techniques because they reduce the amount of computation for rendering an image. Conceptually, early-ray termination selects non-occluded voxels for further processing and rejects occluded voxels from further processing. The dataset is not tested over all samples the viewing parameter dictates supersample. Because of the fully pipelined design, voxel access block memory accesses are overlapped with the processing of another voxel access block; therefore, there is no performance benefit in completing a voxel-block early unless the voxel-block is supersampled. During supersampling, however, the memory system is delayed for a length of time proportional to the sample-to-voxel ratio. Early-ray termination reduces or eliminates these delays.

Using early-ray termination, every voxel access block inside of the dataset is accessed only once. Therefore, the peak performance is equal to the rate at which the entire dataset can be read from memory. Since one goal of the present invention is to render the dataset faster than it can be read from memory, a more aggressive data processing acceleration technique is used that allows the skipping of the memory access to entire voxel access blocks.

Space leaping can provide substantial acceleration for many datasets especially medical datasets where the regions of interest are typically near the center of the volume and there is a lot of empty space. Space leaping skips, or leaps over, transparent regions and requires either explicit or implicit distance information. The dataset is preprocessed and the distance to the next non-transparent sample along a ray is stored for each voxel inside the dataset. Encoding a distance at each voxel requires added memory and preprocessing overhead. In the present invention the additional memory requirements are minimized or reduced. Distances are encoded for a group of voxels thereby reducing the overall leaping distance which lowers memory requirements while only slightly reducing the acceleration achievable through space leaping.

Using implicit distance information, regions inside of the dataset are flagged transparent or non-transparent. When a ray advances to a transparent region, the ray can be quickly incremented through the region, taking into consideration the orientation and size of the region. This method has advantages over explicitly storing distances. For example, this method uses much less memory, for instance a single bit per region. Moreover, preprocessing involves simply comparing each voxel inside of the region to a user-defined threshold and this can be computed on-the-fly. Desirably, implicit distance information is used to leap over empty regions.

The volume data is first rendered as described above. As the dataset is rendered, each voxel contained a voxel access block is compared against a user-defined transparency threshold. If every voxel is below the threshold, then the voxel access block is flagged empty in a small binary lookup-table. This table is called an empty voxel access block table. After the first image is rendered, the table can be applied to subsequent images until the dataset or user-defined transparency threshold is altered. Desirably, the empty voxel access block table is checked before accessing a voxel-block from the volume memory. In order for a voxel access block to be skipped, the voxel access block and its 26 neighbors must be transparent. The 26 neighbors are required to be transparent because of the way voxels are buffered and the clipping regions are translated. If the entire neighborhood of voxels is empty, any ray in the clipping region can be incremented by a dimension, b, of the voxel access block, regardless of the direction of the increment vector. Thus, perspective projections are supported by the present invention. Furthermore, the time to process a voxel access block is reduced. One benefit of this acceleration is that the overhead of computing the empty voxel access block table is completely hidden by useful work.

Hardware Architecture

The hardware architecture of the present invention is called Resample and Composite Engine (RACE) and is a hardware engine for, among other things, accelerating real-time volume rendering of a graphic image by having image-forming voxels available for processing without having to refetch a substantial number of voxels from external memory, such as the memory contained within a personal computer. An overview of the hardware architecture is described below, followed by a description of the data flow for the processors and the controllers of the present invention.

An overview of this hardware architecture is shown schematically in FIG. 8. The hardware architecture 50 contains a control unit 52 and a plurality (p+1) processors 54. Each processor 54 contains a rendering unit 56, a volume memory 58 and a pixel memory 60.

The control unit 52 implements, among other things, object-order projection to control memory accesses to the voxel and pixel memories. The rendering units 56 implement the image-order ray casting, voxel-buffering and clipping. The control unit 52 provides synchronization for each processor 54 and generates memory addresses for all transactions on both the voxel memory bus 62 and the pixel memory bus 64. The volume memory 58 stores the data volume. The pixel memory 60 stores the color and the current state of each ray during the ray casting process.

The RACE architecture partitions the dataset into thousands of subvolumes or voxel blocks. In multiprocessor RACE configurations, each subvolume is equally divided among each processor 54. As the voxels are streamed into the processors 54 from the volume memory 58, they are quickly distributed among processors using local communication. Each processor 54 has a dedicated connection to the volume memory 58. Voxels from other processors are distributed using local neighbor-to-neighbor communication in a circular faction.

With a “p+1” number of processors 54 in the system, after p+1 clock cycles, each processor 54 contains a local copy of the voxel-block. This allows fast random interpolation from high-speed internal SRAM memories. This is important for supersampling and for discrete ray-tracing architectures. Central differences at grid-points are computed on this fixed stream of voxels and stored into a gradient buffer. Alternately, voxels can be stored in a quad-ported SRAM allowing gradients to be computed directly from adjacent samples. This alternate method, however, requires more memory addresses to be generated. The size of the buffer-memory is proportional to the resolution of the voxel-block. Because each voxel gets forwarded to other processors, memory partitioning is not critical and low-order interleaving to distribute the volume may be used. Interleaving allows accesses for each memory module to share a single memory address. Voxel-blocks that have at least 8(p+1) voxels can be stored in contiguous memory locations or interleaved groups of eight voxels between internal memory banks to guarantee peak DRAM memory performance.

The rendered image is written into the pixel-memory 60. Each pixel stores the color, opacity, position and increment vector for a ray that penetrates it. The depth of each pixel in the frame-buffer is approximately twice the depth of pixels used in modern polygon-based accelerators. Modern 3D polygon-based accelerators store color, alpha, z-buffer, and stencil information per pixel using anywhere from 6-8 bytes of data. In the context of volume rendering, doubling the depth of the frame-buffer is reasonable because memory capacity is dominated by the volume buffer. As an example, frame-buffer capacity is typically 4 MB to 16 MB whereas 3D datasets often require 32 MB to1 GB of storage capacity. The current trend in medical and scientific visualization is higher resolution datasets that consistently require over 128 Mbytes of memory storage. In the present invention each pixel memory also responds to a single memory address using low-order image interleaving. The frame buffer is partitioned equally among processors. The least significant bits of the pixel position dictates which processor owns the pixel. Low-order interleaving enhances load balancing between processors because of spatial coherence.

Before rendering starts, the RACE frame buffer is initialized with a color, opacity and the ray's entry position into the volume dataset or at the front-clipping plane. For perspective projections, the increment vector per ray is stored into the frame buffer. A slope-per-ray is only stored for perspective projections. For parallel projections, a register inside of the processor stores the increment vector and is updated once per projection. During shading, 3D accelerators interpolate values across the face of polygons. Typically, a color intensity (Gouraud shading) or a normal (Phong shading) is interpolated. To initialize the frame buffer, the color components of a voxel are assigned to be the actual position of the voxel for use in the Gouraud shading model. For parallel projections, the three visible faces can be then rendered as polygons to initialize the frame-buffer. For perspective projections, the view position is subtracted from each position and normalized to determine the increment vector. Since these calculations are 2D and performed once per projection, they will not cause a bottleneck in the 3D volume rendering performance.

The controller 52 generates addresses for the volume memory 58 and pixel-memory 60. Addresses for the volume memory are determined by the front-to-back ordering of the voxel access blocks and this ordering is based on user-defined viewing parameters. The controller 52 stores the empty voxel access block table that allows skipping of transparent or undesired subvolumes. Before issuing a memory access for a voxel access block, the controller 52 first checks the empty voxel access block table to determine if the block and its 26 neighboring voxel access blocks are transparent. If so, the controller 52 advances to the next voxel access block in front-to-back order and repeats. If the voxel access block or any of its 26 neighbors are not empty, the controller 52 generates the appropriate memory addresses for the DRAM memory.

For each voxel access block, the controller 52 computes a corresponding clipping region based on the front-to-back ordering. The 2D footprint of each clipping-region is determined using the view transformation matrix. The view transformation matrix is applied to each corner of the clipping-region. A bounding box in image-space is computed based on minimum or maximum coordinates thereof or, alternatively, scanconversion can be used to compute a footprint. The footprint is rounded to pixel-block boundaries. The controller 52 issues a memory address for each pixel-block inside of the footprint. The frame buffer responds by delivering an array of pixels. These pixel-tiles can be stored in contiguous memory locations on a DRAM page or interleaved between memory banks such that they can be accessed at the peak speed of the memory system.

The processors 54 perform the image-order ray-casting algorithm, voxel-buffering, and clipping to the local clipping region and global view-frustum. Each voxel from the processor's dedicated pixel memory 60 is streamed into internal buffers. Voxels 64 from other volume memory modules are streamed in from the right neighbor. The processor 54 also forwards voxels 64 to its left neighbor. The entire sub-volume is distributed to each processor 54 in a circular fashion using neighbor-to-neighbor communication. Therefore, each processor 54 receives “p” voxels per clock-cycle, i.e., one from its dedicated memory system and “p−1” from its right-neighbors. Conceptually, this is the same as connecting all memory modules to every processor, however, to limit the fan-out on the memory bus, voxels are forwarded to neighboring processors. This increases the pin-out of the application-specific integrated circuit (ASIC).

Each of the “p” voxels is written to appropriate internal slice or voxel block buffers inside the rendering unit. Voxels are buffered to eliminate duplicate accesses to the volume memory, and this allows for reconstruction near the gaps between voxel blocks. Two slices of voxels are buffered for interpolation and gradient computation in each of the advancing directions. The first slice is necessary to interpolate samples that lie in between adjacent subvolumes. The second slice is needed to interpolate samples on the advancing faces of the previous block. Also, a slice of central difference gradients are buffered. The volume-slice buffers will dominate on-chip storage.

Processor Data Flow

FIG. 9 is a schematic illustration of the data flow for the processors 54. Each processor 54 receives a stream of pixels (rays) 70 from the frame-buffer and queues them in an input queue 72. Each ray 70 entering the input queue 72 is stamped with a tag (pixel-block address) and offset (relative position inside of the pixel-block). Each 2D footprint is delimited by a start-of-footprint (SOF) and end-of-footprint (EOF) flag so that the processor 54 can match clipping-regions to rays (pixels). In addition, a space-leap (SL) flag is used to determine if the ray can skip over the clipping region without rendering. These stamps originate from the controller 52.

Rays read from the input queue 72 are loaded into a new ray register 74. The following fields in the ray register 74 are checked: EOF/SOF flags, opacity threshold, SL flag, and position. EOF/SOF flags are used to synchronize (or switch) clip-regions. The opacity threshold is used to prevent the rendering of occluded samples, i.e., early ray termination. Conversely, the SL flags prevent the rendering of transparent samples. The ray's position is examined to see if it lies within the active clip-region.

Ray's that are not opaque, clipped, or skipped are sent to the accept queue 76 to be rendered all other rays take a second path (or clip path). Along the clip-path, if SL flag is set and the ray-position was not clipped, then the position is incremented (space-leaped) through the clip region. Then, these rays are written to the appropriate line inside of the pixel-cache.

After exiting the accept queue 76, least significant bits from the x-, y-, and z-ray positions are used to address the voxel and gradient buffers. The fractional components are used as weights for the trilinear interpolations. The color, opacity, position and increment vector proceeds through the ray-casting pipeline. A ray interleaving unit 78 interleaves rays from the accept queue 76 onto the inputs of image-order ray caster 77. Ray interleaving is used to eliminate data hazard due to possible feedback in the composition calculation. The ray interleave unit 78 coordinates that two consecutive (or adjacent) samples along the same ray are at the output of the shader stage and the output of the composition stage. This guarantees that two samples along the same ray are blended together

The rendered ray is added into the pixel-cache 82. No cache misses are possible on this path because each ray that is added to the accept queue 76 gets a reserved cache-line. Otherwise, it is not loaded into the accept queue 76 until a cache-line becomes available. Each write-access to a cache-line increments a counter for the corresponding cache-line; it can be determined when the cache-line (i.e., pixel-tile) is complete and ready to be written to the frame-buffer.

Once complete, the entire cache line is serially added to an output queue 83. Then, the valid bit and write counter for the cache-line is cleared. Whenever the output queue 83 is not empty, the processor 52 sends a write-pending flag to the controller. When the pixel-bus becomes inactive, the controller issues a write acknowledge causing the pixel-block to be streamed from the output queue 83 onto the pixel-bus. In a multiple processor configuration, the controller must receive a pending flag from each processor before releasing the pixel-bus. For most of this analysis, the terms pixel and ray are completely interchangeable since only one ray penetrates a given pixel.

The voxel buffer logic is responsible for generating central difference gradients and storing voxels at the correct locations in the internal static-RAMs (SRAM). There are four types of buffer memories: voxel-block, block-slice, beam-slice and volume-slice. One set of buffers store voxels and another set stores central differences at on-grid positions. Central differences are computed as the voxel-block is streamed into the processor. When accessing the buffers for interpolation, gradient buffers and voxel-block buffers respond to a single memory address. Each buffer is an eight-way interleaved SRAM to provide the necessary voxel values to reconstruct the sample value and each component of the gradient in parallel.

Two voxel slices and one gradient slice are buffered in each advancing x, y, and z direction. These buffers are double-buffered to allow access to a previous slice and to update the next slice for subsequent voxel-blocks. Front-to-back ordering proceeds beam-by-beam then slice-by-slice. As a result, these slices will dominate on-chip storage requirements. In general, architectures that seek to eliminate voxel-refetch must buffer slices unless smaller reconstruction kernels are used for samples near a slice boundary.

To reduce memory, the slice of gradients can be eliminated by buffering a third slice of voxels and re-computing central differences for this particular slice. Desirably, the slice of gradients is buffered to simplify computation.

Various methods can be used to remove or reduce the size of the volume-slice buffer, including, but not limited to, storing the volume-slice memory in off-chip memory or pixel memory, rendering the dataset in sections and prebuffering. When the volume-slice memory is stored in the frame-buffer having a wide connection, the volume-slice buffer could be completely eliminated. In the RACE architecture, the pixel interface is wider than the voxel interface (e.g., 16 bytes). Therefore, these slices can be quickly loaded from the pixel memory. Each processor accesses the volume-slice from their dedicated pixel-memory.

To reduce the size of the volume-slice buffers, the dataset can be rendered in sections. The volume-slice buffers are inversely proportional to the number of sections used. Voxels residing on/near a boundary of a section are re-fetched from the volume memory slightly lowering performance. Any face of a voxel-block can potentially lie on the boundary of a section. As a result, the memory accesses to any of the six faces may cross DRAM-page boundaries due to our low-order interleaving scheme. Alternately, the voxel-block can be organized such that boundary block-slices can be retrieved conflict-free from any direction using a skewed memory organization.

Auxiliary voxel-buffers (beam-, block- and volume-slice) may be eliminated by accessing a voxel-block and boundary voxels from neighboring voxel-blocks each time the block is accessed. This method is a prebuffering method because the dataset can be reorganized during a quick preprocessing stage which combines each voxel-block with a surrounding shell of voxels inside of the memory (increasing memory capacity). This creates self-contained blocks that have all of the necessary information to reconstruct samples that lie in a (b+1)×(b+1)×(b+1) subvolume; however, the buffers must be (b+3)×(b+3)×(b+3) in size. Therefore, this method will lower performance by introducing some duplicate memory access to the volume memory, especially for small-blocks. It has the advantage of simplifying internal buffering logic and reducing the number of separately addressable buffers from four to one for the interpolation and gradient memories. These buffers are internally eight-way interleaved.

Moreover, because of the block processing utilized by the RACE architecture, higher-order gradient filters can be used without incurring a performance penalty. Gradient encoding or lookup-table based gradients can also be incorporated into the architecture. The logic that converts the stream of voxels into central differences at on-grid locations can be and replaced by lookup-tables containing gradient components.

After the gradient and interpolation computations, the interpolation value is used to index the classification tables for the red, green, blue and opacity transfer functions. Optionally, the gradient magnitude may be used to modulate the opacity function. This highlights surface boundaries and increases the transparency in homogeneous regions of the dataset. The gradient magnitude computation requires a computationally expensive square root operator. It can be approximated using the norm of the gradient vector or using iterative numerical methods.

The pixel cache serves several purposes, including retiring two rays every clock cycle, i.e., one skipped (or clipped) and one rendered, synchronizing the pixel-blocks with the controller and completing out-of-order pixel-block.

Each ray entering into the RACE pipeline takes one of two paths: accept path (path #1, for rendering) or the algorithmically skipped/clipped path (path #2, little/no processing). Path #1 processes ray segments that are not algorithmically eliminated and lie inside of the clipping-region; therefore, they must be rendered. Each of these rays are loaded into the accept queue 76.

Along the first path, all rays are rendering using the conventional ray-casting algorithm until they exit the clipping-region. Once they exit, rays are written to the current cache-line or the next sequential cache-line, i.e., pixel cache. No cache misses occur along this path; because, a cache-line is reserved before the ray enters path #1 and the cache-line is not discarded until the all rays from the cache-line has been processed.

Path #2 handles two cases: the segment of the ray is algorithmically eliminated (skipped/occluded) or the ray's current xyz position is outside of the voxel-blocks clipping region. Along Path #2, the Clip-and-Add Unit 80 increments the ray's position if the SL flag is set and the ray is inside of the current (space-leapable) clip-region. This adder increments the ray position by a distance of b in the ray's primary direction. This quickly advances the ray through an empty voxel-block. This allows the ray-position to be incremented by another ray-position that is exactly one voxel-block in the major viewing direction along the ray with a single increment. Also, by limiting the norm to be a power of two, each component of the increment vector is scaled using a shift-register.

After exiting the clip-and-add circuitry 80, rays are written to the pixel cache 82. If a cache-hit occurs on the current cache-line, the ray is written at the appropriate address in the cache line. The current cache-line is indicated by a pointer to the cache. This cache utilizes three pointers: two write pointers for the Path #1 (render) and Path #2 (skip/clip). Data is read from the cache from a single read pointer and loaded into the output queue 83. Each pointer increments sequentially through the cache.

The pixel cache 82 is direct mapped to a pointer that indexes the cache and not the pixel address. As a result, only one tag compare is necessary regardless of the size of the cache. No tag comparison is necessary for the read-port of the cache. The read ports cycles through each cache-line waiting the write counter to expire before advancing.

If a cache-miss occurs on the path #2, the clip pointer is incremented by one to the next cache-line. Cache misses can only occur for the first pixel inside of a pixel-block. If next cache-line is marked valid, then the clip logic halts all registers between the Input Queue along the clip-path until the line becomes invalid. Once the line becomes available, the line is marked valid and the ray's tag is stored on the cache-line. Then, the ray's color, position and increment vector are written into the cache. Cache-lines are marked invalid after the full number of write operations have occurred to a single cache-line and the entire cache line has been transferred into the output queue 83. The pixel-block is not retired until the cache-line is indexed by the read pointer. Each ray on the cache-line is then transferred into the output queue 83.

In multiprocessor implementations, the pixel-blocks are evenly partitioned among each processor. The size of the cache-line and the termination write-count are inversely proportional to the number of RACE processors. A benefit of this dual-path approach is that two rays can complete on single clock cycle. Furthermore, it allows the majority of the pixels that lie outside of the true-footprint but within the bounding-box to be clipped without causing additional stalls in the image-order ray casting pipeline.

Because sequential pointers index the cache, pixels from the same pixel-block but residing in different processors are written to the same relative cache-line in the corresponding processor. The sequential read pointer guarantees that pixel-blocks are retired in the same order that they are reserved. This provides synchronization with the controller. As a result, the controller can resynchronize the pixel-blocks among multiple processors before they are written over the pixel-bus. The controller simply waits for each processor to generate a write pending signal. After a cache-line is transferred to the output queue 83, the read pointer is incremented to the next cache-line in a circular fashion.

If the output queue 83 is not empty, a flag is sent to the controller to indicate a write pending status. If the queue is full, a critical write-pending status flag is sent to the controller. Once the controller receives at least a write pending status from each processor and the pixel-bus is inactive, it sends a write acknowledge signal to each processor. In turn, the output queue 83 responds by placing pixels serially onto the pixel-bus in a first-in-first-out (FIFO) sequence.

Controller Data Flow

A dataflow for the RACE controller 52 is illustrated in FIG. 10. Front-to-back ordering generates a sequence of voxel-blocks to be accessed from the DRAM memory. These voxel-blocks can be accessed from memory using one or more volume memory addresses based on the size of the voxel-block, b, the DRAM page-size, and DRAM burst size needed to hide latency. The controller 52 is responsible for setting up both read and write memory transfers to the pixel-memory. As the controller issues memory addresses to the frame-buffer, it records the history of the previous, h, memory addresses in a queue called the history queue 90. The maximum number of pixel-blocks that can be processed (or issued) at a given time limited by either the minimum of the history queue size or the number of pixel-blocks that can be stored in the internal buffers (queues and caches) inside of the RACE processor.

When the history table 92 becomes full, the controller 52 stops processing the footprint until a pixel-block is retired. The history queue 90 generates the correct write address when it is time to retire a pixel-block. The history table 92 prevents the accessing of pixels that are already rendered and is a random access copy of the pixel-block address. Each pixel-block entry in the table has a valid/invalid flag. Before any pixel-block is issued to the pixel-memory controller, the pixel-block address is checked to see if it is already being processed. If so, the RACE controller halts the pixel-block access until a pixel-block is retired. Note that this mechanism can potentially be used to re-issue the pixel-block internally inside of the RACE processor enhancing performance. When the controller acknowledges a write request, one pixel-block entry is simultaneously retired from the history queue 90 and history table 92.

The front-to-back generator is a simple three-digit counter that counts voxel-blocks. Voxel blocks are counted beam-by-beam then slice-by-slice until each block in the data volume has been visited.

If a block is clipped, the block is discarded. As a result, the block does not consume any throughput on the voxel-bus or pixel-bus. If the block is not clipped, the 3D empty block table is checked to determine whether or not the current voxel-block and its 26 neighbors are transparent. If so, the block is flagged as empty. For synchronization purposes, the block is loaded in the volume memory access queue 94 and a DRAM memory access is not generated. Instead, the block's clipping region is forwarded to each processor and it is used to clip space-leaped rays. The empty block is also loaded into the footprint queue 96. Once the block reaches the head of the footprint queue 96, its clipping region is projected onto the image plane.

If the voxel-block is not tagged empty, it is issued to the volume memory controller 98 once it leaves the volume memory access queue 96. The controller waits until previous voxel-block access is complete before issuing the next voxel-block.

As blocks exit the footprint queue 96, they are mapped from object-space (xyz) to image-space (uv) using the view transformation matrix. Once the u and v coordinates are computed for each corner of the voxel-block, the footprint of the voxel-block is computed in image-space. In conventional graphics accelerators, a precise scanline algorithm is used to compute the footprint (i.e., projected area) of primitives in image-space. Alternately, the RACE controller using a simple bounding box approximation of the 2D footprint thereby eliminating the need for scan-conversion hardware. Since each ray must be clipped against the current 3D voxel-block, the true 2D footprint is determined inside the processor. By proceeding center outwards, the controller quickly generates a workload for the RACE rendering pipelines by placing rays with longer paths into the queue first. This leads to less sensitivity to fluctuations on the pixel-bus and fewer wasted clock cycles in the pipeline.

The controller checks handshaking signals from the processor to determine whether or not each processor is ready to receive a pixel-block. This signal indicates the near-full state of the input queue 72. If each processor is not ready, the controller halts the projection unit until each processor is ready. In addition, the history table 92 is checked to determine if the pixel-block is currently in-use by the RACE processors. The history table 92 records all of the pixel-blocks inside of the history queue 90. The history queue 90 keeps the correct ordering of pixel-blocks that are being rendered and provides necessary synchronization for write operations on the pixel-bus. Once each processor indicates a write-pending status, the controller issues a write acknowledge signal when the pixel-bus becomes available. The write request signal indicates that data resides in a processor's output queue 83. Each processor responds by placing pixels onto the pixel-bus. The combination of the history queue 90 and pixel cache 82 provide synchronization for write operations. The sequential read pointer that is used to index the pixel cache 82 guarantees that the pixel-blocks are retired in the same order they are read. Memory addresses from the history queue 90 are used to generate the write address for each pixel write operations. When an address is removed from the history queue 90,the entry is also cleared inside of the history table 92.

The controller 52 is also responsible for generating memory addresses for the frame buffer and the volume memory. Furthermore, the controller 52 keeps each engine operating in a fully pipelined manner.

The following example is provided to further illustrate the architectures and methods of the present invention for real-time volume rendering of images. The example is illustrative only and is not intended to limit the scope of the invention in any way.

EXAMPLES Example 1

The resample and composite engine architecture was simulated in software using a C++ clock cycle simulator. The simulator conservatively assumed that the pixel memory bus operated at the same rate as the voxel memory bus and that the entire dataset lies within the view volume. In practice, embedded DRAM technology can be used for the relatively small pixel memory to enhance performance. Voxel-blocks sizes were varied between 64(43)−32768(323) voxels. Pixel-tiles were sized to accommodate 16 pixels per processor. For example, if 4 processors are simulated a pixel-tile containing 64 pixels are used. This allowed the Resample And Composite Engine to hide the memory latency when accessing the pixel-memory.

Each processor was configured as follows: the Input Queue could store up to 128 rays, the Accept Queue could store up to 16 rays, the Pixel Cache could store 128 rays, and the Output Queue could store up to 128 rays. The auxiliary on-chip storage required less than 10K Byte of memory. Voxel buffers were doubled buffered and required either 256, 2K, 16K or 64K bytes of memory based on the block resolution, b. The internal slice-buffers dominated the on-chip storage and required 448K Bytes for a 2563 dataset.

The Resample And Composite Engine controller required less than 16 K Byte of on-chip storage for the Opaque Voxel Block (OVB) table, Transparent Voxel Block (TVB) table and internal buffers. An 8-entry pixel-address buffer was used to record the pixel-tiles that were being rendered by the resample and composite engine processors. This prevented the reading of stale data from the frame-buffer. The performance of the resample and composite engine architecture was simulated for six different datasets. The datasets were rendered using a plausible classification mapping. For example, CT datasets were rendered with a mapping of soft tissue to a semi-transparent value and bone to an opaque value. For each dataset, 26 (orthogonal, side and diagonal) view positions were used to estimate average rendering performance. The performance was then compared with the Data Access Rate (DAR), which is the peak rate at which the entire dataset can be read from the memory system. These results are presented in the Table 1 below for a single resample and composite engine processor operating at 100 MHz. In this configuration, the resample and composite engine architecture used only

200 MByte second
of volume memory throughput.

From this table, the performance of the resample and composite engine architecture consistently outperformed the DAR rate for 83−323 voxel-blocks when the dataset was larger than 1283. In particular, 83−163 voxel-blocks delivered nearly a 75% increase in performance over the DAR rate with peak performance exceeding 200% (i.e., 3.0 memory efficiency). For small voxel-blocks, the number of pixels per footprint can be greater than the number of voxels inside the voxel-block, therefore, the pixel bus can cause a bottleneck in performance.

A faster pixel interface allowed substantial gains in performance for small voxel-blocks (43−83) whose performance was limited by the pixel throughput. Because embedded DRAM's enable increased pixel memory throughput by a factor of 4 or more, this is a promising result. Each ray (or pixel) read from the frame buffer was also written, therefore, the read and write throughputs were identical. Small voxel-blocks consumed less than the full bandwidth of the volume memory bus because of algorithmically skipped blocks. This feature is exploited in shared memory accelerators, such as accelerated graphics port (AGP), when the dataset is rendered directly from main memory.

The pixel-bus was not limiting performance for larger voxel blocks. Furthermore, the sharing of pixel interfaces between two or more resample and composite engines can be potentially realized with only a small penalty in performance.

The memory efficiency of the resample and composite engine architecture generally increased with an increase in dataset resolution. Comparing the relative memory efficiency of a low resolution 643 dataset and a higher resolution 2563 dataset revealed more than a 100% increase for 83 voxel-blocks, as described in Table 1. This is because large datasets tended to have corresponding larger regions of non-image forming voxels. As a result, expected average performance for a resample and composite engine architecture configured with 83−163 size voxel-blocks to exceed the DAR rate by a factor 3 as dataset resolutions approach 5123. Colossal datasets will offer even more potential for acceleration benefits resulting from the present invention.

TABLE 1
Simulation Results for a Single Pipeline Operating at 100 M Hz
256 × 256 × 128
CT-head (Bone 2563
643 1283 high-opacity, CT-engine CT-head (Bone
Dataset Size Synthetic MRI-head tissue Semi- MRI-head high-opacity, tissue
Voxel-block High-opacity High-opacity semitransparent) transparent High-opacity semitransparent)
(Hz) (Hz) (Hz) (Hz) (Hz) (Hz) (Hz)
Data Access 381.47 47.68 11.92 11.92 5.96 5.96
Rate
 43 243.44 ± 106.70 44.34 ± 18.70 10.01 ± 4.89  7.50 ± 2.71  7.32 ± 3.14 3.39 ± 1.54
 83 403.08 ± 59.95  84.28 ± 16.71 19.27 ± 4.07 17.46 ± 2.69 13.82 ± 2.73 8.81 ± 1.62
163 381.23 ± 0.28  66.20 ± 1.17  15.78 ± 0.55 16.40 ± 0.31 10.39 ± 0.34 9.33 ± 0.26
323 381.46 ± 0.00  47.67 ± 0.02  12.81 ± 0.10 12.11 ± 0.10  6.41 ± 0.04 7.93 ± 0.04

A 2563 MRI dataset with multiple resample and composite engine processors for parallel and perspective projections was also simulated. As expected, perspective projections delivered less performance due to a slight increase in the amount of voxel refetch. By using 83−163 voxel-blocks, 20 Hz (15 Hz) performance was obtained for a 2563×16-bit dataset using only

400 MByte second
(i.e., two 100 MHz processors) of volume memory throughput and two resample and composite engines for parallel (perspective) projections. Extrapolating these results to a 5123 dataset, the resample and composite engine architecture requires only

3.2 GByte second
of volume memory throughput for similar frame rates. Larger algorithmic speedups are expected when the dataset resolution is increased. As a result, the resample and composite engine allows next generation size datasets to be rendered interactively using similar volume memory throughput that other solutions currently use to render smaller datasets. For example, texture mapping engines offer less than 10 Hz for 2563 datasets using more than

3. 2 GByte second
of volume memory throughput. The VG-engine and VIZARD II approaches will require approximately

2 GByte second
bandwidth for similar performance on a smaller dataset. In the RACE architecture, 163 voxel-blocks offer the best combination of scalability and performance when the pixel-bus and voxel-bus operate at the same clock frequency.

Various changes to the foregoing described and shown methods and corresponding structures would now be evident to those skilled in the art. The matter set forth in the foregoing description and accompanying figures is therefore offered by way of illustration only and not as a limitation. Accordingly, the particularly disclosed scope of the invention is set forth in the following claims.

Claims (91)

1. A digital electronic system for real-time volume rendering of a 3D volume dataset comprising:
a data-processing accelerator for reducing a number of voxels for rendering an image in real-time by selecting image-forming voxels that are non-transparent and non-occluded from a projection and by rejecting non-image-forming voxels that are transparent or occluded from the projection, wherein the voxels are a volume dataset of the image to be rendered contained in memory external to the system;
a control unit for forward projecting the 3D volume dataset at regularly spaced voxel positions to determine number of rays to be casted wherein said 3D volume dataset is divided into a plurality of voxel access blocks having a cubic array of voxel;
a processor for ray casting the rays of the image-forming voxels in a front-to-back order to form 2D representation of image planes;
a hardware engine for accelerating the real-time volume rendering by having the image-forming voxels available for processing without having to refetch a substantial number of the voxels from the external memory;
wherein the real-time image is rendered from the image-planes formed from the selected voxels.
2. The system of claim 1 wherein the projection is a parallel projection.
3. The system of claim 1 wherein the projection is a perspective projection.
4. The system of claim 1 wherein the projection is a stereoscopic projection.
5. The system of claim 1 wherein the ray casting includes early-ray termination and space leaping for selecting the image-forming voxels, wherein the image-forming voxels are non-occluded voxels and early-ray termination substantially avoids oversampling of the occluded voxels.
6. The system of claim 1 wherein ray casting includes space leaping for selecting the image-forming voxels, wherein the image-forming voxels are non-transparent voxels and space leaping substantially avoids overprocessing of transparent voxels.
7. The system of claim 1 wherein the hardware engine further comprises volume memory for storing a local copy of a small subset of the data volume defining the voxels, a rendering unit for implementing the ray casting of the stored data volume and pixel memory for storing output ray data from the rendering unit from which the real time image is to be rendered.
8. The system of claim 1 wherein the hardware engine includes at least two processors and a controller synchronizes the processors.
9. The system of claim 8 wherein the data volume of neighboring voxels is distributed between the at least two processors.
10. The system of claim 9 wherein data volume from one processor is distributed in a circular fashion to the other processor for interpolating image-cast rays.
11. The system of claim 10 wherein the volume memory is a high-speed internal static or dynamic random access memory and each processor has a dedicated connection the high-speed internal static or dynamic random access memory.
12. The system of claim 11 wherein the image can be rendered from the hardware engine faster than all of the voxels in the volume dataset can be read from the external memory.
13. The system of claim 1 further comprising a personal computer containing the external memory.
14. The system of claim 1 further comprising a screen for viewing the rendered real-time image.
15. A method for rendering a real-time image comprising:
retrieving a volume dataset from external memory;
subdividing the volume dataset into a plurality of voxel access blocks, wherein said voxel access blocks are a cubic array of voxels;
storing the voxel access blocks in high-speed internal memory;
forward projecting the voxels located at the corners of the block to determine number of rays to be casted, wherein said corner voxels correspond to a position of said block;
ray casting the rays in a front-to-back order to form a two-dimensional representation therefrom;
reducing a number of the voxels for rendering an image in real-time by selecting non-transparent voxels and non-occluded voxels and by rejecting transparent voxels or occluded voxels wherein the voxels are the volume dataset of the image to be rendered contained in said external memory;
processing the selected voxels to form pixels in a plurality of processors having interleaved memories for processing and distributing the voxels thereamong without having to refetch the voxels from the external memory; and
rendering a real-time image therefrom.
16. The method of claim 5 further including wherein the step of reducing the number of voxels further includes early-ray termination for selecting the non-occluded voxels to substantially avoid oversampling of occluded rays.
17. The method of claim 16 wherein the step of reducing the number of voxels further includes space-leaping to substantially avoid the overprocessing of the transparent voxels.
18. The method of claim 16 further including processing the pixels and the voxels in high-speed internal random access memory to render the image therefrom faster than the step of retrieving the volume data set from the external memory.
19. A method for rendering a real-time image comprising:
retrieving a volume dataset from external memory;
forward projecting the volume dataset at regularly spaced voxel positions to compute number of rays/pixels to be casted, wherein the dataset is divided into plurality of voxel access blocks having cubic array of voxels;
ray casting the rays/pixels in front-to-back order visiting all voxel access blocks except for transparent or occluded blocks without having to refetch the voxels from the external memory to form a 2D representation of image planes, wherein said image planes is a calculation of color, opacity and position of the rays/pixels.
20. A method for rendering a real-time image comprising:
retrieving a volume dataset from external memory;
subdividing the volume dataset into a plurality of voxel access blocks;
storing the voxel access blocks in high-speed internal memory;
forward projecting the voxels located at the corners of the block to determine number of rays to be casted, wherein said corner voxels correspond to a position of said block;
ray casting the rays in a front-to-back order to form a two-dimensional representation therefrom;
reducing a number of the voxels for rendering an image in real-time by selecting non-transparent voxels and non-occluded voxels and by rejecting transparent voxels or occluded voxels wherein the voxels are the volume dataset of the image to be rendered contained in said external memory;
processing the selected voxels to form pixels in a plurality of processors having interleaved memories for processing and distributing the voxels thereamong without having to refetch the voxels from the external memory; and
rendering a real-time image therefrom.
21. A system for rendering a volume dataset, wherein the volume dataset includes a plurality of voxel blocks, wherein each of said voxel blocks includes two or more voxels, the system comprising:
one or more rendering units;
a first memory configured to store said plurality of voxel blocks;
a control unit, wherein, for each of said plurality of voxel blocks, said control unit is configured to:
identify, by performing a forward projection, a portion of a frame buffer corresponding to the voxel block;
determine whether the voxel block is selected for transfer from said first memory to said one or more rendering units, wherein said determination is based upon whether said voxel block is transparent and whether said voxel block is occluded relative to a current viewing position; and
transfer the voxel block from the first memory to said one or more rendering units in response to said determination indicating that the voxel block is selected for transfer;
wherein, for each voxel block, said one or more rendering units are configured to process, in front-to-back order, a set of rays passing through the corresponding portion of the frame buffer, and wherein said one or more rendering units are configured to terminate processing of rays determined to be occluded.
22. The system of claim 21, wherein the control unit is configured to perform said identification according to a front to back ordering of the voxel blocks.
23. The system of claim 21, wherein said performing the forward projection is based on a parallel projection, a perspective projection, or a stereoscopic projection.
24. The system of claim 21, wherein a first of the one or more rendering units is configured to determine whether a ray is occluded by comparing an opacity value of the ray to an opacity threshold.
25. The system of claim 21, wherein a first of the one or more rendering units is configured to perform space leaping on at least one of the rays of the set of rays in response to an indication that a current one of the voxel blocks and voxel blocks neighboring the current voxel block are transparent.
26. The system of claim 21, wherein the first memory comprises one or more volume memories coupled respectively to the one or more rendering units, wherein the plurality of voxels are partitioned among the one or more volume memories.
27. The system of claim 26, wherein each of the voxel blocks is partitioned among the one or more volume memories.
28. The system of claim 27, wherein each of the one or more rendering units is configured for circular distribution of voxels among the one or more rendering units.
29. The system of claim 21, wherein the frame buffer is partitioned among one or more pixel memories coupled respectively to the one or more rendering units.
30. The system of claim 29, wherein the control unit is further configured to transfer blocks of rays between the frame buffer and the one or more rendering units.
31. The system of claim 30, wherein the rays of each block of rays is distributed among the one or more pixel memories so that each of the one or more rendering units processes a corresponding portion of the rays in each block of rays.
32. The system of claim 21, wherein the one or more rendering units are configured to interpolate samples along the rays of said set of rays based on voxels of the transferred voxel block.
33. The system of claim 21, wherein a first of the one or more rendering units is configured to compute gradients from voxels of the transferred voxel block.
34. The system of claim 21 further comprising a personal computer containing the first memory.
35. The system of claim 21 further comprising a screen for viewing an image stored in the frame buffer.
36. The system of claim 21, where the frame buffer represents a rendered image of the volume dataset.
37. The system of claim 21, wherein, for each of the voxel blocks, the control unit is configured to issue blocks of rays to the one or more rendering units starting from a center of said portion of the frame buffer.
38. The system of claim 21, wherein a first of the one or more rendering units includes a ray caster unit, wherein the ray caster unit is configured to operate on rays by performing calculations including one or more of the following types of calculations: reconstruction, classification, shading, composition.
39. The system of claim 38, wherein the ray caster unit is configured to perform composition calculations, and wherein the first rendering unit further includes a ray interleave unit configured to interleave rays of said set of rays in order to prevent feedback in said composition calculations performed in the ray caster unit.
40. The system of claim 21, wherein the volume dataset is a computed tomography (CT) dataset or a magnetic resonance imaging (MRI) dataset.
41. The system of claim 21, wherein the volume dataset represents geophysical information.
42. The system of claim 21, wherein the volume dataset describes one or more properties of a fluid or of a chemical system.
43. The system of claim 21, wherein the system is a 3D graphics system.
44. The system of claim 21, wherein the system is a computer aided design (CAD) system.
45. The system of claim 21, wherein said determination includes determining that the voxel block is not selected for transfer based on information indicating that the voxel block is occluded relative to the current viewing position.
46. The system of claim 21, wherein said determination includes determining that the voxel block is selected for transfer based on information indicating that the voxel block is not occluded relative to the current viewing position and information indicating that the voxel block is not transparent.
47. The system of claim 21, wherein said determination includes determining that the voxel block is selected for transfer based on information indicating that said voxel block is transparent, information indicating that the voxel block is not occluded relative to a current viewing position, and information indicating that neighboring voxel blocks of said voxel block are transparent.
48. A system for rendering a volume dataset, wherein the volume dataset includes a plurality of voxel blocks, wherein each of said voxel blocks includes two or more voxels, the system comprising:
one or more rendering means for performing rendering computations;
a first means for storing said plurality of voxel blocks;
a control means for:
identifying, by performing a forward projection, a portion of a frame buffer corresponding to each of the voxel blocks;
determining whether the voxel block is selected for transfer from said first means to said one or more rendering means, wherein said determination is based upon whether said voxel block is transparent and whether said voxel block is occluded relative to a current viewing position; and
transferring the voxel block from the first means to said one or more rendering means in response to said determination indicating that the voxel block is selected for transfer;
wherein said one or more rendering means comprise means for:
processing, in a front-to-back order, a set of rays passing through the portion of the frame buffer, and
terminating the processing of rays determined to be occluded.
49. The system of claim 48, wherein a first of said one or more rendering means includes a first buffer for buffering two slices of voxels.
50. The system of claim 49, wherein the first rendering means includes a second buffer for buffering one slice of gradient data.
51. The system of claim 48, where the frame buffer is configured to store data representing a two-dimensional array of pixels, wherein each pixel defines a corresponding ray relative to the viewing position, wherein the stored data for each pixel includes a color, an opacity and a position.
52. The system of claim 51, wherein the stored data for each pixel also includes an increment vector.
53. The system of claim 48, wherein said determination includes determining that the voxel block is not selected for transfer based on information indicating that the voxel block is occluded relative to the current viewing position.
54. The system of claim 48, wherein said determination includes determining that the voxel block is selected for transfer based on information indicating that the voxel block is not occluded relative to the current viewing position and information indicating that the voxel block is not transparent.
55. The system of claim 48, wherein said determination includes determining that the voxel block is selected for transfer based on: information indicating that said voxel block is transparent, information indicating that the voxel block is not occluded relative to a current viewing position, and information indicating that neighboring voxel blocks of said voxel block are transparent.
56. A method for rendering a volume dataset, wherein the volume dataset includes a plurality of voxel blocks, wherein each of said voxel blocks includes two or more voxels, the method comprising:
a computer system storing the plurality of voxels in a first memory;
for each of the voxel blocks:
the computer system identifying, by performing a forward projection, a portion of a frame buffer corresponding to the voxel block;
the computer system determining whether the voxel block is selected for retrieval from said first memory, wherein said determining is based upon whether said voxel block is transparent and whether said voxel block is occluded relative to a current viewing position; and
the computer system retrieving the voxel block from the first memory in response to said determination indicating that the voxel block is selected for retrieval;
processing, in front-to-back order, a set of rays passing through the corresponding portion of the frame buffer; and
the computer system terminating processing of rays determined to be occluded.
57. The method of claim 56, wherein each of the voxel blocks is retrieved from the first memory at most once per frame.
58. The method of claim 56, wherein said identifying the portion of a frame buffer corresponding to each of said voxel blocks is performed according to a front-to-back ordering of the voxel blocks.
59. The method of claim 56 further comprising:
displaying an image from the frame buffer.
60. The method of claim 56 further comprising:
determining that a ray is occluded by comparing an opacity value of the ray to an opacity threshold.
61. The method of claim 56 further comprising:
performing space leaping on at least one of the rays of said set of rays in response to a determination that the voxel block and a plurality of neighboring voxel blocks are transparent.
62. The method of claim 56, wherein said determining includes determining that the voxel block is not selected for retrieval based on information indicating that the voxel block is occluded relative to the current viewing position.
63. The method of claim 56, wherein said determining includes determining that the voxel block is selected for retrieval based on information indicating that the voxel block is not occluded relative to the current viewing position and information indicating that the voxel block is not transparent.
64. The method of claim 56, wherein said determining includes determining that the voxel block is selected for retrieval based on information indicating that said voxel block is transparent, information indicating that the voxel block is not occluded relative to a current viewing position, and information indicating that neighboring voxel blocks of said voxel block are transparent.
65. A volume rendering controller configured to:
access stored information to determine whether a block of voxels is selected for retrieval from a memory, wherein said stored information includes at least information specifying whether said block is transparent and information specifying whether said block is occluded relative to a current viewing position;
determine, by performing a forward projection, a portion of a frame buffer corresponding to the block;
output a clipping region of the block;
control a transfer of the block from the memory onto a first bus in response to a determination that the block is selected for retrieval.
66. The volume rendering controller of claim 65 further configured to:
control a transfer of pixel tiles in the corresponding portion of the frame buffer onto a second bus.
67. The volume rendering controller of claim 65 further configured to:
generate a space-leap flag for the block based on an examination of said information, wherein the space-leap flag indicates whether space-leaping is to be performed on one or more rays associated with said portion of the frame buffer; and
output the space leaping flag for the block.
68. The volume rendering controller of claim 65, wherein the volume rendering controller is further configured to determine that the block is not selected for retrieval based on the information indicating that the block is occluded relative to the current viewing position.
69. The volume rendering controller of claim 65, wherein the volume rendering controller is further configured to determine that the block is selected for retrieval based on the information indicating that the block is not occluded relative to the current viewing position and the information indicating that the block is not transparent.
70. The volume rendering controller of claim 65, wherein the volume rendering controller is further configured to determine that the block is selected for retrieval based on: the information indicating that said block is transparent, the information indicating that the block is not occluded relative to a current viewing position, and additional information indicating that blocks of voxels neighboring said block are transparent.
71. A method comprising:
accessing stored information to determine whether a block of voxels is selected for retrieval from a memory, wherein said stored information includes at least information specifying whether said block is transparent and information specifying whether said block is occluded relative to a current viewing position;
determining, by performing a forward projection, a portion of a frame buffer corresponding to the block;
outputting a clipping region of the block;
controlling a transfer of the block from the memory onto a first bus in response to a determination that the block is selected for retrieval.
72. The method of claim 71 further comprising:
controlling a transfer of pixel tiles in the corresponding portion of the frame buffer onto a second bus.
73. The method of claim 71 further comprising:
generating a space-leap flag for the block based on an examination of said information, wherein the space-leap flag indicates whether space-leaping is to be performed on one or more rays associated with said portion of the frame buffer; and
outputting the space leaping flag for the block.
74. The method of claim 71 further comprising:
determining that the block is not selected for retrieval based on the information indicating that the block is occluded relative to the current viewing position.
75. The method of claim 71 further comprising:
determining that the block is selected for retrieval based on the information indicating that the block is not occluded relative to the current viewing position and the information indicating that the block is not transparent.
76. The method of claim 71 further comprising:
determining that the block is selected for retrieval based on: the information indicating that said block is transparent, the information indicating that the block is not occluded relative to a current viewing position, and additional information indicating that blocks of voxels neighboring said block are transparent.
77. A medical imaging system for rendering a volume dataset, wherein the volume dataset includes a plurality of voxel blocks, wherein each of said voxel blocks includes two or more voxels, the system comprising:
one or more rendering units;
a first memory configured to store said plurality of voxel blocks;
a control unit, wherein, for each of said plurality of voxel blocks, said control unit is configured to:
identify, by performing a forward projection, a portion of a frame buffer corresponding to the voxel block;
determine whether the voxel block is selected for transfer from said first memory to the one or more rendering units, wherein said determination is based upon whether said voxel block is transparent and whether said voxel block is occluded relative to a current viewing position; and
transfer the voxel block from the first memory to said one or more rendering units in response to said determination indicating that the voxel block is selected for transfer;
wherein, for each voxel block, said one or more rendering units are configured to process, in front-to-back order, a set of rays passing through the corresponding portion of the frame buffer, and wherein the one or more rendering units are configured to terminate processing of rays determined to be occluded.
78. The medical imaging system of claim 77, wherein the volume dataset is a medical information dataset.
79. The medical imaging system of claim 77, wherein said determination includes determining that the voxel block is not selected for transfer based on information indicating that the voxel block is occluded relative to the current viewing position.
80. The medical imaging system of claim 77, wherein said determination includes determining that the voxel block is selected for transfer based on information indicating that the voxel block is not occluded relative to the current viewing position and information indicating that the voxel block is not transparent.
81. The medical imaging system of claim 77, wherein said determination includes determining that the voxel block is selected for transfer based on: information indicating that said voxel block is transparent, information indicating that the voxel block is not occluded relative to a current viewing position, and information indicating that neighboring voxel blocks of said voxel block are transparent.
82. A system for rendering a volume dataset, wherein the volume dataset includes a plurality of voxel blocks, wherein each of said voxel blocks includes an array of voxels, the system comprising:
a plurality of rendering units;
a first memory configured to store said plurality of voxel blocks;
a control unit, wherein, for each of said plurality of voxel blocks, said control unit is configured to:
identify, by performing a forward projection, a portion of a frame buffer corresponding to the voxel block;
determine whether the voxel block is selected for transfer from said first memory to at least one of the plurality of rendering units, wherein said determination is based upon information regarding whether said voxel block is transparent and information regarding whether said voxel block is occluded relative to a current viewing position; and
transfer the voxel block from the first memory to said at least one rendering unit in response to said determination indicating that the voxel block is selected for transfer;
wherein, for each voxel block, said at least one rendering unit is configured to process, in front-to-back order, a set of rays passing through the corresponding portion of the frame buffer, and wherein said at least one rendering unit is configured to perform early ray termination on rays determined to be occluded.
83. The system of claim 82, wherein the control unit is configured to perform said identification of the portion of the frame buffer corresponding to each of said voxel blocks according to a front-to-back ordering of the voxel blocks.
84. The system of claim 82, wherein the at least one rendering unit is configured to determine that a ray is occluded by comparing an opacity value of the ray to an opacity threshold.
85. The system of claim 82, wherein the at least one rendering unit is configured to perform space leaping on at least one of the rays of the set of rays in response to an indication that a current one of the voxel blocks is transparent.
86. The system of claim 82, wherein the at least one rendering unit is configured to interpolate samples along one or more of the rays of said set of rays based on voxels of the transferred voxel block.
87. The system of claim 82, wherein the array of voxels is a rectangular array.
88. The system of claim 82, wherein the array of voxels is a cubic array.
89. The system of claim 82, wherein said determination includes determining that the voxel block is not selected for transfer based on information indicating that the voxel block is occluded relative to the current viewing position.
90. The system of claim 82, wherein said determination includes determining that the voxel block is selected for transfer based on information indicating that the voxel block is not occluded relative to the current viewing position and information indicating that the voxel block is not transparent.
91. The system of claim 82, wherein said determination includes determining that the voxel block is selected for transfer based on: information indicating that said voxel block is transparent, information indicating that the voxel block is not occluded relative to a current viewing position, and information indicating that neighboring voxel blocks of said voxel block are transparent.
US11305902 2000-12-20 2005-12-16 Resample and composite engine for real-time volume rendering Expired - Fee Related USRE42638E1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US09741558 US6664961B2 (en) 2000-12-20 2000-12-20 Resample and composite engine for real-time volume rendering
US11305902 USRE42638E1 (en) 2000-12-20 2005-12-16 Resample and composite engine for real-time volume rendering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11305902 USRE42638E1 (en) 2000-12-20 2005-12-16 Resample and composite engine for real-time volume rendering

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09741558 Reissue US6664961B2 (en) 2000-12-20 2000-12-20 Resample and composite engine for real-time volume rendering

Publications (1)

Publication Number Publication Date
USRE42638E1 true USRE42638E1 (en) 2011-08-23

Family

ID=24981211

Family Applications (2)

Application Number Title Priority Date Filing Date
US09741558 Active 2021-09-13 US6664961B2 (en) 2000-12-20 2000-12-20 Resample and composite engine for real-time volume rendering
US11305902 Expired - Fee Related USRE42638E1 (en) 2000-12-20 2005-12-16 Resample and composite engine for real-time volume rendering

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09741558 Active 2021-09-13 US6664961B2 (en) 2000-12-20 2000-12-20 Resample and composite engine for real-time volume rendering

Country Status (1)

Country Link
US (2) US6664961B2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080270561A1 (en) * 2005-06-30 2008-10-30 Cascada Mobile Corp. System and Method of Recommendation and Provisioning of Mobile Device Related Content and Applications
US8244018B2 (en) 2010-11-27 2012-08-14 Intrinsic Medical Imaging, LLC Visualizing a 3D volume dataset of an image at any position or orientation from within or outside
US8725476B1 (en) * 2010-05-04 2014-05-13 Lucasfilm Entertainment Company Ltd. Applying details in a simulation
US8970592B1 (en) 2011-04-19 2015-03-03 Lucasfilm Entertainment Company LLC Simulating an arbitrary number of particles
US20150228110A1 (en) * 2014-02-10 2015-08-13 Pixar Volume rendering using adaptive buckets
US20160027204A1 (en) * 2014-07-22 2016-01-28 Samsung Electronics Co., Ltd. Data processing method and data processing apparatus
US20160042553A1 (en) * 2014-08-07 2016-02-11 Pixar Generating a Volumetric Projection for an Object

Families Citing this family (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3254451B2 (en) * 2000-03-06 2002-02-04 敏晴 中井 Colorization method and apparatus according to the multi-channel mri image processing
US20040218269A1 (en) * 2002-01-14 2004-11-04 Divelbiss Adam W. General purpose stereoscopic 3D format conversion system and method
US6826297B2 (en) * 2001-05-18 2004-11-30 Terarecon, Inc. Displaying three-dimensional medical images
WO2002095686A1 (en) * 2001-05-23 2002-11-28 Vital Images, Inc. Occlusion culling for object-order volume rendering
EP1397782B1 (en) * 2001-06-07 2008-10-01 Mental Images GmbH Rendering images using a strictly-deterministic methodology for generating a coarse sequence of sample points
DE10239672B4 (en) * 2002-08-26 2005-08-11 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Method and apparatus for generating a two-dimensional image of a three-dimensional structure
US7356178B2 (en) * 2002-12-31 2008-04-08 Koninklijke Philips Electronics N.V. System and method for improved multiple-dimension image displays
JP2006518510A (en) * 2003-02-21 2006-08-10 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィKoninklijke Philips Electronics N.V. Cache for the volume visualization
US7301538B2 (en) * 2003-08-18 2007-11-27 Fovia, Inc. Method and system for adaptive direct volume rendering
US20050110793A1 (en) * 2003-11-21 2005-05-26 Steen Erik N. Methods and systems for graphics processing in a medical imaging system
US20050110791A1 (en) * 2003-11-26 2005-05-26 Prabhu Krishnamoorthy Systems and methods for segmenting and displaying tubular vessels in volumetric imaging data
DE102004007835A1 (en) * 2004-02-17 2005-09-15 Universität des Saarlandes Device for display of dynamic complex scenes
EP1577837A1 (en) * 2004-03-17 2005-09-21 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts 3D cone beam reconstruction
US9492114B2 (en) 2004-06-18 2016-11-15 Banner Health Systems, Inc. Accelerated evaluation of treatments to prevent clinical onset of alzheimer's disease
EP1761191A4 (en) * 2004-06-18 2008-05-14 Banner Health Evaluation of a treatment to decrease the risk of a progeressive brain disorder or to slow brian aging
US7717849B2 (en) * 2004-07-06 2010-05-18 Gerneral Electric Company Method and apparatus for controlling ultrasound system display
US7525543B2 (en) * 2004-08-09 2009-04-28 Siemens Medical Solutions Usa, Inc. High performance shading of large volumetric data using screen-space partial derivatives
US9471978B2 (en) * 2004-10-04 2016-10-18 Banner Health Methodologies linking patterns from multi-modality datasets
US7298372B2 (en) * 2004-10-08 2007-11-20 Mitsubishi Electric Research Laboratories, Inc. Sample rate adaptive filtering for volume rendering
US9024949B2 (en) * 2004-10-13 2015-05-05 Sony Corporation Object representation using distance functions
WO2006077506A1 (en) * 2005-01-18 2006-07-27 Koninklijke Philips Electronics N.V. Multi-view display device
US8150111B2 (en) * 2005-03-15 2012-04-03 The University Of North Carolina At Chapel Hill Methods, systems, and computer program products for processing three-dimensional image data to render an image from a viewpoint within or beyond an occluding region of the image data
US7532214B2 (en) * 2005-05-25 2009-05-12 Spectra Ab Automated medical image visualization using volume rendering with local histograms
JP2006338630A (en) * 2005-05-31 2006-12-14 Terarikon Inc Three-dimensional image display device for creating three-dimensional image while sequentially and partially decompressing compressed image data
US7362330B2 (en) * 2005-09-15 2008-04-22 International Business Machines Corporation Adaptive span computation when ray casting
JP4724517B2 (en) * 2005-09-29 2011-07-13 キヤノン株式会社 An image processing system and image processing method and program, and storage medium
KR100768043B1 (en) 2005-12-23 2007-10-18 주식회사 사이버메드 Method of correcting the orientation of 3d volume data in real time
US8041129B2 (en) * 2006-05-16 2011-10-18 Sectra Ab Image data set compression based on viewing parameters for storing medical image data from multidimensional data sets, related systems, methods and computer products
US7839404B2 (en) * 2006-07-25 2010-11-23 Siemens Medical Solutions Usa, Inc. Systems and methods of direct volume rendering
US7864174B2 (en) * 2006-08-24 2011-01-04 International Business Machines Corporation Methods and systems for reducing the number of rays passed between processing elements in a distributed ray tracing system
US7852336B2 (en) * 2006-11-28 2010-12-14 International Business Machines Corporation Dynamic determination of optimal spatial index mapping to processor thread resources
US7830381B2 (en) * 2006-12-21 2010-11-09 Sectra Ab Systems for visualizing images using explicit quality prioritization of a feature(s) in multidimensional image data sets, related methods and computer products
CN102160087B (en) * 2007-01-05 2013-09-18 兰德马克绘图国际公司,哈里伯顿公司 Systems and methods for visualizing multiple volumetric data sets in real time
WO2008115534A1 (en) * 2007-03-20 2008-09-25 The University Of North Carolina At Chapel Hill Methods, systems, and computer readable media or flexible occlusion rendering
US20080232694A1 (en) * 2007-03-21 2008-09-25 Peter Sulatycke Fast imaging data classification method and apparatus
DE102007020060B4 (en) * 2007-04-27 2013-12-05 Siemens Aktiengesellschaft Distributed computation of images by means of volumetric objects Ray Casting
US8134556B2 (en) * 2007-05-30 2012-03-13 Elsberg Nathan Method and apparatus for real-time 3D viewer with ray trace on demand
US8056086B2 (en) * 2008-05-19 2011-11-08 International Business Machines Corporation Load balancing for image processing using multiple processors
WO2009149332A1 (en) * 2008-06-06 2009-12-10 Landmark Graphics Corporation, A Halliburton Company Systems and methods for imaging a three-dimensional volume of geometrically irregular grid data representing a grid volume
US20100033482A1 (en) * 2008-08-11 2010-02-11 Interactive Relighting of Dynamic Refractive Objects Interactive Relighting of Dynamic Refractive Objects
US8379024B2 (en) * 2009-02-18 2013-02-19 Autodesk, Inc. Modular shader architecture and method for computerized image rendering
US8416238B2 (en) * 2009-02-18 2013-04-09 Autodesk, Inc. Modular shader architecture and method for computerized image rendering
US8368694B2 (en) * 2009-06-04 2013-02-05 Autodesk, Inc Efficient rendering of multiple frame buffers with independent ray-tracing parameters
US8351689B2 (en) * 2009-09-30 2013-01-08 Disney Enterprises, Inc. Apparatus and method for removing ink lines and segmentation of color regions of a 2-D image for converting 2-D images into stereoscopic 3-D images
US8947422B2 (en) * 2009-09-30 2015-02-03 Disney Enterprises, Inc. Gradient modeling toolkit for sculpting stereoscopic depth models for converting 2-D images into stereoscopic 3-D images
US9530189B2 (en) * 2009-12-31 2016-12-27 Nvidia Corporation Alternate reduction ratios and threshold mechanisms for framebuffer compression
US20110157155A1 (en) * 2009-12-31 2011-06-30 Disney Enterprises, Inc. Layer management system for choreographing stereoscopic depth
US9042636B2 (en) 2009-12-31 2015-05-26 Disney Enterprises, Inc. Apparatus and method for indicating depth of one or more pixels of a stereoscopic 3-D image comprised from a plurality of 2-D layers
CA2795835C (en) 2010-04-30 2016-10-04 Exxonmobil Upstream Research Company Method and system for finite volume simulation of flow
US9187984B2 (en) 2010-07-29 2015-11-17 Exxonmobil Upstream Research Company Methods and systems for machine-learning based simulation of flow
EP2599032A4 (en) 2010-07-29 2018-01-17 Exxonmobil Upstream Research Company Method and system for reservoir modeling
CA2807300C (en) 2010-09-20 2017-01-03 Exxonmobil Upstream Research Company Flexible and adaptive formulations for complex reservoir simulations
KR20120066305A (en) * 2010-12-14 2012-06-22 한국전자통신연구원 Caching apparatus and method for video motion estimation and motion compensation
EP2756382A4 (en) 2011-09-15 2015-07-29 Exxonmobil Upstream Res Co Optimized matrix and vector operations in instruction limited algorithms that perform eos calculations
US9152744B2 (en) * 2012-03-29 2015-10-06 Airbus Operations (S.A.S.) Methods, systems, and computer readable media for generating a non-destructive inspection model for a composite part from a design model of the composite part
WO2013169429A1 (en) 2012-05-08 2013-11-14 Exxonmobile Upstream Research Company Canvas control for 3d data volume processing
WO2014051903A1 (en) 2012-09-28 2014-04-03 Exxonmobil Upstream Research Company Fault removal in geological models
US9591309B2 (en) 2012-12-31 2017-03-07 Nvidia Corporation Progressive lossy memory compression
US9607407B2 (en) 2012-12-31 2017-03-28 Nvidia Corporation Variable-width differential memory compression
US10043234B2 (en) 2012-12-31 2018-08-07 Nvidia Corporation System and method for frame buffer decompression and/or compression
US9292953B1 (en) 2014-01-17 2016-03-22 Pixar Temporal voxel buffer generation
US9292954B1 (en) 2014-01-17 2016-03-22 Pixar Temporal voxel buffer rendering
US9311737B1 (en) * 2014-01-17 2016-04-12 Pixar Temporal voxel data structure
US9832388B2 (en) 2014-08-04 2017-11-28 Nvidia Corporation Deinterleaving interleaved high dynamic range image by using YUV interpolation
GB201415534D0 (en) * 2014-09-02 2014-10-15 Bergen Teknologioverforing As Method and apparatus for processing three-dimensional image data

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5113357A (en) 1989-05-18 1992-05-12 Sun Microsystems, Inc. Method and apparatus for rendering of geometric volumes
US5499323A (en) 1993-06-16 1996-03-12 International Business Machines Corporation Volume rendering method which increases apparent opacity of semitransparent objects in regions having higher specular reflectivity
US5557734A (en) 1994-06-17 1996-09-17 Applied Intelligent Systems, Inc. Cache burst architecture for parallel processing, such as for image processing
US5594842A (en) 1994-09-06 1997-01-14 The Research Foundation Of State University Of New York Apparatus and method for real-time volume visualization
US5847711A (en) 1994-09-06 1998-12-08 The Research Foundation Of State University Of New York Apparatus and method for parallel and perspective real-time volume visualization
US5861891A (en) 1997-01-13 1999-01-19 Silicon Graphics, Inc. Method, system, and computer program for visually approximating scattered data
US5917937A (en) 1997-04-15 1999-06-29 Microsoft Corporation Method for performing stereo matching to recover depths, colors and opacities of surface elements
US6008813A (en) 1997-08-01 1999-12-28 Mitsubishi Electric Information Technology Center America, Inc. (Ita) Real-time PC based volume rendering system
US6034697A (en) 1997-01-13 2000-03-07 Silicon Graphics, Inc. Interpolation between relational tables for purposes of animating a data visualization
US6078332A (en) 1997-01-28 2000-06-20 Silicon Graphics, Inc. Real-time lighting method using 3D texture mapping
US6111582A (en) 1996-12-20 2000-08-29 Jenkins; Barry L. System and method of image generation and encoding using primitive reprojection
US6304266B1 (en) 1999-06-14 2001-10-16 Schlumberger Technology Corporation Method and apparatus for volume rendering
US6310620B1 (en) 1998-12-22 2001-10-30 Terarecon, Inc. Method and apparatus for volume rendering with multiple depth buffers
US6456285B2 (en) * 1998-05-06 2002-09-24 Microsoft Corporation Occlusion culling for complex transparent scenes in computer generated graphics
US6636215B1 (en) * 1998-07-22 2003-10-21 Nvidia Corporation Hardware-assisted z-pyramid creation for host-based occlusion culling
US6826297B2 (en) * 2001-05-18 2004-11-30 Terarecon, Inc. Displaying three-dimensional medical images
US7136064B2 (en) * 2001-05-23 2006-11-14 Vital Images, Inc. Occlusion culling for object-order volume rendering
US7167181B2 (en) * 1998-08-20 2007-01-23 Apple Computer, Inc. Deferred shading graphics pipeline processor having advanced features

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5113357A (en) 1989-05-18 1992-05-12 Sun Microsystems, Inc. Method and apparatus for rendering of geometric volumes
US5499323A (en) 1993-06-16 1996-03-12 International Business Machines Corporation Volume rendering method which increases apparent opacity of semitransparent objects in regions having higher specular reflectivity
US5557734A (en) 1994-06-17 1996-09-17 Applied Intelligent Systems, Inc. Cache burst architecture for parallel processing, such as for image processing
US5594842A (en) 1994-09-06 1997-01-14 The Research Foundation Of State University Of New York Apparatus and method for real-time volume visualization
US5847711A (en) 1994-09-06 1998-12-08 The Research Foundation Of State University Of New York Apparatus and method for parallel and perspective real-time volume visualization
US6111582A (en) 1996-12-20 2000-08-29 Jenkins; Barry L. System and method of image generation and encoding using primitive reprojection
US5861891A (en) 1997-01-13 1999-01-19 Silicon Graphics, Inc. Method, system, and computer program for visually approximating scattered data
US6034697A (en) 1997-01-13 2000-03-07 Silicon Graphics, Inc. Interpolation between relational tables for purposes of animating a data visualization
US6078332A (en) 1997-01-28 2000-06-20 Silicon Graphics, Inc. Real-time lighting method using 3D texture mapping
US5917937A (en) 1997-04-15 1999-06-29 Microsoft Corporation Method for performing stereo matching to recover depths, colors and opacities of surface elements
US6008813A (en) 1997-08-01 1999-12-28 Mitsubishi Electric Information Technology Center America, Inc. (Ita) Real-time PC based volume rendering system
US6456285B2 (en) * 1998-05-06 2002-09-24 Microsoft Corporation Occlusion culling for complex transparent scenes in computer generated graphics
US6636215B1 (en) * 1998-07-22 2003-10-21 Nvidia Corporation Hardware-assisted z-pyramid creation for host-based occlusion culling
US7167181B2 (en) * 1998-08-20 2007-01-23 Apple Computer, Inc. Deferred shading graphics pipeline processor having advanced features
US6310620B1 (en) 1998-12-22 2001-10-30 Terarecon, Inc. Method and apparatus for volume rendering with multiple depth buffers
US6304266B1 (en) 1999-06-14 2001-10-16 Schlumberger Technology Corporation Method and apparatus for volume rendering
US6826297B2 (en) * 2001-05-18 2004-11-30 Terarecon, Inc. Displaying three-dimensional medical images
US7136064B2 (en) * 2001-05-23 2006-11-14 Vital Images, Inc. Occlusion culling for object-order volume rendering
US7362329B2 (en) * 2001-05-23 2008-04-22 Vital Images, Inc. Occlusion culling for object-order volume rendering

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Hirai, T., et al: "Hybrid Volume Ray Tracing of Multiple Isosurfaces with Arbitrary Opacity Values", IEICE Transactions on Information and Systems, Institute of Electronics Information and Comm. Eng. Tokyo, JP, vol. E79-D, No. 7, Jul. 1, 1996, pp. 965-972, XP000628406.
Mueller, K., et al: "Fast Perspective Volume Rendering with Splatting by Utilizing a Ray-Driven Approach", Visualization '96. Proceedings of the Visualization Conference, San Francisco, Oct. 27-Nov. 1, 1996, Proceedings of the Visualization Conference, New York, IEEE/ACM, US, Oct. 27, 1996, pp. 65-72, XP000704171.
Ray, H., et al: "Ray Casting Architectures for Volume Visualization", IEEE Transactions on Visualization and Computer Graphics, IEEE Service Center, Piscataway, NJ, US, vol. 5, No. 3, Jul. 1999, pp. 210-223, XP000865305.
Yagel, R., et al: "Accelerating Volume Animation by Space-Leaping", Proceedings of the Conference on Visualization, San Jose, Oct. 25-29, 1993, New York, IEEE, US, Oct. 25, 1993, pp. 62-69, XP000475412.

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080270561A1 (en) * 2005-06-30 2008-10-30 Cascada Mobile Corp. System and Method of Recommendation and Provisioning of Mobile Device Related Content and Applications
US8725476B1 (en) * 2010-05-04 2014-05-13 Lucasfilm Entertainment Company Ltd. Applying details in a simulation
US8244018B2 (en) 2010-11-27 2012-08-14 Intrinsic Medical Imaging, LLC Visualizing a 3D volume dataset of an image at any position or orientation from within or outside
US8970592B1 (en) 2011-04-19 2015-03-03 Lucasfilm Entertainment Company LLC Simulating an arbitrary number of particles
US20150228110A1 (en) * 2014-02-10 2015-08-13 Pixar Volume rendering using adaptive buckets
US9842424B2 (en) * 2014-02-10 2017-12-12 Pixar Volume rendering using adaptive buckets
US20160027204A1 (en) * 2014-07-22 2016-01-28 Samsung Electronics Co., Ltd. Data processing method and data processing apparatus
US20160042553A1 (en) * 2014-08-07 2016-02-11 Pixar Generating a Volumetric Projection for an Object

Also Published As

Publication number Publication date Type
US6664961B2 (en) 2003-12-16 grant
US20020113787A1 (en) 2002-08-22 application

Similar Documents

Publication Publication Date Title
Wilhelms et al. A coherent projection approach for direct volume rendering
Regan et al. Priority rendering with a virtual reality address recalculation pipeline
US6972769B1 (en) Vertex texture cache returning hits out of order
US6025853A (en) Integrated graphics subsystem with message-passing architecture
US7505036B1 (en) Order-independent 3D graphics binning architecture
US6664955B1 (en) Graphics system configured to interpolate pixel values
US5798770A (en) Graphics rendering system with reconfigurable pipeline sequence
Kaufman et al. Memory and processing architecture for 3D voxel-based imagery
US6111584A (en) Rendering system with mini-patch retrieval from local texture storage
US6525723B1 (en) Graphics system which renders samples into a sample buffer and generates pixels in response to stored samples at different rates
US6496186B1 (en) Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for reduced artifacts
Scharsach Advanced CPU Raycasting
US6624823B2 (en) Graphics system configured to determine triangle orientation by octant identification and slope comparison
Mueller et al. High-quality splatting on rectilinear grids with efficient culling of occluded voxels
US6956576B1 (en) Graphics system using sample masks for motion blur, depth of field, and transparency
US6466206B1 (en) Graphics system with programmable real-time alpha key generation
US4985856A (en) Method and apparatus for storing, accessing, and processing voxel-based data
Lefohn et al. A streaming narrow-band algorithm: interactive computation and visualization of level sets
US20060059494A1 (en) Load balancing
US5856829A (en) Inverse Z-buffer and video display system having list-based control mechanism for time-deferred instructing of 3D rendering engine that also responds to supervisory immediate commands
US6650323B2 (en) Graphics system having a super-sampled sample buffer and having single sample per pixel support
Molnar et al. PixelFlow: high-speed rendering using image composition
US6650333B1 (en) Multi-pool texture memory management
US6717578B1 (en) Graphics system with a variable-resolution sample buffer
US6489956B1 (en) Graphics system having a super-sampled sample buffer with generation of output pixels using selective adjustment of filtering for implementation of display effects

Legal Events

Date Code Title Description
AS Assignment

Owner name: RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY, NEW J

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RAY, HARVEY;SILVER, DEBORAH;SIGNING DATES FROM 20001129 TO 20001203;REEL/FRAME:026392/0571

CC Certificate of correction
REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees