WO2006115716A2 - System and method of visible surface determination in computer graphics using interval analysis - Google Patents

System and method of visible surface determination in computer graphics using interval analysis Download PDF

Info

Publication number
WO2006115716A2
WO2006115716A2 PCT/US2006/012548 US2006012548W WO2006115716A2 WO 2006115716 A2 WO2006115716 A2 WO 2006115716A2 US 2006012548 W US2006012548 W US 2006012548W WO 2006115716 A2 WO2006115716 A2 WO 2006115716A2
Authority
WO
WIPO (PCT)
Prior art keywords
computer executable
subdivision
methodology
executable area
interval
Prior art date
Application number
PCT/US2006/012548
Other languages
French (fr)
Other versions
WO2006115716A3 (en
Inventor
Nathan T. Hayes
Original Assignee
Sunfish Studio, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunfish Studio, Llc filed Critical Sunfish Studio, Llc
Publication of WO2006115716A2 publication Critical patent/WO2006115716A2/en
Publication of WO2006115716A3 publication Critical patent/WO2006115716A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/005Tree description, e.g. octree, quadtree
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/06Ray-tracing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/40Hidden part removal

Definitions

  • the present invention generally relates to computer imaging or graphics, more particularly, to the field of photorealistic image synthesis utilizing interval-based techniques for integrating digital scene information in furtherance of constructing and/or reconstructing an image of the digital scene, and/or the construction of an image based solely on mathematical formulae.
  • Rendering is the process of reconstructing a three- dimensional visual scene as a two-dimensional digital image, with the fundamental components thereof being geometry and color.
  • a camera that takes a photograph is one example of how a two-dimensional image of the natural three-dimensional world can be rendered.
  • the well-known grid technique for drawing real world images is another example of how to translate real world images into two-dimensional drawings.
  • a stick is used as the reference point for the artist's viewing position, and the artist looks through a rectangular grid of twine into a scene behind the grid. The paper the artist draws on is also divided into rectangular cells. The artist carefully copies only what is seen in a given cell in the grid of twine onto the corresponding cell on the paper.
  • the process of rendering a digital scene inside a computer is very similar.
  • the computer creates a digital image.
  • the artist's paper is divided into rectangular cells, and a digital image is divided into small rectangles called pixels.
  • a pixel may only be shaded with a single color.
  • a typical computer generated image used by the modern motion picture industry is formed of a rectangular array of pixels 1,920 wide and 1,080 high. Because each pixel can only be shaded a single color, the realism of a digital image is completely determined by the total number of pixels in the image and by how accurately the computer computes the color of each pixel.
  • a computer To determine the color of a pixel, a computer must "look" through the rectangular area of the pixel, much like the artist looks through a rectangular cell in the grid of twine. While the artist looks through the grid into the natural world, the computer has access to a digital scene stored in memory. The computer must determine which parts of the digital scene, if any, are present in the rectangular area of a pixel. As in the natural world, objects in the foreground of the digital scene occlude objects in the background. All non- occluded parts of the digital scene that are present in the rectangular area of a pixel belong to the "visible solution set" of the pixel.
  • visible surface determination in contradistinction to “hidden surface determination,” i.e., the process used to determine which surfaces and parts of surfaces are not visible from a select view point, for example, back face culling, viewing frustum culling, occlusion culling and contribution culling.
  • hidden surface determination i.e., the process used to determine which surfaces and parts of surfaces are not visible from a select view point, for example, back face culling, viewing frustum culling, occlusion culling and contribution culling.
  • the visible solution set can be integrated to yield a single color value that the pixel may be assigned.
  • Many modern rendering programs sample the rectangular area (i.e., two dimensional boundary) of a pixel with points. This method, known as point sampling, is used to compute an approximate visible solution set for a pixel.
  • a point-sample is a ray that starts at the viewing position and shoots through a location within the pixel into the scene.
  • the color of each point sample is computed by intersecting objects in the scene with the ray, and determining the color of the object at the point of intersection. If several points of intersection exist between the ray and the objects of, or in the scene, the visible intersection point is the intersection closest to the origin of the ray. The final color of the pixel is then determined by filtering a neighborhood of point samples .
  • point-sampling techniques are known and are pervasive in modern computer graphics.
  • a broad class of algorithms, collectively called global illumination simulates the path of all light in a scene arriving at a pixel via the visible points of intersection.
  • ray tracing i.e., an image synthesizing technique using geometrical optics and rays to evaluate recursive shading and visibility
  • the intersection points of these additional rays are integrated into a single color value, which is then assigned to the visible point sample.
  • Another class of algorithms that compute the color of a sample without the use of additional rays is called local illumination.
  • Popular examples of local illumination are simple ray-casting algorithms, scan-line algorithms, and the ubiquitous z-buffer algorithm. It is common to find local illumination algorithms implemented in hardware because the results require less computational effort. Local illumination, however, typically does not provide the level of quality and realism found in the global illumination algorithms.
  • RenderMan® is the name of a software program created and owned by Pixar that allows computers to render pseudo life- like digital images.
  • RenderMan a point-sampling global illumination rendering system and subject of U.S. Pat. No. 5,239,624, is the only software package to ever receive an Oscar® award from the Academy of Motion Picture Arts and Sciences.
  • RenderMan clearly represents the current state of the art in pseudo-realistic point sampling software.
  • game consoles such as Sony PlayStation® or Microsoft X-Box® clearly do not exhibit the quality of realism found in RenderMan, but these hardware- based local illumination gaming appliances have a tremendous advantage over RenderMan in terms of speed.
  • RenderMan and ray tracing programs for example, use lots of point samples for each pixel, and so the image appears more realistic.
  • Hardware implementations like X-Box on the other hand, often use only a single point sample per pixel in order to be able to render the images more quickly.
  • point sampling is used almost exclusively to render digital images
  • a fundamental problem of point sampling theory is the problem of aliasing, caused by using an inadequate number of point samples (i.e., an undersampled signal) to reconstruct the image.
  • point samples i.e., an undersampled signal
  • high-frequency components of the original signal can appear as lower frequency components in the sampled version.
  • alias i.e., false identity
  • Such artifacts appear when the rendering method does not compute an accurate approximation to the visible solution set of a pixel.
  • Aliasing is commonly categorized as “spatial” or “temporal.”
  • Common spatial alias artifacts include jagged lines/chunky edges (i.e., "jaggies,”), or missing objects.
  • jaggies jagged lines/chunky edges
  • missing objects missing objects.
  • a "use more pixels" strategy is not curative: no matter how closely the point samples are packed, they will, in the case of jaggies, only make them smaller, and in the case of missing objects, they will always/inevitably miss a small object or a large object far enough away.
  • Temporal aliasing is typically manifest as jerky motion (e.g., "motion blur,” namely, the blurry path left on a time-averaged image by a fast moving object: things happen too fast for accurate recordation) , or as a popping (i.e., blinking) object: as a very small object moves across the screen, it will infrequently be hit by a point sample, only appearing in the synthesized image when hit.
  • the essential aliasing problem is the representation of continuous phenomena with discrete samples (i.e., point sampling, for example, ray tracing) .
  • interval analysis provides a solid platform to solve the nonlinear equations that are out of reach of, and hardly contemplated by Warnock, Greene and others following their lead.
  • an area subdivision method introduces, among other things, the novel concept of using interval analysis to attack and solve directly the required nonlinear equations.
  • the result is an area subdivision method that employs interval analysis to solve the visible surface determination problem to any desired measure of accuracy. This promulgates all the benefits of the original Warnock subdivision method to the general class of nonlinear functions.
  • the present invention provides a system and attendant methodology for digital image reconstruction using interval analysis.
  • This image reconstruction system rather than refining the work of others, abandons that work. That is to say, heretofore known point sampling techniques and point arithmetic are discarded in lieu of the subject approach to reconstructing two-dimensional digital images of a three dimensional representation of a visual scene or an actual, tangible three dimensional object.
  • Integral to the subject invention is the use of an area analysis framework instead of conventional point sampling to compute accurately and deterministically, the visible solution set of a pixel and to integrate its color.
  • Preferred embodiments of the subject invention more particularly the system, provide several advantages over conventional rendering techniques .
  • the visible solution set of a pixel is determined through interval analysis, since traditional point-based numerical analysis cannot "solve” such computational problems to a degree of imperceptible aliasing.
  • interval analysis guarantees accuracy to a user-specified level of detail ⁇ hen computing the color of a pixel.
  • preferred embodiments of the system may be used with any general form of projection.
  • RGB red-green-blue
  • a process for automatic adaptive resolution is possible, i.e., the dimension of a more and more refined rendering interval can be compared to the fidelity of the rendering machine to adaptively stop rendering smaller intervals once an optimum presentation has been obtained.
  • the system is a massively parallel and scalable rendering engine, and therefore useful for distributing across servers in a network grid to optimally create more realistic images from a scene database, or for implementing in a graphics accelerator for improved performance.
  • still images or video images may be more efficiently broadcast to remote clients as the interval analysis methods operate directly on the mathematical functions that describe a scene as opposed to a piecewise geometric or tesselated model of the scene, thus providing efficient data compaction for transmission over limited bandwidth connections.
  • an entire scene can be loaded on each computer connected to an output device and synchronously display an image either by sequentially displaying data from each computer, displaying disjoint pieces from each computer, or a combination of both.
  • the system can casually seek edges of objects or transitional areas, i.e., areas with increased levels of information to concentrate the rendering effort. Convergence to the proper visible solution set of a pixel is a deterministic operation which exhibits quadratic convergence (i.e., 0 (x 2 ) ) . This is in contrast to point-sampling methods which are probabilistic and exhibit square-root convergence (i.e., 0 (x 1/2 ) ) .
  • interval analysis methods are used to compute tight bounds on digital scene information across one or more functions or dimensions.
  • the digital scene information will typically contain dimensions of space, time, color, and opacity.
  • other dimensions such as temperature, mass or acceleration may also exist.
  • interval consistency methods are used to compute tight bounds on the visible solution set of a pixel for each dimension that is specified, and according to further aspect of the present invention, interval consistency methods are used to compute tight bounds on the visible solution set of a hemisphere subtending a surface element in the digital scene. Integrating the visible solution set over the hemisphere for each dimension that is specified yields a solution to the irradiance function, which can be used to compute global illumination effects, such as soft shadows and blurry reflections.
  • an interval analysis subdivision is performed on the parameters of a visual scene employing a hierarchical occlusion-culling step, with an interleaved interval contraction step advantageously coupled therewith.
  • a merit function is further advantageously introduced during the contraction step so as to perform an optimal contraction, and the contracted results are then processed in an order that maximizes the hierarchical occlusion-culling step.
  • the hierarchical occlusion-culling step accelerates overall computational performance of the subdivision method by deleting geometry in the scene that can, at the earliest possible iteration in the subdivision process, be proven not to be visible. This is accomplished with a hierarchical occlusion buffer, that is, a hierarchical worst-case depth representation of the rectangular array of screen pixels, using this information to perform consistency checks at each stage of the subdivision process. If the depth interval of the current subdivision iteration is strictly greater than the corresponding value stored in the hierarchical occlusion buffer, the recursive subdivision process can be terminated. All depth intervals that become part of the visible solution set are, in turn, used to dynamically propagate the state of the hierarchical occlusion buffer. As more objects are processed, the hierarchical occlusion buffer becomes a closer and closer approximation to the true visible solution set, facilitating dramatic computational savings for scenes with a high depth complexity.
  • the interleaved interval contraction step accelerates overall computational performance by deleting the excess interval width of, for example, parametric variables at a rate and spatial frequency that is optimal for each successive iteration of the subdivision process. It also guarantees that the parametric domain of a geometric shape will be subdivided into a regular paving, maximizing coherence and improving overall speed.
  • the present invention is an ideal solution for the direct processing of highly sophisticated and complex nonlinear geometric shapes, as no linear approximations to the underlying nonlinear geometry are performed or required.
  • the subject features alone or in combination, facilitate a dramatic capacity to process increasingly large scenes in sub-linear time.
  • the hierarchical occlusion-culling process works regardless of the order individual shapes within the scene are processed, making the present invention ideal for processing scenes so large they cannot fit into computer memory.
  • increasingly large scenes that contain a high degree of occlusion can generally be processed in logarithmic time.
  • FIG. 1 is an illustration of the well-known grid technique by which an artist makes a perspective drawing of a scene
  • FIG. 2 is a block diagram of a typical computer graphic process of rendering a digital image
  • FIG. 3 is a schematic diagram of the operation of computer graphic process using an existing point-sampling process
  • FIG. 4 represents a variety of spatial notions fundamental to image synthesis
  • FIG. 5 is a representation of hemispherical coordinates, integral to global illumination assessment
  • FIGS. ⁇ (a)-6(f) are pictorial representations of the results of using the point-sampling approach of FIG. 3;
  • FIGS. 7 (a) - (c) illustrate filtering techniques for point sampling methods
  • FIGS. 8(a)-8(f) are pictorial representations of the results of using a modified stochastic point-sampling method in the process of FIG. 3;
  • FIG. 9 is a depiction showing the tradeoff between speed of rendering and realism of an image
  • FIG. 10 is a schematic representation of the photorealistic image synthesis method of the subject invention
  • FIG. 11 is a representation as FIG. 10, wherein an exemplary system, and corresponding display are shown;
  • FIG. 12 is a static unified model language representation of the content of FIG. 10;
  • FIG. 13 is a temporal unified model language representation of the solvers of FIG. 12;
  • FIGS. 14-18 are schematic depiction of the work flow of the solvers of FIG. 13, namely, SCREEN, PIXEL, COVERAGE, DEPTH, and IMPORTANCE;
  • FIGS. 19 (a) -19 (f) are pictorial representations of the operation of an interval set technique for rendering in accordance with the present invention.
  • FIGS. 20 (a) - (c) illustrate occlusion principles, more particularly, depth of field and area masking notions
  • FIG. 21 depicts a hierarchical occlusion buffer of the method of the subject invention, more particularly, a binary tree stored as an linear array;
  • FIG. 22 illustrates an advantageous subdivision process, namely, processes implicating tile and solution vectors
  • FIG. 23 generally illustrates notions of consistency as a preferred threshold requirement of the process of FIG. 22;
  • FIG. 24 illustrates exemplary measurement techniques used in computing a merit function for the solution vector of FIG. 22;
  • FIGS. 25 (a) - (d) illustrate user selectable scaling effects in connection with solution vector contraction
  • FIG. 26 representatively illustrates solution vector contraction in relation to a merit function application.
  • FIG. 1 shows a classic example of how a renaissance artist uses the well-known grid technique to translate real world images into two-dimensional drawings.
  • An artist 20 uses a stick 22 as the reference point for the artist's viewing position.
  • the artist 20 looks through the cells 24 created by a rectangular grid of twine 26 into a scene 28 behind the grid.
  • a drawing paper 30 on which the artist 20 will render a two-dimensional representation of the scene 28 is divided into the same number of rectangular cells 32 as there are cells 24. The artist carefully copies only what is seen in a given cell 24 in the grid of twine 26 onto the corresponding cell 32 on the paper 30.
  • FIG. 2 shows the overall process of how a computer graphics system 40 turns a three dimensional digital representation of a scene of a scene database 42 into multiple two-dimensional digital images 44.
  • the digital graphics system 40 divides an image 44 to be displayed into thousands of pixels in order to digitally display two-dimensional representations of three dimensional scenes.
  • a typical computer generated image used by the modern motion picture industry, for example, is formed of a rectangular array of pixels 1,920 wide and 1,080 high.
  • a modeler 41 defines geometric models 43 for each of a series of objects in a scene.
  • a graphic artist 45 adds light, color and texture features 46 to geometric models 43 of each object, and an animator 47 then defines a set of motions and dynamics 48 defining how the objects will interact with each other and with light sources in the scene. All of this information is then collected and related in the scene database 42.
  • a render farm e.g., a network 49 comprised of multiple servers 51 then utilizes the scene database 42 to perform the calculations necessary to color in each of the pixels 50 in each frame 44 that are sequenced together to create the illusion of motion and action of the scene.
  • a pixel may only be assigned a single color.
  • the system 40 of FIG. 2 simulates the process of looking through a rectangular array of pixels into the scene from the artists viewpoint.
  • Current methodology uses a ray 62 that starts at the viewing position 60 and shoots through a location within pixel 50. The intersection of the ray with the pixel is called a point sample. The color of each point sample of pixel 50 is computed by intersecting this ray 62 with objects 64 in the scene. If several points of intersection exist between a ray 62 and objects in the scene, the visible intersection point 66 is the intersection closest to the origin of the viewing position 60 of the ray 62. The color computed at the visible point of intersection 66 is assigned to the point sample. If a ray does not hit any objects in the scene, the point sample is simply assigned a default "background" color. The final color of the pixel is then determined by filtering a neighborhood of point samples. SCENE OBJECTS AND THEIR REPRESENTATION
  • an initial untransformed primitive is presented having unit dimensions, and is most conveniently positioned and aligned in its own local coordinate/model system.
  • a primitive sphere would have a radius of one with its center at the origin of its local/model coordinate system; the modeler would then specify a new center and radius to create the desired specific instance of a sphere in the world coordinate system (i.e., space) or the scene (i.e., camera space).
  • Characteristic of the explicit equation is that it only has one result value for each set of input coordinates .
  • any point on the "object” may be computed by plugging in values for the parameters. It is easy to generate a few points that are on the surface and then, as heretofore done, approximate the rest of the surface by linear approximation or some other iterative process. (i.e., tesselation) .
  • the system effectively converts a two-dimensional pair of parametric coordinates into three- dimensions, the surface has a natural two-dimensional coordinate system, thereby making it easy to map other two- dimensional data onto the surface, the most obvious example being texture maps.
  • a hemisphere consists of all directions in which a viewer can look when positioned at the oriented surface point: a viewer can look from the horizon all the way up to the zenith, and all around in 180°.
  • the parameterization of a hemisphere is therefore a two- dimensional space, in which each point on the hemisphere defines a direction.
  • Spherical coordinates are a useful way of parameterizing the hemisphere 72.
  • each direction is characterized by two angles ⁇ and ⁇ .
  • the first angle, ⁇ represents the azimuth, and is measured with regard to an arbitrary axis located in the tangent plane at x; the second angle, ⁇ , gives the elevation, measured from the normal vector N x at surface point x.
  • a direction ⁇ can be expressed as the pair ( ⁇ , ⁇ ) .
  • a distance r along the direction ⁇ is added. Any three-dimensional point is then defined by three coordinates ( ⁇ , ⁇ , r) .
  • the most commonly used unit when modeling the physics of light is radiance (L) , which is defined as the radiant flux per unit solid angle per unit projected area. Flux measures the rate of flow through a surface. In the particular case of computer graphics, radiance measures the amount of electromagnetic energy passing through a region on the surface of a sphere and arriving at a point in space located at the center of the sphere. The region on the surface of the sphere is called the solid angle.
  • FIG. 6 (a) represents a single pixel containing four scene objects, with FIGS. 6(b)- (f) generally showing a point-sampling algorithm at work in furtherance of assigning the pixel a single color.
  • the color of the pixel might be some kind of amalgamation (i.e., integrated value) of the colors of the scene objects.
  • FIG. 6 (b) only a single point sample is used, and it does not intersect with any of the objects in the scene; so the value of the pixel is assigned the default background color.
  • FIG. 6(c) four point samples are used, but only one object in the scene is intersected; so the value of the pixel is assigned a color that is 75% background color and 25% the color of the intersected scene object.
  • additional point samples are used to compute better approximations (i.e., more accurate representations) for the color of the pixel.
  • two of the scene objects are not intersected (i.e., spatial aliasing: missing objects), and only in FIG. 6(f) does a computed color value of the pixel actually contain color contributions from all four scene objects.
  • point sampling cannot guarantee that all scene objects contained within a pixel will be intersected, regardless of how many samples are used.
  • each pixel is point sampled on a regular grid or matrix grid (e.g. nxm) .
  • the scene is then rendered nxm times, each time with a different subpixel offset. After each subpixel rendering is completed, it is summed and divided by the sum nxm.
  • the subject approach need not be confined to regular nxm regular grids.
  • a relaxation technique can be used to automatically generate irregular super-sampling patterns for any sample count.
  • the aforementioned sampling process will create partial antialiased images that are "box filtered" (FIG.
  • Geometry can be sampled using a gaussian, or other sample function, in several distinct and known ways, for instance to weight the distribution of point samples, say the nxm box filtered sample of FIG. 7 (a) , using a gaussian distribution, and thereby acheive a weighted filtering of the nxm matrix as shown in FIG 7 (b) . As illustrated in FIG.
  • FIG. -8 (a) represents the single pixel of FIG. 6 (a)
  • FIG. 9 depicting the trade-off between the speed of conventional local illumination (i.e., "fast,” e.g. FIG. 6(b) or 8 (b) ) and the realism of global illumination (i.e., "realistic,” that is to say, less “fast,” e.g. , FIG. 6(f) or 8 (f) ) .
  • aliasing is impossible to completely eliminate from a point-sampled image
  • an area analysis framework is instead used to more accurately represent or define the problem.
  • the mathematics involved in a global illumination model have been summarized in a high level continuous equation known as the rendering equation (i.e., a formal statement of light balancing in an environment) , with most image synthesis algorithms being viewed as solutions to an equation written as an approximation of the rendering equation (i.e., a numerical method solution approach) . That is to say, a solution to the rendering equation in the case of light falling on an image plane, is a solution of the global illumination problem.
  • the color of a pixel is determined by actually integrating the visible solution set over the area of an oriented surface parameterized, such as a pixel or hemisphere as previously discussed.
  • interval arithmetic and interval analysis i.e., the application of interval arithmetic to problem domains
  • interval arithmetic and interval analysis soon lost its status as a popular computing paradigm because of its tendency to produce pessimistic results.
  • Modern advances in interval computing have resolved many of these problems, and interval researchers are continuing to make advancements, see for example Eldon Hansen and William Walster, Global Optimization Using Interval Analysis, Second Edition 2003; L. Jaulin et al., Applied Interval Analysis, First Edition 2001; and, Miguel Sainz, Modal Intervals.
  • the electronic computing device consists of one or more processing units connected by a common bus, a network or both. Each processing unit has access to local memory on the bus.
  • the framebuffer which receives information from the processing units, applies the digital image, on a separate bus, to a standard video output or, by a virtual memory apparatus, to a digital storage peripheral such as a disk or tape unit.
  • Suitable electronic computing devices include, but are not limited to, a general purpose desktop computer with one or more processors on the bus; a network of such general purpose computers; a large supercomputer or grid computing system with multiple processing units and busses; a consumer game console apparatus; an embedded circuit board with one or more processing units on one or more busses; or a silicon microchip that contains one or more sub-processing units.
  • the framebuffer is a logical mapping into a rectangular array of pixels which represent the visible solution set of the digital scene. Each pixel in the image will typically consist of red, green, blue, coverage and depth components, but additional components such as geometric gradient, a unique geometric primitive identifier, or parametric coordinates, may also be stored at each pixel.
  • the image stored in the framebuffer is a digital representation of a single frame of film or video.
  • the digital representation of a visual scene in memory is comprised of geometric primitives; geometric primitives reside in the local memory of the processing units. If there is more than one bus in the system, the geometric primitives may be distributed evenly across all banks of local memory.
  • Each geometric primitive is represented in memory as a system of three linear or nonlinear equations that map an n- dimensional parameter domain to the x, y and z domain of the digital image; alternatively, a geometric primitive may be represented implicitly by a zero-set function of the x, y and z domain of the digital image.
  • interval domains are interval functions.
  • an interval consistency method is performed on X, Y, and Z over interval values of x, y, z, t, u and v that represent the entire domain of each variable.
  • the interval domain of x, y and z will typically be the width, height, and depth, respectively, of the image in pixels, and the interval domain of t, u, and v will depend on the parameterization of the geometric primitive .
  • FIGS. 10-12 the subject photorealistic image synthesis process is generally shown, with particular emphasis and general specification of the methodology of the subject interval consistency approach, in furtherance of photorealistic image synthesis processing, outlined in FIG. 12, which utilizes unified Modeling Language (uml) .
  • FIG. 12 central to the process are a plurality of interval consistency solvers.
  • Operatively and essentially linked to the interval consistency solvers is a system input, exemplified in FIG. 10 by a series of generic parametric equations, each function having two or more variables, for example the arguments t, u, and v as shown, and as representatively illustrated in FIG.
  • system is a sphere, the x-y-z functions being parameterized in t, u, v.
  • system need not be limited to parametric expressions, which have the greatest utility and are most challenging/problematic, other geometric primitives, or alternate system expressions are similarly contemplated and amenable to the subject methodology and process as is to be gleaned from the discussion to this point.
  • the system can similarly render strictly mathematical formulae selectively input by a user.
  • interval consistency solvers are further reliant upon user-defined shading routines as are well known to those of skill in the art (FIGS. 10-11) .
  • the dependency is mutual, the interval consistency solvers exporting information for use or work by the shader, a valuation or assessment by the user-defined shading routines returned (i.e., imported) to the interval consistency solvers for consideration and/or management in the framework of the process.
  • An output of the interval consistency solvers is indicated as pixel data (i.e., the task of the interval consistency solvers is to quantitatively assign a quality or character to a pixel) .
  • the pixel data output is ultimately used in image synthesis or reconstruction, vis-a-vis forwarding the quantitatively assigned pixel quality or character to a display in furtherance of defining (i.e., forming) a 2-D array of pixels. For the parameterized system input of FIG. 11, a 2-D array of pixels, associated with a defined set of intervals, is illustrated.
  • the relationship and interrelationships between the SOLVER, INPUT, CALLBACKS, and OUTPUT is defined, and will be generally outlined, and further, the relationship between and among the several solvers, e.g., SCREEN, PIXEL, COVERAGE, DEPTH and IMPORTANCE, are defined in the figures subordinate thereto, namely FIGS. 13- 18, and will be subsequently outlined.
  • solvers e.g., SCREEN, PIXEL, COVERAGE, DEPTH and IMPORTANCE
  • the solver more particularly the most preferred components thereof, namely SCREEN, PIXEL, COVERAGE, DEPTH, and IMPORTANCE, are shown in relation to the input (i.e., dim and system, that is to say, a geometric function) , callbacks (i.e., shader) , and output (i.e., pixel data and display).
  • the interrelationships between the individual most preferred elements of constituents of the solver, and the general temporal hierarchy between and among each, as well as their relationships between the callbacks (i.e., the shader) and the output (i.e., the display) are schematically shown in FIG. 12.
  • the screen solver effectively conducts an analysis of the screen (e.g., FIG. 3) or "image plane" of the camera space of FIG. 4, essentially performing a set inversion in x, y.
  • the objective or job of the screen solver is a preliminary one, namely, to chop the x-y screen into x-y subunits, effectively "stopping" upon achieving (i.e., identifying) a unit (i.e., area) commensurate or corresponding to a pixel (see e.g., FIGS. 19(b)-19(f) wherein the chopping is illustrated here of a pixel, not the image plane as is a preliminary step or prerequisite to chopping the pixel area) .
  • SCREEN is a point from which parallel processing is pursued, further desirable for such purposes is PIXEL, as will become readily apparent as the discussion progresses.
  • chopping of the x-y image plane begins with an initial step analogous to that illustrated in FIG. 19 (b).
  • the idea is to parse the x-y image plane to dimensional equate to a pixel.
  • initial chopping yields a subdivided x-y area more extensive than a pixel
  • more chopping is conducted, namely a preferential chopping.
  • the nature of the x-y image plane subunit i.e., a rectangle
  • the x dimension is further split: in the event that the subunit is portrait, then the y dimension is then split.
  • the pixel solver depicted in FIG. 15, is essentially a liaison between screen and the other solvers, acting as a synchronization point and performing a housekeeping function.
  • PIXEL seeks an answer to the question, is the nature of the x-y interval corresponding to a pixel area, and thereby the t, u, v solutions associated therewith, such that the shader has been invoked (i.e., color and opacity, for example, has been assigned or designated) . If the shader has been invoked, by calling upon the coverage solver, no further parsing of the x-y space (e.g., FIGS. 19 (b) -19 (f) ) is required, and the x-y pixel data is sent to the display.
  • the coverage solver essentially replicates the iterations of SCREEN, based upon a user defined limit epsilon (eps) .
  • COVERAGE seeks to delimit, via the retention of contributing t, u, v aspects based upon the user specified chop area "eps," those portions (i.e., areas) of the object surface within the pixel subunit (again, see FIGS. 19(b)-19(f).
  • the next procedural task is a consideration of depth (i.e., assessment of Z (t, u, v) of the parametric system with a fixed or set x and y) .
  • the depth solver is essentially doing the job of FIG. 17 (a) . More particularly, DEPTH initially ascertains where in the z dimension, ultimately from the image plane (see FIG. 4 camera space) , does the object surface, heretofore defined in x, y, t, u, v aspects, first appear or reside (i.e., in which depth cell), and thereafter step into space, via iterative cells, until the x, y, t, u, v object surface is no longer present in a cell (i.e., cell X of FIG. 17 (a) ) .
  • the depth variable more accurately, depth function
  • the depth variable is initialized for all depth space, namely set to the infinite interval z depth, i.e., set to an interval at an infinite distance from the viewer, z depth.
  • t, u, v, contraction begins in the depth field (z ⁇ ).
  • z search there is a trivial accept/reject query as to whether there is in fact a depth component of the x-y parameterization, with searching commencing thereafter (z search) .
  • the importance solver i.e., the t, u, v, chopper wherein a set inversion is executed in t, u, v so as to contract same
  • the shader is called upon, and it is necessary to next assess if the shader was invoked.
  • the output of the shader are accumulated into the importance sums and the depth parsing continues in furtherance of accounting for all z components of the x-y object surface, if not, steps, on a cell by cell basis are "walked off.”
  • the parsing or chopping of z space has been described as a serial or loop type progression, it is certainly amenable to recursive splitting, as the case of the x-y space.
  • the importance solver as detailed in FIG.
  • t, u, v are to be narrowed as much or as finely as possible.
  • the function of the importance solver is to fit or optimally match the t, u, v with the x, y, z, in a way that is overreaching, not under reaching.
  • ⁇ , ⁇ r and p represent the parametric domain of the hemisphere as previously outlined, for integration, the t, u, and v representing geometric primitives in the digital scene. If the shading routine performs this procedure recursively, very accurate bounds on the radiance function for a pixel can be computed. It should be noted, and/or again emphasized, that to the extent that the subject description has only used a 3- dimensional parameter domain as an example, the method described herein works for any n-dimensional parameter domain. For example, additional parameters such as temperature, mass or pressure can be used to specify the manifold that appears in the x, y and z domain of the image. This allows more complex and accurate simulations to be represented in the digital scene and applied to the rendered image.
  • a visual scene comprised of geometric primitives (e.g., explicit, implicit, and/or parametric equations representing at least a portion of a surface of a surface of a scene object) .
  • geometric primitives e.g., explicit, implicit, and/or parametric equations representing at least a portion of a surface of a surface of a scene object.
  • each geometric primitive is comprised of two parametric functions, but need not be so limited: one is a geometric shape function and the other is a shading function.
  • the variables u and v are the intrinsic parameters of the surface and t is the variable of time.
  • the function P maps the R 3 domain of (u, v, fe) -to the R 3 domain of (x, y, z) , wherein the x, y and z variables are coordinates in a rectangular array of pixels comprising the output image.
  • the valid domain of the u, v and t variables can be any arbitrary subset of R 3 and will depend entirely on the function P.
  • P and S are both interval functions.
  • the heretofore described preferred method involves a subdivision step of the [x, y, z) domain followed by a contraction of the (u, v, t) domain with the subject advantageous, non-limiting embodiment following the general concept while utilizing bisection to perform contraction on the (u, v, t) domain.
  • a merit function be used to choose the optimal dimension to split.
  • a merit function is used to determine whether bisection along the u, v or t dimension will be more likely to yield optimal results. This prevents wasted iterations of contraction that can occur when scaling in the map from (u, v, t) to (x, y, z) is highly- nonlinear or irregular.
  • FIGS. 13 et seq. An improvement over the methodology of FIGS. 13 et seq. , among other things, lies in the utilization of a hierarchical occlusion culling step or process which essentially deletes scene geometry that can, at the earliest possible iteration in a subdivision process, be proven not to be visible.
  • occlusion culling is germane in connection with processing the x-y or area dimension for area or coverage mask assessment, and/or with the processing of the z dimension for depth of field assessments.
  • FIGS. 20 (a) -(c) the subject occlusion notions are illustrated and briefly noted.
  • depth of field occlusion is manifest, for example, in the relatively distal image planes containing structures 80, or shore line elements 82.
  • significant savings in computational time can be gained since b 2 can be deleted, and further subdivision and contraction steps can be thereby completely avoided.
  • the values of x and y are measured in pixels, and z is typically normalized to lie within the range [0,1], mapping to the near and far clipping planes, respectively, of the viewing volume.
  • the hierarchical occlusion buffer is a dynamically-updated data structure that stores the most current pessimistic variable estimate, e.g., depth, for each tile in the paving of the x and y dimensions of the screen.
  • the implementation reguires the use of a hierarchical data structure which can be efficiently accessed and updated. Many choices exist, each with particular advantages and disadvantages.
  • the preferred embodiment uses a binary tree 90 stored in virtual memory as a linear array 92. While the subseguent discussion is directed to depth assessment, the previously noted area assessment follows the general format.
  • the root item of the tree is stored at array index 1.
  • array index 1 representing a tile 94 (e.g., "4")
  • the left and right child of the binary tree is stored at array index 21 (i.e., "8") and 21 + 1 (i.e., "9"), respectively.
  • the parent of an array index is floor 1/2, e.g., the floor of index "4" is 2. All of these computations can be performed using a combination of integer shift and bitwise-or operations. Because of the cheap and easy addressing scheme, pointers between parent and children do not need to be explicitly stored in memory.
  • the only storage required for a tree item is, in the case of a depth of filed assessment, a double-precision floating point value representing the worst-case depth information of a tile 94 (cf. a binary flag/bit, true (1) /false (0) , in the context of a coverage mask assessment) .
  • the binary tree is represented in memory as a linear array of double-precision depth values.
  • a downside to this scheme is that for image dimensions which are not exact powers of two, some array indices will not be used, wasting storage. However, the speed and simplicity of the method generally outweigh the wasted memory, which in practice turns out to be rather nominal.
  • indices into the linear array are computed on the fly. This means that reading a value from the hierarchical occlusion buffer is a simple indexed virtual memory access. Updating the state of the array is only slightly more complicated. In general, updates will be initiated once subdivision has reached the leaf nodes of the tree. The parent of the leaf is then set to the maximum depth of its children. This process is repeated recursively, and can be implemented efficiently as a loop, bit-shifting the index of the original leaf item to walk up the tree and propagate changes into the linear array. Of course, the propagation can terminate early if the depth value of a parent is greater-than or equal-to the maximum depth of its children. The entire array is generally initialized so that all tree items represent a depth value which maps to the far clipping plane or beyond, for example, 1 or ⁇ .
  • the tile vector 96 is comprised of the two interval variables x and y, each initialized so as to contain the entire domain, measured in pixels, of the screen. For example, if the screen is 720 pixels wide and 480 pixels high, then the x and y values of the tile vector are initialized to [0,720] and [0,480], respectively.
  • the tile vector also contains a variable, path, which is an index into the linear array representation of the hierarchical occlusion buffer (FIG. 21) .
  • the tile comprising the entire screen is always associated with the root item of the binary tree (FIG. 21), and so the path variable is initialized to 1.
  • the u, v and t variables are initialized to intervals containing each of their respective domains. The particular intervals will depend on the function P. For example, if P is the function of a moving sphere
  • u and v would most likely be initialized to [- ⁇ , ⁇ ] and [- ⁇ /2, ⁇ /2], respectively, and the t variable could be initialized to [0,5] if it was desired to have the sphere moving (motion-blurred) 5 pixels to the right.
  • the x, y and z values of the solution vector would then be initialized to the intervals
  • the last member of the solution vector, axis is a flag that indicates the next dimension to be bisected during the contraction step. It will be updated by the merit function during processing, but for starters, it is sufficient to initialize this flag such that the u dimension will be selected next for bisection.
  • Processing commences by first performing a consistency check 100 between the tile vector 96 and the solution vector 98, that is, the x and y interval variables of each vector are tested for intersection, see representation of FIG. 23. There must be intersection between the x and y dimensions of both vectors in order to continue. If no intersection exists, then the consistency is disjoint, meaning the solution vector can be deleted from the visible solution set of the tile vector. If the consistency check is true, processing can continue.
  • a hierarchical occlusion or occlusion-culling step 102 is performed.
  • the path variable of the tile vector is used as an index to read the worst-case depth estimate from the hierarchical occlusion buffer (FIG. 21) . If the worst-case depth estimate is strictly less-than the z value of the solution vector, it is proof that the solution vector can be deleted from the visible solution set of the tile vector. If the hierarchical occlusion-culling step fails, processing can continue.
  • the solution vector needs contraction if ⁇ * a x ⁇ a 2 , that is, the solution vector needs contraction if its projected screen area a 2 is greater than a scaled value, i.e., ⁇ , of the projected screen area of the tile vector, i.e., a x .
  • the scale value ⁇ is a parameter that allows the contraction to be tuned for speed or quality. For example, if ⁇ ⁇ 1 speed will improve and resolution will be degraded. If ⁇ ⁇ 1, resolution will improve at the cost of more contractions on the solution vector, see e.g., FIGS 25 (a) -(d) wherein FIG. 25 (a) shows the output of the FIG.
  • FIGS. 25(b)-(d) are as FIG. 25 (a), with ⁇ equal to 1/2, 1/4, and 1/8, respectively.
  • can be "tuned" to accommodate any desired subpixel resolution.
  • contraction of the solution vector is performed, preferably via bisection.
  • two new solution vectors are created, L and R, initialized as copies of the solution vector, B (FIG. 26 (a)), which is being bisected.
  • the dimension specified by the axis variable of the B vector is bisected: the L and R vectors are updated accordingly (see e.g., FIG. 26 (b) or FIG. 26 (c), depending upon which axis has been bisected) .
  • the last step is to use a merit function to determine if the axis variables of L and R should be updated or not.
  • the last step of contraction is to recursively processes both L and R by going back to the first processing stage, that is, performing the consistency check, and so forth. It is most efficient, however, to consider the L or R with the smallest possible z value as higher priority, making sure it is processed first. This allows the worst-case depth values from the solution vector closest to the near clipping plane to be dynamically propagated through the system before the solution vector further away is processed. This maximizes the occlusion between L and J?, allowing significant amounts of unnecessary computations to be avoided in many cases.
  • the three-way branch condition (FIG. 22) will determine that contraction is not the necessary step. In this case, the remaining choices are to subdivide the tile vector or to shade the solution vector because termination criteria have been met.
  • Termination criteria should occur if the width of the x and y variables of the tile vector are less-than or equal-to some specified constant. This is usually the case when the width of x and y are the size of a pixel, but subpixel tolerances are also possible. If the termination criteria have been met, then both the tile and solution vectors are added to the visible solution set and the hierarchical occlusion buffer (FIG. 21) is updated.
  • the upper bound of the z variable of the solution vector is propagated through the linear array of depth values.
  • the path variable of the tile vector specifies at which index in the linear array the propagation will start.
  • the three-way branch condition will eventually determine that subdividing the tile vector is the next step. This condition occurs after both the contraction and termination conditions have failed.
  • subdivision of the tile vector is accomplished with bisection. Two new tile vectors are created, L and R, initialized as copies of the tile vector, B, which is being bisected. Bisection of a tile vector always occurs along the widest dimension, so the x and y values of L and R are updated accordingly. Additionally, the path variables of L and R are computed as 2i and 2i + 1, respectively, where i is the path variable of B. After L and R have both been properly updated, the last step of subdivision is to recursively processes L and R by going back to the first processing stage, that is, performing the consistency check, and so forth. Once processing has completely finished, the final output is a large collection of shaded solution vectors which collectively represent the entire visible solution set of the image.

Abstract

A system (40) is provided for visible surface determination in furtherance of photorealistic rendering in a computer graphics environment. The system includes a scene database (42) and a processor, visual characteristics of objects (64) of an image frame (44) of a scene of the scene database (42) are delimited as geometric primitives, more particularly, non linear functions. The processor, for executing an interval analysis, to a user degree of certainty, accurately and deterministically ascertains a visible solution set of an area not exceeding a pixel dimension for a pixel (50) of an array (62) of pixels (50) that form said image frame (44). Hierarchical occlusion buffering, in combination with an interleaved interval contraction are advantageously utilized to greatly reduce processing time.

Description

SYSTEM AND METHOD OF VISIBLE SURFACE DETERMINATION IN COMPUTER GRAPHICS USING INTERVAL' ANALYSIS
This is an international application filed under 35 USC §363, more particularly, a continuation-in-part of presently pending U.S. Appl . No. 10/532,907 which is the National Stage of International Appl. No. PCT/US2003/036836 filed November 17, 2003 having a claim of priority under 35 USC §119 (e) of U.S. Prov. Appl. No. 60/426,763 filed November 15, 2003, the instant application claiming priority under 35 USC §365 (c) of said international application, and further claiming priority under 35 USC §119(e) of U.S. Prov. Appl. No. 60/668,543 filed April 5, 2005, each and every of the applications cited being incorporated herein by reference in its entirety.
TECHNICAL FIELD
The present invention generally relates to computer imaging or graphics, more particularly, to the field of photorealistic image synthesis utilizing interval-based techniques for integrating digital scene information in furtherance of constructing and/or reconstructing an image of the digital scene, and/or the construction of an image based solely on mathematical formulae.
BACKGROUND OF THE INVENTION
Photorealism for computer-generated scenes, that is to say, the production of a computer-generated scene that is indistinguishable from a photograph of the actual scene, as for instance, the elimination of aliasing, remains the "holy grail" for computer graphic artisans. So much so that Jim Blinn has proclaimed: "Nobody will ever solve the antialiasing problem," emphasis original, Jim Blinn, Jim Blinn' s Corner Notation, Notation, Notation, 2003, p. 166. In furtherance of a general appreciation and understanding the single most important obstacle to photorealism, i.e., the antialiasing problem, an overview of heretofore known image synthesizing processing, beginning with the notion of rendering, must be had. Rendering is the process of reconstructing a three- dimensional visual scene as a two-dimensional digital image, with the fundamental components thereof being geometry and color. A camera that takes a photograph is one example of how a two-dimensional image of the natural three-dimensional world can be rendered. The well-known grid technique for drawing real world images is another example of how to translate real world images into two-dimensional drawings. A stick is used as the reference point for the artist's viewing position, and the artist looks through a rectangular grid of twine into a scene behind the grid. The paper the artist draws on is also divided into rectangular cells. The artist carefully copies only what is seen in a given cell in the grid of twine onto the corresponding cell on the paper.
The process of rendering a digital scene inside a computer is very similar. Where the artist creates a paper drawing, the computer creates a digital image. The artist's paper is divided into rectangular cells, and a digital image is divided into small rectangles called pixels. Unlike the rectangular cells on the artist's paper, a pixel may only be shaded with a single color. A typical computer generated image used by the modern motion picture industry is formed of a rectangular array of pixels 1,920 wide and 1,080 high. Because each pixel can only be shaded a single color, the realism of a digital image is completely determined by the total number of pixels in the image and by how accurately the computer computes the color of each pixel.
To determine the color of a pixel, a computer must "look" through the rectangular area of the pixel, much like the artist looks through a rectangular cell in the grid of twine. While the artist looks through the grid into the natural world, the computer has access to a digital scene stored in memory. The computer must determine which parts of the digital scene, if any, are present in the rectangular area of a pixel. As in the natural world, objects in the foreground of the digital scene occlude objects in the background. All non- occluded parts of the digital scene that are present in the rectangular area of a pixel belong to the "visible solution set" of the pixel. The method of finding the visible solution set of a pixel is called "visible surface determination, " in contradistinction to "hidden surface determination," i.e., the process used to determine which surfaces and parts of surfaces are not visible from a select view point, for example, back face culling, viewing frustum culling, occlusion culling and contribution culling. Once visible surface determination is complete, the visible solution set can be integrated to yield a single color value that the pixel may be assigned. Many modern rendering programs sample the rectangular area (i.e., two dimensional boundary) of a pixel with points. This method, known as point sampling, is used to compute an approximate visible solution set for a pixel. A point-sample is a ray that starts at the viewing position and shoots through a location within the pixel into the scene. The color of each point sample is computed by intersecting objects in the scene with the ray, and determining the color of the object at the point of intersection. If several points of intersection exist between the ray and the objects of, or in the scene, the visible intersection point is the intersection closest to the origin of the ray. The final color of the pixel is then determined by filtering a neighborhood of point samples . A wide variety of point-sampling techniques are known and are pervasive in modern computer graphics. A broad class of algorithms, collectively called global illumination, simulates the path of all light in a scene arriving at a pixel via the visible points of intersection. For example, additional rays can be shot from each visible point of intersection into the scene, this type of global illumination algorithm is often called ray tracing (i.e., an image synthesizing technique using geometrical optics and rays to evaluate recursive shading and visibility) . The intersection points of these additional rays are integrated into a single color value, which is then assigned to the visible point sample. Another class of algorithms that compute the color of a sample without the use of additional rays is called local illumination. Popular examples of local illumination are simple ray-casting algorithms, scan-line algorithms, and the ubiquitous z-buffer algorithm. It is common to find local illumination algorithms implemented in hardware because the results require less computational effort. Local illumination, however, typically does not provide the level of quality and realism found in the global illumination algorithms.
RenderMan® is the name of a software program created and owned by Pixar that allows computers to render pseudo life- like digital images. RenderMan, a point-sampling global illumination rendering system and subject of U.S. Pat. No. 5,239,624, is the only software package to ever receive an Oscar® award from the Academy of Motion Picture Arts and Sciences. RenderMan clearly represents the current state of the art in pseudo-realistic point sampling software. On the other end of the spectrum, game consoles such as Sony PlayStation® or Microsoft X-Box® clearly do not exhibit the quality of realism found in RenderMan, but these hardware- based local illumination gaming appliances have a tremendous advantage over RenderMan in terms of speed. The realistic frames of animation produced by RenderMan take hours, even days, to compute, whereas the arcade-style graphics of gaming appliances are rendered at a rate of several frames per second. This disparity or tradeoff between speed and realism is typical of the current state of computer graphics. The nature of this disparity is due to the point-sampling techniques used in modern rendering implementations. Because each pixel can only be assigned by a single color, the "realism" of a digital image is completely determined by the total number of pixels, and by how accurately a computer chooses the color of each pixel. With a point-sampling algorithm, the most common method of increasing the accuracy of the computation is to increase the number of point samples. RenderMan and ray tracing programs, for example, use lots of point samples for each pixel, and so the image appears more realistic. Hardware implementations like X-Box, on the other hand, often use only a single point sample per pixel in order to be able to render the images more quickly.
Although point sampling is used almost exclusively to render digital images, a fundamental problem of point sampling theory is the problem of aliasing, caused by using an inadequate number of point samples (i.e., an undersampled signal) to reconstruct the image. When a signal is undersampled, high-frequency components of the original signal can appear as lower frequency components in the sampled version. These high frequencies assume the alias (i.e., false identity) of the low frequencies, because after sampling these different phenomena cannot be distinguished, with visual artifacts not specified in the scene appearing in the reconstruction of the image. Such artifacts appear when the rendering method does not compute an accurate approximation to the visible solution set of a pixel.
Aliasing is commonly categorized as "spatial" or "temporal." Common spatial alias artifacts include jagged lines/chunky edges (i.e., "jaggies,"), or missing objects. In spatial aliasing the artifacts are borne of the uniform nature of the pixel grid, and are independent of resolution. A "use more pixels" strategy is not curative: no matter how closely the point samples are packed, they will, in the case of jaggies, only make them smaller, and in the case of missing objects, they will always/inevitably miss a small object or a large object far enough away. Temporal aliasing is typically manifest as jerky motion (e.g., "motion blur," namely, the blurry path left on a time-averaged image by a fast moving object: things happen too fast for accurate recordation) , or as a popping (i.e., blinking) object: as a very small object moves across the screen, it will infrequently be hit by a point sample, only appearing in the synthesized image when hit. The essential aliasing problem is the representation of continuous phenomena with discrete samples (i.e., point sampling, for example, ray tracing) .
Despite the fact that rigorous mathematical models for the cause of aliasing in point-sampling algorithms have been well established and understood for years, local and global illumination algorithms based on point sampling continue to suffer from visual artifacts due to the aliasing problem. A tremendous amount of prior art in the field of computer graphics deals explicitly with the problem of aliasing.
Increasing the number of point samples to improve realism and avoid aliasing is not a viable solution because it simply causes the aliasing to occur at higher frequencies in the image. In fact, the current literature available on computer graphics seems to indicate that point sampling techniques have reached their practical limits in terms of speed and realism.
Increasingly elaborate and sophisticated probabilistic and statistical point sampling techniques are being investigated to gain marginal improvements in the realism of global illumination. Advances in point-sampling hardware are being used to improve the speed of local illumination techniques; but even with unlimited hardware speed, the best that can be hoped for is that hardware systems will some day be able to generate images of the same quality as existing global illumination algorithms which still suffer from aliasing problems caused by point sampling. The finger of history points to the final and inevitable conclusion that point-sampling is a dead-end for computer graphics. For this reason, it is perhaps ironic that J. Warnock disclosed an analytical area-sampling method in 1969 which promised to eliminate aliasing (see, Warnock, J. E. , A Hidden- Surface Algorithm for Computer Generated Halftone Pictures, Computer Science Department, University of Utah, TR 4-15, June 1969) . In this method, a logarithmic search for visible tiles in a quadtree subdivision of a polygon is conducted such that the area of a pixel is recursively subdivided, until only a small number of polygons are present, and then an analytical solution to the visible contribution of each polygon is computed. The result is a highly efficient computational method that can prevent certain types of aliasing from appearing in the image, however, this approach, as that of the traditional incremental scan conversion, is not well suited to tiling scenes of a moderate depth complexity.
The computational performance of the Warnock area subdivision method was later improved by Greene, who introduced a hierarchical occlusion-culling step to the process (see e.g., Hierarchical Polygon Tiling with Coverage Mask, Apple Computer, 1996) and subsequently refined that methodology as evidenced by one or more of at least U.S. Pat. Nos. 6,480,205; 6,636,215; 6,646,639; and, 6,89,489. Even with Greene's improvements, the method is limited to the processing of polygons and linear equations, prohibiting direct support for nonlinear geometry. As a consequence, nonlinear geometry must first be tessellated into linear approximations. Because the tessellation process is itself a form of point-sampling, this allows aliasing to occur in the final image. It is not an ideal solution.
By contrast and way of departure from heretofore know approaches, interval analysis provides a solid platform to solve the nonlinear equations that are out of reach of, and hardly contemplated by Warnock, Greene and others following their lead. In Applicant ' s published international application WO 2004/046881, an area subdivision method introduces, among other things, the novel concept of using interval analysis to attack and solve directly the required nonlinear equations. The result is an area subdivision method that employs interval analysis to solve the visible surface determination problem to any desired measure of accuracy. This promulgates all the benefits of the original Warnock subdivision method to the general class of nonlinear functions.
While tremendous advances have been made in the realism and speed by which two-dimensional images of digital scenes are rendered, there is a continuing need to further improve the speed and realism of rendering of digital image reconstruction in furtherance of photorealistic image synthesis. Furthermore, there exist opportunities to improve even further the interval-based area subdivision method, primarily in terms of computational performance. It is therefore the subject of the present invention to introduce a new and improved visible surface determination system and methodology.
SUMMARY OF THE INVENTION
The present invention provides a system and attendant methodology for digital image reconstruction using interval analysis. This image reconstruction system, rather than refining the work of others, abandons that work. That is to say, heretofore known point sampling techniques and point arithmetic are discarded in lieu of the subject approach to reconstructing two-dimensional digital images of a three dimensional representation of a visual scene or an actual, tangible three dimensional object. Integral to the subject invention is the use of an area analysis framework instead of conventional point sampling to compute accurately and deterministically, the visible solution set of a pixel and to integrate its color. Preferred embodiments of the subject invention, more particularly the system, provide several advantages over conventional rendering techniques . Most modern ray tracing algorithms only support geometric primitives of a low degree such as planes, triangles, spheres, and quadrics because the methods commonly used to find the visible point of intersection between a point-sampling ray and the primitive are numerically unstable for higher degree functions. Moreover, because it has been heretofore impossible to compute in an accurate and deterministic manner the visible solution set of a pixel using traditional point-sampling techniques, undesirable aliasing must appear on reconstructed images using conventional point-sampling. Prior strategy has been the reduction of aliasing, not the elimination of aliasing. In the context of the subject invention, the visible solution set of a pixel is determined through interval analysis, since traditional point-based numerical analysis cannot "solve" such computational problems to a degree of imperceptible aliasing. Unlike point sampling techniques, interval analysis guarantees accuracy to a user-specified level of detail ^hen computing the color of a pixel. In fact, it is possible to eliminate aliasing to the extent of hardware precision or to any user-defined precision above that threshold. Taken together, these benefits facilitate a new framework of scalable rendering, where speed and realism are no longer competing forces in the rendering process, and users can easily adjust the parameters of the rendering algorithm to define a ratio of speed and realism that suits their specific needs .
Given the above advantages, preferred embodiments of the system may be used with any general form of projection. For example, by representing RGB (red-green-blue) coloration as three intervals rather than three points, a process for automatic adaptive resolution is possible, i.e., the dimension of a more and more refined rendering interval can be compared to the fidelity of the rendering machine to adaptively stop rendering smaller intervals once an optimum presentation has been obtained. The system is a massively parallel and scalable rendering engine, and therefore useful for distributing across servers in a network grid to optimally create more realistic images from a scene database, or for implementing in a graphics accelerator for improved performance. Moreover, still images or video images may be more efficiently broadcast to remote clients as the interval analysis methods operate directly on the mathematical functions that describe a scene as opposed to a piecewise geometric or tesselated model of the scene, thus providing efficient data compaction for transmission over limited bandwidth connections.
With preferred embodiments of the system, an entire scene can be loaded on each computer connected to an output device and synchronously display an image either by sequentially displaying data from each computer, displaying disjoint pieces from each computer, or a combination of both. The system can casually seek edges of objects or transitional areas, i.e., areas with increased levels of information to concentrate the rendering effort. Convergence to the proper visible solution set of a pixel is a deterministic operation which exhibits quadratic convergence (i.e., 0 (x2) ) . This is in contrast to point-sampling methods which are probabilistic and exhibit square-root convergence (i.e., 0 (x1/2) ) .
As suggested, interval analysis methods are used to compute tight bounds on digital scene information across one or more functions or dimensions. For example, the digital scene information will typically contain dimensions of space, time, color, and opacity. In addition, other dimensions, such as temperature, mass or acceleration may also exist. According to one aspect of the present invention, interval consistency methods are used to compute tight bounds on the visible solution set of a pixel for each dimension that is specified, and according to further aspect of the present invention, interval consistency methods are used to compute tight bounds on the visible solution set of a hemisphere subtending a surface element in the digital scene. Integrating the visible solution set over the hemisphere for each dimension that is specified yields a solution to the irradiance function, which can be used to compute global illumination effects, such as soft shadows and blurry reflections.
In yet a further embodiment of the subject invention, an interval analysis subdivision is performed on the parameters of a visual scene employing a hierarchical occlusion-culling step, with an interleaved interval contraction step advantageously coupled therewith. A merit function is further advantageously introduced during the contraction step so as to perform an optimal contraction, and the contracted results are then processed in an order that maximizes the hierarchical occlusion-culling step.
The hierarchical occlusion-culling step accelerates overall computational performance of the subdivision method by deleting geometry in the scene that can, at the earliest possible iteration in the subdivision process, be proven not to be visible. This is accomplished with a hierarchical occlusion buffer, that is, a hierarchical worst-case depth representation of the rectangular array of screen pixels, using this information to perform consistency checks at each stage of the subdivision process. If the depth interval of the current subdivision iteration is strictly greater than the corresponding value stored in the hierarchical occlusion buffer, the recursive subdivision process can be terminated. All depth intervals that become part of the visible solution set are, in turn, used to dynamically propagate the state of the hierarchical occlusion buffer. As more objects are processed, the hierarchical occlusion buffer becomes a closer and closer approximation to the true visible solution set, facilitating dramatic computational savings for scenes with a high depth complexity.
The interleaved interval contraction step accelerates overall computational performance by deleting the excess interval width of, for example, parametric variables at a rate and spatial frequency that is optimal for each successive iteration of the subdivision process. It also guarantees that the parametric domain of a geometric shape will be subdivided into a regular paving, maximizing coherence and improving overall speed. The present invention is an ideal solution for the direct processing of highly sophisticated and complex nonlinear geometric shapes, as no linear approximations to the underlying nonlinear geometry are performed or required.
As such, the subject features, alone or in combination, facilitate a dramatic capacity to process increasingly large scenes in sub-linear time. The hierarchical occlusion-culling process works regardless of the order individual shapes within the scene are processed, making the present invention ideal for processing scenes so large they cannot fit into computer memory. In particular, increasingly large scenes that contain a high degree of occlusion can generally be processed in logarithmic time.
Additional items, advantages and features of the various aspects of the present invention will become apparent from the description of its preferred embodiments, which description should be taken in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is an illustration of the well-known grid technique by which an artist makes a perspective drawing of a scene;
FIG. 2 is a block diagram of a typical computer graphic process of rendering a digital image;
FIG. 3 is a schematic diagram of the operation of computer graphic process using an existing point-sampling process;
FIG. 4 represents a variety of spatial notions fundamental to image synthesis;
FIG. 5 is a representation of hemispherical coordinates, integral to global illumination assessment;
FIGS. β(a)-6(f) are pictorial representations of the results of using the point-sampling approach of FIG. 3;
FIGS. 7 (a) - (c) illustrate filtering techniques for point sampling methods; FIGS. 8(a)-8(f) are pictorial representations of the results of using a modified stochastic point-sampling method in the process of FIG. 3;
FIG. 9 is a depiction showing the tradeoff between speed of rendering and realism of an image; FIG. 10 is a schematic representation of the photorealistic image synthesis method of the subject invention;
FIG. 11 is a representation as FIG. 10, wherein an exemplary system, and corresponding display are shown; FIG. 12 is a static unified model language representation of the content of FIG. 10;
FIG. 13 is a temporal unified model language representation of the solvers of FIG. 12;
FIGS. 14-18 are schematic depiction of the work flow of the solvers of FIG. 13, namely, SCREEN, PIXEL, COVERAGE, DEPTH, and IMPORTANCE;
FIGS. 19 (a) -19 (f) are pictorial representations of the operation of an interval set technique for rendering in accordance with the present invention;
FIGS. 20 (a) - (c) illustrate occlusion principles, more particularly, depth of field and area masking notions;
FIG. 21 depicts a hierarchical occlusion buffer of the method of the subject invention, more particularly, a binary tree stored as an linear array;
FIG. 22 illustrates an advantageous subdivision process, namely, processes implicating tile and solution vectors;
FIG. 23 generally illustrates notions of consistency as a preferred threshold requirement of the process of FIG. 22;
FIG. 24 illustrates exemplary measurement techniques used in computing a merit function for the solution vector of FIG. 22;
FIGS. 25 (a) - (d) illustrate user selectable scaling effects in connection with solution vector contraction; and,
FIG. 26 representatively illustrates solution vector contraction in relation to a merit function application.
DETAILED DESCRIPTION OF THE INVENTION The present invention abandons existing point sampling techniques, and instead provides a system and attendant methodology for reconstructing two-dimensional digital images of a three dimensional digital representation of a visual scene, a process referred to as rendering, so as to photorealistically synthesize images. In furtherance of detailed invention development, a rendering framework is preliminarily outlined. RENDERING FRAMEWORK FIG. 1 shows a classic example of how a renaissance artist uses the well-known grid technique to translate real world images into two-dimensional drawings. An artist 20 uses a stick 22 as the reference point for the artist's viewing position. The artist 20 looks through the cells 24 created by a rectangular grid of twine 26 into a scene 28 behind the grid. A drawing paper 30 on which the artist 20 will render a two-dimensional representation of the scene 28 is divided into the same number of rectangular cells 32 as there are cells 24. The artist carefully copies only what is seen in a given cell 24 in the grid of twine 26 onto the corresponding cell 32 on the paper 30.
This grid technique is the real world analogy to the ■computer graphic process that forms the basis of modern day digital graphics. FIG. 2 shows the overall process of how a computer graphics system 40 turns a three dimensional digital representation of a scene of a scene database 42 into multiple two-dimensional digital images 44. Just as the artist uses the cells 24 and 32 to divide the representation of an entire scene 8 (FIG. 1) into several smaller and more manageable components, the digital graphics system 40 divides an image 44 to be displayed into thousands of pixels in order to digitally display two-dimensional representations of three dimensional scenes. A typical computer generated image used by the modern motion picture industry, for example, is formed of a rectangular array of pixels 1,920 wide and 1,080 high.
In a conventional digital animation process, for example, a modeler 41 defines geometric models 43 for each of a series of objects in a scene. A graphic artist 45 adds light, color and texture features 46 to geometric models 43 of each object, and an animator 47 then defines a set of motions and dynamics 48 defining how the objects will interact with each other and with light sources in the scene. All of this information is then collected and related in the scene database 42.
A render farm, e.g., a network 49 comprised of multiple servers 51 then utilizes the scene database 42 to perform the calculations necessary to color in each of the pixels 50 in each frame 44 that are sequenced together to create the illusion of motion and action of the scene. Unlike the rectangular cells on the artist's paper, a pixel may only be assigned a single color.
With reference to FIG. 3, like the artist via the viewing position 22, the system 40 of FIG. 2 simulates the process of looking through a rectangular array of pixels into the scene from the artists viewpoint. Current methodology uses a ray 62 that starts at the viewing position 60 and shoots through a location within pixel 50. The intersection of the ray with the pixel is called a point sample. The color of each point sample of pixel 50 is computed by intersecting this ray 62 with objects 64 in the scene. If several points of intersection exist between a ray 62 and objects in the scene, the visible intersection point 66 is the intersection closest to the origin of the viewing position 60 of the ray 62. The color computed at the visible point of intersection 66 is assigned to the point sample. If a ray does not hit any objects in the scene, the point sample is simply assigned a default "background" color. The final color of the pixel is then determined by filtering a neighborhood of point samples. SCENE OBJECTS AND THEIR REPRESENTATION
Prior to any further development or discussion of traditional point-sampling methods, some fundamental understanding of a scene object, more particularly, its quality or character, will facilitate an appreciation of the problem at hand. The curves and/or surfaces that are used in computer graphics are all derived from various types of mathematical equations. Plug values in the variables of the equations, and they identify which points are on the object, and all the rest are not. For the most part, primitives, that is to say simple solid shapes, have a position and orientation initially set within the primitives local coordinate system (i.e., model space), as appreciated by reference to FIG. 4 wherein there is depicted the primary notions of space, i.e., model space 66, world space 68 and camera space 70. In some modeling systems, an initial untransformed primitive is presented having unit dimensions, and is most conveniently positioned and aligned in its own local coordinate/model system. For example, a primitive sphere would have a radius of one with its center at the origin of its local/model coordinate system; the modeler would then specify a new center and radius to create the desired specific instance of a sphere in the world coordinate system (i.e., space) or the scene (i.e., camera space).
As to the surfaces of scene objects, there are three types of equations which provide the basis for computer graphics geometric primitives: explicit, implicit, and parametric. An explicit equation is one that evaluates one coordinate of the position of the object from the values of the other coordinates (e.g., z = 2x + y is the explicit equation for a plane) . Characteristic of the explicit equation is that it only has one result value for each set of input coordinates .
An implicit equation is one in which certain values of input coordinates satisfy an equation: surface (x, y, z) = 0. Points that satisfy the equation are "on" the primitive, while others that do not are "not on" the primitive. The points that are generated by complex implicit equations are not always connected, they can be isolated points or small isolated regions that satisfy the equation. A parametric surface is a surface generated by a system of equations with two or more variables: p = surface (u, v) . For example, a sphere may be generated by the following parametric equations:
X = cos (θ) cos (φ) Y = sin (θ) cos (φ)
Z = sin(φ) For the most part, the value of a parametric system is believed to be two-fold. First, because parametric equations can be evaluated directly, any point on the "object" may be computed by plugging in values for the parameters. It is easy to generate a few points that are on the surface and then, as heretofore done, approximate the rest of the surface by linear approximation or some other iterative process. (i.e., tesselation) . Second, because the system effectively converts a two-dimensional pair of parametric coordinates into three- dimensions, the surface has a natural two-dimensional coordinate system, thereby making it easy to map other two- dimensional data onto the surface, the most obvious example being texture maps.
Returning again now to the notion of ray tracing, in traditional point-sampling methods, a ray is defined parametrically as: r(t)= at + b, wherein a and b are vectors and t is scalar, r(t) thereby being a vector. Points on the ray are defined for t = [ «0, +
∞] , where b is the origin of the ray and a is its direction. In the general case, surfaces in the scene are represented as the zero set of a vector-valued implicit function: G(x)= 0
Determining the visible point of intersection between the ray and the implicit function reduces to finding the smallest positive root of the univariate equation:
G(r(t))= 0 The roots of this equation can be computed algebraically for linear, quadratic and cubic functions, and this is the reason for the ubiquitous use of planes, triangles, spheres and quadrics as the geometric primitives of choice in modern ray tracing algorithms. Numerical analysis, including bisection methods or Newton's method, must be used to find the roots of functions of higher degree. Such methods are numerically unstable for functions of high degree, and there are no point analysis methods that can guarantee a solution by finding all the roots, or even the proper roots, of such functions. This is why most modern ray tracing algorithms only support geometric primitives of low degree and more complex objects are tesselated into simpler geometric primitives. In photorealistic rendering, it is desirable to work with functions defined over a hemisphere 72 centered around an oriented surface point 74 (FIG. 5) . A hemisphere consists of all directions in which a viewer can look when positioned at the oriented surface point: a viewer can look from the horizon all the way up to the zenith, and all around in 180°. The parameterization of a hemisphere is therefore a two- dimensional space, in which each point on the hemisphere defines a direction.
Spherical coordinates are a useful way of parameterizing the hemisphere 72. In the spherical coordinate system, each direction is characterized by two angles φ and θ. The first angle, φ, represents the azimuth, and is measured with regard to an arbitrary axis located in the tangent plane at x; the second angle, θ, gives the elevation, measured from the normal vector Nx at surface point x. A direction θ can be expressed as the pair (φ, θ) . The values of the angles φ and θ belong to the intervals φ = [0, 2π] , and θ = [0, π/2] . At this juncture, directions or points on the hemisphere have been defined. Should it be desirable to specify every three- dimensional point in space (i.e., not only points on the surface of the hemisphere) , a distance r along the direction θ is added. Any three-dimensional point is then defined by three coordinates (φ, θ, r) . The most commonly used unit when modeling the physics of light is radiance (L) , which is defined as the radiant flux per unit solid angle per unit projected area. Flux measures the rate of flow through a surface. In the particular case of computer graphics, radiance measures the amount of electromagnetic energy passing through a region on the surface of a sphere and arriving at a point in space located at the center of the sphere. The region on the surface of the sphere is called the solid angle. From a point-sampling perspective, calculating radiance is exactly equivalent to computing the entire set of visible intersection points for all rays originating at the origin of the sphere and passing through the solid angle. Since there are an infinite number of rays that subtend any given solid angle, it is clearly impossible to compute an exact value of radiance by using traditional point-sampling techniques, as it would require an infinite amount of samples. Instead, practical algorithms use only a finite number of samples to compute a discreet approximation, and this provides an opportunity for aliasing in a synthesized or reconstructed image. It is precisely the fact that point-sampling algorithms do not compute an infinite amount of samples that is the cause of aliasing in modern computer graphics. THE PIXEL AND VISIBLE SOLUTION APPROXIMATION Returning back again to the notion of point-sampling, and with reference now to FIGS-. 6(a)-(f), FIG. 6 (a) represents a single pixel containing four scene objects, with FIGS. 6(b)- (f) generally showing a point-sampling algorithm at work in furtherance of assigning the pixel a single color. As should be readily apparent, and generally intuitive, the color of the pixel might be some kind of amalgamation (i.e., integrated value) of the colors of the scene objects. In FIG. 6 (b) , only a single point sample is used, and it does not intersect with any of the objects in the scene; so the value of the pixel is assigned the default background color. In FIG. 6(c), four point samples are used, but only one object in the scene is intersected; so the value of the pixel is assigned a color that is 75% background color and 25% the color of the intersected scene object. In FIGS. 6(d), 6(e) and 6(f), additional point samples are used to compute better approximations (i.e., more accurate representations) for the color of the pixel. Even with the increased number of point samples in FIG. 6(e), two of the scene objects are not intersected (i.e., spatial aliasing: missing objects), and only in FIG. 6(f) does a computed color value of the pixel actually contain color contributions from all four scene objects. In general, point sampling cannot guarantee that all scene objects contained within a pixel will be intersected, regardless of how many samples are used.
For example, in a "super sample" operation (i.e., when using lots of rays to compute the color of a pixel) , each pixel is point sampled on a regular grid or matrix grid (e.g. nxm) . The scene is then rendered nxm times, each time with a different subpixel offset. After each subpixel rendering is completed, it is summed and divided by the sum nxm. The subject approach need not be confined to regular nxm regular grids. A relaxation technique can be used to automatically generate irregular super-sampling patterns for any sample count. Ultimately, the aforementioned sampling process will create partial antialiased images that are "box filtered" (FIG. 7 (a) ) , however, there is not reason to limit samples to the area of a single pixel. By distributing point samples in the regions surrounding each pixel center, improved antialiasing, but nonetheless deficient and thereby unacceptable, results may be obtained. Geometry can be sampled using a gaussian, or other sample function, in several distinct and known ways, for instance to weight the distribution of point samples, say the nxm box filtered sample of FIG. 7 (a) , using a gaussian distribution, and thereby acheive a weighted filtering of the nxm matrix as shown in FIG 7 (b) . As illustrated in FIG. 7 (c) , when the uniform nxm matrix is abandoned in favor of a gaussian spatial distribution, and there is a homogeneity of weight with regard to the sample points, a so called importance filtering is achieved. Improved image synthesis has been obtained in the context of supersampling by concentrating the rays where they will do the most good (e.g., to start by using five rays per pixel, namely, one each at the pixels corners, and one through the center) . However, even with such adaptive supersampling, aliasing problems nonetheless arise due to the use of regular grids (i.e., subdivisions), even though the grid is somewhat more finely, preferentially subdivided in some places. It has been found that by introducing randomness into the point- sampling process (i.e., getting rid of the grid), aliasing artifacts in a reconstructed image are disguised as "noise" which the human visual system is much less apt to perceive as objectionable (i.e., a better or improved perceptual color of the pixel is obtained with this approach, however, it's most often not any more mathematically correct) . Two common algorithms that use a randomness approach are the so-called "Monte Carlo" and "stochastic point sampling" techniques. Pixar's RenderMan, for example, uses stochastic point sampling, which perturbs the position of samples within a pixel by small amounts of random displacement. Such approach is illustrated in FIGS. 8(a)-(f), wherein FIG. -8 (a) represents the single pixel of FIG. 6 (a), FIG. 9 depicting the trade-off between the speed of conventional local illumination (i.e., "fast," e.g. FIG. 6(b) or 8 (b) ) and the realism of global illumination (i.e., "realistic," that is to say, less "fast," e.g. , FIG. 6(f) or 8 (f) ) . ALIASING AND AREA ANALYSIS
Because aliasing is impossible to completely eliminate from a point-sampled image, an area analysis framework is instead used to more accurately represent or define the problem. The mathematics involved in a global illumination model have been summarized in a high level continuous equation known as the rendering equation (i.e., a formal statement of light balancing in an environment) , with most image synthesis algorithms being viewed as solutions to an equation written as an approximation of the rendering equation (i.e., a numerical method solution approach) . That is to say, a solution to the rendering equation in the case of light falling on an image plane, is a solution of the global illumination problem. Consistent with the model, the color of a pixel is determined by actually integrating the visible solution set over the area of an oriented surface parameterized, such as a pixel or hemisphere as previously discussed.
Historically, several attempts have been made to find exact solutions to this equation. For example, the initial ambition of Turner Whitted, who invented ray tracing in 1980, was to analytically compute an exact visible solution set between the solid angle of a cone through a pixel and the objects in the scene. He ultimately abandoned this approach due to the complexity of the intersection calculations, and this is how he instead arrived at the idea of using point sampling with rays as an approximation. In 1984, John Amanatides tried the same method. He successfully created an algorithm that approximated the visible solution set between the solid angle of a cone and simple algebraic scene objects, such as planes and spheres. Like Whitted, however, Amanatides could not solve the problem for arbitrarily complex objects or scenes. Even to this day, traditional point-based numerical analysis cannot solve, in general, such surface intersections. Instead, point sampling has become firmly entrenched as the preferred and de facto method of approximating the visible solution set. The problem formulation, and work to date, in this area, is presented by Sung, Pearce & Wang, Spatial- Temporal Antialiasing, 2002, incorporated herein by reference . INTERVAL ANALYSIS The present invention, in all its embodiments, abandons point arithmetic and point-sampling techniques altogether, and instead turns to an interval analysis approach. First invented and published in 1966 by Ramon Moore, interval arithmetic is a generalization of the familiar point arithmetic. After a brief period of enthusiastic response from the technical community, interval arithmetic and interval analysis (i.e., the application of interval arithmetic to problem domains) soon lost its status as a popular computing paradigm because of its tendency to produce pessimistic results. Modern advances in interval computing have resolved many of these problems, and interval researchers are continuing to make advancements, see for example Eldon Hansen and William Walster, Global Optimization Using Interval Analysis, Second Edition 2003; L. Jaulin et al., Applied Interval Analysis, First Edition 2001; and, Miguel Sainz, Modal Intervals.
In one embodiment of the subject invention, the electronic computing device consists of one or more processing units connected by a common bus, a network or both. Each processing unit has access to local memory on the bus. The framebuffer, which receives information from the processing units, applies the digital image, on a separate bus, to a standard video output or, by a virtual memory apparatus, to a digital storage peripheral such as a disk or tape unit. Examples of suitable electronic computing devices include, but are not limited to, a general purpose desktop computer with one or more processors on the bus; a network of such general purpose computers; a large supercomputer or grid computing system with multiple processing units and busses; a consumer game console apparatus; an embedded circuit board with one or more processing units on one or more busses; or a silicon microchip that contains one or more sub-processing units. The framebuffer is a logical mapping into a rectangular array of pixels which represent the visible solution set of the digital scene. Each pixel in the image will typically consist of red, green, blue, coverage and depth components, but additional components such as geometric gradient, a unique geometric primitive identifier, or parametric coordinates, may also be stored at each pixel. The image stored in the framebuffer is a digital representation of a single frame of film or video.
As previously discussed, the digital representation of a visual scene in memory is comprised of geometric primitives; geometric primitives reside in the local memory of the processing units. If there is more than one bus in the system, the geometric primitives may be distributed evenly across all banks of local memory. Each geometric primitive is represented in memory as a system of three linear or nonlinear equations that map an n- dimensional parameter domain to the x, y and z domain of the digital image; alternatively, a geometric primitive may be represented implicitly by a zero-set function of the x, y and z domain of the digital image. John Snyder' s book, Generative Modeling for Computer Graphics and CAD; Symbolic Shape Design Using Interval Analysis, describes a compact, general-purpose method for representing geometric primitives of both kinds in the memory of a computer. Such a method is compatible with the requirements of the present invention, and it is also the preferred method.
Because the nature of such a general-purpose method of representing geometric primitives in memory has so many possible encodings, only a single representation will be used for the sake of clarity and simplicity in the remainder of this description. The example to be used is a system of nonlinear equations that map a 3-dimensional parameter domain, specified by the parametric variables t, u, and v, to the x, y and z domain of the digital image. The resulting manifold is a parametric surface of the form
X (t, u, v) = x
Y {t, u, v) = y
Z (t, u, v) = z,
wherein said system of equations are interval functions. To compute pixels in the framebuffer, an interval consistency method is performed on X, Y, and Z over interval values of x, y, z, t, u and v that represent the entire domain of each variable. For example, the interval domain of x, y and z will typically be the width, height, and depth, respectively, of the image in pixels, and the interval domain of t, u, and v will depend on the parameterization of the geometric primitive . PHOTOREALISTIC IMAGE SYNTHESIS
Referring now to FIGS. 10-12, the subject photorealistic image synthesis process is generally shown, with particular emphasis and general specification of the methodology of the subject interval consistency approach, in furtherance of photorealistic image synthesis processing, outlined in FIG. 12, which utilizes unified Modeling Language (uml) . As shown, central to the process are a plurality of interval consistency solvers. Operatively and essentially linked to the interval consistency solvers is a system input, exemplified in FIG. 10 by a series of generic parametric equations, each function having two or more variables, for example the arguments t, u, and v as shown, and as representatively illustrated in FIG. 11, wherein the "system" is a sphere, the x-y-z functions being parameterized in t, u, v. It is to be understood that the system need not be limited to parametric expressions, which have the greatest utility and are most challenging/problematic, other geometric primitives, or alternate system expressions are similarly contemplated and amenable to the subject methodology and process as is to be gleaned from the discussion to this point. For example, the system can similarly render strictly mathematical formulae selectively input by a user.
As shown, the interval consistency solvers are further reliant upon user-defined shading routines as are well known to those of skill in the art (FIGS. 10-11) . The dependency is mutual, the interval consistency solvers exporting information for use or work by the shader, a valuation or assessment by the user-defined shading routines returned (i.e., imported) to the interval consistency solvers for consideration and/or management in the framework of the process.
An output of the interval consistency solvers is indicated as pixel data (i.e., the task of the interval consistency solvers is to quantitatively assign a quality or character to a pixel) . The pixel data output is ultimately used in image synthesis or reconstruction, vis-a-vis forwarding the quantitatively assigned pixel quality or character to a display in furtherance of defining (i.e., forming) a 2-D array of pixels. For the parameterized system input of FIG. 11, a 2-D array of pixels, associated with a defined set of intervals, is illustrated.
With particular reference now to FIG. 12, the relationship and interrelationships between the SOLVER, INPUT, CALLBACKS, and OUTPUT is defined, and will be generally outlined, and further, the relationship between and among the several solvers, e.g., SCREEN, PIXEL, COVERAGE, DEPTH and IMPORTANCE, are defined in the figures subordinate thereto, namely FIGS. 13- 18, and will be subsequently outlined.
The solver, more particularly the most preferred components thereof, namely SCREEN, PIXEL, COVERAGE, DEPTH, and IMPORTANCE, are shown in relation to the input (i.e., dim and system, that is to say, a geometric function) , callbacks (i.e., shader) , and output (i.e., pixel data and display). The interrelationships between the individual most preferred elements of constituents of the solver, and the general temporal hierarchy between and among each, as well as their relationships between the callbacks (i.e., the shader) and the output (i.e., the display) are schematically shown in FIG. 12. As will be subsequently discussed in the flow schematics for each of the solvers, and as is appreciated by a reference to the subject figure, hierarchical, iterative sieving progresses, in nested fashion, from the screen solver to the importance solver, with each solver exporting a constraint for which the subsequent solver is to act in consideration thereof. Values from successively embedded solvers are returned as shown, the pixel solver ultimately bundling qualities or character of color, opacity, depth, and coverage, for instance, and "issues" such bundled information package (i.e., a pixel reflecting that scene object subtending same) to the display as shown in furtherance of synthesizing the 2-D array corresponding to the image plane.
The screen solver effectively conducts an analysis of the screen (e.g., FIG. 3) or "image plane" of the camera space of FIG. 4, essentially performing a set inversion in x, y. The objective or job of the screen solver is a preliminary one, namely, to chop the x-y screen into x-y subunits, effectively "stopping" upon achieving (i.e., identifying) a unit (i.e., area) commensurate or corresponding to a pixel (see e.g., FIGS. 19(b)-19(f) wherein the chopping is illustrated here of a pixel, not the image plane as is a preliminary step or prerequisite to chopping the pixel area) . Most preferably, SCREEN is a point from which parallel processing is pursued, further desirable for such purposes is PIXEL, as will become readily apparent as the discussion progresses.
Referring now to FIG. 14, chopping of the x-y image plane begins with an initial step analogous to that illustrated in FIG. 19 (b). The idea is to parse the x-y image plane to dimensional equate to a pixel. As shown, in the event that initial chopping yields a subdivided x-y area more extensive than a pixel, more chopping is conducted, namely a preferential chopping. More particularly, the nature of the x-y image plane subunit (i.e., a rectangle) is assessed and characterized as being either "landscape" or "portrait". In the event the subunit is landscape, the x dimension is further split: in the event that the subunit is portrait, then the y dimension is then split. For each iterative step in x or y (see FIGS. 19 (b) et seq., the arguments t, u, and v, are contracted so as to eliminate values thereof outside the specific or "working" x-y interval (i.e., with each iteration in x and y, it is advantageous to eliminate the t, u, and v values that are not contributing, and thereby potentially contribute to aliasing) .
The pixel solver, depicted in FIG. 15, is essentially a liaison between screen and the other solvers, acting as a synchronization point and performing a housekeeping function. Preliminarily, PIXEL seeks an answer to the question, is the nature of the x-y interval corresponding to a pixel area, and thereby the t, u, v solutions associated therewith, such that the shader has been invoked (i.e., color and opacity, for example, has been assigned or designated) . If the shader has been invoked, by calling upon the coverage solver, no further parsing of the x-y space (e.g., FIGS. 19 (b) -19 (f) ) is required, and the x-y pixel data is sent to the display.
The coverage solver, as detailed in FIG. 16, essentially replicates the iterations of SCREEN, based upon a user defined limit epsilon (eps) . COVERAGE, as the name suggests, seeks to delimit, via the retention of contributing t, u, v aspects based upon the user specified chop area "eps," those portions (i.e., areas) of the object surface within the pixel subunit (again, see FIGS. 19(b)-19(f). Upon ascertaining the values associated with the x-y space or area, they are added or compiled to provide or define the total coverage of the object surface (i.e., a mapping of the entire x-y space). At this point, analysis, more particularly processing, in x-y space is complete. The next procedural task is a consideration of depth (i.e., assessment of Z (t, u, v) of the parametric system with a fixed or set x and y) .
The depth solver, as detailed in FIG. 17, is essentially doing the job of FIG. 17 (a) . More particularly, DEPTH initially ascertains where in the z dimension, ultimately from the image plane (see FIG. 4 camera space) , does the object surface, heretofore defined in x, y, t, u, v aspects, first appear or reside (i.e., in which depth cell), and thereafter step into space, via iterative cells, until the x, y, t, u, v object surface is no longer present in a cell (i.e., cell X of FIG. 17 (a) ) . In furtherance thereof, the depth variable, more accurately, depth function, is initialized for all depth space, namely set to the infinite interval z depth, i.e., set to an interval at an infinite distance from the viewer, z depth. Thereafter, t, u, v, contraction begins in the depth field (zθ). Subsequently, there is a trivial accept/reject query as to whether there is in fact a depth component of the x-y parameterization, with searching commencing thereafter (z search) . For each depth cell, the importance solver (i.e., the t, u, v, chopper wherein a set inversion is executed in t, u, v so as to contract same) is called upon, and it is necessary to next assess if the shader was invoked. If the shader is invoked (i.e., a first visible root is identified), the output of the shader are accumulated into the importance sums and the depth parsing continues in furtherance of accounting for all z components of the x-y object surface, if not, steps, on a cell by cell basis are "walked off." Although the parsing or chopping of z space has been described as a serial or loop type progression, it is certainly amenable to recursive splitting, as the case of the x-y space. The importance solver, as detailed in FIG. 18, when called, essentially completes a set inversion in t, u, v, that is to say, for the smallest x, y, z (i.e., each specific z cell for, or in, which an object surface element x-y resides), t, u, v are to be narrowed as much or as finely as possible. The function of the importance solver is to fit or optimally match the t, u, v with the x, y, z, in a way that is overreaching, not under reaching.
As should be readily appreciated, the same methodology may be used in a shading routine to integrate radiance over the solid angle of a hemisphere. The only change needed to accomplish this is to define a function: Φ (t, u, v) θ (t, u, V) p (t, u, v)
where φ, θr and p represent the parametric domain of the hemisphere as previously outlined, for integration, the t, u, and v representing geometric primitives in the digital scene. If the shading routine performs this procedure recursively, very accurate bounds on the radiance function for a pixel can be computed. It should be noted, and/or again emphasized, that to the extent that the subject description has only used a 3- dimensional parameter domain as an example, the method described herein works for any n-dimensional parameter domain. For example, additional parameters such as temperature, mass or pressure can be used to specify the manifold that appears in the x, y and z domain of the image. This allows more complex and accurate simulations to be represented in the digital scene and applied to the rendered image.
As previously noted in connection with FIG. 2, stored within the virtual memory, which may represent physical random access memory, or a virtual map to other forms of digital storage such as magnetic or optical disk, is a visual scene, comprised of geometric primitives (e.g., explicit, implicit, and/or parametric equations representing at least a portion of a surface of a surface of a scene object) . In the context of the preferred, non-limiting embodiment, each geometric primitive is comprised of two parametric functions, but need not be so limited: one is a geometric shape function and the other is a shading function.
The geometric shape function is preferably and advantageously a parametric surface, generally taking the form P (u, vr t) = (x, y, z) . In this case, the variables u and v are the intrinsic parameters of the surface and t is the variable of time. The function P maps the R3 domain of (u, v, fe) -to the R3 domain of (x, y, z) , wherein the x, y and z variables are coordinates in a rectangular array of pixels comprising the output image. The valid domain of the u, v and t variables can be any arbitrary subset of R3 and will depend entirely on the function P.
The parametric shading function is a mapping of the u, v and t variables into an output vector specified by the output format of the reconstructed image. As previously noted, typically, this will at least be red, green and blue channels of color, however, it need not be so limited, with other channels of information being selectively contemplated, e.g., additional color channels, opacity, object identifier tags and spectral samples are all possible types of additional information that may be present in an output vector of the shading function. As such, the shading function generally takes the form S (u, v, t) = (r, g, b, .. ) , where r, g and b are the red, green and blue color components (i.e., a commonly used color format in computer graphics) followed by any number of additional output variables.
In the context of the present invention, P and S are both interval functions. In other words, all of the input and output variables are intervals. For example, submitting interval values of u, v and t to P (u, v, t) = (x, y, z) will yield interval values for x, y and z that are guaranteed to contain the range of the function over the submitted domain.
HIERARCHICAL OCCLUSION
As described in connection with FIG. 13 et seq. , a hierarchical "sieving" process advantageously utilizes nested solvers to operate on the geometric function P(ur v, t) = (x, y, z) . The heretofore described preferred method involves a subdivision step of the [x, y, z) domain followed by a contraction of the (u, v, t) domain with the subject advantageous, non-limiting embodiment following the general concept while utilizing bisection to perform contraction on the (u, v, t) domain. Furthermore, as the case warrants, e.g., rendering of the afore described parametric surfaces, it is advantageous that at each subdivision stage, a merit function be used to choose the optimal dimension to split. For example, prior to contraction, a merit function is used to determine whether bisection along the u, v or t dimension will be more likely to yield optimal results. This prevents wasted iterations of contraction that can occur when scaling in the map from (u, v, t) to (x, y, z) is highly- nonlinear or irregular.
An improvement over the methodology of FIGS. 13 et seq. , among other things, lies in the utilization of a hierarchical occlusion culling step or process which essentially deletes scene geometry that can, at the earliest possible iteration in a subdivision process, be proven not to be visible. For instance, such occlusion culling is germane in connection with processing the x-y or area dimension for area or coverage mask assessment, and/or with the processing of the z dimension for depth of field assessments.
With general reference to FIGS. 20 (a) -(c), the subject occlusion notions are illustrated and briefly noted. In connection to FIG. 20 (a) , which illustrates a rendered target image, depth of field occlusion is manifest, for example, in the relatively distal image planes containing structures 80, or shore line elements 82. As a general proposition, if the projected screen area of a box Jb1 = (X1, ylf Z1) intersects the projected screen area of another box b2 = (x2r Y21 Z 2) r then it is proof that Jb2 is not part of the visible solution set belonging to Jb1 if Z1 is strictly less-than z2. This is true regardless how large or tiny the two boxes are. In fact, if such a relationship between Jb1 and b2 can be detected before subdividing to a pixel or subpixel level, significant savings in computational time can be gained since b2 can be deleted, and further subdivision and contraction steps can be thereby completely avoided.
In connection with area occlusion, an initialized hierarchical area mask, at the pixel level, is shown in FIG.
20 (b) . With a threshold determination of the masked or obscured areas, i.e., the "black" area or template, rendering of the target image is had vis-a-vis subsequent interative subdivision processing of the "white" area or space to ultimately yield the masked image of FIG. 20 (c) .
In the context of interval analysis, there exists a general awareness of the interval bounds on all variables under consideration. The function P{u, v, t) = (x, y, z ) represents a mapping of the (u, v, t) variables into the (x, y, z) domain of the screen. The values of x and y are measured in pixels, and z is typically normalized to lie within the range [0,1], mapping to the near and far clipping planes, respectively, of the viewing volume. With the hierarchical worst-case depth representation, max (Z1), of the rectangular array of screen pixels utilized, and dynamically updated during the subdivision process, contractions of the u, v and t variables are unnecessary once it can be proven that they no longer contribute to the final visible solution set.
The hierarchical occlusion buffer, from a conceptual perspective, is a dynamically-updated data structure that stores the most current pessimistic variable estimate, e.g., depth, for each tile in the paving of the x and y dimensions of the screen. In practice, the implementation reguires the use of a hierarchical data structure which can be efficiently accessed and updated. Many choices exist, each with particular advantages and disadvantages.
With reference now to FIG. 21, the preferred embodiment uses a binary tree 90 stored in virtual memory as a linear array 92. While the subseguent discussion is directed to depth assessment, the previously noted area assessment follows the general format.
As shown in FIG. 21, the root item of the tree is stored at array index 1. For any array index, 1 representing a tile 94 (e.g., "4"), the left and right child of the binary tree is stored at array index 21 (i.e., "8") and 21 + 1 (i.e., "9"), respectively. Generally, the parent of an array index is floor 1/2, e.g., the floor of index "4" is 2. All of these computations can be performed using a combination of integer shift and bitwise-or operations. Because of the cheap and easy addressing scheme, pointers between parent and children do not need to be explicitly stored in memory. The only storage required for a tree item is, in the case of a depth of filed assessment, a double-precision floating point value representing the worst-case depth information of a tile 94 (cf. a binary flag/bit, true (1) /false (0) , in the context of a coverage mask assessment) . As a consequence, the binary tree is represented in memory as a linear array of double-precision depth values. A downside to this scheme is that for image dimensions which are not exact powers of two, some array indices will not be used, wasting storage. However, the speed and simplicity of the method generally outweigh the wasted memory, which in practice turns out to be rather nominal.
During the subdivision process, indices into the linear array are computed on the fly. This means that reading a value from the hierarchical occlusion buffer is a simple indexed virtual memory access. Updating the state of the array is only slightly more complicated. In general, updates will be initiated once subdivision has reached the leaf nodes of the tree. The parent of the leaf is then set to the maximum depth of its children. This process is repeated recursively, and can be implemented efficiently as a loop, bit-shifting the index of the original leaf item to walk up the tree and propagate changes into the linear array. Of course, the propagation can terminate early if the depth value of a parent is greater-than or equal-to the maximum depth of its children. The entire array is generally initialized so that all tree items represent a depth value which maps to the far clipping plane or beyond, for example, 1 or ∞.
TILE AND SOLUTION VECTORS
With reference to FIG. 22, which generally represents a refinement of the processing or processes of FIGS. 16-18, to begin subdivision processing, two vectors are initialized: the tile vector 96 and the solution vector 98. The tile vector {path, x, y) is comprised of the two interval variables x and y, each initialized so as to contain the entire domain, measured in pixels, of the screen. For example, if the screen is 720 pixels wide and 480 pixels high, then the x and y values of the tile vector are initialized to [0,720] and [0,480], respectively. The tile vector also contains a variable, path, which is an index into the linear array representation of the hierarchical occlusion buffer (FIG. 21) . The tile comprising the entire screen is always associated with the root item of the binary tree (FIG. 21), and so the path variable is initialized to 1.
The solution vector ( x, y, z, u, v, t, axis ) is initialized so as to encompass the full domain of the u, v and t variables and P( u, v, t ) = ( x, y, z ) . In other words, the u, v and t variables are initialized to intervals containing each of their respective domains. The particular intervals will depend on the function P. For example, if P is the function of a moving sphere
P[UfV, t) - (cos ( u) cos ( v) + t, sin (u) cos ( v) , sin(v)),
then u and v would most likely be initialized to [-π,π] and [- π/2,π/2], respectively, and the t variable could be initialized to [0,5] if it was desired to have the sphere moving (motion-blurred) 5 pixels to the right. The x, y and z values of the solution vector would then be initialized to the intervals
P([-π,π], t-π/2,π/2], [0,5]) = (x,y,z) = ( [-1, 6] , [-1, 1] , [-1, 1] ).
The last member of the solution vector, axis, is a flag that indicates the next dimension to be bisected during the contraction step. It will be updated by the merit function during processing, but for starters, it is sufficient to initialize this flag such that the u dimension will be selected next for bisection.
Processing commences by first performing a consistency check 100 between the tile vector 96 and the solution vector 98, that is, the x and y interval variables of each vector are tested for intersection, see representation of FIG. 23. There must be intersection between the x and y dimensions of both vectors in order to continue. If no intersection exists, then the consistency is disjoint, meaning the solution vector can be deleted from the visible solution set of the tile vector. If the consistency check is true, processing can continue.
After a successful consistency check, a hierarchical occlusion or occlusion-culling step 102 is performed. The path variable of the tile vector is used as an index to read the worst-case depth estimate from the hierarchical occlusion buffer (FIG. 21) . If the worst-case depth estimate is strictly less-than the z value of the solution vector, it is proof that the solution vector can be deleted from the visible solution set of the tile vector. If the hierarchical occlusion-culling step fails, processing can continue.
At this stage of processing, it has been proven that the solution vector might contribute to the visible solution set of the tile vector. A choice between a three-way branch condition must now be made: (1) subdivide the tile vector 96,
(2) contract the solution vector 98 or (3) shade the solution vector 98 because termination criteria have been reached. A few observations are noted, namely, it is not necessary to subdivide the tile vector unless the other two branches can not be taken. Similarly, termination criteria can not be reached unless it is proven that contraction of the solution vector is not necessary. As a result, it is most efficient to test for this condition first.
To determine if the solution vector needs to be contracted, a simple but highly effective test is to compare the projected screen area of the tile vector with the projected screen area of solution vector. For example, if A(x, y) = wid( x ) • wid ( y ) (FIG. 24 (a) ) , where x and y are intervals and wid ( . ) returns the width of a given interval, then S1 = A( x, y ) tlle and a2 = A( x, y )BOlutxon, where aλ and a2 are computed from the x and y values of the tile and solution vectors, respectively. Advantageously, the solution vector needs contraction if σ * ax < a2, that is, the solution vector needs contraction if its projected screen area a2 is greater than a scaled value, i.e., σ , of the projected screen area of the tile vector, i.e., ax. The scale value σ is a parameter that allows the contraction to be tuned for speed or quality. For example, if σ ≥ 1 speed will improve and resolution will be degraded. If σ < 1, resolution will improve at the cost of more contractions on the solution vector, see e.g., FIGS 25 (a) -(d) wherein FIG. 25 (a) shows the output of the FIG. 22 process when the σ equals one; squares ax representing the regular paving of the screen (i.e., each square represents the area of a tile vector) , the smallest of the squares a± being tile vectors representing individual pixels; and, squares a2 representing the set of visible solution vectors for this particular nonlinear shape. FIGS. 25(b)-(d) are as FIG. 25 (a), with σ equal to 1/2, 1/4, and 1/8, respectively. Advantageously, σ can be "tuned" to accommodate any desired subpixel resolution.
Comparing projected screen areas corresponding to the tile and solution vectors can lead to problems in pathological cases. For example, it is possible for the projected screen area of the solution vector to be many pixels wide but only a fraction of a pixel tall and still fail the contraction test. As a result, a more robust contraction test is to compare the length of the hypotenuse of the projected screen areas (FIG. 24 (b) ) . In this case, the values of ax and a2 are computed from the function H(x, y) = sqrt(wid( x )2 + wid( y )2 ) .
To the extent that the contraction test succeeds, contraction of the solution vector is performed, preferably via bisection. With reference to FIG. 26, two new solution vectors are created, L and R, initialized as copies of the solution vector, B (FIG. 26 (a)), which is being bisected. Next, the dimension specified by the axis variable of the B vector is bisected: the L and R vectors are updated accordingly (see e.g., FIG. 26 (b) or FIG. 26 (c), depending upon which axis has been bisected) . At this point, the x, y and z values of L and .R are recomputed by evaluating P{u, v, t) = (x, y, z) , once each for L and R, each time using the x, y, z, u, v and t variables of the respective solution vector as input and output. The last step is to use a merit function to determine if the axis variables of L and R should be updated or not. This is advantageously done by measuring the relative error of the hypotenuse of projected screen area, computed as E( hx, h2 ) = 1 - Jh1Zh2, where hx = H{ x, y ) of L or J?, (i.e., H(L) or H(R)), in turn, and h2 = H{ x, y ) of B, (i.e., H(B)) (FIG. 26(d)). In general, if E( hlr h2 ) ≤ δ, then the respective axis variable of L or R is advanced, round- robin style, to the next dimension. Only the input variables u, v and t are cycled through. In practice, δ = 1/3 has proven to be an effective setting, but other values may be used to tune a particular system or implementation without departing from the sprit of the subject method.
The last step of contraction is to recursively processes both L and R by going back to the first processing stage, that is, performing the consistency check, and so forth. It is most efficient, however, to consider the L or R with the smallest possible z value as higher priority, making sure it is processed first. This allows the worst-case depth values from the solution vector closest to the near clipping plane to be dynamically propagated through the system before the solution vector further away is processed. This maximizes the occlusion between L and J?, allowing significant amounts of unnecessary computations to be avoided in many cases.
As recursion continues, eventually the three-way branch condition (FIG. 22) will determine that contraction is not the necessary step. In this case, the remaining choices are to subdivide the tile vector or to shade the solution vector because termination criteria have been met. Checking for termination criteria under these circumstances is now easy. For example, termination should occur if the width of the x and y variables of the tile vector are less-than or equal-to some specified constant. This is usually the case when the width of x and y are the size of a pixel, but subpixel tolerances are also possible. If the termination criteria have been met, then both the tile and solution vectors are added to the visible solution set and the hierarchical occlusion buffer (FIG. 21) is updated. To perform the update, the upper bound of the z variable of the solution vector is propagated through the linear array of depth values. In this case, the path variable of the tile vector specifies at which index in the linear array the propagation will start. Once propagation is complete, shading of the solution vector can be performed by calling the function S(u, v, t) = (r, g, b,...), where the u, v and t arguments to S are from the solution vector. After shading, the processing state of the recursive program can begin to unwind.
During further processing, the three-way branch condition will eventually determine that subdividing the tile vector is the next step. This condition occurs after both the contraction and termination conditions have failed. In this case, subdivision of the tile vector is accomplished with bisection. Two new tile vectors are created, L and R, initialized as copies of the tile vector, B, which is being bisected. Bisection of a tile vector always occurs along the widest dimension, so the x and y values of L and R are updated accordingly. Additionally, the path variables of L and R are computed as 2i and 2i + 1, respectively, where i is the path variable of B. After L and R have both been properly updated, the last step of subdivision is to recursively processes L and R by going back to the first processing stage, that is, performing the consistency check, and so forth. Once processing has completely finished, the final output is a large collection of shaded solution vectors which collectively represent the entire visible solution set of the image.
There are other variations of this invention which will become obvious to those skilled in the art. It will be understood that this disclosure, in many respects, is only illustrative. Although the various aspects of the present invention have been described with respect to various preferred embodiments thereof, it will be understood that the invention is entitled to protection within the full scope of the appended claims .

Claims

What is claimed is:
1. In a computer executable area subdivision methodology for visible surface determination of elements of a visual scene in connection with photorealistic image synthesis, the visual scene comprised of non linear functions stored within a virtual memory, the step comprising: a. utilization of a hierarchical occlusion-culling buffer, operatively accessible by a processor for executing an interval analysis, to facilitate determination of obscured scene geometry of the visual scene at the earliest possible iteration of a subdivision process.
2. The computer executable area subdivision methodology of claim 1 further comprising an interleaved interval contraction step.
3. The computer executable area subdivision methodology of claim 2 wherein said hierarchical occlusion buffer comprises a hierarchical data structure accessible by an indexing mechanism.
4. The computer executable area subdivision methodology of claim 3 wherein said hierarchical data structure comprises a binary tree stored in virtual memory as a linear array.
5. The computer executable area subdivision methodology of claim 2 wherein said interleaved interval contraction step identifies excess interval width of parametric variables of parametric functions of the non linear functions of the visual scene.
6. The computer executable area subdivision methodology of claim 5 wherein said interleaved interval contraction step includes, as a prerequisite, application of a merit function so as to select an optimal variable of said parametric variables of said parametric functions to contract.
7. The computer executable area subdivision methodology of claim 6 wherein said interleaved interval contraction step eliminates said excess interval width from each subsequent successive iteration in the subdivision process.
8. The computer executable area subdivision methodology of claim 2 wherein said hierarchical occlusion-culling buffer stores a characteristic of a depth interval of a select x-y subdivision interval.
9. The computer executable area subdivision methodology of claim 2 wherein said hierarchical occlusion-culling buffer stores a characteristic of an area of a select x-y subdivision.
10. The computer executable area subdivision methodology of claim 2 wherein said interleaved interval contraction step identifies excess interval width of implicit variables of implicit functions of the non linear functions of the visual scene .
11. The computer executable area subdivision methodology of claim 10 wherein said hierarchical occlusion-culling buffer stores a characteristic of a depth interval of a select x-y subdivision interval.
12. The computer executable area subdivision methodology of claim 10 wherein said hierarchical occlusion-culling buffer stores a characteristic of an area of a select x-y subdivision.
13. In a computer executable area subdivision methodology for visible surface determination in connection with photorealistic image synthesis wherein a visual scene comprised of at least a parametric surface is stored within virtual memory, the steps comprising: a. identifying a value estimate for a limiting depth plane of a viewing volume for a select mapping of (u, v, t) variables into the (x, y, z) screen domain of a
jo function of a form P(u, v, t) = (x, y, z) for the parametric surface; and, b. storing said value estimate for a limiting depth plane, and further value estimates thereof for each tile in a paving of x-y dimensions of said screen domain, in a dynamically updated data structure.
14. The computer executable area subdivision methodology of claim 13 wherein said dynamically updated data structure comprises a binary tree.
15. The computer executable area subdivision methodology of claim 13 wherein said dynamically updated data structure comprises a binary tree stored in virtual memory as a linear array.
16. The computer executable area subdivision methodology of claim 13 wherein said identifying step includes initialization of a tile vector and a solution vector.
17. The computer executable area subdivision methodology of claim 13 wherein said tile vector includes variables (path, x, y)
18. The computer executable area subdivision methodology of claim 17 wherein path comprises an index into said dynamically updated data structure.
19. The computer executable area subdivision methodology of claim 18 wherein said solution vector includes variables (x, y, z, u, v, t , axis) .
20. The computer executable area subdivision methodology of claim 19 wherein axis comprises a flag indicating a next dimension for bisection during a contraction step.
21. The computer executable area subdivision methodology of claim 20 wherein said axis is initialized such that a dimension associated with the u variable is selected for bisection.
22. The computer executable area subdivision methodology of claim 16 wherein said solution vector includes variables (x, y, z, uf v, t, axis) .
23. The computer executable area subdivision methodology of claim 22 wherein axis comprises a flag indicating a next dimension for bisection during a contraction step.
24. The computer executable area subdivision methodology of claim 23 wherein said axis is initialized such that a dimension associated with the u variable is selected for bisection.
25. The computer executable area subdivision methodology of claim 13 further comprising a subdivision step of the (x, y, z) domain followed by a contraction of the (u, v, t) domain.
26. The computer executable area subdivision methodology of claim 25 wherein said contraction deletes excessive interval width of parametric variables .
27. The computer executable area subdivision methodology of claim 25 wherein said contraction results in the parametric domain being subdivided into a regular paving.
28. The computer executable area subdivision methodology of claim 25 wherein at each stage of said subdivision step a merit function is used to selectively contract one of variables u, v, t of the (u, v, t) domain.
29. The computer executable area subdivision methodology of claim 13 further comprising initialization of a tile vector and a solution vector.
30. The computer executable area subdivision methodology of claim 29 wherein said tile vector is characterized by interval variables and an index into said dynamically updated data structure
31. The computer executable area subdivision methodology of claim 30 wherein said solution vector is characterized by the parametric surface and a flag indicating a next dimension to be split in a subsequent contraction step.
32. The computer executable area subdivision methodology of claim 31 further comprising assessing whether said solution vector contributes to a visible solution set of said tile vector.
33. The computer executable area subdivision methodology of claim 32 wherein upon determining that said solution vector contributes to said visible solution set of said tile vector, contraction of said solution vector is had.
34. The computer executable area subdivision methodology of claim 33 wherein said contraction of said solution vector is based upon an area test.
35. The computer executable area subdivision methodology of claim 34 wherein said area test comprises a comparison of a projected screen area of said tile vector with a projected screen area of said solution vector.
36. The computer executable area subdivision methodology of claim 35 wherein contraction of said solution vector proceeds if its projected screen area exceeds a scaled value of said projected screen area of said tile vector.
37. The computer executable area subdivision methodology of claim 36 wherein said scaled value is user selectable.
38. The computer executable area subdivision methodology of claim 33 wherein said contraction of said solution vector is based upon a comparison of a hypotenuse of a projected screen area of said tile vector with a hypotenuse of a projected screen area of said solution vector.
39. In a photorealistic image synthesis method wherein stored digital representations of physical three dimension visual scenes are selectively input as parametric surfaces, and one or more user-defined shading routines are selectively called upon in the course of assessment of the stored digital representations of physical three dimension visual scenes in furtherance of the production of a rectangular output array of pixels representing the visible set of surfaces of each of the stored digital representations of physical three dimension visual scenes, the step comprising: a. performing an interval analysis subdivision on parameters of a visual scene of the visual scenes, said interval analysis subdivision comprising a hierarchical occlusion culling step for determination, at the earliest possible iteration in the performance of said interval analysis subdivision, of geometry of said visual scene absent from the visible solution set of said visual scene, and elimination of said geometry of said visual scene absent from the visible solution set of said visual scene from subsequent interval analysis subdivisions.
40. A system for -visible surface determination of portions of a visual scene in- furtherance of photorealistic rendering in a computer graphics environment, said system comprising: a. a scene database wherein visual characteristics of objects of an image frame of a scene of said scene database are delimited as geometric primitives comprising at least a parametric shape function; and, b. a hierarchical occlusion-culling buffer, operatively • accessible by a processor for executing an interval analysis, to facilitate determination of obscured scene geometry of said visual scene at the earliest possible iteration of a subdivision process.
PCT/US2006/012548 2005-04-05 2006-04-05 System and method of visible surface determination in computer graphics using interval analysis WO2006115716A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US66854305P 2005-04-05 2005-04-05
US60/668,543 2005-04-05

Publications (2)

Publication Number Publication Date
WO2006115716A2 true WO2006115716A2 (en) 2006-11-02
WO2006115716A3 WO2006115716A3 (en) 2007-05-18

Family

ID=37215196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/012548 WO2006115716A2 (en) 2005-04-05 2006-04-05 System and method of visible surface determination in computer graphics using interval analysis

Country Status (1)

Country Link
WO (1) WO2006115716A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102103512A (en) * 2009-12-22 2011-06-22 英特尔公司 Compiling for programmable culling unit
US9384564B2 (en) 2007-11-19 2016-07-05 Microsoft Technology Licensing, Llc Rendering of data sets comprising multiple-resolution samples
CN110033520A (en) * 2017-12-24 2019-07-19 达索系统公司 The visibility function of three-dimensional scenic
CN110276839A (en) * 2019-06-20 2019-09-24 武汉大势智慧科技有限公司 A kind of bottom fragment minimizing technology based on outdoor scene three-dimensional data
CN111937039A (en) * 2018-01-25 2020-11-13 顶点软件有限公司 Method and apparatus for facilitating 3D object visualization and manipulation across multiple devices
US11924442B2 (en) 2018-11-20 2024-03-05 Koninklijke Kpn N.V. Generating and displaying a video stream by omitting or replacing an occluded part

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6480205B1 (en) * 1998-07-22 2002-11-12 Nvidia Corporation Method and apparatus for occlusion culling in graphics systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6480205B1 (en) * 1998-07-22 2002-11-12 Nvidia Corporation Method and apparatus for occlusion culling in graphics systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DUFF T.: 'Interval Arithemic and Recursive Subdivision for Implicit Functions and Constructive Solid Geometry' ACM 1992, pages 131 - 138, XP003011838 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9384564B2 (en) 2007-11-19 2016-07-05 Microsoft Technology Licensing, Llc Rendering of data sets comprising multiple-resolution samples
US10163229B2 (en) 2007-11-19 2018-12-25 Microsoft Technology Licensing, Llc Rendering of data sets comprising multiple-resolution samples
CN102103512A (en) * 2009-12-22 2011-06-22 英特尔公司 Compiling for programmable culling unit
JP2011134326A (en) * 2009-12-22 2011-07-07 Intel Corp Compiling for programmable culling unit
EP2348407A1 (en) * 2009-12-22 2011-07-27 Intel Corporation Compiling for programmable culling unit
US9038034B2 (en) 2009-12-22 2015-05-19 Intel Corporation Compiling for programmable culling unit
CN110033520A (en) * 2017-12-24 2019-07-19 达索系统公司 The visibility function of three-dimensional scenic
CN111937039A (en) * 2018-01-25 2020-11-13 顶点软件有限公司 Method and apparatus for facilitating 3D object visualization and manipulation across multiple devices
US11924442B2 (en) 2018-11-20 2024-03-05 Koninklijke Kpn N.V. Generating and displaying a video stream by omitting or replacing an occluded part
CN110276839A (en) * 2019-06-20 2019-09-24 武汉大势智慧科技有限公司 A kind of bottom fragment minimizing technology based on outdoor scene three-dimensional data

Also Published As

Publication number Publication date
WO2006115716A3 (en) 2007-05-18

Similar Documents

Publication Publication Date Title
US7250948B2 (en) System and method visible surface determination in computer graphics using interval analysis
Kaufman et al. Overview of volume rendering.
Cook et al. The Reyes image rendering architecture
US6853377B2 (en) System and method of improved calculation of diffusely reflected light
US7362332B2 (en) System and method of simulating motion blur efficiently
US20050041024A1 (en) Method and apparatus for real-time global illumination incorporating stream processor based hybrid ray tracing
Richter et al. Out-of-core real-time visualization of massive 3D point clouds
Szirmay-Kalos et al. Gpu-based techniques for global illumination effects
Livnat et al. Interactive point-based isosurface extraction
WO2006115716A2 (en) System and method of visible surface determination in computer graphics using interval analysis
Räsänen Surface splatting: Theory, extensions and implementation
Jens et al. GPU-based responsive grass
Reis et al. High-quality rendering of quartic spline surfaces on the GPU
Chan et al. Geocube–GPU accelerated real-time rendering of transparency and translucency
Ahokas Shadow Maps
Jansson Ambient Occlusion for Dynamic Objects and Procedural Environments
Mortensen et al. Light field propagation and rendering on the GPU
Prakash et al. Voxel based modeling and rendering irregular solids
Meseth et al. Interactive fragment tracing
Cook et al. CR CATEGORIES AND SUBJECT DESCRIPTORS: 1.3. 7
Roettger Volumetric methods for the real time display of natural gaseous phenomena
Staffans et al. Online Occlusion Culling
Fizimayer A real-time cloud animation and illumination method
Barczak et al. GPU-based scene management for rendering large crowds
Harbinson Visualisation of non-manifold implicit surfaces

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: DE

NENP Non-entry into the national phase in:

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 06749274

Country of ref document: EP

Kind code of ref document: A2