EP2052366A2 - Simultaneous simulation of markov chains using quasi-monte carlo techniques - Google Patents
Simultaneous simulation of markov chains using quasi-monte carlo techniquesInfo
- Publication number
- EP2052366A2 EP2052366A2 EP07814103A EP07814103A EP2052366A2 EP 2052366 A2 EP2052366 A2 EP 2052366A2 EP 07814103 A EP07814103 A EP 07814103A EP 07814103 A EP07814103 A EP 07814103A EP 2052366 A2 EP2052366 A2 EP 2052366A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- simulating
- graphics system
- computer graphics
- further improvement
- sorting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/06—Ray-tracing
Definitions
- Source code listing is referred to herein as the "Computer Program Listing,” and is organized into sections identified by a three-digit reference number in the format "1.1.1.”
- the present invention relates generally to methods and systems for image rendering in and by digital computing systems, such as for motion pictures and other applications, and in particular, relates to methods, systems, devices, and computer software for real-time, precision ray tracing. Background of the Invention
- ray tracing describes a technique for synthesizing photorealistic images by identifying all light paths that connect light sources with cameras and summing up these contributions. The simulation traces rays along the line of sight to determine visibility, and traces rays from the light sources in order to determine illumination.
- L'Ecuyer et al. Randomized Quasi- Monte Carlo Simulation of Markov Chains with an Ordered State Space (2004) (“L'Ecuyer”), which is incorporated herein by reference as if set forth in its entirety.
- L'Ecuyer discusses the use of a randomized quasi-Monte Carlo method for estimating the state distribution at each step of a Markov chain with totally ordered (discrete or continuous) state space.
- the number of steps in the chain can be random and unbounded.
- the method can be used in particular to get a low-variance unbiased estimator of the expected total cost up to some random stopping time, when state dependent costs are paid at each step. That paper provides numerical illustrations where the variance reduction with respect to standard Monte Carlo is substantial.
- step j of the chain it reorders the n copies according to their states and simulates the transitions (i.e., next states) for the n copies by employing the elements njto nj + n - ⁇ of the (0, 2)- sequence in place of uniform random numbers to drive the simulation. It assumes that simulating each transition of the chain requires a single uniform random variate.
- L'Ecuyer attempts to generalize the technique to Markov chains with continuous state space, with a random and unbounded number of steps (and it is claimed that such a technique permits one to cover regenerative simulation, in particular), and for which the number d of uniform random variates that are required to generate the next state in one step of the Markov chain can be larger than 1.
- the present invention provides methods, systems, apparatus, and computer software/computer code products suitable for use in a computer graphics system for generating a pixel value for a pixel in an image, the pixel value being representative of points in a scene as recorded on an image plane of a simulated camera, the computer graphics system being configured to generate the pixel value for an image using a selected ray-tracing methodology comprising the simulating of at least one ray shot from the pixel into a scene along a selected direction, the ray-tracing methodology further comprising the calculating of the intersections of rays and surfaces of objects in the scene and the simulating of trajectories of rays illuminating objects in the scene, the simulating of trajectories using Markov chains.
- methods, systems, apparatus and computer software/computer code products in accordance with the invention comprise simulating (and/or means for simulating) the Markov chains using a quasi-Monte Carlo methodology, wherein the simulating of Markov chains comprises sorting states, and wherein the sorting comprises proximity sorting.
- a further aspect of the invention comprises simulating a plurality of Markov chains simultaneously, and the simulating of a plurality of Markov chains simultaneously comprises utilizing quasi-Monte Carlo points, which can be selected using any of a Halton sequence, a variant of a Halton sequence, a lattice sequence or a (t, s)-sequence.
- the simulating of a plurality of Markov chains simultaneously can utilize any number of chains.
- the simulating of a plurality of Markov chains simultaneously comprises adding additional trajectories on the fly, such as by trajectory splitting.
- the simulating can also comprise utilizing a technique simulating particle absorption, and can include using a Russian Roulette technique.
- the simulating of a plurality of Markov chains simultaneously for an s-dimensional problem comprises the following technique:
- the invention can also comprise ordering states by proximity in state space, and techniques of randomization. Any selected set or sequence of points with any selected randomization can be employed in conjunction with the methods of the present invention.
- the invention further comprises the following technique:
- Another aspect of the invention comprises sorting by luminosity, or the norm of state; sorting using a spatial hierarchy; and sorting using buckets enumerated by at least one space filling curve.
- the sorting using a spatial hierarchy can comprise utilizing a binary hierarchy, and the spatial hierarchy can comprise any of a BSP-tree, kD-tree or other axis-aligned subset of a BSP-tree structure, or bounding volume hierarchies, or regular voxels.
- the binary hierarchy can be constructed by recursively subdividing using planes selected by a selected heuristic; and then traversing the hierarchy in in- order to enumerate the leaves of the hierarchy in an order of proximity.
- Sorting using a spatial hierarchy can also comprise utilizing bucket sorting and a selected space-filling curve; and in one aspect of the invention, using regular voxels.
- the simulation and/or sorting processing comprises: bounding an object to be rendered by an axis-aligned bounding box; recursively or iteratively dividing the bounding box into successive left and right branches, each terminating in a leaf, by applying splitting planes passing through selected points of the object, thereby generating a plurality of leaves; ordering the leaves hierarchically according to their respective luminosities; performing a bucket sort on a matrix of selected size, divided into a selected number of buckets; and using a selected space-filling curve to proceed through the matrix, wherein within an individual bucket, a smaller space-filling curve is used to proceed through individual cells in the matrix.
- the simulation processes of the present invention can be adapted for configuration to solve integral equations.
- an underlying light transport simulating integral equation can be reformulated as a path integral, such that sampling path space corresponds to simulating Markov chains, in which paths are established by ray tracing and scattering events; wherein an initial distribution is determined by emission characteristics of light sources or sensors, and transition probabilities are determined by bi-directional reflectance distribution functions and bi-directional subsurface scattering distribution functions for a surface in the image. Transparency effects can be processed by utilizing the path integral, initial distribution and transition probabilities.
- the path integral can be solved by utilizing either high dimensional quasi- Monte Carlo points or by padding low dimensional quasi-Monte Carlo points, and the integral equation underlying light transport simulation is considered either as a Fredholm or a Volterra integral equation.
- FIG. 1 is a schematic diagram of a conventional digital processing system in which the present invention can be deployed.
- FIG. 2 is a schematic diagram of a conventional personal computer, or like computing apparatus, in which the present invention can be deployed.
- FIG. 3 is a diagram illustrating an overall method in accordance with a first aspect of the present invention.
- FIG. 4 is diagram of a ray tracing procedure, illustrating the problem of self-intersection.
- FIG. 5 shows a diagram, in elevation view, of a partitioned axis-aligned bounding box that is used as an acceleration data structure in accordance with a further aspect of the invention.
- FIGS. 6-8 are a series of diagrams, in isometric view, of the axis-aligned bounding box shown in FIG. 5, illustrating the partitioning of the bounding box with L- and R-planes.
- FIGS. 9 and 10 are flowcharts of ray tracing methods according to further aspects of the invention.
- FIG. 11 is diagram illustrating the shift ⁇ s(n).
- FIG. 12 shows an illustrative Halton sequence along with its stratification properties.
- FIG. 13 is a diagram illustrating a sampling transport path space by bidirectional path tracing.
- FIGS. 14A-B and 15A-B are rendered images illustrating differences between simulations using a Monte Carlo technique and simulations using an array-randomized quasi-Monte Carlo technique.
- FIGS. 16A-D are a series of diagrams illustrating photon trajectories from a plurality of starting points on a light source.
- FIGS. 17A-G, 18A-B, and 19A-C are a series of diagrams illustrating the sorting of states into leaves of a spatial hierarchy, thereby defining an order of proximity by traversing the hierarchy in in-order.
- FIGS. 20A-D are, respectively, a test scene "Shirley 6," a graph showing average RMS error to a master solution, a graph showing global variance, and a graph showing pixel-based variance.
- FIGS. 21 A-D are, respectively, a test scene "Invisible Date," a graph showing average RMS error to a master solution, a graph showing global variance, and a graph showing pixel-based variance.
- FIGS. 22A-C are a visual comparison of the test scene "Invisible Date” rendered, respectively, using Monte Carlo, randomized quasi-Monte Carlo, and randomized quasi-Monte Carlo with sorting.
- FIGS. 23 A-D are, respectively, a schematic view of a labyrinth test scene, a graph showing the average amount of total radiance received by the camera using each technique, an enlarged portion of the graph, and a table displaying the deviation from a master solution.
- FIGS. 24-30 are flowcharts of a number of techniques and sub-techniques according to various described aspects of the invention.
- the present invention provides improved techniques for ray tracing, and for the efficient construction of acceleration data structures required for fast ray tracing.
- the following discussion describes methods, structures and systems in accordance with these techniques. It will be understood by those skilled in the art that the described methods and systems can be implemented in software, hardware, or a combination of software and hardware, using conventional computer apparatus such as a personal computer (PC) or equivalent device operating in accordance with, or emulating, a conventional operating system such as Microsoft Windows, Linux, or Unix, either in a standalone configuration or across a network.
- PC personal computer
- the various processing means and computational means described below and recited in the claims may therefore be implemented in the software and/or hardware elements of a properly configured digital processing device or network of devices. Processing may be performed sequentially or in parallel, and may be implemented using special purpose or reconfigurable hardware.
- Methods, devices or software products in accordance with the invention can operate on any of a wide range of conventional computing devices and systems, such as those depicted by way of example in FIG. 1 (e.g., network system 100), whether standalone, networked, portable or fixed, including conventional PCs 102, laptops 104, handheld or mobile computers 106, or across the Internet or other networks 108, which may in turn include servers 110 and storage 112.
- a software application configured in accordance with the invention can operate within, e.g., a PC 102 like that shown in FIG. 2, in which program instructions can be read from a CD-ROM 116, magnetic disk or other storage 120 and loaded into RAM 114 for execution by CPU 11 S. Data can be input into the system via any known device or means, including a conventional keyboard, scanner, mouse or other elements 103.
- FIG. 3 is a diagram depicting an overall method 200 in accordance with an aspect of the present invention.
- the method is practiced in the context of a computer graphics system, in which a pixel value is generated for each pixel in an image. Each generated pixel value is representative of a point in a scene as recorded on an image plane of a simulated camera.
- the computer graphics system is configured to generate the pixel value for an image using a selected ray-tracing methodology.
- the selected ray-tracing methodology includes the use of a ray tree that includes at least one ray shot from the pixel into a scene along a selected direction, and further includes calculations of the intersections of rays and objects (and/or surfaces of objects) in the scene.
- bounding volume hierarchies are used to calculate the intersections of rays and surfaces in the scene.
- a bounding box of a scene is computed.
- the termination criterion is defined as a condition at which the bounding box coordinates differ only in one unit of resolution from a floating point representation of the ray/surface intersection point.
- the scope of the present invention extends to other termination criteria.
- bounding volume hierarchies as an acceleration structure is advantageous for a number of reasons.
- the memory requirements for bounding volume hierarchies can be linearly bounded in the number of obj ects to be ray traced.
- bounding volume hierarchies can be constructed much more efficiently than 3D-trees, which makes them very suitable for an amortized analysis, as required for fully animated scenes.
- FIG. 4 is a diagram illustrating the "self-intersection" problem.
- FIG. 4 shows a ray tracing procedure 300, including an image surface 302, an observation point 304, and a light source 306.
- a series of computations are performed in order to locate rays extending between the observation point 304 and the surface 302.
- FIG. 4 shows one such ray 308.
- the calculated ray/surface intersection point 312 due to floating point arithmetic computations on computers, it is sometimes possible for the calculated ray/surface intersection point 312 to be different from the actual intersection point 310. Further, as illustrated in FIG. 4, it is possible for the calculated point 312 to be located on the "wrong" side of the surface 302. In that case, when computations are performed to locate a secondary ray 314 extending from the calculated ray/surface intersection point 312 to the light source 306, these computations indicate that the secondary ray 314 hits the surface 302 at a second intersection point 316 rather than extending directly to the light source 306, thus resulting in an imaging error.
- An aspect of the invention provides a more precise alternative. Alter arriving at a calculated ray/surface intersection point 312, the calculated point 312 and the direction of the ray 308 are then used to re-compute an intersection point that is closer to the actual intersection point 310. This re-computation of the intersection point is incorporated into the ray tracing technique as an iteration that increases precision. If the iteratively computed intersection point turns out to be on the "wrong" side of the surface 302, it is moved to the "correct" side of the surface 302. The iteratively computed intersection point can be moved along the surface normal, or along the axis determined by the longest component of the normal. Instead of using a global floating point £, the point is moved by an integer e to the last bits of the floating point mantissas.
- the described procedure avoids computations in double precision and has the advantage that it implicitly adapts to the scale of the floating point number, which is determined by its exponent.
- all secondary rays directly start from these modified points making an e-offset unnecessary.
- intersection computation it can therefore be assumed that the ray interval of validity to begin at 0 rather than some offset.
- Modifying the integer representation of the mantissa also avoids numerical problems when intersecting a triangle and a plane in order to decide which points are on what side.
- intersections of rays and freeform surfaces can be found by refining an axis-aligned bounding box, which contains the point of intersection nearest to the ray origin. This refinement can be continued until the resolution of floating point numbers is reached, i.e., until the bounding box coordinates differ only in one unit of resolution from the floating point representation.
- the self-intersection problem then is avoided by selecting the bounding box corner that is closest to the surface normal in the center of the bounding box. This corner point then is used to start the secondary ray. This "ray object intersection test" is very efficient and benefits from the avoidance of the self-intersection problem.
- the test first determines the intersection of the ray and the plane of the triangle and then excludes intersections outside the valid interval ]0, resulttfar] on the ray. This is achieved by only one integer test. Note that the +0 is excluded from the valid interval. This is important if denormalized floating point numbers are not supported. If this first determination is successful, the test proceeds by computing the Barycentric coordinates of the intersection. Note that again only an integer test, i.e., more specifically only testing two bits, is required to perform the complete inclusion test. Thus the number of branches is minimal. In order to enable this efficient test, the edges and the normal of the triangle are scaled appropriately in the transformation step.
- the precision of the test is sufficient to avoid wrong or missed ray intersections. However, during traversal situations may occur in which it is appropriate to extend the triangles for a robust intersection test. This can be done before transforming the triangles. Since the triangles are projected along the axis identified by the longest component of their normal, this projection case has to be stored. This is achieved by counters in the leaf nodes of the acceleration data structure: The triangle references are sorted by the projection case and a leaf contains a byte for the number of triangles in each class.
- a further aspect of the present invention provides an improved approach for constructing acceleration data structures for ray tracing.
- the approach described herein yields significantly flatter trees with superior ray tracing performance.
- Candidates for splitting planes are given by the coordinates of the triangle vertices inside the axis-aligned bounding box to be partitioned. Note that this includes vertices that actually lie outside the bounding box, but have at least one coordinate that lies in one of the three intervals defined by the bounding box. Out of these candidates, there is selected the plane closest to middle of the longest side of the current axis- aligned bounding box.
- a further optimization selects only coordinates of triangles whose longest component of the surface normal matches the normal of the potential splitting plane. This procedure yields much flatter trees, since placing splitting planes through the triangle vertices implicitly reduces the number of triangles split by splitting planes. In addition, the surface is approximated tightly and empty space is maximized. If the number of triangles is higher than a specified threshold and there are no more candidates for splitting planes, the box is split in the middle along its longest side. This avoids inefficiencies of other approaches, including the use, for example, of long diagonal objects.
- the recursive procedure of deciding which triangles belong to the left and right child of a node in the hierarchy has typically required extensive bookkeeping and memory allocation. There is a much simpler approach that only fails in exceptional cases.
- the first array is initialized with the object references.
- a stack of the elements on the left is grown from the beginning of the array, while the elements, which are classified right, are kept on a stack growing from the end of the array towards the middle.
- the second array keeps a stack of them.
- backtracking is efficient and simple. Instead of pruning branches of the tree by using the surface area heuristic, tree depth is pruned by approximately left-balancing the binary space partition starting from a fixed depth.
- Section 2.3.1 of the Computer Program Listing sets forth a listing of source code in accordance with this aspect of the invention.
- each object to be ray traced is referenced exactly once.
- no mailbox mechanisms are required to prevent the multiple intersection of an object with a ray during the traversal of the hierarchy. This is a significant advantage from the viewpoint of system performance and makes implementations on a shared memory system much simpler.
- a second important consequence is that there cannot be more inner nodes in the tree of a bounding volume hierarchy than the total number of objects to be ray- traced.
- the memory footprint of the acceleration data structure can be linearly bounded in the number of objects before construction.
- FIG. S is a diagram illustrating the principal data structure 400.
- FIG. S shows an axis-aligned bounding box 402, in elevation view.
- An L-plane 402 and an R-plane 404 which are axis-aligned and parallel with each other, are used to partition bounding box 402 into left and right axis-aligned bounding box.
- the left bounding box extends from the left wall 406 of the original bounding box 402 to the L-plane 402.
- the right bounding box extends from the R-plane 404 to the right wall 408 of the original bounding box 402.
- the traversal of ray 412 is determined by the positions of intersection with the L- and R-planes 402 and 404 relative to the interval of validity [N, F
- the L- and R-planes 402 and 404 are positioned with respect to each other to partition the set of objects contained within the original bounding box 402, rather than the space contained within the bounding box 402.
- having two planes offers the possibility of maximizing the empty space between the two planes. Consequently the boundary of the scene can be approximated much faster.
- FIGS. 6-8 are a series of three-dimensional diagrams further illustrating data structure 400.
- FIG. 6 shows a diagram of bounding box 402.
- virtual objects within bounding box 402 are depicted as abstract circles 416.
- L-plane 404 and R-plane 406 are then used to partition bounding box 402 into a left bounding box 402a and a right bounding box 402b.
- the L- and R-planes are selected such that the empty space between them is maximized.
- Each virtual object 416 ends up in either the left bounding box 402a or the right bounding box 402b. As shown at the bottom of FIG.
- the virtual objects 416 are partitioned into "left" objects 416a and "right” objects 416b.
- Each of the resulting bounding boxes 402a and 402b are themselves partitioned, and so on, until a termination criterion has been satisfied.
- FIG. 9 is a flowchart of the described method 500.
- step 501 a bounding box of a scene is computed.
- step 502 parallel L- and R-planes are used to partition the axis-aligned bounding box left and right axis-aligned bounding boxes, which may overlap.
- step 503 the left and right bounding boxes are used to partition the set of virtual objects contained with the original axis-aligned bounding box into a set of left objects and a set of right objects.
- step 504 the left and right objects are processed recursively until a termination criterion is met.
- two split parameters are stored within a node. Since the number of nodes is linearly bounded by the number of objects to be ray traced, an array of all nodes can be allocated once. Thus, the costly memory management of 3D-trees during construction becomes unnecessary.
- the construction technique is much simpler than the analog for 3D-tree construction and is easily implemented in a recursive way, or by using an iterative version and a stack.
- Given a list of objects and an axis-aligned bounding box the L- and R-planes are determined, and the set of objects is determined accordingly.
- the left and right objects are then processed recursively until some termination criterion is met. Since the number of inner nodes is bounded, it is safe to rely on termination when there is only one object left. It should be noted that the partition only relies on sorting objects along planes that are perpendicular to the x-, y-, and z-a ⁇ es, which is very efficient and numerically absolutely stable.
- one technique is to determine a planeM418 using the 3D-tree construction technique described above and partition the objects such that the overlap of the resulting L-plane and R-plane of the new axis-aligned bounding boxes minimally overlaps the suggested splitting plane ⁇ /418.
- the resulting tree is very similar to the corresponding 3D-tree, however, since the object sets are partitioned rather than space, the resulting tree is much flatter.
- Another approach is to select the R-plane and L-plane in such a way that the overlap of child boxes is minimal and the empty space is maximized if possible. It should be noted that for some objects axis- aligned bounding boxes are inefficient. An example of such a situation is a long cylinder with small radius on the diagonal of an axis-aligned bounding box.
- FIG. 10 is a flowchart of a method 600 according to this aspect of the invention.
- a bounding box of a scene is computed.
- a 3D-tree construction is executed to determine a splitting plane M.
- parallel L- and R-planes are used to partition the axis-aligned bounding box into left and right axis-aligned bounding boxes that minimally overlap the splitting plane M
- the left and right bounding boxes are used to partition the set of virtual objects contained within the original axis-aligned bounding box into a set of left objects and a set of right objects.
- the left and right objects are processed recursively until a termination criterion is met.
- method 600 illustrated in FIG. 10, as well as the method 200 illustrated in FIG. 3, may be combined with other techniques described herein, including techniques relating to 3D-tree construction, real-time processing, bucket sorting, self-intersection, and the like.
- the spatial subdivision is continued so as to cut off the empty portions of the space around the object.
- partitioning such objects into smaller ones results in a similar behavior.
- a maximum bounding box size is defined. All objects with an extent that exceeds the maximum bounding box size are split into smaller portions to meet the requirement. The maximum allowed size can be found by scanning the data set for the minimal extent among all objects.
- the data structure described herein allows the transfer of the principles of fast 3D-tree traversal to bounding volume hierarchies.
- the cases of traversal are similar: (1) only the left child; (2) only the right child; (3) the left child and then the right child; (4) the right child and then the left child; or (S) the ray is between split planes (i.e., empty space). Since one node in the described technique is split by two parallel planes, the order of how to traverse the boxes is determined by the ray direction.
- FIG. 6 sets forth a source code listing incorporating the techniques described above. Previous bounding volume hierarchy techniques could not efficiently determine the order of how to traverse the child nodes or required additional effort, such as updating a heap data structure.
- the described bounding volume hierarchy also can be applied to efficiently find ray freeform surface intersections by subdividing the freeform surface. Doing so allows the intersection of a freeform surface with a convex hull property and a subdivision technique efficiently to be computed up to floating point precision, depending on the actual floating point arithmetic.
- a subdivision step is performed, for example, for polynomial surfaces, rational surfaces, and approximating subdivision surfaces. For each axis in space the possibly overlapping bounding boxes are determined as discussed above. In case of a binary subdivision, the intersection of the L-boxes and the intersection of the R-boxes for new bounding boxes of the new meshes. Now the above-described traversal can be efficiently performed, since the spatial order of the boxes is known.
- Section 2.1.1 of the Computer Program Listing sets forth a code listing in accordance with this aspect of the invention. Using regular grids as an acceleration data structure in ray tracing is simple, but efficiency suffers from a lack of spatial adaptivity and the subsequent traversal of many empty grid cells.
- Hierarchical regular grids can improve on the situation, but still are inferior as compared to bounding volume hierarchies and 3D-trees.
- regular grids can be used to improve on the construction speed of acceleration data structures.
- the technique for constructing the acceleration data structures are similar to quick sorting and are expected to run in O ⁇ n log ri).
- An improvement can be obtained by applying bucket sorting, which runs in linear time. Therefore the axis-aligned bounding box of the objects is partitioned into n x * n y x n z axis-aligned boxes. Each object then is sorted into exactly one of these boxes by one selected point, e.g., the center of gravity or the first vertex of each triangle could be used.
- the acceleration data structures can be built on demand, i.e., at the time when a ray is traversing a specific axis-aligned bounding box with its objects.
- the acceleration data structure never becomes refined in regions of space, which are invisible to the rays, and caches are not polluted by data that is never touched.
- the objects possibly intersected by a ray are already in the caches.
- the present invention addresses long known issues in ray tracing and provides techniques for ray tracing having improved precision, overall speed and memory footprint of the acceleration data structures.
- the improvements in numerical precision transfer to other number systems as well as, for example, to the logarithmic number system used in the hardware of the ART ray tracing chips.
- the specific implementation of the IEEE floating point standard on a processor or a dedicated hardware can severely influence performance. For example, on a Pentium 4 chip denormalized numbers can degrade performance by a factor of 100 and more. As discussed above, an implementation of the invention avoids these exceptions.
- the view of bounding volume hierarchies described herein makes them suited for realtime ray tracing.
- the described techniques outperform the previous state of the art, thus allowing more precise techniques to be used, for example, for computing motion blur in fully animated scene, as in a production setting or the like. It will be apparent from the above discussion that the described bounding volume hierarchies have significant advantages when compared with 3D-trees and other techniques, particularly in hardware implementations and for huge scenes. In an amortized analysis, the described bounding volume hierarchies outperform current 3D-trees by at least a factor of two. In addition, the memory footprint can be determined beforehand and is linear in the number of objects.
- the following described aspect of the invention provides methods, techniques and systems for efficient photorealistic simulation of light transport, suitable for use in computer graphics and other applications in which a computer or other digital processor is used to calculate pixel values for pixels in an image, using ray-tracing techniques to calculate ray-surface intersections.
- the present invention provides improvements to conventional ray-tracing techniques, including methods for making ray-surface intersection testing more efficient, by applying sorting strategies that enable rapid processing of objects within a scene illuminated by packets or units of light, such as photons, the trajectories of which are simulated using Markov chains, the Markov chains being simulated using a suitable quasi-Monte Carlo (QMC) methodology.
- the sorting strategies include proximity sorting, and the invention advantageously exploits the low dimensionality of the QMC methodology using such proximity sorting.
- the described techniques and systems can be used to solve the integral equation in close to real-time, a task that has presented significant difficulty in conventional systems.
- a Markov chain is a discrete-time stochastic sequence of system states.
- a Markov chain is a "memoryless" process, in which previous states are irrelevant for predicting the probability of subsequent states, given knowledge of the current state.
- Markov chains can be used to model and analyze a number of different phenomena. For example, the English language has been modeled using a Markov chain, based upon the calculation of the relative frequencies of one word following another. Using these transition probabilities, it is possible to generate English sentences.
- Many other evolution processes can also be described using Markov chains, including, for example: the analysis of queues, Brownian motion, particle transport, and, of particular relevance to the present discussion, light transport simulation.
- Markov chains can be used to simulate the trajectories of photons illuminating objects in a scene in a simulated image.
- Light transport simulation is characterized by complicated domains and discontinuities, and reduction to high-dimensional integrals.
- light transport simulation is characterized by a low-dimensional structure, due to repeated application of identical operators.
- Markov chains may themselves be simulated using quasi-Monte Carlo (QMC) techniques.
- QMC quasi-Monte Carlo
- the general idea behind QMC techniques is that, rather than attempting to condition randomly generated numbers in some way, one starts with a deterministic sequence that is known to provide very uniform coverage of a selected interval.
- randomized QMC the sequence is randomized in such a way that the numbers have a random appearance, but as a group retain the desirable property of covering the selected interval in a regular way.
- Markov chains may be simulated simultaneously.
- the idea of improving the simultaneous simulation of Markov chains with quasi-Monte Carlo methods by an intermediate sorting step was originally introduced by Lecot in a series of papers dealing with the Boltzmann equation and, later on, the heat equation. This idea was then used and refined for solving the heat equation on a grid by Morokoff and Caflisch and recently extended by L'Ecuyer, Tuffin, Demers et al. to incorporate randomized versions of the technique and splitting for rare event simulation. Independent research, in a way related to the above approaches, has been conducted in the field of computer graphics.
- X n,l is the state of chain / at time step n.
- the simulation may be performed as follows:
- N b m chains
- a (t, 2)-sequence x i in base b such that for m > t, the b m subsequent components x i, 1 form a (0, m, 1)-net.
- the technique then continues by sorting the states (described in detail in Section 3, below) and continues the chains by computing using for 0 ⁇ / ⁇ N to realize transitions according to P.
- the index for selecting the next state for transition in fact uses the van der Corput sequence ⁇ 2 in
- base 2 which is a (0, l)-sequence and thus a sequence of (0, m, l)-nets.
- all indices used during the different time steps n are in fact identical for all m.
- FIG. 11 shows a two-dimensional array 700 illustrating the shift ⁇ s(n).
- the low dimensional setting may result in a misleading interpretation of the samples being a shifted lattice or stratified samples, as the entirety of the ⁇ s(l) for
- the radical inversion implies stratification, with subsequent points that fall into the biggest "holes.” This property is good for any starting point; (t, s)-sequences have the same property.
- This observation can be taken as an explanation for some of the effects seen using earlier approaches: Sobol and Korobov points used in the array-(R)QMC simulation are worse up to an order of magnitude in variance reduction than their transformed (Gray-code for Sobol, Baker transform for Korobov) counterparts. The explanation for this is found in the structure of the points.
- the sequence of (t, m, s)- nets extracted from the Sobol sequence is locally worse than its Gray-code variant. The same goes for the Korobov lattice and its transformed variant.
- the technique has a number of interesting properties.
- First, the technique is unbiased for one tabulated set of x; and for a continuous stream of x / .
- FIG. 13 is a diagram illustrating sampling transport path space 740 by bidirectional path tracing. Trajectories from the eye 742 and the light sources 744 to scene objects 746 are generated by Markov chains and connected to determine the transported amount of light.
- the sampling path space, shown in FIG. 13, corresponds to simulating Markov chains, where the paths are established by ray tracing and scattering events.
- the initial distribution is determined by the emission characteristics of the light sources 744 and the transition probabilities are given by bidirectional reflectance distribution functions on the hit surfaces.
- FIGS. 14A-B and 15A-B are rendered images 760, 770, 780 and 790 illustrating differences between simulations using a Monte Carlo technique and simulations using an array-randomized quasi-Monte Carlo (A-RQMC) technique.
- the simulations using a Monte Carlo technique are shown in images 760 and 780 in FIGS. 14A and 15A.
- the simulations using an array-randomized quasi-Monte Carlo technique are shown in images 770 and 790 in FIGS. 14B and 15B.
- the integral equation underlying light transport can be considered either as a Fredholm or Volterra integral equation, which matters for implementation.
- FIGS. 16A-D show a series of diagrams 800, 810, 820, and 830 illustrating photon trajectories 802, 812, 822, and 832 that are started from a plurality of starting points 804, 814, 824, and 834 on a light source.
- the bidirectional reflectance distribution function is sampled to determine a direction of scattering to continue the path.
- ⁇ r is the bidirectional reflectance distribution function describing the optical surface properties and L is the incident radiance. Using this integral operator results in a Volterra integral equation of the second kind, as the integration domain depends on x.
- (t, s)-sequences are sequences of (t, m, s)-nets in base b, and are similar to Halton sequences and their variants. Similar states X n , t use successive points. Therefore, states should be ordered by proximity in state space. The closer the states are in the state space, the more uniformly it will be explored.
- a second possibility to enable multidimensional sorting is the usage of a spatial hierarchy to define an order on the states.
- Efficient data structures for this purpose are the BSP-tree, its specialized axis aligned subset, the kD-tree, or bounding volume hierarchies.
- the construction of such binary hierarchies is simple: The space is recursively subdivided using planes selected by some heuristic. The construction runs in O(Nlog N) on the average. Traversing the hierarchy in in-order enumerates the leaves in an order of proximity. This traversal becomes trivial, if the tree is left- balanced and in consequence can be stored in an array.
- FIGS. 17A-G, 18A-B, and 19A-C are a series of diagrams illustrating that sorting the states into the leaves of a spatial hierarchy defines an order of proximity by traversing the hierarchy in in-order.
- FIG. 17A shows an exemplary scene 840, including a light source 850 and an object 860 to be rendered.
- the object 860 is a five-pointed star.
- the object 860 is bounded by an axis-aligned bounding box 870.
- the bounding box 870 is iteratively subdivided into branches. Each branch terminates in a leaf.
- the bounding box 870 is divided into a first-generation left branch L and a first-generation right branch R by a splitting plane 871 passing through a first vertex 861 of the object 860.
- the left branch L is divided by a second splitting plane 872 passing through a second vertex 862 of the object 860, resulting in a second-generation left branch LL and a second-generation right branch LR.
- a third splitting plane 873 passes through a third vertex 863 of the object 860, splitting branch LR into third-generation left and right branches LRL and LRR.
- fourth and fifth splitting planes 874 and 875 pass through fourth and fifth vertices 864 and 865 of the object 860.
- the result is five leafs LRL, LRR, RR, LLL, LLR and RL, one for each of the five points of the star 860.
- FIG. 17G illustrates the illumination of individual points within each leaf.
- each leaf has been shaded to indicate its luminosity.
- the leaves are hierarchically ordered according to their respective luminosities. From “darkest” to “brightest,” the leaves are arranged as follows: LRL, LRR, LLR, RR, RL.
- bucket sorting can be used.
- multidimensional states were sorted into buckets by the first dimension of the state, then the states of each bucket were sorted into buckets according to the second dimension, and so forth.
- This procedure works well, but has the problem that states close in state space can be separated by the sorting procedure.
- a stratification of each dimension has to be used, which induces the curse of dimension in the number of Markov chains to be simulated simultaneously.
- the state space is therefore into equal volume cells, or "voxels,” which serve as buckets.
- the bucket of each state is found by truncating the state coordinates according to the resolution of the voxel grid. Note that this applies for continuous as well as discrete state spaces. Enumerating the voxels by proximity yields the desired order on the states and can be done in linear time in the number of voxels.
- FIGS. 19A-C show a series of diagrams 890, 892, and 894 illustrates the Z- curve 891, 893, and 895 in two dimensions for 2x2 (FIG. 19A), 4x4 (FIG. 19B), and 16 x 16 (FIG. 19C) buckets.
- the origin (0, 0) top left the point marked by x has the integer coordinates (3, 4), which corresponds to (011, 100) 2 in the binary system.
- Its binary Z-curve index 100101 2 is computed by bitwise interleaving the binary coordinates. This is straightforward and fast to compute for any dimension and problem size, and requires no additional memory. It has been found that the results are not as good as, for example, the Hilbert curve in a global context.
- MC Uniform random numbers generated by the Mersenne Twister were used for classical Monte Carlo sampling.
- RQMC Used the high-dimensional Halton sequence with permutations by Faure randomized by a Cranley-Patterson rotation, where pairs of components were used to sample the two-dimensional emission and scattering events.
- lo-dim Used the two-dimensional Halton sequence randomized by a RQMCS: Cranley-Patterson rotation. The Z-curve was used to enumerate the bucket-sorted states.
- hi-dim Used the high-dimensional Halton sequence with permutations RQMCS: by Faure randomized by a Cranley-Patterson rotation. The Z-curve was used to enumerate the bucket-sorted states.
- FIG. 2OA shows a test scene "Shirley 6," which is a relatively simple light transport setting
- FIG. 21 A shows a test scene "Invisible Date,” which is a more complicated light transport setting
- FIGS. 2OB and 21B are graphs 901 and 911 showing average RMS error to a master solution
- FIGS. 2OC and 21C are graphs 902 and 912 showing global variance (averaged over the whole image)
- FIGS. 2OD and 21D are graphs 903 and 913 showing pixel-based variance.
- the numbers were obtained by averaging 10 independent runs for a varying number of Markov chains. The measured numbers only convince for the simple test scene. In even the complicated cases, no performance gain over Monte Carlo sampling can be measured, because the number of independent runs is too small. More experiments were not possible due to excessive running times.
- FIGS. 22A-C show a visual comparison for the test scene "Invisible Date" using 300 chains for simulation.
- FIG. 22A shows the test scene rendered using Monte Carlo
- FIG. 22B shows the test scene rendered using randomized quasi-Monte Carlo
- FIG. 22C shows the test scene rendered using randomized quasi-Monte Carlo with sorting.
- the only light source is not visible as it is placed on the ceiling of the neighboring room. Due to the better distribution of the photons randomized quasi-Monte Carlo (RQMC) (FIG.
- RQMC randomized quasi-Monte Carlo
- RQMC with sorting (RQMCS) (FIG. 22C) (using 256 3 voxels for the bucket sort) is even superior as more photons made it into the second room and even the back of the door is lit very well.
- FIGS. 23 A-D shows measurements for a very difficult light transport problem, where photons were traced directly from the light sources and the final path segment was connected to the camera (one technique of bidirectional path tracing).
- FIG. 23 A shows a schematic view of the labyrinth test scene 940, where floor and roof have been removed for illustration. The camera is situated in the room at the bottom, while a light source is located on the other end of the connecting tunnel.
- FIG. 23B is a "Box-and-Whisker" plot, showing the average amount (marked by x) of the total radiance received by the camera for 1048576 simulated light paths for 50 independent realizations using each technique.
- the graph 960 shown in FIG. 23C enlarges the interesting part of the graph shown in FIG. 23B.
- the table 970 shown in FIG. 23D finally displays the deviation from a master solution using the Ll- and L2- norm (variance) for the total radiance received by the camera and received by each pixel (256x256 pixels resolution) averaged over the 50 independent realizations. In this difficult setting the new method (RQMCS) is clearly superior.
- rank- 1 lattice sequences could be applied.
- FIGS. 24-30 are flowcharts of a number of techniques and sub-techniques according to the above-described aspects of the invention. It should be noted that some or all of the following techniques and sub-techniques, including individual boxes and groups of boxes thereof, may be combined as desired to practice the invention, as described herein.
- FIG. 24 shows a flowchart of a first general technique 1000 according to the present invention.
- the technique 1000 includes the following:
- Box 1001 Simulate Markov chains using Quasi-Monte Carlo methodology.
- Box 1002 Sort the states. (May include sorting by state comprising proximity sorting.)
- Box 1003 Simulate plurality of Markov chains simultaneously.
- FIG. 25 shows a flowchart of a further technique 1100 according to the present invention for simulating a plurality of Markov chains simultaneously for an s-dimensional problem.
- the technique 1100 includes the following:
- Box 1103 Sort state vector using a suitable order.
- Box 1104: n : n + l
- Box 1105 Continue chain by computing X n , ! using subsequent quasi-Monte Carlo points xi.
- Boxes 1103-1105 are performed in a loop.
- FIG. 26 is a flowchart of additional techniques 1200, one or more of which may be performed in conjunction with the technique 1100 illustrated in FIG. 25.
- These techniques 1200 include the following: Box 1201 : Sort by luminosity, or the norm of state.
- Box 1202 Sort using a spatial hierarchy.
- binary hierarchy can be constructed by recursively subdividing using planes selected by a selected heuristic; and then traversing the hierarchy in in-order to enumerate the leaves of the hierarchy in an order of proximity; can also include left- balancing the tree and storing the tree in an array data structure to enable efficient traversal of the hierarchy.
- FIG. 27 is a flowchart of a further technique 1300 according to the invention.
- the technique 1300 includes the following:
- Box 1303 Initialize X 0,l using rand(x l ).
- Box 1304 Sort state vector using some order.
- Box 1307 Continue chain by computing X n,l using rand(x l ). Boxes 1304-1307 are performed in a loop.
- FIG. 28 is a flowchart of an additional technique 1400 that may be performed in conjunction with the FIG. 27 technique 1300.
- the additional technique 1400 randomizes Quasi-Monte Carlo points in each time step n, and includes the following:
- Box 1402 XOR samples by s-dimensional random vector, the random vector being generated after each transition step.
- FIG. 29 is a flowchart of a further technique 1500 according to the invention.
- the technique 1500 includes the following: Box 1501: Bound object to be rendered by axis-aligned bounding box.
- Box 1502 Recursively or iteratively divide bounding box into successive left and right branches, each terminating in a leaf, by applying splitting planes passing through selected points of object, thereby generating plurality of leaves.
- Box 1503 Order leaves hierarchically according to their respective luminosities.
- Box 1504 Perform bucket sort on matrix of selected size, divided into selected number of buckets.
- Box 1505 Use a selected space-filling curve to proceed through matrix, wherein within an individual bucket, smaller space-filling curve used to proceed through individual cells in matrix.
- FIG. 30 is a flowchart of a further technique 1600 according to the invention.
- the technique 1600 configures a simulation to enable light-transport simulation for synthesizing realistic-appearing images, and includes the following:
- Box 1601 Reformulate underlying light transport-simulating integral equation as path integral, such that sampling path space corresponds to simulating Markov chains, in which paths are established by ray tracing and scattering events.
- Box 1602 Determine initial distribution by emission characteristics of light sources or sensors, and determine transition probabilities by bi-directional reflectance distribution functions and bi-directional subsurface scattering distribution functions for a surface in image.
- Box 1603 Process transparency effects by utilizing path integral, initial distribution and transition probabilities.
- Box 1604 Solve path integral, with or without any kind of randomization, by utilizing either high-dimensional Quasi -Monte Carlo points or by padding low- dimensional Quasi-Monte Carlo points.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Generation (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US82241706P | 2006-08-15 | 2006-08-15 | |
PCT/US2007/075966 WO2008022173A2 (en) | 2006-08-15 | 2007-08-15 | Simultaneous simulation of markov chains using quasi-monte carlo techniques |
Publications (1)
Publication Number | Publication Date |
---|---|
EP2052366A2 true EP2052366A2 (en) | 2009-04-29 |
Family
ID=39083085
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP07814103A Withdrawn EP2052366A2 (en) | 2006-08-15 | 2007-08-15 | Simultaneous simulation of markov chains using quasi-monte carlo techniques |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP2052366A2 (en) |
JP (1) | JP4947394B2 (en) |
CA (1) | CA2660190A1 (en) |
WO (1) | WO2008022173A2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11170254B2 (en) | 2017-09-07 | 2021-11-09 | Aurora Innovation, Inc. | Method for image analysis |
US11334762B1 (en) | 2017-09-07 | 2022-05-17 | Aurora Operations, Inc. | Method for image analysis |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8131770B2 (en) | 2009-01-30 | 2012-03-06 | Nvidia Corporation | System, method, and computer program product for importance sampling of partitioned domains |
JP5633961B2 (en) | 2010-05-19 | 2014-12-03 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Simulation system, method and program |
US20130033507A1 (en) * | 2011-08-04 | 2013-02-07 | Nvidia Corporation | System, method, and computer program product for constructing an acceleration structure |
CN114528728B (en) * | 2022-01-20 | 2024-05-28 | 北京航空航天大学 | Monte Carlo computer simulation method for canopy reflection of aquatic vegetation in line sowing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7133041B2 (en) * | 2000-02-25 | 2006-11-07 | The Research Foundation Of State University Of New York | Apparatus and method for volume processing and rendering |
EP1405271B1 (en) * | 2001-06-07 | 2008-07-23 | Mental Images GmbH | Rendering images using a russian roulette methodology for evaluating global illumination |
US7184071B2 (en) * | 2002-08-23 | 2007-02-27 | University Of Maryland | Method of three-dimensional object reconstruction from a video sequence using a generic model |
-
2007
- 2007-08-15 EP EP07814103A patent/EP2052366A2/en not_active Withdrawn
- 2007-08-15 WO PCT/US2007/075966 patent/WO2008022173A2/en active Application Filing
- 2007-08-15 CA CA002660190A patent/CA2660190A1/en not_active Abandoned
- 2007-08-15 JP JP2009524776A patent/JP4947394B2/en not_active Expired - Fee Related
Non-Patent Citations (1)
Title |
---|
See references of WO2008022173A3 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11170254B2 (en) | 2017-09-07 | 2021-11-09 | Aurora Innovation, Inc. | Method for image analysis |
US11334762B1 (en) | 2017-09-07 | 2022-05-17 | Aurora Operations, Inc. | Method for image analysis |
US11748446B2 (en) | 2017-09-07 | 2023-09-05 | Aurora Operations, Inc. | Method for image analysis |
US12056209B2 (en) | 2017-09-07 | 2024-08-06 | Aurora Operations, Inc | Method for image analysis |
Also Published As
Publication number | Publication date |
---|---|
JP2010501100A (en) | 2010-01-14 |
WO2008022173A3 (en) | 2008-12-04 |
JP4947394B2 (en) | 2012-06-06 |
CA2660190A1 (en) | 2008-02-21 |
WO2008022173A2 (en) | 2008-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7773088B2 (en) | Simultaneous simulation of markov chains using quasi-monte carlo techniques | |
US7499053B2 (en) | Real-time precision ray tracing | |
US7952583B2 (en) | Quasi-monte carlo light transport simulation by efficient ray tracing | |
US8411088B2 (en) | Accelerated ray tracing | |
US8188997B2 (en) | Accelerated ray tracing using shallow bounding volume hierarchies | |
US7495664B2 (en) | Instant ray tracing | |
Ernst et al. | Early split clipping for bounding volume hierarchies | |
EP2008249B1 (en) | Instant ray tracing | |
Jönsson et al. | Historygrams: Enabling interactive global illumination in direct volume rendering using photon mapping | |
US7990380B2 (en) | Diffuse photon map decomposition for parallelization of global illumination algorithm | |
WO2009044282A2 (en) | Quasi-monte carlo light transport simulation by efficient ray tracing | |
WO2009063319A2 (en) | Shallow bounding volume hierarchies for accelerated ray tracing | |
Ogayar et al. | Point in solid strategies | |
Wang et al. | GPU-based out-of-core many-lights rendering | |
WO2007002494A2 (en) | Real-time precision ray tracing | |
WO2008022173A2 (en) | Simultaneous simulation of markov chains using quasi-monte carlo techniques | |
Morrical et al. | Quick clusters: A GPU-parallel partitioning for efficient path tracing of unstructured volumetric grids | |
Lagae et al. | Geometry synthesis by example | |
Figueiredo et al. | An efficient collision detection algorithm for point cloud models | |
Wächter et al. | Efficient simultaneous simulation of Markov chains | |
Dietrich et al. | Terrain guided multi-level instancing of highly complex plant populations | |
Ye et al. | 3D Model Occlusion Culling Optimization Method Based on WebGPU Computing Pipeline. | |
Fan et al. | IBCD: a fast collision detection algorithm based on image space using OBB | |
Semwal | The slicing extent technique for fast ray tracing | |
Dyken et al. | Histopyramids in iso-surface extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20090212 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA HR MK RS |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: WAECHTER, CARSTEN Inventor name: KELLER, ALEXANDER |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: WAECHTER, CARSTEN Inventor name: KELLER, ALEXANDER |
|
DAX | Request for extension of the european patent (deleted) | ||
RBV | Designated contracting states (corrected) |
Designated state(s): DE ES FR GB |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20160301 |