JP4947394B2

JP4947394B2 - Simultaneous simulation of Markov chains using the quasi-Monte Carlo method

Info

Publication number: JP4947394B2
Application number: JP2009524776A
Authority: JP
Inventors: アレクサンドリアケラー，; カーステンヴェヒター，
Original assignee: Mental Images GmbH
Current assignee: Nvidia ARC GmbH
Priority date: 2006-08-15
Filing date: 2007-08-15
Publication date: 2012-06-06
Anticipated expiration: 2027-08-15
Also published as: WO2008022173A2; JP2010501100A; CA2660190A1; EP2052366A2; WO2008022173A3

Description

Cross-reference of related applications

本特許出願（整理番号ＭＥＮＴ−１０１−Ｂ）は、２００６年８月１５日に出願された米国仮特許出願第６０／８２２，４１７号の優先権の利益を主張する。本出願は、２００６年６月２３日に出願された米国特許出願第１１／４７４，５１７号（整理番号ＭＥＮＴ−１０１−ＵＳ）の一部継続出願でもあり、同出願は、２００５年６月２３日に出願された米国仮特許出願第６０／６９３，２３１号（整理番号ＭＥＮＴ−１０１−ＰＲ）の利益を主張し、また２００２年１１月１９日に出願された米国特許出願第１０／２９９，５７４号（整理番号ＭＥＮＴ−０７５）の一部継続出願でもある。 This patent application (Docket MENT-101-B) claims the benefit of the priority of US Provisional Patent Application No. 60 / 822,417, filed August 15, 2006. This application is also a continuation-in-part of US Patent Application No. 11 / 474,517 (reference number MENT-101-US) filed on June 23, 2006. Claims the benefit of US Provisional Patent Application No. 60 / 693,231 (Docket No. MENT-101-PR) filed on Nov. 19, 2002, and also filed on Nov. 19, 2002. It is also a partial continuation application of No. 574 (reference number MENT-075).

米国特許出願第１０／２９９，５７４号（整理番号ＭＥＮＴ−０７５）は、２００１年６月１９日に出願された米国特許出願第０９／８８４，８６１号（整理番号ＭＥＮＴ−０６１）の一部継続出願であり、同出願は、２００１年２月１日に出願された米国仮特許出願第６０／２６５，９３４号、及び２０００年６月１９日に出願された米国仮特許出願第６０／２１２，２８６号の優先権の利益を主張する。 US patent application Ser. No. 10 / 299,574 (Docket No. MENT-075) is a continuation of US patent application No. 09 / 884,861 (Docket No. MENT-061) filed Jun. 19, 2001. This application is a U.S. Provisional Patent Application No. 60 / 265,934 filed on Feb. 1, 2001, and U.S. Provisional Patent Application No. 60/212, filed Jun. 19, 2000. Insist on the benefits of 286 priority.

整理番号ＭＥＮＴ−０６１、ＭＥＮＴ−０７５、ＭＥＮＴ−１０１−ＰＲ、及びＭＥＮＴ−１０１−ＵＳを含むが、それらに限定されない、上に列挙された特許出願、並びにそれらの仮特許出願の各々は、その全体が本明細書に記載されているかのように、その全体が参照により本明細書に組み込まれる。 Each of the patent applications listed above, including, but not limited to, docket numbers MENT-061, MENT-075, MENT-101-PR, and MENT-101-US The entirety of which is hereby incorporated by reference as if set forth in full herein.

Browse computer program list

本出願には、以下にソースコードリストが添付される。そのソースコードリストは、本明細書では「コンピュータプログラムリスト」と呼ばれ、「１．１．１」の形式の３桁の参照番号によって識別されるセクションに構成される。 A source code list is attached to the present application as follows. The source code list is referred to herein as a “computer program list” and is organized into sections identified by a three-digit reference number of the form “1.1.1”.

Field of Invention

本発明は、一般に、動画及び他の応用例のためなどの、デジタルコンピューティングシステムにおける、またデジタルコンピューティングシステムによる、画像レンダリングのための方法及びシステムに関し、詳細には、リアルタイム高精度レイトレーシングのための方法、システム、デバイス、及びコンピュータソフトウェアに関する。 The present invention relates generally to a method and system for image rendering in and by a digital computing system, such as for moving pictures and other applications, and in particular, for real-time high precision ray tracing. The present invention relates to a method, a system, a device, and computer software.

Background of the Invention

「レイトレーシング」という用語は、光源をカメラと結びつけるすべての光路を識別し、これらの寄与を総計することによって、フォトリアリスティックな画像を合成するための技法を指すものである。シミュレーションは、可視性を決定するために視線に沿って光線をトレースし、照度を決定するために光源から光線をトレースする。 The term “ray tracing” refers to a technique for synthesizing photorealistic images by identifying all the light paths that link a light source with a camera and summing these contributions. The simulation traces rays along the line of sight to determine visibility and traces rays from the light source to determine illuminance.

レイトレーシングは、動画及び他の応用例における主流になっている。しかし、現在のレイトレーシング技法は、数値的問題、動的シーンを処理する能力の限界、アクセラレーションデータ構造（ａｃｃｅｌｅｒａｔｉｏｎｄａｔａｓｔｒｕｃｔｕｒｅ）の設定の遅さ、及び大きなメモリフットプリントを含む、多くの知られた限界及び弱点により影響をこうむる。したがって、現在のレイトレーシング技法は、風になびく森や人の髪の毛など、全体的に動きのあるシーンを効率的に処理するための能力を欠いている。現在のレイトレーシングシステムの限界の克服は、例えば、映画制作におけるより高品質なモーションブラー（ｍｏｔｉｏｎｂｌｕｒ）のレンダリングも可能にする。 Ray tracing has become the mainstream in video and other applications. However, current ray tracing techniques involve many known issues, including numerical problems, limitations in the ability to process dynamic scenes, slow acceleration data structure settings, and large memory footprints. It is affected by the limitations and weaknesses. Therefore, current ray tracing techniques lack the ability to efficiently process scenes with overall movement, such as forests that flutter in the wind and human hair. Overcoming the limitations of current raytracing systems also allows for the rendering of higher quality motion blur, for example in movie production.

レイトレーシングシステムの性能を改善するための現在の試みは、多くの理由で不十分である。例えば、現在のリアルタイムレイトレーシングシステムは一般に、そのアクセラレーション構造として、軸平行バイナリ空間分割（ａｘｉｓ−ａｌｉｇｎｅｄｂｉｎａｒｙｓｐａｃｅｐａｒｔｉｔｉｏｎ）に基づいた３Ｄツリーを使用する。これらのシステムの主な重点は、静的シーンをレンダリングすることにあるので、これらのシステムは一般に、全体的に動きのあるシーンに関して必要なデータ構造を構築するのに必要とされる、かなり長い設定時間に対処することができない。この線に沿って、ある製造業者は、効率的な３Ｄツリーを構築し、ツリーを横断するのに必要とされる時間を短縮できる技法を開発することによって、リアルタイムレイトレーシングを改良した。しかし、レイトレーシングされる物体の数が増加するにつれて、システムの予想メモリ要件が２次的に増大することが示され得る。 Current attempts to improve the performance of ray tracing systems are inadequate for a number of reasons. For example, current real-time ray tracing systems typically use a 3D tree based on an axis-aligned binary space partition as its acceleration structure. Since the main emphasis of these systems is on rendering static scenes, these systems are generally quite long, needed to build the necessary data structures for an overall moving scene The set time cannot be dealt with. Along this line, certain manufacturers have improved real-time ray tracing by developing techniques that can build efficient 3D trees and reduce the time required to traverse the tree. However, it can be shown that as the number of objects to be ray-traced increases, the expected memory requirement of the system increases secondarily.

別の製造業者は、システム性能を改善するために境界ボリューム階層（ｂｏｕｎｄｉｎｇｖｏｌｕｍｅｈｉｅｒａｒｃｈｙ）を使用する、レイトレーシング集積回路を設計した。しかし、そのアーキテクチャの性能は、あまりに多くのインコヒーレントな２次光線がトレースされる場合、損なわれることが判明した。 Another manufacturer has designed a ray tracing integrated circuit that uses a bounding volume hierarchy to improve system performance. However, the performance of the architecture has been found to be impaired if too many incoherent secondary rays are traced.

加えて、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）を使用して３Ｄツリー横断技法を実施することによって、システム性能を改善する試みも行われた。これらのシステムの処理速度の主な増加は、コヒーレントな光線の束をトレースし、高速ハードワイヤード計算を実行するＦＰＧＡの能力を活用することによって獲得される。 In addition, attempts have been made to improve system performance by implementing 3D tree traversal techniques using field programmable gate arrays (FPGAs). The major increase in processing speed of these systems is obtained by taking advantage of the FPGA's ability to trace coherent beam bundles and perform fast hardwired computations.

アクセラレーション構造の構築は、いまだハードウェアでは実施されていない。ＦＰＧＡ実装は一般に、精度を抑えた浮動小数点法（ｆｌｏａｔｉｎｇｐｏｉｎｔｔｅｃｈｎｉｑｕｅ）を使用する。 The construction of the acceleration structure has not yet been implemented in hardware. FPGA implementations generally use a floating point technique with reduced precision.

多くの異なる技法が、マルコフ連鎖（Ｍａｒｋｏｖｃｈａｉｎ）を含む光子軌道をシミュレートするために使用できる。マルコフ連鎖をシミュレートする際の準モンテカルロ法（ｑｕａｓｉ−ＭｏｎｔｅＣａｒｌｏｔｅｃｈｎｉｑｕｅ）の使用が、例えば、Ｌ’Ｅｃｕｙｅｒ他、ＲａｎｄｏｍｉｚｅｄＱｕａｓｉ−ＭｏｎｔｅＣａｒｌｏＳｉｍｕｌａｔｉｏｎｏｆＭａｒｋｏｖＣｈａｉｎｓｗｉｔｈａｎＯｒｄｅｒｅｄＳｔａｔｅＳｐａｃｅ（２００４）（「Ｌ’Ｅｃｕｙｅｒ」）において説明されており、同文献は、その全体があたかも説明されているかのように、参照により本明細書に組み込まれる。 A number of different techniques can be used to simulate photon trajectories involving Markov chains. The use of the quasi-Monte Carlo method in simulating a Markov chain is described, for example, by L'Ecuyer et al., Randomized Quasi-Monte Carlo Simulation of Markov Chain Sce. Ecuyer "), which is incorporated herein by reference as if set forth in its entirety.

Ｌ’Ｅｃｕｙｅｒは、全順序（離散又は連続）状態空間を有するマルコフ連鎖の各ステップにおける状態分布を推定するための、ランダム化準モンテカルロ法（ｒａｎｄｏｍｉｚｅｄｑｕａｓｉ−ＭｏｎｔｅＣａｒｌｏｍｅｔｈｏｄ）の使用について説明している。連鎖内のステップの数は、ランダムで無制限とすることができる。この方法は、特に、各ステップにおいて状態依存コストが支払われる場合の、何らかのランダムな停止時間までの予想トータルコストの、低分散不偏推定器（ｌｏｗ−ｖａｒｉａｎｃｅｕｎｂｉａｓｅｄｅｓｔｉｍａｔｏｒ）を取得するために使用することができる。当該論文は、数値的な説明を提供しており、その説明では、標準モンテカルロと比べて、分散低減が大きい。 L'Ecuyer describes the use of a randomized quasi-Monte-Carlo method to estimate the state distribution at each step of a Markov chain with a total order (discrete or continuous) state space. . The number of steps in the chain can be random and unlimited. This method should be used to obtain a low-variance unbalanced estimator with an expected total cost up to some random downtime, especially when state-dependent costs are paid at each step Can do. The paper provides a numerical explanation, which has a large variance reduction compared to standard Monte Carlo.

Ｌ’Ｅｃｕｙｅｒは、全順序状態空間を有する離散時間及び離散状態マルコフ連鎖について、固定数のステップにわたって過渡的な量（ｔｒａｎｓｉｅｎｔｍｅａｓｕｒｅ）を推定するための決定論的準モンテカルロ法（ｄｅｔｅｒｍｉｎｉｓｔｉｃｑｕａｓｉ−ＭｏｎｔｅＣａｒｌｏｍｅｔｈｏｄ）が、以下の論文において提案され、研究されたことを述べており、以下の論文も、その全体が参照により本明細書に組み込まれる。 L'Ecuyer is a deterministic quasi-Monte Carlo method for estimating a transient measure over a fixed number of steps for discrete-time and discrete-state Markov chains with full ordered state space. method) is proposed and studied in the following paper, which is also incorporated herein by reference in its entirety:

Ｌｅｃｏｔ他、「Ｑｕａｓｉ−ＲａｎｄｏｍＷａｌｋＭｅｔｈｏｄｓ」、ＭｏｎｔｅＣａｒｌｏａｎｄＱｕａｓｉ−ＭｏｎｔｅＣａｒｌｏＭｅｔｈｏｄｓ２０００、ｐｐ．６３〜８５、Ｓｐｒｉｎｇｅｒ−Ｖｅｒｌａｇ（Ｂｅｒｌｉｎ、２００２） Lecot et al., “Quasi-Random Walk Methods”, Monte Carlo and Quasi-Monte Carlo Methods 2000, pp. 63-85, Springer-Verlag (Berlin, 2002)

Ｌｅｃｏｔ他、「Ｑｕａｓｉ−ＭｏｎｔｅＣａｒｌｏＭｅｔｈｏｄｓｆｏｒＥｓｔｉｍａｔｉｎｇＴｒａｎｓｉｅｎｔＭｅａｓｕｒｅｓｏｆＤｉｓｃｒｅｔｅＴｉｍｅＭａｒｋｏｖＣｈａｉｎｓ」、ＭｏｎｔｅＣａｒｌｏａｎｄＱｕａｓｉ−ＭｏｎｔｅＣａｒｌｏＭｅｔｈｏｄｓ２００２、ｐｐ．３２９〜４３、Ｓｐｒｉｎｇｅｒ−Ｖｅｒｌａｇ（Ｂｅｒｌｉｎ、２００４）（「Ｌｅｃｏｔ２００４」） Lecot et al., “Quasi-Monte Carlo Methods for Estimating Transient Measures of Discrete Time Markov Chains”, Monte Carlo and Qualiz-Monte Carlo Me. 329-43, Springer-Verlag (Berlin, 2004) ("Lecot 2004")

Ｌ’Ｅｃｕｙｅｒは、基数２の（０，２）−列を使用して、連鎖のｎ＝２^ｋ個のコピーを同数のステップにわたって並列にシミュレートする技法について説明している。連鎖のステップｊにおいて、当該技法は、ｎ個のコピーをそれらの状態に従って再度順序づけ、シミュレーションを行うために一様乱数の代わりに（０，２）−列のｎｊからｎｊ＋ｎ−１までの要素を利用することによって、ｎ個のコピーについて遷移（即ち次の状態）をシミュレートする。当該技法は、連鎖の各遷移のシミュレーションが単一の一様確率変数（ｕｎｉｆｏｒｍｒａｎｄｏｍｖａｒｉａｔｅ）を必要とすることを仮定する。Ｌｅｃｏｔ２００４において、マルコフ連鎖の遷移行列の構造についてのある条件下で、正しい値への収束が証明された。 L'Ecuyer is radix 2 (0,2) - using the column, which describes techniques to simulate the n = ^{2 k} pieces copies of the chain in parallel over the same number of steps. In step j of the chain, the technique reorders the n copies according to their state, and replaces the elements from nj to nj + n-1 in the (0,2) -sequence instead of a uniform random number for simulation. By using it, a transition (ie, the next state) is simulated for n copies. The technique assumes that the simulation of each transition in the chain requires a single uniform random variable. In Lecot 2004, convergence to the correct value was proved under certain conditions on the structure of the Markov chain transition matrix.

Ｌ’Ｅｃｕｙｅｒは、上記の技法を、ランダムで無制限な数のステップを有し、連続状態空間を有するマルコフ連鎖に一般化することを試みており（そのような技法は特に再生シミュレーションを扱うことを可能にすると主張されている）、それに対しては、マルコフ連鎖の１つのステップにおいて次の状態を生成するのに必要とされる一様確率変数の数ｄは、１より大きくなり得る。 L'Ecuyer has attempted to generalize the above technique to a Markov chain with a random and unlimited number of steps and having a continuous state space (such a technique is especially intended to handle regeneration simulations). In contrast, the number of uniform random variables d required to generate the next state in one step of a Markov chain can be greater than one.

しかし、光子軌道などをシミュレートするために、様々な研究者がマルコフ連鎖の使用を前提としているが、マルコフ連鎖、例えば光子軌道をシミュレートするために、準モンテカルロ法を使用するこれまでの技法は、ランダムサンプリングよりわずかに良い程度の改善を提供したにすぎなかった。しかし、以下に説明される本発明によるＱＭＣ法は、解演算子（ｓｏｌｕｔｉｏｎｏｐｅｒａｔｏｒ）の本質的な低次元構造を活用することができ、はるかに効率的な技法をもたらす。 However, in order to simulate photon orbits, various researchers assume the use of Markov chains, but conventional techniques that use quasi-Monte Carlo methods to simulate Markov chains, for example, photon orbits. Provided only a slight improvement over random sampling. However, the QMC method according to the present invention described below can take advantage of the inherent low-dimensional structure of the solution operator, resulting in a much more efficient technique.

したがって、本発明以前には、マルコフ連鎖を同時にシミュレートするための、より効率的な方法、システム、装置、デバイス、及びソフトウェアプログラム／コード製品に対する必要がまだ存在していた。 Thus, prior to the present invention, there was still a need for more efficient methods, systems, apparatus, devices, and software program / code products for simultaneously simulating Markov chains.

したがって、準モンテカルロ法及びソーティング戦略（ｓｏｒｔｉｎｇｓｔｒａｔｅｇｙ）を使用する、マルコフ連鎖をシミュレートするための、改良されたより効率的な方法、システム、装置、デバイス、及びソフトウェアプログラム／コード製品を提供するのが望ましい。 Accordingly, it is an object to provide an improved and more efficient method, system, apparatus, device, and software program / code product for simulating a Markov chain using quasi-Monte Carlo methods and sorting strategies. desirable.

本発明は、画像内のピクセルのピクセル値を生成するためのコンピュータグラフィックスシステムにおける使用に適した、方法、システム、装置、及びコンピュータソフトウェア／コンピュータコード製品を提供し、ピクセル値は、シーン内の点を、シミュレートされるカメラの画像平面上に記録されるように表し、コンピュータグラフィックスシステムは、選択された方向に沿ってピクセルからシーンに向かって放たれた少なくとも１つの光線をシミュレートするステップを含む、選択されたレイトレーシング法を使用して、画像のピクセル値を生成するように構成され、レイトレーシング法は、光線とシーン内の物体の表面との交わりを計算するステップと、シーン内の物体を照明する光線の軌道をシミュレートするステップとをさらに含み、軌道をシミュレートするステップは、マルコフ連鎖を使用する。 The present invention provides a method, system, apparatus, and computer software / computer code product suitable for use in a computer graphics system for generating pixel values of pixels in an image, wherein the pixel values are in a scene. The point is represented as recorded on the image plane of the simulated camera, and the computer graphics system simulates at least one ray emitted from the pixel toward the scene along the selected direction. The method is configured to generate a pixel value of the image using a selected ray-tracing method, the method comprising: calculating the intersection of the ray with the surface of an object in the scene; and Simulating the trajectory of rays that illuminate an object inside Look, step to simulate the track, using the Markov chain.

本発明の一態様では、本発明による方法、システム、装置、及びコンピュータソフトウェア／コンピュータコード製品は、準モンテカルロ法を使用してマルコフ連鎖をシミュレートするステップ（及び／又はシミュレートするための手段）を含み、マルコフ連鎖をシミュレートするステップは、状態をソートするステップを含み、ソートするステップは、近接性ソーティング（ｐｒｏｘｉｍｉｔｙｓｏｒｔｉｎｇ）を含む。 In one aspect of the invention, the method, system, apparatus, and computer software / computer code product according to the invention simulate (and / or means for simulating) a Markov chain using quasi-Monte Carlo methods. And simulating the Markov chain includes sorting states, and the sorting step includes proximity sorting.

本発明のさらなる態様は、複数のマルコフ連鎖を同時にシミュレートするステップを含み、複数のマルコフ連鎖を同時にシミュレートするステップは、Ｈａｌｔｏｎ列、Ｈａｌｔｏｎ列の変形、格子列、又は（ｔ，ｓ）−列のいずれかを使用して選択され得る、準モンテカルロ点を利用するステップを含む。複数のマルコフ連鎖を同時にシミュレートするステップは、任意の数の連鎖を利用することができる。 Further aspects of the invention include simulating a plurality of Markov chains simultaneously, the step of simultaneously simulating a plurality of Markov chains comprising a Halton sequence, a Halton sequence modification, a lattice sequence, or (t, s) −. Utilizing a quasi-Monte Carlo point that can be selected using any of the columns. The step of simultaneously simulating multiple Markov chains can utilize any number of chains.

本発明のさらなる態様では、複数のマルコフ連鎖を同時にシミュレートするステップは、軌道分割（ｔｒａｊｅｃｔｏｒｙｓｐｌｉｔｔｉｎｇ）などによって、追加的な軌道をオンザフライで追加するステップを含む。シミュレートするステップは、粒子吸収をシミュレートする技法を利用するステップも含むことができ、ロシアンルーレット法（ＲｕｓｓｉａｎＲｏｕｌｅｔｔｅｔｅｃｈｎｉｑｕｅ）を使用するステップを含むことができる。 In a further aspect of the invention, simulating multiple Markov chains simultaneously includes adding additional trajectories on the fly, such as by trajectory splitting. The simulating step can also include utilizing a technique for simulating particle absorption, and can include using a Russian Roulette technique.

本発明の一態様では、ｓ次元問題について、複数のマルコフ連鎖を同時にシミュレートするステップは、以下の技法を含む。
ｎ：＝０
準モンテカルロ点ｘ_ｌを使用して、Ｘ_０，ｌを初期化する。
ループ：
− 適切な順序を使用して、状態ベクトルをソートする。
− ｎ：＝ｎ＋１
− 後続の準モンテカルロ点ｘ_ｌを使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。 In one aspect of the invention, simultaneously simulating multiple Markov chains for an s-dimensional problem includes the following technique.
n: = 0
X _{0, l} is initialized using the quasi-Monte Carlo point _xl .
loop:
-Sort the state vector using the appropriate order.
-N: = n + 1
-Continue the chain by calculating _{Xn, l} using the subsequent quasi-Monte Carlo points _xl .

本発明は、状態空間における近接性によって状態を順序づけるステップ、及びランダム化の技法も含むことができる。任意の選択されたランダム化を施された任意の選択された点集合又は点列が、本発明の方法と併せて利用できる。 The present invention can also include ordering states by proximity in state space and randomization techniques. Any selected point set or point sequence with any selected randomization can be used in conjunction with the method of the present invention.

別の態様では、本発明は、以下の技法をさらに含む。
ｎ：＝０、点列ｘ_ｌを選択する。
ランダム化をインスタンス化する。
ｒａｎｄ（ｘ_ｌ）を使用して、Ｘ_０，ｌを初期化する。
ループ
・何らかの順序を使用して、状態ベクトルをソートする。
・ｎ：＝ｎ＋１
・ランダム化をインスタンス化する。
・ｒａｎｄ（ｘ_ｌ）を使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。 In another aspect, the present invention further includes the following techniques.
n: = 0, point sequence _xl is selected.
Instantiate randomization.
rand (x _l ) is used to initialize X _{0, l} .
Loop • Sort the state vector using some order.
・ N: = n + 1
• Instantiate randomization.
Continue the chain by calculating X _{n, l} using rand (x _l ).

ランダム化技法は、各時間ステップｎにおいて準モンテカルロ点をランダム化するステップを含むことができ、ランダム化するステップは、より詳細には、後続のサンプルがそこから取り出される、基数ｂ＝２の（ｔ，ｓ）−列を選択するステップと、サンプルにｓ次元ランダムベクトルによるＸＯＲ演算を施すステップであって、ランダムベクトルが、各遷移ステップの後に生成される、ステップを含む。 The randomization technique can include randomizing a quasi-Monte Carlo point at each time step n, and more specifically, the randomizing step is more specifically a radix b = 2 (from which subsequent samples are taken. selecting a t, s) -column and subjecting the sample to an XOR operation with an s-dimensional random vector, wherein a random vector is generated after each transition step.

本発明の別の態様は、明るさ、又は状態のノルムによってソートするステップと、空間階層を使用してソートするステップと、少なくとも１つの空間充填曲線（ｓｐａｃｅｆｉｌｌｉｎｇｃｕｒｖｅ）によって列挙されたバケットを使用してソートするステップとを含む。空間階層を使用してソートするステップは、バイナリ階層を利用するステップを含むことができ、空間階層は、ＢＳＰツリー、ｋＤツリー、若しくはＢＳＰツリー構造の他の軸平行サブセットのいずれか、又は境界ボリューム階層、又は規則的ボクセル（ｖｏｘｅｌ）を含むことができる。バイナリ階層は、選択された発見的手法によって選択された平面を使用して再帰的に細分を行い、その後、階層のリーフを近接性の順序で列挙するために、階層を順序正しく横断することによって、構築することができる。より効率的な横断は、ツリーを左平衡化（ｌｅｆｔ−ｂａｌａｎｃｅ）させ、ツリーを配列データ構造に保存することによって提供することができる。空間階層を使用してソートするステップは、バケットソーティング及び選択された空間充填曲線を利用するステップと、本発明の一態様では、規則的ボクセルを使用するステップとを含むこともできる。 Another aspect of the invention uses a bucket enumerated by sorting by brightness or state norm, sorting using a spatial hierarchy, and at least one space filling curve. And sorting. Sorting using a spatial hierarchy can include utilizing a binary hierarchy, where the spatial hierarchy is either a BSP tree, a kD tree, or any other axis-parallel subset of a BSP tree structure, or a boundary volume. It can contain hierarchies or regular voxels. The binary hierarchy is recursively subdivided using the plane selected by the chosen heuristics, and then traversing the hierarchy in order to enumerate the leaves of the hierarchy in order of proximity Can be built. More efficient traversal can be provided by left-balancing the tree and storing the tree in an array data structure. Sorting using the spatial hierarchy may also include utilizing bucket sorting and a selected space filling curve and, in one aspect of the invention, using regular voxels.

本発明の別の態様では、シミュレーション及び／又はソーティング処理は、
軸平行境界ボックス（ａｘｉｓ−ａｌｉｇｎｅｄｂｏｕｎｄｉｎｇｂｏｘ）によって、レンダリングされる物体の境界を定めるステップと、
物体の選択された点を通過する分割平面を適用することによって、境界ボックスを、各々がリーフで終了する、連続的な左ブランチ及び右ブランチに再帰的又は反復的に分割し、それによって、複数のリーフを生成するステップと、
リーフをそれぞれの明るさに従って階層的に順序づけるステップと、
選択された数のバケットに分割された、選択されたサイズの行列上で、バケットソートを実行するステップと、
行列内をたどるために選択された空間充填曲線を使用するステップであって、個々のバケット内において、より小さな空間充填曲線が、行列内の個々のセル内をたどるために使用される、ステップと
を含む。 In another aspect of the invention, the simulation and / or sorting process comprises
Delimiting an object to be rendered by an axis-aligned bounding box;
By applying a split plane that passes through a selected point of the object, the bounding box is split recursively or iteratively into successive left and right branches, each ending with a leaf, thereby Generating a leaf of
Ordering the leaves hierarchically according to their brightness;
Performing a bucket sort on a matrix of a selected size divided into a selected number of buckets;
Using a selected space-filling curve to traverse the matrix, wherein within each bucket a smaller space-filling curve is used to traverse within an individual cell within the matrix; and including.

本発明のシミュレーションプロセスは、積分方程式を解くための構成用に適合することができる。シミュレーションは、リアルに見える画像を合成するための光輸送（ｌｉｇｈｔｔｒａｎｓｐｏｒｔ）シミュレーションを可能にするように動作可能である。本発明のこの態様によれば、光輸送シミュレーションの基礎をなす積分方程式は、経路空間をサンプリングすることがマルコフ連鎖をシミュレートすることに対応するように、経路積分として定式化し直すことができ、その際、経路は、レイトレーシング及び散乱事象によって確立され、初期分布は、光源又はセンサの放出特性によって決定され、遷移確率は、画像内の表面についての双方向反射分布関数（ｂｉ−ｄｉｒｅｃｔｉｏｎａｌｒｅｆｌｅｃｔａｎｃｅｄｉｓｔｒｉｂｕｔｉｏｎｆｕｎｃｔｉｏｎ）及び双方向表面下散乱分布関数（ｂｉ−ｄｉｒｅｃｔｉｏｎａｌｓｕｂｓｕｒｆａｃｅｓｃａｔｔｅｒｉｎｇｄｉｓｔｒｉｂｕｔｉｏｎｆｕｎｃｔｉｏｎ）によって決定される。透過効果は、経路積分、初期分布、及び遷移確率を利用することによって、処理することができる。本発明の関連態様では、経路積分は、高次元準モンテカルロ点を利用することによって、又は低次元準モンテカルロ点をパディングすることによって、解くことができ、光輸送シミュレーションの基礎をなす積分方程式は、Ｆｒｅｄｈｏｌｍ積分方程式又はＶｏｌｔｅｒｒａ積分方程式と見なされる。 The simulation process of the present invention can be adapted for construction to solve integral equations. The simulation is operable to allow a light transport simulation to synthesize a real-looking image. According to this aspect of the invention, the integral equation underlying the light transport simulation can be reformulated as a path integral so that sampling the path space corresponds to simulating a Markov chain, In doing so, the path is established by ray tracing and scattering events, the initial distribution is determined by the emission characteristics of the light source or sensor, and the transition probability is a bi-directional reflection distribution function for the surface in the image. function) and bi-directional subsurface scattering distribution function. The transmission effect can be handled by utilizing path integrals, initial distributions, and transition probabilities. In a related aspect of the invention, the path integral can be solved by utilizing high-dimensional quasi-Monte Carlo points or by padding low-dimensional quasi-Monte Carlo points, and the integral equation underlying the light transport simulation is: It is regarded as a Fredholm integral equation or a Volterra integral equation.

さらなる詳細、実施形態、実践、及び例が、添付の図面と併せて読まれる以下の詳細な説明において説明され、又は図面と併せて詳細な説明を読むことから、当業者に容易に明らかとなるであろう。 Additional details, embodiments, practices, and examples are set forth in the following detailed description, read in conjunction with the accompanying drawings, or will be readily apparent to those skilled in the art from reading the detailed description in conjunction with the drawings. Will.

本発明が配備され得る、従来のデジタル処理システムの概略図である。1 is a schematic diagram of a conventional digital processing system in which the present invention may be deployed. 本発明が配備され得る、従来のパーソナルコンピュータ又は同様のコンピューティング装置の概略図である。1 is a schematic diagram of a conventional personal computer or similar computing device in which the present invention may be deployed. 本発明の第１の態様による、全体的な方法を示す図である。FIG. 2 shows the overall method according to the first aspect of the present invention. 自己交差の問題を図説する、レイトレーシング手順の図である。FIG. 6 is a ray tracing procedure illustrating the self-intersection problem. 本発明のさらなる態様による、アクセラレーションデータ構造として使用される分割された軸平行境界ボックスの立面図による図である。FIG. 6 is an elevational view of a split axis parallel bounding box used as an acceleration data structure according to a further aspect of the invention. Ｌ平面及びＲ平面を用いた境界ボックスの分割を示す、図５に示された軸平行境界ボックスの等角図による図である。FIG. 6 is an isometric view of the axis parallel bounding box shown in FIG. 5 showing the partitioning of the bounding box using the L and R planes. Ｌ平面及びＲ平面を用いた境界ボックスの分割を示す、図５に示された軸平行境界ボックスの等角図による図である。FIG. 6 is an isometric view of the axis parallel bounding box shown in FIG. 5 showing the partitioning of the bounding box using the L and R planes. Ｌ平面及びＲ平面を用いた境界ボックスの分割を示す、図５に示された軸平行境界ボックスの等角図による図である。FIG. 6 is an isometric view of the axis parallel bounding box shown in FIG. 5 showing the partitioning of the bounding box using the L and R planes. 本発明のさらなる態様による、レイトレーシング法のフローチャートである。4 is a flow chart of a ray tracing method according to a further aspect of the present invention. 本発明のさらなる態様による、レイトレーシング法のフローチャートである。4 is a flow chart of a ray tracing method according to a further aspect of the present invention. シフトΦ_Ｓ（ｎ）を図説する図である。It is a figure illustrating shift Φ _S (n). 例示的なＨａｌｔｏｎ列をその階層構造特性とともに示す図である。FIG. 5 illustrates an exemplary Halton column with its hierarchical characteristics. 双方向経路トレーシングによってサンプリング輸送経路空間を示す図である。It is a figure which shows sampling transport route space by bidirectional route tracing. モンテカルロ法を使用するシミュレーションと配列ランダム化準モンテカルロ法を使用するシミュレーションの間の差を示した、レンダリングされた画像の図である。FIG. 6 is a rendered image showing the difference between a simulation using the Monte Carlo method and a simulation using a sequence randomized quasi-Monte Carlo method. モンテカルロ法を使用するシミュレーションと配列ランダム化準モンテカルロ法を使用するシミュレーションの間の差を示した、レンダリングされた画像の図である。FIG. 6 is a rendered image showing the difference between a simulation using the Monte Carlo method and a simulation using a sequence randomized quasi-Monte Carlo method. モンテカルロ法を使用するシミュレーションと配列ランダム化準モンテカルロ法を使用するシミュレーションの間の差を示した、レンダリングされた画像の図である。FIG. 6 is a rendered image showing the difference between a simulation using the Monte Carlo method and a simulation using a sequence randomized quasi-Monte Carlo method. モンテカルロ法を使用するシミュレーションと配列ランダム化準モンテカルロ法を使用するシミュレーションの間の差を示した、レンダリングされた画像の図である。FIG. 6 is a rendered image showing the difference between a simulation using the Monte Carlo method and a simulation using a sequence randomized quasi-Monte Carlo method. 光源上の開始点からの光子軌道を示す図である。It is a figure which shows the photon orbit from the starting point on a light source. 光源上の開始点からの光子軌道を示す図である。It is a figure which shows the photon orbit from the starting point on a light source. 光源上の開始点からの光子軌道を示す図である。It is a figure which shows the photon orbit from the starting point on a light source. 光源上の開始点からの光子軌道を示す図である。It is a figure which shows the photon orbit from the starting point on a light source. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. 状態を空間階層のリーフに分類し、それによって、階層を順序正しく横断することによって、近接性の順序を定義するステップを示す図である。FIG. 4 shows the steps of defining the order of proximity by classifying states into leaves of a spatial hierarchy, thereby traversing the hierarchy in order. テストシーン「Ｓｈｉｒｌｅｙ６」の図である。It is a figure of the test scene "Shirley 6". マスタ解に対する平均ＲＭＳ誤差を示すグラフである。It is a graph which shows the average RMS error with respect to a master solution. 大域的分散を示すグラフである。It is a graph which shows global dispersion | distribution. ピクセルベースの分散を示すグラフである。6 is a graph showing pixel-based dispersion. テストシーン「ＩｎｖｉｓｉｂｌｅＤａｔｅ」の図である。It is a figure of a test scene "Invisible Date". マスタ解に対する平均ＲＭＳ誤差を示すグラフである。It is a graph which shows the average RMS error with respect to a master solution. 大域的分散を示すグラフである。It is a graph which shows global dispersion | distribution. ピクセルベースの分散を示すグラフである。6 is a graph showing pixel-based dispersion. モンテカルロを使用してレンダリングされたテストシーン「ＩｎｖｉｓｉｂｌｅＤａｔｅ」の視覚的比較の図である。FIG. 6 is a visual comparison of a test scene “Invisible Date” rendered using Monte Carlo. ランダム化準モンテカルロを使用してレンダリングされたテストシーン「ＩｎｖｉｓｉｂｌｅＤａｔｅ」の視覚的比較の図である。FIG. 6 is a visual comparison of a test scene “Invisible Date” rendered using randomized quasi-Monte Carlo. ソーティングを伴うランダム化準モンテカルロを使用してレンダリングされたテストシーン「ＩｎｖｉｓｉｂｌｅＤａｔｅ」の視覚的比較の図である。FIG. 7 is a visual comparison of a test scene “Invisible Date” rendered using randomized quasi-Monte Carlo with sorting. 迷宮テストシーンの概略図である。It is the schematic of a labyrinth test scene. 各技法を使用した際のカメラによって受け取られた総放射の平均量を示すグラフである。Figure 6 is a graph showing the average amount of total radiation received by the camera when using each technique. グラフの拡大部分の図である。It is a figure of the enlarged part of a graph. マスタ解からの偏差を表示する表である。It is a table | surface which displays the deviation from a master solution. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention. 本発明の説明された態様による、技法及び副技法のフローチャートである。4 is a flowchart of techniques and sub-techniques according to the described aspects of the invention.

本発明は、レイトレーシングのため、及び高速レイトレーシング用に必要とされるアクセラレーションデータ構造の効率的な構築のための、改良された技法を提供する。以下の説明は、これらの技法による方法、構造、及びシステムについて説明する。 The present invention provides an improved technique for the efficient construction of acceleration data structures required for ray tracing and for fast ray tracing. The following description describes methods, structures, and systems according to these techniques.

説明される方法及びシステムが、ＭｉｃｒｏｓｏｆｔＷｉｎｄｏｗｓ、Ｌｉｎｕｘ、又はＵｎｉｘなどの従来のオペレーティングシステムに従って動作する、又は従来のオペレーティングシステムをエミュレートする、スタンドアロン構成の、又はネットワーク上に分散する、パーソナルコンピュータ（ＰＣ）又は等価なデバイスなどの従来のコンピュータ装置を使用して、ソフトウェア、ハードウェア、又はソフトウェアとハードウェアの組合せで実施され得ることは、当業者であれば理解されよう。以下に説明され、特許請求の範囲で列挙される、様々な処理手段及び計算手段は、したがって、適切に構成されたデジタル処理デバイス又はデバイスのネットワークの、ソフトウェア及び／又はハードウェア要素で実施されてよい。処理は、順次又は並列に実行されてよく、専用又は再構成可能なハードウェアを使用して実施されてよい。 The described methods and systems operate in accordance with a conventional operating system such as Microsoft Windows, Linux, or Unix, or emulate a conventional operating system, in a stand-alone configuration or distributed over a network (PC) Those skilled in the art will appreciate that conventional computer equipment, such as) or equivalent devices, may be implemented in software, hardware, or a combination of software and hardware. The various processing and computing means described below and recited in the claims are thus implemented in software and / or hardware elements of a suitably configured digital processing device or network of devices. Good. Processing may be performed sequentially or in parallel, and may be performed using dedicated or reconfigurable hardware.

本発明による方法、デバイス、又はソフトウェア製品は、スタンドアロンであるか、ネットワーク接続されているか、ポータブルであるか、又は固定式であるかを問わない、図１（例えばネットワークシステム１００）に例として示されたような、広範な従来のコンピューティングデバイス及びシステムのいずれかにおいて動作することができ、従来のコンピューティングデバイス及びシステムは、従来のＰＣ１０２、ラップトップ１０４、ハンドヘルド若しくはモバイルコンピュータ１０６を含み、又はインターネット若しくは他のネットワーク１０８を介して接続され、インターネット若しくは他のネットワーク１０８は、同様にサーバ１１０及びストレージ１１２を含むことができる A method, device, or software product according to the present invention is illustrated by way of example in FIG. 1 (eg, network system 100), whether it is stand-alone, networked, portable, or fixed. Can operate on any of a wide range of conventional computing devices and systems, such as a conventional PC 102, laptop 104, handheld or mobile computer 106, or Connected via the Internet or other network 108, the Internet or other network 108 may similarly include a server 110 and storage 112.

従来のコンピュータソフトウェア及びハードウェア実施に沿って、本発明に従って構成されたソフトウェアアプリケーションは、例えば、図２に示されるようなＰＣ１０２内で動作することができ、ＰＣ１０２において、プログラム命令は、ＣＤ−ＲＯＭ１１６、磁気ディスク、又は他のストレージ１２０から読み出され、ＣＰＵ１１８による実行のためにＲＡＭ１１４にロードされることができる。データは、従来のキーボード、スキャナ、マウス、又は他の要素１０３を含む、任意の知られたデバイス又は手段を介して、システムに入力することができる。 In line with conventional computer software and hardware implementations, a software application configured in accordance with the present invention can run in, for example, a PC 102 as shown in FIG. 2, where program instructions are stored in a CD-ROM 116. Can be read from a magnetic disk or other storage 120 and loaded into the RAM 114 for execution by the CPU 118. Data can be entered into the system via any known device or means, including a conventional keyboard, scanner, mouse, or other element 103.

本発明の本説明は、２つの大きなセクションに分割される。
Ｉ．リアルタイム高精度レイトレーシング
ＩＩ．マルコフ連鎖及び準モンテカルロ法を使用してシミュレートされる光線のためのソーティング戦略 This description of the invention is divided into two large sections.
I. Real-time high-accuracy ray tracing II. A sorting strategy for rays simulated using Markov chains and quasi-Monte Carlo methods

Ｉ．リアルタイム高精度レイトレーシング
図３は、本発明の一態様による全体的な方法２００を示す図である。この方法は、コンピュータグラフィックスシステムに関連して実施され、システムでは、画像内の各ピクセルについて、ピクセル値が生成される。生成された各ピクセル値は、シーン内の点を、シミュレートされるカメラの画像平面上に記録されるように表す。コンピュータグラフィックスシステムは、選択されたレイトレーシング法を使用して、画像のピクセル値を生成するように構成される。選択されたレイトレーシング法は、選択された方向に沿ってピクセルからシーンに向かって放たれた少なくとも１つの光線を含む光線ツリー（ｒａｙｔｒｅｅ）の使用を含み、光線とシーン内の物体（及び／又は物体の表面）との交わりの計算をさらに含む。 I. Real-Time Precision Ray Tracing FIG. 3 illustrates an overall method 200 according to one aspect of the present invention. The method is implemented in connection with a computer graphics system, where a pixel value is generated for each pixel in the image. Each generated pixel value represents a point in the scene as recorded on the image plane of the simulated camera. The computer graphics system is configured to generate pixel values for the image using the selected ray tracing method. The selected ray tracing method includes the use of a ray tree that includes at least one ray emitted from the pixel toward the scene along the selected direction, and includes the ray and the object in the scene (and / or Or the calculation of the intersection with the surface of the object).

図３の方法２００において、光線とシーン内の表面との交わりを計算するために、境界ボリューム階層が使用される。ステップ２０１において、シーンの境界ボックスが計算される。ステップ２０２において、所定の終了基準が満たされているかどうかが判定される。満たされていない場合、ステップ２０３において、軸平行境界ボックスが精緻化される。プロセスは、終了基準が満たされるまで、再帰的に継続される。本発明の一態様によれば、終了基準は、境界ボックス座標が光線／表面の交点の浮動小数点表現から解像度（ｒｅｓｏｌｕｔｉｏｎ）の１単位分だけ異なるという条件として定義される。しかし、本発明の範囲は、他の終了基準にまで及んでいる。 In the method 200 of FIG. 3, a boundary volume hierarchy is used to calculate the intersection of the ray with the surface in the scene. In step 201, the bounding box of the scene is calculated. In step 202, it is determined whether predetermined termination criteria are met. If not, in step 203 the axis parallel bounding box is refined. The process continues recursively until termination criteria are met. According to one aspect of the invention, the termination criterion is defined as the condition that the bounding box coordinates differ from the floating point representation of the ray / surface intersection by one unit of resolution. However, the scope of the present invention extends to other termination criteria.

アクセラレーション構造としての境界ボリューム階層の使用は、多くの理由で有利である。境界ボリューム階層のためのメモリ要件は、レイトレーシングされる物体の数に線形に制限され得る。また、以下に説明されるように、境界ボリューム階層は、３Ｄツリーよりもはるかに効率的に構築することができ、このことが、全体的に動きのあるシーンに必要とされる償却解析（ａｍｏｒｔｉｚｅｄａｎａｌｙｓｉｓ）にとって、境界ボリューム階層を非常に適したものにする。 The use of a border volume hierarchy as an acceleration structure is advantageous for a number of reasons. Memory requirements for the boundary volume hierarchy can be linearly limited to the number of objects that are ray-traced. Also, as will be explained below, the boundary volume hierarchy can be built much more efficiently than a 3D tree, which is an amortized analysis required for an overall moving scene. analysis) makes the boundary volume hierarchy very suitable.

以下の説明は、レイトレーシング法におけるある種の問題と、それらの問題に対処する本発明の特定の態様について詳細に説明する。図４は、「自己交差（ｓｅｌｆ−ｉｎｔｅｒｓｅｃｔｉｏｎ）」問題を図説する図である。図４は、画像表面３０２と、観測点３０４と、光源３０６とを含む、レイトレーシング手順３００を示している。表面の画像を合成するため、一連の計算が、観測点３０４と表面３０２の間に延在する光線を見出すために実行される。図４は、１つのそのような光線３０８を示している。理想的には、その場合、光線３０８と表面３０２との正確な交点３１０が計算される。 The following description describes in detail certain problems in ray tracing methods and specific aspects of the present invention that address those problems. FIG. 4 is a diagram illustrating the “self-intersection” problem. FIG. 4 shows a ray tracing procedure 300 that includes an image surface 302, an observation point 304, and a light source 306. In order to synthesize the surface image, a series of calculations are performed to find rays extending between the observation point 304 and the surface 302. FIG. 4 shows one such ray 308. Ideally, in that case, the exact intersection 310 of the ray 308 and the surface 302 is calculated.

しかし、コンピュータにおける浮動小数点算術計算のせいで、計算された光線／表面の交点３１２が実際の交点３１０とは異なることが時にあり得る。さらに、図４に図説されるように、計算された点３１２が表面３０２の「誤った」側に見出されることがあり得る。その場合、計算された光線／表面の交点３１２から光源３０６まで延在する第２の光線３１４を見出すために計算が実行されたとき、これらの計算は、第２の光線３１４が、光源３０６まで直接的に延在する代わりに、第２の交点３１６において表面３０２に打ち当たることを示し、したがって、画像誤差をもたらす。 However, due to floating point arithmetic calculations in the computer, the calculated ray / surface intersection 312 may sometimes be different from the actual intersection 310. Further, as illustrated in FIG. 4, the calculated point 312 can be found on the “wrong” side of the surface 302. In that case, when calculations are performed to find a second ray 314 that extends from the calculated ray / surface intersection 312 to the light source 306, these calculations result in the second ray 314 reaching the light source 306. Instead of extending directly, it shows hitting the surface 302 at the second intersection point 316, thus leading to an image error.

自己交差問題に対する１つの知られた解決策は、表面３０２から安全な距離のところから各光線３０８を出発させることである。この安全な距離は一般に、グローバル浮動小数点（ｇｌｏｂａｌｆｌｏａｔｉｎｇｐｏｉｎｔ）εとして表現される。しかし、グローバル浮動小数点εの決定は、画像が合成される、シーン及びシーン自体の中の特定の位置に大きく依存する。 One known solution to the self-intersection problem is to start each ray 308 from a safe distance from the surface 302. This safe distance is generally expressed as a global floating point ε. However, the determination of the global floating point ε is highly dependent on the scene and the specific location within the scene itself where the image is synthesized.

本発明の一態様は、より高精度の代替案を提供する。計算された光線／表面の交点３１２に到達した後、実際の交点３１０により近い交点を再計算するために、計算された点３１２及び光線３０８の方向が使用される。交点のこの再計算は、精度を高める反復として、レイトレーシング法に組み込まれる。反復的に計算される交点が表面３０２の「誤った」側に存在すると判明した場合、その交点は、表面３０２の「正しい」側に移動される。反復的に計算される交点は、表面の法線に沿って、又は法線の最長成分によって決定される軸に沿って移動することができる。グローバル浮動小数点εを使用する代わりに、小数点は浮動小数点の仮数の最終ビットの方へ整数εだけ移動される。 One aspect of the present invention provides a more accurate alternative. After reaching the calculated ray / surface intersection 312, the direction of the calculated point 312 and ray 308 is used to recalculate the intersection closer to the actual intersection 310. This recalculation of intersection points is incorporated into the ray tracing method as an iteration that increases accuracy. If an iteratively calculated intersection is found to be on the “wrong” side of the surface 302, the intersection is moved to the “correct” side of the surface 302. The iteratively calculated intersection can move along the surface normal or along an axis determined by the longest component of the normal. Instead of using the global floating point ε, the decimal point is moved by the integer ε toward the last bit of the floating point mantissa.

説明された手順は、倍精度の計算を回避し、浮動小数点数の指数によって決定される浮動小数点数の位取り（ｓｃａｌｅ）に暗黙に適応するという利点を有する。したがって、この実施では、すべての２次光線は、εオフセットを不要にするこれらの修正された点から直接出発する。したがって、交わり計算の最中、有効な光線区間は、何らかのオフセットの代わりに、０から開始すると仮定することができる。 The described procedure has the advantage of avoiding double precision calculations and implicitly adapting to the floating point scale determined by the exponent of the floating point number. Thus, in this implementation, all secondary rays start directly from these modified points that eliminate the ε offset. Thus, during the intersection calculation, it can be assumed that the effective ray section starts at 0 instead of some offset.

仮数の整数表現の修正は、どの点がどの側に存在するかを決定するために三角形と平面とを交わらせる場合、数値問題も回避する。 The modification of the integer representation of the mantissa also avoids numerical problems when crossing a triangle and a plane to determine which point is on which side.

凸結合の凸閉包性を活用して、光線と自由曲面との交わりは、光線の源に最も近い交点を含む軸平行境界ボックスを精緻化することによって見出すことができる。この精緻化は、浮動小数点数の解像度に達するまで、即ち、境界ボックス座標が浮動小数点表現から解像度の１単位分だけ異なるまで、継続することができる。その場合、自己交差問題は、境界ボックスの中心で表面法線に最も近い境界ボックスの角を選択することによって回避される。この角点は、その後、第２の光線を出発させるために使用される。この「光線物体交わりテスト」は、非常に効率的であり、自己交差問題の回避から利益を得る。 Utilizing the convex hull nature of convex coupling, the intersection of a ray and a free-form surface can be found by refining the axis-parallel bounding box containing the intersection closest to the source of the ray. This refinement can continue until a floating point resolution is reached, i.e., the bounding box coordinates differ from the floating point representation by one unit of resolution. In that case, the self-intersection problem is avoided by selecting the corner of the bounding box that is closest to the surface normal at the center of the bounding box. This corner point is then used to start the second ray. This “ray object intersection test” is very efficient and benefits from avoiding the self-intersection problem.

アクセラレーションデータ構造を構築した後、三角形が面内で変形される。新しい表現は、特別な苦労なしに交わりテストが縮退三角形（ｄｅｇｅｎｅｒａｔｅｔｒｉａｎｇｌｅ）を扱えるように、縮退三角形を符号化する。新しい表現は、もちろん、縮退三角形がグラフィックスパイプラインに入ることを単に防止することも可能である。コンピュータプログラムリストのセクション２．２．１及び２．２．２は、本発明の本態様によるソースコードのリストを示している。 After building the acceleration data structure, the triangle is deformed in-plane. The new representation encodes the degenerate triangle so that the intersection test can handle the degenerate triangle without any special effort. The new representation can of course simply prevent the degenerate triangle from entering the graphics pipeline. Sections 2.2.1 and 2.2.2 of the computer program listing show a list of source code according to this aspect of the invention.

テストは最初に、光線と三角形の平面との交わりを決定し、その後、光線上の有効区間］０，ｒｅｓｕｌｔ．ｔｆａｒ］の外部の交わりを除外する。これは、ただ１回の整数テストによって達成される。＋０が有効区間から除外されることに留意されたい。これは、非正規化浮動小数点数がサポートされない場合に重要である。この最初の決定が成功した場合、交わりの重心座標を計算することによって、テストは先に進む。完全包含テストを実行するために、やはり整数テストだけが、即ち、より詳細には２つのビットのテストだけが必要とされることに留意されたい。したがって、ブランチの数は最小である。この効率的なテストを可能にするため、三角形の辺及び法線は、変形ステップにおいて、適切にスケーリングされる。 The test first determines the intersection of the ray with the triangular plane, then the effective interval on the ray] 0, result. tfar] is excluded. This is achieved by a single integer test. Note that +0 is excluded from the valid interval. This is important when denormalized floating point numbers are not supported. If this initial decision is successful, the test proceeds by calculating the barycentric coordinates of the intersection. Note that in order to perform a full inclusion test, again only an integer test is needed, i.e. more specifically only a test of two bits. Therefore, the number of branches is minimal. In order to allow this efficient testing, triangle edges and normals are appropriately scaled in the deformation step.

テストの精度は、誤った又は失われた光線交わりを回避するのに十分である。しかし、横断の最中、堅牢な交わりテストのために、三角形を拡大するのが適切である状況が生じることがある。これは、三角形を変形する前に行うことができる。三角形はその法線の最長成分によって識別される軸に沿って射影されるので、この射影ケースは、保存されなければならない。これは、アクセラレーションデータ構造のリーフノードにおいて、カウンタによって達成される。三角形参照（ｔｒｉａｎｇｌｅｒｅｆｅｒｅｎｃｅ）は、射影ケースによってソートされ、リーフは、各クラス内の三角形の数のためのバイトを含む。 The accuracy of the test is sufficient to avoid false or lost ray intersections. However, during traversal situations may arise where it is appropriate to enlarge the triangles for robust intersection testing. This can be done before deforming the triangle. Since the triangle is projected along the axis identified by the longest component of its normal, this projection case must be preserved. This is accomplished by a counter at the leaf node of the acceleration data structure. Triangle references are sorted by projection case, and the leaf contains bytes for the number of triangles in each class.

本発明のさらなる態様は、レイトレーシング用のアクセラレーションデータ構造を構築するための改良された手法を提供する。多くの異なる最適化に従った従来のソフトウェア実施と比較して、本明細書で説明される手法は、優れたレイトレーシング性能をもたらす著しく平たいツリーを生みだす。 A further aspect of the invention provides an improved approach for building an acceleration data structure for ray tracing. Compared to conventional software implementations following many different optimizations, the approach described herein yields a significantly flat tree that provides excellent ray tracing performance.

分割平面の候補は、分割される軸平行境界ボックスの内部の三角形頂点の座標によって与えられる。これは、実際には境界ボックスの外部にあるが、境界ボックスによって定義される３つの区間の１つに含まれる座標を少なくとも１つ有する、頂点を含むことに留意されたい。これらの候補から、現在の軸平行境界ボックスの最長辺の中央に最も近い平面が選択される。さらなる最適化は、表面法線の最長成分が潜在的な分割平面の法線と一致する三角形の候補のみを選択する。分割平面を三角形頂点を通るように置くことは、分割平面によって分割される三角形の数を暗黙的に減少させるので、この手順は、はるかに平たいツリーを生みだす。加えて、表面が緊密に近似され、空の空間が最大化される。三角形の数が指定された閾値よりも多くなり、分割平面の候補がもはや存在しない場合、ボックスは、その最長辺に沿って中央で分割される。これは、例えば、長対角線物体の使用を含む、他の手法の非効率性を回避する。 The split plane candidates are given by the coordinates of the triangle vertices inside the axis-parallel bounding box to be split. Note that this includes vertices that are actually outside the bounding box, but have at least one coordinate contained in one of the three intervals defined by the bounding box. From these candidates, the plane closest to the center of the longest side of the current axis-parallel bounding box is selected. Further optimization selects only those triangle candidates whose longest surface normal component matches the normal of the potential splitting plane. This procedure yields a much flatter tree because placing the splitting plane through the triangle vertices implicitly reduces the number of triangles split by the splitting plane. In addition, the surface is closely approximated and the empty space is maximized. If the number of triangles exceeds the specified threshold and there are no more split plane candidates, the box is split in the middle along its longest side. This avoids the inefficiencies of other approaches, including, for example, the use of long diagonal objects.

どの三角形が階層内のノードの左側及び右側チャイルドに属するかを決定する再帰的手順は一般に、広範囲のブックキーピングと、メモリ割り当てとを必要とした。例外ケースにだけうまくいかない、はるかに簡略な手法が存在する。レイトレーシングされる物体への参照の配列が、２つだけ割り当てられる。第１の配列は、物体参照（ｏｂｊｅｃｔｒｅｆｅｒｅｎｃｅ）を用いて初期化される。再帰的な空間分割の最中、左側の要素のスタックは、配列の先頭から成長し、一方、右側に分類される要素は、配列の末尾から中央に向かって成長するスタック上に維持される。分割平面と交わっている要素、即ち、左側でも右側でもある要素を速やかに回復できるように、第２の配列が、それらのスタックを維持する。したがって、バックトラッキングは、効率的で簡単である。 The recursive procedure for determining which triangles belong to the left and right children of a node in the hierarchy generally required extensive bookkeeping and memory allocation. There is a much simpler approach that does not work only for exception cases. Only two arrays of references to the ray-tracing object are assigned. The first array is initialized with an object reference. During the recursive spatial partitioning, the left element stack grows from the beginning of the array, while the elements classified on the right are maintained on the stack growing from the end of the array toward the center. The second array maintains their stack so that the elements that intersect the split plane, i.e., the elements that are both left and right, can be quickly recovered. Thus, backtracking is efficient and simple.

表面積ヒューリスティック（ｓｕｒｆａｃｅａｒｅａｈｅｕｒｉｓｔｉｃ）を使用することによってツリーのブランチを刈り込む代わりに、ツリーの深さは、固定された深さから開始するバイナリ空間分割を近似的に左平衡化することによって刈り込まれる。網羅的な実験によって観測されたように、グローバルな固定深さパラメータが、非常に多様なシーンにわたって指定され得る。これは、ある回数のバイナリ空間分割の後、空間において相対的に平らな接続された要素が通常は残っていることを観測することによって理解できる。コンピュータプログラムリストのセクション２．３．１は、本発明のこの態様によるソースコードのリストを示している。 Instead of pruning the branches of the tree by using a surface area heuristic, the depth of the tree is pruned by approximately left equilibrating binary spatial partitioning starting from a fixed depth. As observed by exhaustive experiments, global fixed depth parameters can be specified across a wide variety of scenes. This can be understood by observing that a relatively flat connected element usually remains in space after a certain number of binary space divisions. Section 2.3.1 of the computer program listing shows a list of source code according to this aspect of the invention.

境界ボリューム階層を使用して、レイトレーシングされる各物体は、正確に１度だけ参照される。その結果、３Ｄツリーとは対照的に、階層の横断の最中に、物体と光線との複数の交わりを防止するために、メールボックスメカニズム（ｍａｉｌｂｏｘｍｅｃｈａｎｉｓｍ）は必要とされない。これは、システム性能の観点から見て大きな利点であり、共用メモリシステム上での実施をはるかに簡単にする。第２の重要な結果は、境界ボリューム階層のツリーでは、レイトレーシングされる物体の総数よりも多い内部ノードが存在できないことである。したがって、アクセラレーションデータ構造のメモリフットプリントは、構築前に物体の数に線形に制限され得る。そのような事前の制限は、３Ｄツリーの構築では利用可能でなく、３Ｄツリーでは、メモリ計算量は、レイトレーシングされる物体の数につれて２次的に増大することが予想される。 Using the boundary volume hierarchy, each ray-tracing object is referenced exactly once. As a result, in contrast to 3D trees, a mailbox mechanism is not required to prevent multiple intersections of objects and rays during traversal of the hierarchy. This is a great advantage from a system performance perspective and makes it much easier to implement on a shared memory system. The second important result is that there can be no more internal nodes in the boundary volume hierarchy tree than the total number of ray-traced objects. Thus, the memory footprint of the acceleration data structure can be linearly limited to the number of objects prior to construction. Such pre-limits are not available in the construction of 3D trees, where it is expected that memory complexity will increase quadratically with the number of objects that are ray-traced.

したがって、現在の３Ｄツリーレイトレーシング法よりも著しく高速であり、レイトレーシングされる物体の数につれて、メモリ要件が２次的に予想されるのではなく線形に増大する、境界ボリューム階層の新しい概念が、今から説明される。境界ボリューム階層が３Ｄツリーよりも性能的にまさることを可能にする中核概念は、境界ボリューム自体に焦点を絞る代わりに、空間がどのように分割され得るかに焦点を絞ることである。 Thus, there is a new concept of boundary volume hierarchies that is significantly faster than current 3D tree ray tracing methods, and that memory requirements increase linearly rather than secondarily expected with the number of objects to be ray-traced. Will be explained from now on. The core concept that allows the boundary volume hierarchy to outperform the 3D tree is to focus on how the space can be divided instead of focusing on the boundary volume itself.

３Ｄツリーにおいては、境界ボックスは、単一の平面によって分割される。本発明の本態様によれば、２つの軸平行境界ボックスを定義するために、２つの平行な平面が使用される。図５は、主要なデータ構造４００を示す図である。 In a 3D tree, the bounding box is divided by a single plane. In accordance with this aspect of the invention, two parallel planes are used to define two axis parallel bounding boxes. FIG. 5 is a diagram showing a main data structure 400.

図５は、軸平行境界ボックス４０２を立面図で示している。軸平行であり、互いに平行である、Ｌ平面４０２及びＲ平面４０４は、境界ボックス４０２を左軸平行境界ボックスと右軸平行境界ボックスとに分割するために使用される。左境界ボックスは、元の境界ボックス４０２の左壁４０６からＬ平面４０２まで広がる。右境界ボックスは、Ｒ平面４０４から元の境界ボックス４０２の右壁４０８まで広がる。したがって、左境界ボックスと右境界ボックスとは、互いに重なり合うことがある。光線４１２の横断は、光線４１０の有効な区間［Ｎ，Ｆ］４１２に関するＬ平面４０２及びＲ平面４０４の交わりの位置によって決定される。 FIG. 5 shows the axis parallel bounding box 402 in elevation. The L plane 402 and the R plane 404 that are axially parallel and parallel to each other are used to divide the bounding box 402 into a left-axis parallel bounding box and a right-axis parallel bounding box. The left bounding box extends from the left wall 406 of the original bounding box 402 to the L plane 402. The right bounding box extends from the R plane 404 to the right wall 408 of the original bounding box 402. Therefore, the left bounding box and the right bounding box may overlap each other. The traversal of ray 412 is determined by the position of intersection of L plane 402 and R plane 404 with respect to the valid section [N, F] 412 of ray 410.

図５のデータ構造４００では、Ｌ平面４０２及びＲ平面４０４は、境界ボックス４０２内に含まれる空間ではなく、元の境界ボックス４０２内に含まれる物体の集合を分割するように、互いに対して位置づけられる。３Ｄツリー分割とは対照的に、２つの平面をもつことが、２つの平面の間の空の空間を最大化する可能性を提供する。その結果、シーンの境界が、はるかに高速に近似され得る。 In the data structure 400 of FIG. 5, the L plane 402 and the R plane 404 are positioned relative to each other so as to divide the set of objects contained within the original bounding box 402, rather than the space contained within the bounding box 402. It is done. In contrast to 3D tree partitioning, having two planes offers the possibility of maximizing the empty space between the two planes. As a result, the scene boundaries can be approximated much faster.

図６〜図８は、データ構造４００をさらに説明する、一連の３次元図である。図６は、境界ボックス４０２の図を示している。説明の目的で、境界ボックス４０２内に仮想物体が、抽象的な円４１６として示されている。図７及び図８に示されるように、境界ボックス４０２を左境界ボックス４０２ａと右境界ボックス４０２ｂとに分割するために、次にＬ平面４０４及びＲ平面４０６が使用される。Ｌ平面及びＲ平面は、それらの間の空の空間が最大化されるように選択される。各仮想物体４１６は、最終的には左境界ボックス４０２ａ又は右境界ボックス４０２ｂ内に入る。図８の下図に示されるように、仮想物体４１６は、「左」物体４１６ａと「右」物体４１６ｂとに分割される。結果の境界ボックス４０２ａ、４０２ｂの各々は、それ自体が分割され、それ以降も、終了基準が満たされるまで同様である。 FIGS. 6-8 are a series of three-dimensional diagrams that further illustrate the data structure 400. FIG. 6 shows a diagram of the bounding box 402. For illustrative purposes, a virtual object is shown in the bounding box 402 as an abstract circle 416. As shown in FIGS. 7 and 8, the L plane 404 and the R plane 406 are then used to divide the bounding box 402 into a left bounding box 402a and a right bounding box 402b. The L and R planes are chosen so that the empty space between them is maximized. Each virtual object 416 eventually enters the left bounding box 402a or the right bounding box 402b. As shown in the lower part of FIG. 8, the virtual object 416 is divided into a “left” object 416a and a “right” object 416b. Each of the resulting bounding boxes 402a, 402b is itself split and so on until the exit criteria are met.

図９は、説明される方法５００のフローチャートである。ステップ５０１において、シーンの境界ボックスが計算される。ステップ５０２において、軸平行境界ボックスを、重なり合うことがある左軸平行境界ボックスと右軸平行境界ボックスとに分割するために、Ｌ平面及びＲ平面が使用される。ステップ５０３において、元の軸平行境界ボックスに含まれる仮想物体の集合を、左物体の集合と右物体の集合とに分割するために、左境界ボックス及び右境界ボックスが使用される。ステップ５０４において、左物体及び右物体が、終了基準が満たされるまで、再帰的に処理される。 FIG. 9 is a flowchart of the method 500 described. In step 501, the bounding box of the scene is calculated. In step 502, the L and R planes are used to divide the axis parallel bounding box into left and right axis parallel bounding boxes that may overlap. In step 503, the left and right bounding boxes are used to divide the set of virtual objects contained in the original axis parallel bounding box into a set of left objects and a set of right objects. In step 504, the left and right objects are processed recursively until the exit criteria are met.

以前の実施において使用された１つの分割パラメータの代わりに、２つの分割パラメータが、ノード内に保存される。ノードの数はレイトレーシングされる物体の数によって線形的に制限されるので、すべてのノードの配列は、１度だけ割り当てられ得る。したがって、構築中の３Ｄツリーの高コストのメモリ管理が不要になる。 Instead of the one splitting parameter used in the previous implementation, two splitting parameters are stored in the node. Since the number of nodes is linearly limited by the number of ray-tracing objects, the array of all nodes can be assigned only once. This eliminates the need for expensive memory management of the 3D tree under construction.

構築技法は、３Ｄツリー構築用の類似物（ａｎａｌｏｇ）よりはるかに簡単であり、再帰的方法で、又は反復バージョン及びスタックを使用することによって、容易に実施される。物体のリスト及び軸平行境界ボックスを与えると、Ｌ平面及びＲ平面が決定され、物体の集合がしかるべく決定される。その後、左物体及び右物体が、何らかの終了基準が満たされるまで、再帰的に処理される。内部ノードの数は制限されているので、ただ１つの物体が残されているときに、終了を当てにしても安全である。 The construction technique is much simpler than the analog for 3D tree construction and is easily implemented in a recursive manner or by using iterative versions and stacks. Given a list of objects and an axis-parallel bounding box, the L and R planes are determined and the set of objects is determined accordingly. The left and right objects are then processed recursively until some termination criterion is met. Since the number of internal nodes is limited, it is safe to rely on termination when only one object remains.

分割が、ｘ軸、ｙ軸、及びｚ軸に垂直な平面に沿って、物体を分類することだけに依存しており、それは非常に効率的で、数値的に絶対安定であることに留意されたい。３Ｄツリーとは対照的に、物体と分割平面との正確な交わりは計算される必要がなく、それは、数値的に堅牢な方法で達成するには、より大きなコストがかかり困難である。頂点及び辺沿いでの三角形の喪失など、３Ｄツリーの数値問題は、境界ボリューム階層の構築前に三角形を拡大することによって回避され得る。また、３Ｄツリーでは、重なり合う物体は、左軸平行境界ボックスと右軸平行境界ボックスの両方に分類されなければならず、それによって、ツリーの予想される２次成長を引き起こす。 Note that segmentation relies solely on classifying objects along planes perpendicular to the x, y, and z axes, which is very efficient and numerically absolutely stable. I want. In contrast to the 3D tree, the exact intersection of the object and the splitting plane does not need to be calculated, which is more expensive and difficult to achieve in a numerically robust manner. Numerical problems with 3D trees, such as the loss of triangles along vertices and edges, can be avoided by expanding the triangles before building the boundary volume hierarchy. Also, in a 3D tree, overlapping objects must be classified into both a left-axis parallel bounding box and a right-axis parallel bounding box, thereby causing the expected secondary growth of the tree.

Ｌ平面及びＲ平面を、したがって実際のツリーレイアウトを決定するために、様々な技法が使用されてよい。図６〜図８に戻ると、１つの技法は、上述の３Ｄツリー構築技法を使用して平面Ｍ４１８を決定し、新しい軸平行境界ボックスの結果のＬ平面及びＲ平面の重なりが、提案された分割平面Ｍ４１８と最も小さく重なり合うように、物体を分割することである。結果のツリーは、対応する３Ｄツリーと非常に類似しているが、空間の代わりに、物体集合が分割されるので、結果のツリーは、はるかに平たい。別の手法は、チャイルドボックスの重なりが最小になり、空の空間が可能な限り最大化されるような方法で、Ｒ平面及びＬ平面を選択することである。ある物体にとっては、軸平行境界ボックスが非効率的であることに留意されたい。そのような状況の一例は、軸平行境界ボックスの対角線上の、半径の小さい長い円柱である。 Various techniques may be used to determine the L and R planes, and thus the actual tree layout. Returning to FIGS. 6-8, one technique determines the plane M418 using the 3D tree construction technique described above, and the resulting overlap of the L and R planes of the new axis parallel bounding box has been proposed. The object is divided so as to overlap with the dividing plane M418 the smallest. The resulting tree is very similar to the corresponding 3D tree, but the resulting tree is much flatter because the object set is split instead of space. Another approach is to select the R and L planes in such a way that child box overlap is minimized and the empty space is maximized as much as possible. Note that for some objects, the axis parallel bounding box is inefficient. One example of such a situation is a long cylinder with a small radius on the diagonal of an axis-parallel bounding box.

図１０は、本発明のこの態様による、方法６００のフローチャートである。ステップ６０１において、シーンの境界ボックスが計算される。ステップ６０２において、分割平面Ｍを決定するために、３Ｄツリー構築が実行される。ステップ６０３において、軸平行境界ボックスを、分割平面Ｍと最も小さく重なり合う左軸平行境界ボックスと右軸平行境界ボックスとに分割するために、平行なＬ平面及びＲ平面が使用される。ステップ６０４において、元の軸平行境界ボックスに含まれる仮想物体の集合を、左物体の集合と右物体の集合とに分割するために、左境界ボックス及び右境界ボックスが使用される。ステップ６０５において、左物体及び右物体が、終了基準が満たされるまで、再帰的に処理される。図１０で説明される方法６００は、図３で説明された方法２００と同様に、３Ｄツリー構築、リアルタイム処理、バケットソーティング、自己交差などに関する技法を含む、本明細書で説明される他の技法と組み合わされてよいことに留意されたい。 FIG. 10 is a flowchart of a method 600 according to this aspect of the invention. In step 601, the bounding box of the scene is calculated. In step 602, a 3D tree construction is performed to determine the split plane M. In step 603, the parallel L and R planes are used to divide the axis parallel bounding box into a left axis parallel bounding box and a right axis parallel bounding box that overlap the partition plane M the smallest. In step 604, the left and right bounding boxes are used to divide the set of virtual objects contained in the original axis parallel bounding box into a set of left objects and a set of right objects. In step 605, the left and right objects are processed recursively until the exit criteria are met. The method 600 described in FIG. 10 is similar to the method 200 described in FIG. 3 with other techniques described herein, including techniques related to 3D tree construction, real-time processing, bucket sorting, self-intersection, and the like. Note that may be combined with:

３Ｄツリーの場合、空間細分は、物体周囲の空間の空の部分を切り取るように継続される。説明される境界ボリューム階層の場合、そのような物体をより小さな物体に分割することが、同様の挙動をもたらす。メモリ要件の予測可能性を維持するため、最大境界ボックスサイズが定義される。最大境界ボックスサイズを超える大きさをもつすべての物体は、要件を満たすように、より小さい部分に分割される。最大許容サイズは、すべての物体のうちの最小の大きさを求めてデータセットをスキャンすることによって見出すことができる。 In the case of a 3D tree, spatial subdivision is continued to clip away the empty portion of the space around the object. For the boundary volume hierarchy described, splitting such objects into smaller objects yields similar behavior. In order to maintain predictability of memory requirements, a maximum bounding box size is defined. All objects with a size that exceeds the maximum bounding box size are divided into smaller parts to meet the requirement. The maximum allowable size can be found by scanning the data set for the smallest size of all objects.

本明細書で説明されるデータ構造は、高速３Ｄツリー横断の原理を境界ボリューム階層に移植することを可能にする。（１）左チャイルドのみ、（２）右チャイルドのみ、（３）左チャイルドの後、右チャイルド、（４）右チャイルドの後、左チャイルド、又は（５）光線が分割平面の間（即ち空の空間）にある、という、横断のケースは同様である。説明される技法における１つのノードは２つの平行な平面によって分割されるので、ボックスをどのような順序で横断するかは、光線の方向によって決定される。図６は、上述の技法を含むソースコードリストを説明している。 The data structures described herein allow fast 3D tree traversal principles to be ported to the boundary volume hierarchy. (1) Left child only, (2) Right child only, (3) After left child, right child, (4) After right child, left child, or (5) Rays between split planes (ie empty The case of crossing that is in the space is the same. Since one node in the described technique is divided by two parallel planes, the order in which the box is traversed is determined by the direction of the rays. FIG. 6 illustrates a source code listing that includes the techniques described above.

これまでの境界ボリューム階層技法は、チャイルドノード、又はヒープデータ構造の更新など必要とされる付加的な作業をどのような順序で横断するかを効率的に決定できなかった。加えて、全境界ボリュームが、ロードされ、光線に対してテストされなければならなかったが、本手法は、２つの平面距離を必要とするだけである。しかし、ソフトウェアで２つの平面に対して光線をチェックすることは、よりコスト高であるように思える。横断は、３Ｄツリーにおけるボトルネックであり、もう少し多くの計算をここで行うことが、メモリアクセスの待ち時間をより良く隠蔽する。加えて、境界ボリューム階層ツリーは、同じ性能の対応する３Ｄツリーよりもはるかに小さくなる傾向にある。 Previous boundary volume hierarchy techniques have not been able to efficiently determine in what order to traverse additional work required such as updating child nodes or heap data structures. In addition, although the entire bounding volume had to be loaded and tested for rays, this approach only requires two plane distances. However, checking the rays against two planes in software seems more expensive. Traversal is a bottleneck in the 3D tree, and doing a little more computation here better hides memory access latency. In addition, the boundary volume hierarchy tree tends to be much smaller than the corresponding 3D tree with the same performance.

本明細書では新しい境界ボリューム階層が説明されるが、３Ｄツリーの横断と強い関連が存在する。Ｌ＝Ｒと設定すると、従来のバイナリ空間分割が獲得され、横断技法は、３Ｄツリーの横断技法へと崩落する。 Although a new bounding volume hierarchy is described herein, there is a strong association with 3D tree traversal. Setting L = R acquires traditional binary space partitioning, and the traversal technique collapses into a 3D tree traversal technique.

説明される境界ボリューム階層は、自由曲面を細分することによって、光線自由曲面交わりを効率的に見つけるためにも適用することができる。そのようにすることは、自由曲面と凸閉包性との交わり、及び実際の浮動小数点計算に応じて浮動小数点精度まで効率的に計算される細分技法を可能にする。細分ステップは、例えば、多項式曲面、有理曲面、及び近似細分化曲面（ａｐｐｒｏｘｉｍａｔｉｎｇｓｕｂｄｉｖｉｓｉｏｎｓｕｒｆａｃｅ）について実行される。空間内の各軸について、重なり合う可能性のある境界ボックスが、上で説明されたように決定される。バイナリ細分の場合、新しいメッシュの新しい境界ボックスについてのＬボックスの交わり及びＲＬボックスの交わり。今は、ボックスの空間的順序が知られているので、上述の横断が効率的に実行できる。境界ボリュームの階層を事前計算する代わりに、それはオンザフライで計算することができる。この手順は、自由曲面について効率的であり、アクセラレーションデータ構造のためのメモリを節約することを可能にし、それは、バックトラッキングによって横断されなければならない境界ボリュームの小さなスタックによって置き換えられる。細分は、光線表面交わりが浮動小数点精度の点又は小さなサイズの区間へと崩落した境界ボリュームに入るまで、継続される。コンピュータプログラムリストのセクション２．１．１は、本発明のこの態様によるコードリストを示している。 The described boundary volume hierarchy can also be applied to efficiently find ray free surface intersections by subdividing the free surface. Doing so allows for the intersection of a free-form surface with convex hullability, and a subdivision technique that is efficiently computed to floating-point precision depending on the actual floating-point computation. The subdivision step is performed on, for example, a polynomial curved surface, a rational curved surface, and an approximate subdivision surface. For each axis in space, bounding boxes that may overlap are determined as described above. For binary subdivisions, L box intersection and RL box intersection for a new bounding box of the new mesh. Now that the spatial order of the boxes is known, the above traversal can be performed efficiently. Instead of precomputing the boundary volume hierarchy, it can be computed on the fly. This procedure is efficient for freeform surfaces and allows saving memory for acceleration data structures, which is replaced by a small stack of boundary volumes that must be traversed by backtracking. The subdivision continues until the ray surface intersection enters a boundary volume that collapses into a point of floating point precision or a small size interval. Section 2.1.1 of the computer program listing shows a code listing according to this aspect of the invention.

レイトレーシングにおけるアクセラレーションデータ構造としての規則的グリッドの使用は簡単であるが、効率は、空間適応性の欠如、及び多くの空のグリッドセルのその後の横断により影響をこうむる。階層的な規則的グリッドは、状況を改善できるが、境界ボリューム階層及び３Ｄツリーと比較すると依然として劣っている。しかし、アクセラレーションデータ構造の構築速度を改善するために、規則的グリッドが使用できる。アクセラレーションデータ構造を構築するための技法は、クイックソーティングと類似しており、○（ｎｌｏｇｎ）で動作することが予想される。線形時間で動作するバケットソーティングを適用することによって、改善が獲得され得る。したがって、物体の軸平行境界ボックスは、ｎ_ｘ×ｎ_ｙ×ｎ_ｚ個の軸平行ボックスに分割される。その後、各物体は、１つの選択点によって、これらのボックスの１つに正確に分類され、例えば、重心又は各三角形の第１の頂点が使用できる。その後、各グリッドセル内の物体の実際の軸平行境界ボックスが決定される。これらの軸平行境界ボックスは、ボックスが分割平面の１つと交わらない限り、それらが含む物体の代わりに使用される。交わる場合は、ボックスが開けられ、代わりに、ボックス内の物体が直接使用される。この手順は、多くの比較及びメモリアクセスを省き、構築技法のオーダの定数を目覚しく改善し、再帰的に適用することもできる。上記の技法は、物体のストリームを処理することによって実現され得るので、特にハードウェア実装にとって魅力的である。 Although the use of regular grids as an acceleration data structure in ray tracing is straightforward, efficiency is affected by a lack of spatial adaptability and subsequent traversal of many empty grid cells. Hierarchical regular grids can improve the situation, but are still inferior compared to boundary volume hierarchies and 3D trees. However, regular grids can be used to improve the speed of building acceleration data structures. The technique for building the acceleration data structure is similar to quick sorting and is expected to work with ◯ (nlogn). Improvements can be obtained by applying bucket sorting that operates in linear time. Accordingly, the object axis-aligned bounding box is divided into n _x × n _y × n _z number of axis-aligned box. Each object is then accurately classified into one of these boxes by one selected point, for example, the centroid or the first vertex of each triangle can be used. Thereafter, the actual axial parallel bounding box of the object in each grid cell is determined. These axis-parallel bounding boxes are used in place of the objects they contain unless the box intersects one of the split planes. If they intersect, the box is opened and instead the object in the box is used directly. This procedure can also be applied recursively, eliminating many comparisons and memory accesses, significantly improving the constants of the order of construction techniques. The above technique is particularly attractive for hardware implementations because it can be realized by processing a stream of objects.

アクセラレーションデータ構造は、要求時に、即ち、物体を有する特定の軸平行境界ボックスを光線が横断したときに、構築することができる。その場合、一方では、アクセラレーションデータ構造は、光線の射さない空間の領域では、精緻になることはなく、キャッシュは、決して触れられることのないデータによって汚染されない。他方、精緻化の後、光線と交わる可能性のある物体は、すでにキャッシュ内に存在する。 An acceleration data structure can be constructed on demand, that is, when a ray traverses a particular axis-parallel bounding box with an object. In that case, on the other hand, the acceleration data structure will not be elaborated in areas of light that do not shine, and the cache will not be polluted by data that will never be touched. On the other hand, after elaboration, objects that can intersect the light beam already exist in the cache.

上記の説明から、本発明が、レイトレーシングにおける長く知られていた問題に対処し、改善された精度、全体的速度、及びアクセラレーションデータ構造のメモリフットプリントを改善した、レイトレーシングのための技法を提供することが分かる。数値精度の改善は、他の数体系への転換ばかりでなく、例えば、ＡＲＴレイトレーシングチップのハードウェアで使用される対数体系への転換も行う。プロセッサ又は専用ハードウェアへのＩＥＥＥ浮動小数点標準の具体的な実装は、性能に深刻な影響を及ぼし得ることに留意されたい。例えば、Ｐｅｎｔｉｕｍ４チップ上で、非正規化数は、１００分の１以下に性能を低下させ得る。上で説明されたように、本発明の実施は、これらの例外を回避する。本明細書で説明される境界ボリューム階層の概念は、境界ボリューム階層をリアルタイムレイトレーシングにとって適したものにする。償却解析においては、説明された技法は、当技術分野のこれまでの状況よりも性能が優れており、したがって、制作セッティング（ｐｒｏｄｕｃｔｉｏｎｓｅｔｔｉｎｇ）などにおいて、例えば、全体的に動きのあるシーンでモーションブラーを計算するための技法など、より正確な技法の使用を可能にする。説明された境界ボリューム階層が、３Ｄツリー及び他の技法と比較した場合、特にハードウェア実装において、また巨大シーンについて、著しい利点を有することは、上記の説明から明らかである。償却解析においては、説明された境界ボリューム階層は、現在の３Ｄツリーよりも少なくとも２倍、性能が優れている。加えて、メモリフットプリントは、事前に決定することができ、物体の数に対して線形である。 From the above description, the present invention addresses a long known problem in ray tracing and improves the accuracy, overall speed, and memory footprint of acceleration data structures, and techniques for ray tracing. You can see that The improvement in numerical accuracy is not only a conversion to other numerical systems, but also a conversion to a logarithmic system used in, for example, the hardware of an ART ray tracing chip. Note that the specific implementation of the IEEE floating point standard on a processor or dedicated hardware can have a severe impact on performance. For example, on a Pentium 4 chip, a denormalized number can degrade performance to 1/100 or less. As explained above, implementation of the present invention avoids these exceptions. The boundary volume hierarchy concept described herein makes the boundary volume hierarchy suitable for real-time ray tracing. In amortization analysis, the described technique outperforms previous situations in the art, and thus, for example, in production settings, for example, motion blur in motion scenes. Allows the use of more accurate techniques, such as techniques for computing. It is clear from the above description that the described boundary volume hierarchy has significant advantages when compared to 3D trees and other techniques, especially in hardware implementations and for large scenes. In the amortization analysis, the described boundary volume hierarchy is at least twice as good as the current 3D tree. In addition, the memory footprint can be predetermined and is linear with the number of objects.

ＩＩ．マルコフ連鎖及び準モンテカルロ法を使用してシミュレートされる光線のためのソーティング戦略
以下に説明される本発明の態様は、光線表面交わりを計算するためにレイトレーシング法を使用する、画像内のピクセルのピクセル値を計算するためにコンピュータ又は他のデジタルプロセッサが使用されるコンピュータグラフィックス及び他の応用例における使用に適した、光輸送の効率的なフォトリアリスティックシミュレーションのための方法、技法、及びシステムを提供する。 II. Sorting Strategy for Rays Simulated Using Markov Chains and Quasi-Monte Carlo Methods Aspects of the invention described below use pixels in an image that use ray-tracing methods to calculate ray surface intersections. A method, technique, and method for efficient photorealistic simulation of light transport, suitable for use in computer graphics and other applications where a computer or other digital processor is used to calculate a pixel value of Provide a system.

特に、本発明は、光子など光のパケット又はユニットによって照明されるシーン内の物体の高速処理を可能にするソーティング戦略を適用することによって、光線表面交わりテストをより効率的にするための方法を含む、従来のレイトレーシング法に対する改良を提供し、光のパケット又はユニットの軌道は、マルコフ連鎖を使用してシミュレートされ、マルコフ連鎖は、適切な準モンテカルロ（ＱＭＣ）法を使用してシミュレートされる。ソーティング戦略は、近接性ソーティングを含み、本発明は、有利には、そのような近接性ソーティングを使用する低次元のＱＭＣ法を活用する。 In particular, the present invention provides a method for making ray surface intersection testing more efficient by applying a sorting strategy that allows for fast processing of objects in a scene illuminated by a packet or unit of light such as photons. Provides improvements over conventional ray-tracing methods, including: optical packet or unit trajectories are simulated using Markov chains, which are simulated using appropriate quasi-Monte Carlo (QMC) methods Is done. The sorting strategy includes proximity sorting, and the present invention advantageously exploits a low-dimensional QMC method that uses such proximity sorting.

説明される技法及びシステムは、従来のシステムでは非常な困難を提示した課題であった、リアルタイムに近い時間で積分方程式を解くことのために使用することができる。 The described techniques and systems can be used to solve integral equations in near real-time, a task that presents great difficulty with conventional systems.

これまでは、マルコフ連鎖、例えば光子の軌道をシミュレートするための準モンテカルロ法の使用は、ランダムサンプリングよりわずかに良い程度の改善を提供したにすぎなかった。しかし、本発明によるＱＭＣ法は、解演算子の本質的な低次元構造を活用することができ、はるかに効率的な技法をもたらす。 Previously, the use of quasi-Monte Carlo methods to simulate Markov chains, such as photon trajectories, provided only a slight improvement over random sampling. However, the QMC method according to the present invention can take advantage of the intrinsic low-dimensional structure of the solution operator, resulting in a much more efficient technique.

本発明のこの態様の説明は、以下のように構成される。
１．マルコフ連鎖及び準モンテカルロ法
２．マルコフ連鎖の同時シミュレーション
２．１１次元での決定論的技法の分析
２．１．１技法の簡略化
２．１．２いつ、なぜ、それが機能するか
２．２ｓ次元での簡略化技法
２．３ランダム化
２．４配列ランダム化準モンテカルロ
３．光輸送シミュレーションへの応用
３．１Ｆｒｅｄｈｏｌｍ積分及びＶｏｌｔｅｒｒａ積分
４．ソーティング技法及びシステム
４．１状態のノルム
４．２空間階層
４．３バケットソーティング及び空間充填曲線
５．結果
６．結論
７．一般的技法 The description of this aspect of the invention is organized as follows.
1. 1. Markov chain and quasi-Monte Carlo method Simultaneous simulation of Markov chains 2.1 Analysis of deterministic techniques in one dimension 2.1.1 Simplification of techniques 2.1.2 When and why it works 2.2 Simplification in s dimensions Technique 2.3 Randomization 2.4 Sequence Randomization Quasi-Monte Carlo 3. Application to light transport simulation 3.1 Fredholm integration and Volterra integration Sorting Techniques and Systems 4.1 State Norm 4.2 Spatial Hierarchy 4.3 Bucket Sorting and Space Filling Curves 5. Result 6. Conclusion 7. General technique

１．マルコフ連鎖及び準モンテカルロ法
数学的に、マルコフ連鎖は、システム状態の離散時間確率系列である。マルコフ連鎖は、現在の状態の知識が与えられれば、以降の状態の確率を予想する上で、以前の状態は無関係である、「記憶いらずの」プロセスである。マルコフ連鎖は、多くの異なる現象をモデル化し、分析するために使用することができる。例えば、英語は、１つの単語が別の単語の後に続く相対頻度の計算に基づいて、マルコフ連鎖を使用してモデル化された。これらの遷移確率を使用して、英語の文を生成することが可能である。例えば、待ち行列、ブラウン運動、粒子輸送の分析、及び本説明に特に関係が深い光輸送シミュレーションを含む、他の多くの進展プロセスも、マルコフ連鎖を使用して説明することができる。 1. Markov Chain and Quasi-Monte Carlo Method Mathematically, a Markov chain is a discrete-time probability sequence of system states. A Markov chain is a “memoryless” process in which, given the knowledge of the current state, the previous state is irrelevant in predicting the probability of the subsequent state. Markov chains can be used to model and analyze many different phenomena. For example, English was modeled using a Markov chain based on a calculation of the relative frequency of one word following another. Using these transition probabilities, English sentences can be generated. Many other evolutionary processes can also be described using Markov chains, including, for example, queuing, Brownian motion, particle transport analysis, and light transport simulations that are particularly relevant to the present description.

より具体的には、マルコフ連鎖は、シミュレート画像のシーン内の物体を照明する光子の軌道をシミュレートするために使用できる。光輸送シミュレーションは、複雑な領域及び不連続性と、高次元積分への還元とによって特徴づけられる。加えて、光輸送シミュレーションは、同一演算子の反復適用による低次元構造によって特徴づけられる。 More specifically, Markov chains can be used to simulate the trajectory of photons that illuminate objects in the scene of the simulated image. Light transport simulation is characterized by complex regions and discontinuities and reduction to higher dimensional integrals. In addition, light transport simulation is characterized by a low dimensional structure by repeated application of the same operator.

マルコフ連鎖自体も、準モンテカルロ（ＱＭＣ）法を使用して、シミュレートされ得る。ＱＭＣ法の背後にある一般的な考えは、何らかの方法でランダムに生成された数の調整を試みる代わりに、選択区間の非常に一様な被覆（ｃｏｖｅｒａｇｅ）を提供するために、知られた決定論的系列から開始するというものである。ランダム化ＱＭＣでは、数はランダムな様相を有するが、規則的な方法で選択区間を被覆するという所望の特性をグループとして保持するような方法で、系列はランダム化される。 The Markov chain itself can also be simulated using the quasi-Monte Carlo (QMC) method. The general idea behind the QMC method is that the known decision to provide a very uniform coverage of the selected interval instead of trying to adjust the number of randomly generated in some way. It starts with a theoretical line. In randomized QMC, the number has a random appearance, but the sequence is randomized in such a way that the desired property of covering the selected interval in a regular manner is retained as a group.

ＱＭＣを使用してマルコフ連鎖をシミュレートするために利用できる多くの手法が存在する。１つの手法は、高次元低食い違い量点集合（ｈｉｇｈ−ｄｉｍｅｎｓｉｏｎａｌ，ｌｏｗ−ｄｉｓｃｒｅｐａｎｃｙｐｏｉｎｔｓｅｔ）を適用することである。しかし、この手法は、中次元に対してだけ著しい利点を提供する。別の手法は、低次元ランダム化点集合のパディングである。この第２の手法は、実施するのがより簡単であり、ほぼ良好な結果を達成する。第３の手法は、状態のソーティングとパディングとを組み合わせる。この第３の手法は、ＱＭＣの能力が目に見える、性能の上昇をもたらし、決定論的技法及びランダム化技法の両方と併せて使用することができる。この手法の実施は、相対的に容易である。 There are many approaches that can be used to simulate Markov chains using QMC. One approach is to apply a high-dimensional, low-discrepancy point set. However, this approach offers significant advantages only for the middle dimension. Another approach is padding of a low-dimensional randomized point set. This second approach is easier to implement and achieves almost good results. The third technique combines state sorting and padding. This third approach results in an increase in performance where QMC capabilities are visible and can be used in conjunction with both deterministic and randomization techniques. Implementation of this technique is relatively easy.

プロセスの特性は、複数のマルコフ連鎖の寄与を平均することによって推定することができる。各軌道を個々にシミュレートする代わりに、相関サンプルを使用してマルコフ連鎖を同時にシミュレートし、各遷移ステップの後に１群の状態をソートすることで、収束を顕著に改善できることが見出された。 The characteristics of the process can be estimated by averaging the contributions of multiple Markov chains. Instead of simulating each trajectory individually, it was found that convergence can be significantly improved by simultaneously simulating Markov chains using correlation samples and sorting a group of states after each transition step. It was.

状態空間上で順序を与えた場合、これらの手法は、１群のマルコフ連鎖をソートすることによって改善され得る。以下に説明される本発明の態様によれば、決定論的手法の使用は、説明される技法において簡略化をもたらし、説明されたソーティングがいつ、なぜ、増大した性能をもたらすかに関する有益な徴候を提供する。以下にさらに説明されるように、説明されるソーティング戦略の一部又は全部は、有利には、光輸送シミュレーションの効率を増大させるために使用することができる。 Given the order on the state space, these approaches can be improved by sorting a group of Markov chains. In accordance with aspects of the present invention described below, the use of deterministic techniques provides simplification in the described techniques, and beneficial indications as to when and why the described sorting results in increased performance. I will provide a. As described further below, some or all of the described sorting strategies can be advantageously used to increase the efficiency of light transport simulations.

２．マルコフ連鎖の同時シミュレーション
効率を増大させるため、マルコフ連鎖は、同時にシミュレートされ得る。準モンテカルロ法を用いたマルコフ連鎖の同時シミュレーションを中間ソーティングステップによって改善するというアイデアは、ボルツマン方程式を扱い、後に熱方程式を扱った一連の論文において、最初にＬｅｃｏｔによって紹介された。その後、このアイデアは、グリッド上で熱方程式を解くために、Ｍｏｒｏｋｏｆｆ及びＣａｆｌｉｓｃｈによって使用及び洗練され、最近、当該技法のランダム化バージョンと、希少事象シミュレーションのための分割とを組み込むために、Ｌ’Ｅｃｕｙｅｒ、Ｔｕｆｆｉｎ、Ｄｅｍｅｒｓ他によって拡張された。独立した研究が、上記の手法に関連した方法で、コンピュータグラフィックスの分野において行われてきた。 2. Simultaneous simulation of Markov chains To increase efficiency, Markov chains can be simulated simultaneously. The idea of improving the simultaneous simulation of Markov chains using the quasi-Monte Carlo method by an intermediate sorting step was first introduced by Lecot in a series of papers dealing with the Boltzmann equation and later dealing with the thermal equation. This idea was then used and refined by Morokoff and Caflisch to solve the heat equation on the grid, and recently L 'to incorporate a randomized version of the technique and partitioning for rare event simulation. Extended by Ecuyer, Tuffin, Demers et al. Independent research has been conducted in the field of computer graphics in a manner related to the above approach.

以下の説明では、上記の方式にまさる改善を実施する技法が説明される。これらの改良技法の背景は、Ｈ．Ｎｉｅｄｅｒｒｅｉｔｅｒ編、ＭｏｎｔｅＣａｒｌｏａｎｄＱｕａｓｉ−ＭｏｎｔｅＣａｒｌｏＭｅｔｈｏｄｓｉｎＳｃｉｅｎｔｉｆｉｃＣｏｍｐｕｔｉｎｇ２００２、ｐｐ．３２９〜４４、ＳｐｒｉｎｇｅｒＶｅｒｌａｇ（２００４）に記載の、Ｌｅｃｏｔ他、「Ｑｕａｓｉ−ＭｏｎｔｅＣａｒｌｏＭｅｔｈｏｄｓｆｏｒＥｓｔｉｍａｔｉｎｇＴｒａｎｓｉｅｎｔＭｅａｓｕｒｅｓｏｆＤｉｓｃｒｅｔｅＴｉｍｅＭａｒｋｏｖＣｈａｉｎｓ」において説明されている技法によって部分的に提供され、同文献は、その全体があたかも説明されているかのように、参照により本明細書に組み込まれる。 In the following description, techniques for implementing improvements over the above scheme are described. The background of these improved techniques is described in H.W. Edited by Niederreiter, Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing 2002, pp. 329-44, Springer Verlag (2004), described in Lecot et al., “Quasi-Monte Carlo Methods for Estimating Transient Measurements of Discrete Time Markov Chains, partly by the literature,” It is incorporated herein by reference as if set forth in its entirety.

以下の説明は、説明される技法が、いつ、なぜ、従来の手法よりも優れたものとなるかに関する、概念的フレームワークを提供する。説明される技法の有利な実施も説明される。 The following description provides a conceptual framework for when and why the described techniques will be superior to conventional approaches. An advantageous implementation of the described technique is also described.

２．１１次元での決定論的技法の分析
１次元問題についての決定論的技法が今から説明される。Ｌｅｃｏｔにおいて提示された技法は、初期分布

及び遷移行列

を有する離散状態空間Ｅ上で、マルコフ連鎖を同時にシミュレートする。基数ｂの（ｔ，２）−列

を使用して、Ｎ＝ｂ^ｍ個の連鎖が、並列にシミュレートされ、ここで、Ｘ_ｎ，ｌは、時間ステップｎにおける連鎖ｌの状態であり、

は、ゼロを含む自然数の集合である。さらに、当該技法は、ｍ＞ｔについて、Ｎ＝ｂ^ｍ個の後続要素ｘ_i，１が、基数ｂの（０，ｍ，１）−ネットを形成することを要求する。Ｌｅｃｏｔに示されるように、以下の基準が満たされる場合、当該技法は収束する。

2.1 Analysis of deterministic techniques in one dimension Deterministic techniques for one-dimensional problems will now be explained. The technique presented in Lecot uses an initial distribution

And transition matrix

A Markov chain is simultaneously simulated on a discrete state space E having (T, 2) -sequence of radix b

N = b ^m chains are simulated in parallel, where X _{n, l} is the state of chain l at time step n,

Is a set of natural numbers including zero. Further, the technique requires that N = b ^m subsequent elements x _{i, 1} form a (0, m, 1) -net of radix b for m> t. As shown in Lecot, the technique converges if the following criteria are met:

Ｘ_ｎ，ｌは、時間ステップｎにおける連鎖ｌの状態である。したがって、シミュレーションは、以下のように実行され得る。
ｎ：＝０
０≦ｌ＜Ｎについて、ｘ_ｌ，２を使用して、

を初期化する。
ループ
何らかの順序を使用して、状態ベクトルをソートする。
ｎ：＝ｎ＋１
０≦ｌ＜Ｎについて、

を使用して、

を計算することによって、連鎖を継続させる。 X _{n, l} is the state of chain 1 at time step n. Therefore, the simulation can be performed as follows.
n: = 0
For 0 ≦ l <N, use x _{l, 2}

Is initialized.
Loop Sort the state vector using some order.
n: = n + 1
For 0 ≦ l <N,

using,

The chain is continued by computing

Ｎ＝ｂ^ｍ個の連鎖の場合、ｍ＞ｔについて、ｂ^ｍ個の後続する要素ｘ_ｉ，１が、（０，ｍ，１）−ネットを形成するように、基数ｂの（ｔ，２）−列ｘ^ｉが選択される。 For N = ^{b m} pieces of chain, for m> t, ^{b m-number} of subsequent elements _{x i, 1} is, (0, m, 1) - so as to form a net, radix b (t, 2 )-Column x ⁱ is selected.

基数ｂ＝２のＳｏｂｏｌ’（０，２）−列（φ_２（ｌ），φ_Ｓ（ｌ））が、以下の要件を満たすことが分かる。
ｍ＝３＞ｔ＝０について、Ｎ＝２^３＝８となり、

について、

となる。 It can be seen that the Sobol ′ (0,2) -column (φ ₂ (l), φ _S (l)) of the radix b = 2 satisfies the following requirements.
For m = 3> t = 0, N = 2 ³ = 8,

about,

It becomes.

すべての置換は同一である。 All substitutions are identical.

２．１．１技法の簡略化
一様分布を仮定した場合、安定なソーティングは、状態の順序を変化させないことが分かる。実際、収束条件は無変化であり続けるので、すべての置換は省略することができる。したがって、上述の技法は、簡略化することができる。 2.1.1 Simplification of technique It can be seen that stable sorting does not change the order of the states, assuming a uniform distribution. In fact, since the convergence condition remains unchanged, all substitutions can be omitted. Thus, the technique described above can be simplified.

基数ｂ＝２の（０，２）−列であり、収束条件を維持するために必要とされる仮定を満たす、Ｓｏｂｏｌ’列

について検討する。元のＳｏｂｏｌ’列は、Ｉ．Ｓｏｂｏｌ’、「ＯｎｔｈｅＤｉｓｔｒｉｂｕｔｉｏｎｏｆＰｏｉｎｔｓｉｎａＣｕｂｅａｎｄｔｈｅＡｐｐｒｏｘｉｍａｔｅＥｖａｌｕａｔｉｏｎｏｆＩｎｔｅｇｒａｌｓ」、Ｚｈ．Ｖｙｃｈｉｓｌ．Ｍａｔ．ｉＭａｔ．Ｆｉｚ．、７（４）：７８４−８０２（１９６７）において定義されており、同文献は、その全体があたかも説明されているかのように、参照により本明細書に組み込まれる。

についての簡単なコード例は、Ｔ．Ｋｏｌｌｉｇ及びＡ．Ｋｅｌｌｅｒ、「ＥｆｆｉｃｉｅｎｔＭｕｌｔｉｄｉｍｅｎｓｉｏｎａｌＳａｍｐｌｉｎｇ」、ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓＦｏｒｕｍ、２１（３）：５５７〜５６３（２００２年９月）において提供されており、同文献は、その全体があたかも説明されているかのように、参照により本明細書に組み込まれる。 A Sobol 'sequence that is a (0,2) -sequence of radix b = 2 and satisfies the assumptions needed to maintain convergence conditions

To consider. The original Sobol 'column contains I.D. Sobol ', "On the Distribution of Points in a Cube and the Apploximation Evaluation of Integrals", Zh. Vychisl. Mat. iMat. Fiz. 7 (4): 784-802 (1967), which is incorporated herein by reference as if set forth in its entirety.

A simple code example for T. Kollig and A.K. Keller, “Efficient Multidimensional Sampling”, Computer Graphics Forum, 21 (3): 557-563 (September 2002), which is incorporated by reference as if set forth in its entirety. Incorporated herein.

シミュレーション自体は、時間ステップｎ＝０で開始し、μの実現のため、ｘ_ｌ，２を使用して、０≦ｌ≦Ｎについて、状態

を初期化する。その後、当該技法は、状態をソートすること（以下のセクション３において詳細に説明される）によって続行され、Ｐに従った遷移を実現するため、０≦ｌ≦Ｎについて、

を使用して、

を計算することによって、連鎖を継続させる。遷移の次の状態を選択するためのインデックス

は、実際には、基数２のｖａｎｄｅｒＣｏｒｐｕｔ列Φ_２を使用し、これは、（０，１）−列であり、したがって、（０，ｍ，１）−ネットの系列である。例えば、ｍ＝３＞ｔ＝０を選択すると、Ｎ＝２^３＝８となり、

について、

となる。したがって、異なる時間ステップｎの間に使用されるすべてのインデックスは、実際には、すべてのｍについて同一である。 The simulation itself starts at time step n = 0 and uses x _{l, 2} for the realization of μ and states for 0 ≦ l ≦ N

Is initialized. The technique then continues by sorting the states (described in detail in Section 3 below), and for achieving a transition according to P, for 0 ≦ l ≦ N,

using,

The chain is continued by computing An index to select the next state of the transition

Actually uses a radix-2 van der Corp sequence [Phi] ₂ , which is a (0,1) -sequence and therefore a (0, m, 1) -net sequence. For example, if m = 3> t = 0, N = 2 ³ = 8,

about,

It becomes. Thus, all indexes used during different time steps n are actually the same for all m.

一様確率

を仮定すると、収束定理（ｃｏｎｖｅｒｇｅｎｃｅｔｈｅｏｒｅｍ）が依然として適用されるが、より重要な安定なソーティングは、状態の順序を変化させない。したがって、実際には、インデックス置換は、収束条件に触れることなく、恒等置換（ｉｄｅｎｔｉｔｙ）として選択できることになる。同じことが、初期状態Ｘ_０，ｌを選択するために適用され、それが、（０，２）−列ｘ_ｌの第２の要素だけを使用する、以下の簡略化された、しかし等価の技法をもたらす。
・ｎ：＝０
・０≦ｌ＜２^ｍについて、ｘ_ｌ，２を使用して、Ｘ_０，ｌを初期化する。
・ループ
− 適切な順序を使用して、状態ベクトルをソートする。
− ｎ：＝ｎ＋１
− ０≦ｌ＜２^ｍについて、Φ_Ｓ（ｌ＋ｎ・２^ｍ）を使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。 Uniform probability

, The convergence theorem still applies, but the more important stable sorting does not change the order of the states. Therefore, in practice, index replacement can be selected as identity replacement without touching the convergence condition. The same applies to select the initial state X _{0, l} , which uses only the second element of (0,2) -column x _l , the following simplified but equivalent Bring technique.
・ N: = 0
Initialize X _0,1 using x _{l, 2} for 0 ≦ l <2 ^m .
• Loop-Sort the state vector using the appropriate order.
-N: = n + 1
-For 0 ≦ l <2 ^m , continue the chain by calculating X _{n, l} using Φ _S (l + n · 2 ^m ).

２．１．２いつ、なぜ、それが機能するか
多くの応用例において観測された、当該方式の改善された収束は、Ｐに従ったＸ_ｎ，ｌの遷移を実現するために使用されるサンプル
Φ_Ｓ（ｌ＋ｎ・２^ｍ）
の構造によって引き起こされることが分かる。これは、暗黙の階層構造を露呈させる、根基逆関数（ｒａｄｉｃａｌｉｎｖｅｒｓｅ）の分解

によって理解することができ、Φ_Ｓ（ｌ）は、状態数ｌに依存する間隔１／（２^ｍ）を有するオフセットであり、一方、シフトΦ_Ｓ（ｎ）は、時間ステップｎにおけるすべての区間について同一である。図１１は、シフトΦ_Ｓ（ｎ）を図説する２次元配列７００を示している。 2.1.2 When and why it works The improved convergence of the scheme, observed in many applications, is used to achieve a transition of X _{n, l} according to P Sample Φ _S (l + n · 2 ^m )
It can be seen that it is caused by the structure of This is a decomposition of the radical inverse that exposes an implicit hierarchical structure.

Φ _S (l) is an offset with an interval 1 / (2 ^m ) depending on the number of states l, while the shift Φ _S (n) is the entire interval at time step n Are the same. FIG. 11 shows a two-dimensional array 700 illustrating the shift Φ _S (n).

ここで、ｌ＝０，．．．，２^ｍ−１についてのΦ_Ｓ（ｌ）の全体は、実際には、（０，ｍ，１）−ネットでなければならず、したがって、サンプルの等距離集合でなければならないので、低次元設定は、サンプルがシフトされた格子又は階層化されたサンプルであるという、誤解を招く解釈をもたらすことがある。しかし、良好な性能は、Φ_Ｓ（ｌ）が（ｔ，ｓ）−列であり、したがって、ｔ≦ｍ’≦ｍである任意のｍ’についての（ｔ，ｍ’，ｓ）−ネットの系列であるという特性から生じる。これは、状態空間において同様であり、したがって、ソーティング後の順序によって連なるｂ^ｍ’個の状態が、（ｔ，ｍ’，ｓ）−ネットによって遷移をサンプリングすることを意味し、それが、良好な離散密度近似を保証する。最大の改善は、２^ｍ個の連鎖すべてが同じ状態にある場合に獲得される。連鎖の状態が状態空間において離れていればいるだけ、性能改善は小さくなる。 Here, l = 0,. . . , 2 ^m −1, the whole Φ _S (l) must actually be a (0, m, 1) -net, and therefore must be an equidistant set of samples, so the low dimension The setting may lead to a misleading interpretation that the samples are shifted grids or layered samples. However, good performance is that of (t, m ′, s) -net for any m ′ where Φ _S (l) is in the (t, s) -sequence and therefore t ≦ m ′ ≦ m. Arises from the property of being a series. This is the same in the state space, and therefore means that b ^{m ′} states that are linked by post-sorting order sample transitions by the (t, m ′, s) -net, which is good Guarantees a discrete density approximation. Maximum improvement is obtained when all 2 ^m chains are in the same state. The better the chain states are in the state space, the smaller the performance improvement.

２．２ｓ次元での簡略化技法
（ｔ，ｍ，ｓ）−ネットの系列である、基底ｂの（ｔ，ｓ）−列を使用して、当該方式は、ｓ次元でも機能する。ソーティング後に状態が同様になるマルコフ連鎖は、低食い違い量サンプルによる遷移確率のサンプリングを保証される。ｓ次元での簡略化技法は、今では以下のような様相を呈する。
・ｎ：＝０
・準モンテカルロ点ｘ_ｌを使用して、Ｘ_０，ｌを初期化する。
・ループ
− 適切な順序を使用して、状態ベクトルをソートする。
− ｎ：＝ｎ＋１
− （ｔ，ｓ）−列からの後続のサンプルｘ_ｌを使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。 2.2 Simplification technique in s dimension Using the (t, s) -sequence of base b, which is a sequence of (t, m, s) -net, the scheme also works in s dimension. A Markov chain whose states are similar after sorting is guaranteed to sample transition probabilities with low discrepancy samples. The simplification technique in the s dimension now has the following aspect.
・ N: = 0
Initialize X _{0, l} using the quasi-Monte Carlo point _xl .
• Loop-Sort the state vector using the appropriate order.
-N: = n + 1
- (t, s) - using subsequent samples _{x l} from _{column, X n,} by calculating the _l, to continue the chain.

いくつかのシミュレーションは、ある局所的な微妙な効果を獲得するために、軌道分割を必要とする。これは、より複雑な手法を使用して、すでに対処されていたが、実際には、分割される状態について、単に（ｔ，ｓ）−列からより多くのサンプルを取り出すことによって、より簡単な方法で達成することができる。 Some simulations require trajectory splitting to obtain some local subtle effects. This has already been addressed using a more complex approach, but in practice it is simpler for the state to be split by simply taking more samples from the (t, s) -column. Can be achieved in a way.

これは、正確にｂ^ｍ個の連鎖を同時にシミュレートする必要すらないという事実の結果である。最大性能利得を可能にするためには、（ｔ，ｓ）−列から後続のサンプルを取り出すこと、及び後続の（ｔ，ｍ，ｓ）−ネット内の点の数ｂ^ｍを最小化することだけが重要である。しかし、（ｔ，ｓ）−列の選択は、（０，ｓ）−列はｂ≧ｓの場合にのみ存在し、ｍ＞ｔであるという条件によって制限される。 This is a result of the fact that it is not necessary to simulate exactly b ^m chains simultaneously. To enable maximum performance gain, take subsequent samples from the (t, s) ^-sequence and minimize the number of points b ^m in the subsequent (t, m, s) -net Only is important. However, the selection of the (t, s) -column is limited by the condition that the (0, s) -column exists only if b ≧ s and m> t.

Ｈａｌｔｏｎ列又はそのスクランブルされた変形などの他の根基逆変換（ｒａｄｉｃａｌｉｎｖｅｒｓｉｏｎ）ベースの点集合は、（ｔ，ｓ）−列と同様の特性を満たし、同様の性能利得をもたらすことに留意されたい。図１２は、Ｈａｌｔｏｎ列、点ｘ_ｌ＝（φ_２（ｌ），φ_３（ｌ））を図説するグラフ７２０を示している。示されたＨａｌｔｏｎ列では、根基逆変換は、最大の「穴」に落ち込む後続の点を有する、階層構造を暗示している。この特性は、任意の開始点にとって好適であり、（ｔ，ｓ）−列も、同じ特性を有する。 Note that other radical inversion-based point sets, such as the Halton sequence or its scrambled variants, satisfy similar characteristics as the (t, s) -sequence and yield similar performance gains. . FIG. 12 shows a graph 720 illustrating the Halton row, the point x _l = (φ ₂ (l), φ ₃ (l)). In the Halton column shown, the root inverse transform implies a hierarchical structure with subsequent points falling into the largest “hole”. This property is suitable for any starting point, and the (t, s) -column has the same property.

２．３ランダム化
より高次元における決定論的技法の収束についての一般的証明は現在存在していないが、当該技法は、各時間ステップｎにおいて準モンテカルロ点を新たにランダム化することによって不偏になる。実際には、これはパディングされた複製のサンプリングの場合であるので、不偏性についての議論はより簡単になる。しかし、ランダム化は、特別な注意を払うに値する。 2.3 Randomization Although there is currently no general proof about the convergence of deterministic techniques in higher dimensions, the technique is unbiased by newly randomizing quasi-Monte Carlo points at each time step n. Become. In practice, this is the case for padded replica sampling, so the discussion of unbiasedness is simpler. However, randomization deserves special attention.

最も効率的な実施は、基底ｂ＝２の（ｔ，ｓ）−列を選択し、それから後続のサンプルを取り出し、そのサンプルにｓ次元ランダムベクトルによるＸＯＲ演算を施すことから成る。このランダムベクトルは、各遷移ステップの後、新たに取り出される。しかし、ランダムなスクランブリングは、点が列挙される順序を変化させるので、（ｔ，ｍ，ｓ）−ネットの系列の局所的特性も、変化させられる。 The most efficient implementation consists of selecting the (t, s) -column with basis b = 2, then taking the subsequent sample, and subjecting that sample to an XOR operation with an s-dimensional random vector. This random vector is newly extracted after each transition step. However, random scrambling changes the order in which the points are listed, so the local characteristics of the (t, m, s) -net sequence are also changed.

この結果は、これまでの手法を使用した場合に見られる効果のいくつかについての説明として利用することができる。配列（Ｒ）ＱＭＣシミュレーションで使用されるＳｏｂｏｌ点及びＫｏｒｏｂｏｖ点は、分散低減の１０倍までは、その変換された等価物（Ｓｏｂｏｌに対するグレイコード、Ｋｏｒｏｂｏｖに対するＢａｋｅｒ変換）よりも悪い。これについての説明は、点の構造において見出される。Ｓｏｂｏｌ列から抽出される（ｔ，ｍ，ｓ）−ネットの系列は、局所的には、そのグレイコード変形よりも悪い。同じことが、Ｋｏｒｏｂｏｖ格子とそれを変換した変形についても言える。 This result can be used as an explanation for some of the effects seen using previous approaches. The Sobol and Korovov points used in array (R) QMC simulations are worse than their transformed equivalents (Gray code for Sobol, Baker transform for Korovov) up to 10 times the variance reduction. An explanation for this is found in the dot structure. The (t, m, s) -net sequence extracted from the Sobol sequence is locally worse than its Gray code deformation. The same is true for the Korovov lattice and its transformed version.

２．４配列ランダム化準モンテカルロ
以下は、ｓ次元問題のためのランダム化技法である。
ｎ：＝０、点列ｘ_ｌを選択する。
ランダム化をインスタンス化する。
０≦ｌ＜Ｎについて、ｒａｎｄ（ｘ_ｌ）を使用して、Ｘ_０，ｌを初期化する。
ループ
・適切な順序を使用して、状態ベクトルをソートする。
・ｎ：＝ｎ＋１
・ランダム化をインスタンス化する。
・０≦ｌ＜Ｎについて、ｒａｎｄ（ｘ_ｌ）を使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。 2.4 Array Randomization Quasi-Monte Carlo The following is a randomization technique for the s-dimensional problem.
n: = 0, point sequence _xl is selected.
Instantiate randomization.
Initialize X _{0, l} using rand (x _l ) for 0 ≦ l <N.
Loop • Sort the state vector using the appropriate order.
・ N: = n + 1
• Instantiate randomization.
Continue the chain by computing X _{n, l} using rand (x _l ) for 0 ≦ l <N.

当該技法は、多くの興味深い特性を有する。第１に、当該技法は、ｘ_ｌの表にまとめられた１つの集合（ｏｎｅｔａｂｕｌａｔｅｄｓｅｔ）について、及びｘ_ｌの連続ストリームについて不偏である。第２に、ストリームを使用する場合、分割及び吸収は本質的である。これは、ロシアンルーレット方式において真実であり、ｘ_ｌからより多くの連続サンプルを取ることによって分割する場合にも真実である。 The technique has many interesting properties. First, the technique, one set was summarized in the table _{x l} for (one tabulated set), and a unbiased for continuous stream of _{x l.} Second, splitting and absorption are essential when using streams. This is true in Russian roulette system, it is also true in the case of dividing by taking more consecutive samples from x _l.

３．光輸送シミュレーション
数値的な証拠のため、上述の技法が、リアリスティックな画像を合成するための光輸送シミュレーションに適用される。基礎をなす積分方程式は、経路積分として定式化し直すことができる。 3. Light Transport Simulation For numerical evidence, the technique described above is applied to light transport simulation to synthesize realistic images. The underlying integral equation can be reformulated as a path integral.

図１３は、双方向経路トレーシングによってサンプリング輸送経路空間７４０を示す図である。目７４２からシーン物体７４６までの軌道と光源７４４からシーン物体７４６までの軌道とは、マルコフ連鎖によって生成され、輸送される光の量を決定するために接続される。図１３に示されるサンプリング経路空間は、マルコフ連鎖をシミュレートすることに対応し、経路は、レイトレーシング及び散乱事象によって確立される。初期分布は、光源７４４の放出特性によって決定され、遷移確率は、衝突表面上における双方向反射分布関数によって与えられる。 FIG. 13 is a diagram showing a sampling transport route space 740 by bidirectional route tracing. The trajectory from eye 742 to scene object 746 and the trajectory from light source 744 to scene object 746 are generated by a Markov chain and connected to determine the amount of light transported. The sampling path space shown in FIG. 13 corresponds to simulating a Markov chain, where the path is established by ray tracing and scattering events. The initial distribution is determined by the emission characteristics of the light source 744, and the transition probability is given by the bidirectional reflection distribution function on the impact surface.

経路積分を解くため、少なくとも２つの基本戦略が存在し、それらは、高次元低食い違い量点を使用するか、又は低次元低食い違い量点をパディングする。後者の手法は、上述の発見に適合しており、高次元事象は、マルコフ連鎖の後続する遷移として構成される。計測してみると、マルコフ連鎖シミュレーションのために高次元系列を使用することに対する差はないと言えるほど小さいが、パディングされた系列の使用は、計算的により高速であり、よりわずかな実施上の労力しか必要としない。それは、レンダリング業界の実務者にとってもより簡単である。 To solve the path integral, there are at least two basic strategies that use high-dimensional low discrepancy points or pad low-dimensional low discrepancy points. The latter approach is compatible with the above discovery, where high-dimensional events are configured as subsequent transitions of Markov chains. When measured, the use of high-dimensional sequences for Markov chain simulations is small enough that there is no difference, but the use of padded sequences is computationally faster and requires less implementation. Only effort is required. It is even easier for practitioners in the rendering industry.

加えて、（ｔ，ｓ）−列又はＨａｌｔｏｎ列及びそのスクランブルされた変形の階層構造特性は、上で説明されたように、次元が小さい場合にはるかに良好になるので、低次元手法は、はるかに良好な結果を可能にする。 In addition, since the hierarchical properties of the (t, s) -sequence or Halton sequence and its scrambled variants are much better when the dimension is small, as explained above, the low-dimensional approach is Allows much better results.

図１４Ａ〜Ｂ及び図１５Ａ〜Ｂは、モンテカルロ法を使用するシミュレーションと配列ランダム化準モンテカルロ（Ａ−ＲＱＭＣ）法を使用するシミュレーションの間の差を示した、レンダリングされた画像７６０、７７０、７８０、７９０である。モンテカルロ法を使用するシミュレーションは、図１４Ａ及び図１５Ａの画像７６０及び７８０に示されている。配列ランダム化準モンテカルロ法を使用するシミュレーションは、図１４Ｂ及び図１５Ｂの画像７７０及び７９０に示されている。各レンダリング画像の上部の矢印７６２、７７２、７８２、７９２は、放出位置及び方向を示している。図１４Ａ〜Ｂ及び図１５Ａ〜Ｂから、これらの例では、配列ランダム化ＱＭＣ法を使用してシミュレートされた光路は、モンテカルロ法を使用してシミュレートされた光路よりも、はるかに一様に分布していることが分かる。 14A-B and FIGS. 15A-B show rendered images 760, 770, 780 showing the difference between the simulation using the Monte Carlo method and the simulation using the array randomized quasi-Monte Carlo (A-RQMC) method. 790. Simulations using the Monte Carlo method are shown in images 760 and 780 in FIGS. 14A and 15A. Simulations using the array randomized quasi-Monte Carlo method are shown in images 770 and 790 in FIGS. 14B and 15B. Arrows 762, 772, 782, 792 at the top of each rendered image indicate the emission position and direction. 14A-B and 15A-B, in these examples, the optical path simulated using the array randomized QMC method is much more uniform than the optical path simulated using the Monte Carlo method. It can be seen that they are distributed.

３．１Ｆｒｅｄｈｏｌｍ積分及びＶｏｌｔｅｒｒａ積分
光輸送の基礎をなす積分方程式は、Ｆｒｅｄｈｏｌｍ積分方程式又はＶｏｌｔｅｒｒａ積分方程式と見なすことができるが、そのことは実施にとって重要となる。 3.1 Fredholm integral and Volterra integral The integral equation underlying the light transport can be regarded as a Fredholm integral equation or a Volterra integral equation, which is important for implementation.

は、点ｘにおいて方向ωの方に表面から反射された放射であり、積分演算子の定義域

は、ｘにおける法線ｎ（ｘ）と軸平行な半球である。

Is the radiation reflected from the surface in the direction ω at the point x, and the domain of the integral operator

Is a hemisphere that is axially parallel to the normal n (x) at x.

図１６Ａ〜Ｄは、光源上の複数の開始点８０４、８１４、８２４、８３４から開始させた光子軌道８０２、８１２、８２２、８３２を示した、一連の図８００、８１０、８２０、８３０である。光線をトレースした後、表面８０６、８１６、８２６、８３６に衝突したとき、散乱の方向を決定して経路を継続するために、双方向反射分布関数がサンプリングされる。 16A-D are a series of diagrams 800, 810, 820, 830 showing photon trajectories 802, 812, 822, 832 starting from a plurality of starting points 804, 814, 824, 834 on the light source. After tracing the ray, when it strikes the surface 806, 816, 826, 836, the bi-directional reflection distribution function is sampled to determine the direction of scattering and continue the path.

ｆ_ｒは、光表面特性（ｏｐｔｉｃａｌｓｕｒｆａｃｅｐｒｏｐｅｒｔｙ）を記述する双方向反射分布関数であり、Ｌは、入射放射である。この積分演算子の使用は、積分定義域がｘに依存するような、第２種のＶｏｌｔｅｒｒａ積分方程式

をもたらし、他方で、ｘとは独立に単位球Ｓ^２の全方向にわたって積分が実行されるような、第２種のＦｒｅｄｈｏｌｍ積分方程式をもたらす。 _fr is a bi-directional reflection distribution function describing the optical surface property and L is the incident radiation. The use of this integral operator is the second kind of Volterra integral equation where the integral domain depends on x.

On the other hand, a Fredholm integral equation of the second kind is obtained such that the integration is performed in all directions of the unit sphere S ² independently of x.

全方向を生成し、表面法線ｎ（ｘ）に対して負のスカラ積を有する方向を排除するという後者の手法の使用は、計算的に魅力があるが、第１の手法は、局所的な表面フレーム（ｓｕｒｆａｃｅｆｒａｍｅ）に変換されなければならない、半球における方向の生成を必要とし、これはよりコスト高になる。単位球からＳ^２又は

への写像は、当技術分野において知られている。 Although the use of the latter approach of generating all directions and eliminating directions with negative scalar products relative to the surface normal n (x) is computationally attractive, the first approach is locally Requires the generation of directions in the hemisphere, which must be converted to a smooth surface frame, which is more costly. ^{S 2} or from the unit sphere

The mapping to is known in the art.

全方向を生成するためにはるかに重要な議論は、これから説明される手法に関連する。ソーティングによって、例えば角にある、異なる表面法線を有する２つのすぐ近くの表面点が、散乱方向を生成するために、（ｔ，ｓ）−列の後続のサンプルを使用することが起こり得る。全方向の生成は、今は良好に機能するが、後続のサンプルを使用する、２つの異なる局所的フレームにおける方向の生成は、低食い違い量性を打ち壊す。これらの不連続性は、はっきりと目に見えるようになる。しかし、全方向の使用は、ｆ_ｒによるインポータンス（ｉｍｐｏｒｔａｎｃｅ）のサンプリング、又はコサイン項を可能にせず、それはしばしば不都合であり、さらなる研究に値する。 A much more important argument for generating omnidirectional is related to the approach to be described. By sorting, it can happen that two nearby surface points with different surface normals, for example at the corners, use subsequent samples in the (t, s) -sequence to generate the scattering direction. Although omnidirectional generation now works well, directional generation in two different local frames using subsequent samples breaks down the low discrepancy. These discontinuities become clearly visible. However, the use of all directions, the sampling of importance by _{f r} (importance), or does not allow the cosine term, it is often inconvenient, deserves further study.

４．ソーティング技法及びシステム
上述の技法の効率を増大させるために、ソーティングを使用することが可能である。本明細書で説明されるように、（ｔ，ｓ）−列は、基底ｂの（ｔ，ｍ，ｓ）−ネットの系列であり、Ｈａｌｔｏｎ列及びその変形に類似している。類似の状態Ｘ_ｎ，ｌは、連続する点を使用する。したがって、状態は、状態空間における近接性によって順序づけられるべきである。状態空間において状態がより近いほど、状態空間はより一様に探索される。 4). Sorting Techniques and Systems Sorting can be used to increase the efficiency of the techniques described above. As described herein, a (t, s) -sequence is a sequence of (t, m, s) -nets of base b and is similar to the Halton sequence and its variants. A similar state X _{n, l} uses successive points. Thus, the states should be ordered by proximity in the state space. The closer the state is in the state space, the more uniformly the state space is searched.

状態を互いに可能な限り近くするため、状態は、近隣の状態の距離の和が最小になるような順序で列挙されなければならない。これは、実際には、巡回セールスマン問題に関係するが、この問題では、与えられた都市の集合と、１つの都市から別の都市まで旅をするコストに対して、各都市を正確に１回訪れた後、出発した都市に戻る、最も安価な巡回旅行が求められる。この問題はすでに、１８世紀初頭にオイラーによって研究され、残念なことに、ＮＰ困難であることを示すことができる。 To make the states as close as possible to each other, the states must be listed in an order that minimizes the sum of the distances of neighboring states. This is actually related to the traveling salesman problem, where each city is exactly 1 for a given set of cities and the cost of traveling from one city to another. After the visit, the cheapest patrol trip is required to return to the departure city. This problem has already been studied by Euler in the early 18th century and unfortunately can be shown to be NP-hard.

われわれの問題は、ルートの最後の状態から最初の状態に戻る必要がないことを除いて、非常に類似している。近似ではあるが最適に近い、巡回セールスマン問題の解を効率的に計算するために、いくつかの技法がすでに利用可能である。しかし、これらの技法の実行時間は、距離行列の計算だけでもＯ（Ｎ^２）の計算量を示すので、シミュレーションでは許容可能ではないが、古典的なモンテカルロ法のＯ（Ｎ）の計算量に可能な限り近く、技法を維持することが望ましい。 Our problem is very similar except that it is not necessary to return from the last state of the route to the first state. Several techniques are already available to efficiently compute a solution to the traveling salesman problem that is an approximation but close to optimal. However, since the execution time of these techniques shows the amount of calculation of O (N ² ) only by calculating the distance matrix, it is not acceptable in the simulation, but the amount of calculation of O (N) in the classical Monte Carlo method It is desirable to maintain the technique as close as possible.

以下では、高次元状態空間用の高速ソーティングを達成するための、本発明の態様による多くの順序が説明される。 In the following, a number of sequences according to aspects of the present invention to achieve fast sorting for a high dimensional state space will be described.

４．１状態のノルム
クイックソートの平均計算量は、Ｏ（ＮｌｏｇＮ）であるが、あるシナリオの場合、基数ソート（ｒａｄｉｘｓｏｒｔ）など、Ｏ（Ｎ）の技法すら存在し、しかし、それは追加の一時メモリを必要とする。これらの１次元ソーティング技法を使用するため、多次元状態は、１次元まで引き下げられなければならない。多くの選択肢の中で、しばしば、何らかのノルム

が、状態空間上で順序を定義するために使用される。しかし、類似のノルムが、必ずしも状態空間における近接性を示すわけではない。これについての簡単な例は、輸送シミュレーションにおける、空間内で遠く離れて配置された粒子の類似のエネルギーである。 4.1 Norm of states The average complexity of quicksort is O (NlogN), but in some scenarios there are even O (N) techniques, such as radix sort, but it adds Requires temporary memory. In order to use these one-dimensional sorting techniques, the multidimensional state must be reduced to one dimension. Of many choices, often some norm

Are used to define the order on the state space. However, similar norms do not necessarily indicate proximity in state space. A simple example of this is the similar energy of particles located far away in space in a transport simulation.

４．２空間階層
多次元ソーティングを可能にする第２の可能性は、状態上で順序を定義するための、空間階層の使用である。この目的のために効率的なデータ構造は、ＢＳＰツリー、その特殊な軸平行サブセット、ｋＤツリー、又は境界ボリューム階層である。そのようなバイナリ階層の構築は簡単である。空間は、何らかの発見的方法によって選択された平面を使用して、再帰的に細分される。構築は、平均でＯ（ＮｌｏｇＮ）で実行される。階層の順序正しい横断が、近接性の順序でリーフを列挙する。この横断は、ツリーが左平衡化され、その結果、配列内に保存できる場合、自明なものとなる。 4.2 Spatial Hierarchy A second possibility to enable multidimensional sorting is the use of a spatial hierarchy to define order on the state. An efficient data structure for this purpose is a BSP tree, a special axis-parallel subset thereof, a kD tree, or a boundary volume hierarchy. Building such a binary hierarchy is simple. The space is recursively subdivided using a plane selected by some heuristic method. The construction is performed with O (NlogN) on average. Hierarchical traversal enumerates leaves in order of proximity. This traversal is trivial if the tree is left-balanced so that it can be stored in an array.

例えばレイトレーシングを加速するために、空間階層がとにかく使用されなければならない場合、階層のための付加的な構築時間は存在しない。その場合、階層のリーフに結びつけられたリンクリストとして、粒子が保存される。 If the spatial hierarchy has to be used anyway, for example to accelerate ray tracing, there is no additional construction time for the hierarchy. In that case, the particles are stored as a linked list associated with the leaves of the hierarchy.

図１７Ａ〜Ｇ、図１８Ａ〜Ｂ、及び図１９Ａ〜Ｃは、空間階層のリーフへの状態の分類が、階層を順序正しく横断することによって、近接性の順序を定義することを示した、一連の図である。 FIGS. 17A-G, 18A-B, and 19A-C show that the classification of states into leaves of a spatial hierarchy defines the order of proximity by traversing the hierarchy in order. FIG.

図１７Ａは、光源８５０と、レンダリングされる物体８６０とを含む、例示的なシーン８４０を示している。物体８６０は、五頂点の星形をしている。図１７Ｂでは、物体８６０は、軸平行境界ボックス８７０によって境界を定められる。図１７Ｃ〜Ｇでは、境界ボックス８７０が、再帰的にブランチに細分される。各ブランチは、リーフで終了する。 FIG. 17A shows an exemplary scene 840 that includes a light source 850 and an object 860 to be rendered. The object 860 has a star shape with five vertices. In FIG. 17B, the object 860 is bounded by an axis parallel bounding box 870. 17C-G, the bounding box 870 is recursively subdivided into branches. Each branch ends with a leaf.

図１７Ｃでは、境界ボックス８７０が、物体８６０の第１の頂点８６１を通過する分割平面８７１によって、第１世代の左ブランチＬと第１世代の右ブランチＲとに分割される。 In FIG. 17C, the bounding box 870 is divided into a first generation left branch L and a first generation right branch R by a dividing plane 871 passing through the first vertex 861 of the object 860.

図１７Ｄでは、左ブランチＬが、物体８６０の第２の頂点８６２を通過する第２の分割平面８７２によって分割され、第２世代の左ブランチＬＬと第２世代の右ブランチＬＲとを生じさせる。 In FIG. 17D, the left branch L is split by a second split plane 872 that passes through the second vertex 862 of the object 860, resulting in a second generation left branch LL and a second generation right branch LR.

図１７Ｅでは、第３の分割平面８７３が、物体８６０の第３の頂点８６３を通過し、ブランチＬＲを第３世代の左ブランチＬＲＬと右ブランチＬＲＲとに分割する。 In FIG. 17E, the third dividing plane 873 passes through the third vertex 863 of the object 860 and divides the branch LR into the third generation left branch LRL and right branch LRR.

図１７Ｆでは、第４及び第５の分割平面８７４及び８７５が、物体８６０の第４及び第５の頂点８６４及び８６５を通過する。図１７Ｆに示されるように、結果は、星形８６０の５つの点の各々に対して１つの、５つのリーフＬＲＬ、ＬＲＲ、ＲＲ、ＬＬＬ、ＬＬＲ、及びＲＬである。 In FIG. 17F, the fourth and fifth split planes 874 and 875 pass through the fourth and fifth vertices 864 and 865 of the object 860. As shown in FIG. 17F, the result is five leaves LRL, LRR, RR, LLL, LLR, and RL, one for each of the five points of star 860.

図１７Ｇは、各リーフ内の個々の点の照明を示している。図１８Ａでは、各リーフは、その明るさを示すために陰影づけされている。図１８Ｂでは、リーフは、それぞれの明るさに従って、階層的に順序づけられる。「最も暗い」から「最も明るい」まで、リーフは、以下のように、ＬＲＬ、ＬＲＲ、ＬＬＲ、ＲＲ、ＲＬの順に並べられる。 FIG. 17G shows the illumination of individual points within each leaf. In FIG. 18A, each leaf is shaded to show its brightness. In FIG. 18B, the leaves are hierarchically ordered according to their brightness. From “darkest” to “lightest”, the leaves are arranged in the following order: LRL, LRR, LLR, RR, RL.

残念ながら、この順序の品質は、シミュレーション用に使用される空間階層の品質によって強く決定され、そのことは、階層内のリーフの数が連鎖Ｎの数よりもはるかに少ない場合、これは複数の状態が同じリーフにマッピングされることを引き起こすので、特に問題となる。 Unfortunately, the quality of this order is strongly determined by the quality of the spatial hierarchy used for the simulation, which means that if the number of leaves in the hierarchy is much less than the number of chains N, this This is especially problematic because it causes the state to map to the same leaf.

４．３バケットソーティング及び空間充填曲線
線形時間の計算量を保証するため、バケットソーティングが使用できる。セクション２．１で略述された簡単な技法のｓ次元への拡張では、多次元の状態が、状態の第１の次元によってバケットに分類され、その後、各バケットの状態が、第２の次元に従ってバケットに分類され、以降も同様に行われた。この手順は、うまく機能するが、状態空間内で近い状態がソーティング手順によって引き離され得るという問題を有する。加えて、各次元の階層構造が使用されなければならず、それが、同時にシミュレートされるマルコフ連鎖の数において、次元の呪い（ｃｕｒｓｅｏｆｄｉｍｅｎｓｉｏｎ）を引き起こす。 4.3 Bucket sorting and space filling curves Bucket sorting can be used to guarantee linear time complexity. In the extension of the simple technique outlined in section 2.1 to the s dimension, multi-dimensional states are classified into buckets by a first dimension of states, after which the state of each bucket is converted to a second dimension. According to the above, it was classified into buckets and so on. This procedure works well, but has the problem that states that are close in the state space can be separated by the sorting procedure. In addition, the hierarchical structure of each dimension must be used, which causes a curse of dimension in the number of Markov chains that are simulated simultaneously.

したがって、状態空間は、バケットとして役立つ、等しいボリュームセル即ち「ボクセル」になる。各状態のバケットは、ボクセルグリッドの解像度に従って、状態座標を切り捨てることによって見出される。これは、連続状態空間にも、離散状態空間にも適用されることに留意されたい。近接性によるボクセルの列挙は、状態に所望の順序をもたらし、ボクセルの数に対する線形時間で実行できる。 Thus, the state space becomes an equal volume cell or “voxel” that serves as a bucket. Each state bucket is found by truncating the state coordinates according to the resolution of the voxel grid. Note that this applies to both continuous and discrete state spaces. The enumeration of voxels by proximity brings the desired order to the state and can be performed in linear time for the number of voxels.

近接性によってボクセルを列挙する順序は、ペアノ曲線、ヒルベルト曲線、又はＨインデクシング（Ｈ−ｉｎｄｅｘｉｎｇ）などの空間充填曲線によって与えられる。これらの曲線は、すべてのボクセルが正確に１度だけ訪れられ、全体の経路長が相対的に短いことを保証する。われわれ自身のシミュレーションにおける場合である、大きな形状に関連する問題にとって、これは、巡回セールスマン問題に対する高速でメモリ効率のよい近似解を生成するための、数少ない可能性の１つにすらなり得る。しかし、これらの曲線は、むしろ評価するのにコストがかかり、効率的であるために表にまとめられる必要があり、又はより高い次元に対しては利用可能でない。 The order in which voxels are listed by proximity is given by a space filling curve such as a Peano curve, a Hilbert curve, or H-indexing. These curves ensure that all voxels are visited exactly once and the overall path length is relatively short. For problems related to large shapes, which is the case in our own simulations, this can even be one of the few possibilities for generating fast, memory-efficient approximate solutions to the traveling salesman problem. However, these curves are rather expensive to evaluate and need to be tabulated to be efficient, or are not available for higher dimensions.

幸い、ルベーグ曲線又はモートン順序（Ｍｏｒｔｏｎｏｒｄｅｒ）としても知られるＺ曲線は、これらの問題を回避する。多次元空間においてバケットの整数座標を与えると、その１次元Ｚ曲線インデックスが、図１９Ａ〜Ｃに示されるように、座標値をビット単位にインタリーブすることによって、簡単に計算される。 Fortunately, the Z curve, also known as Lebesgue curve or Morton order, avoids these problems. Given the integer coordinates of a bucket in a multidimensional space, its one-dimensional Z-curve index is simply calculated by interleaving the coordinate values bit by bit, as shown in FIGS.

図１９Ａ〜Ｃは、２×２（図１９Ａ）、４×４（図１９Ｂ）、及び１６×１６（図１９Ｃ）のバケットについて、２次元のＺ曲線８９１、８９３、８９５を示した、一連の図８９０、８９２、８９４を示している。左上を原点（０，０）とすると、×によって印を付けられた点は、整数座標（３，４）を有し、それは、２進法の（０１１，１００）_２に対応する。そのバイナリＺ曲線インデックス１００１０１_２は、バイナリ座標をビット単位にインタリーブすることによって計算される。 19A-C show a series of two-dimensional Z-curves 891, 893, 895 for 2 × 2 (FIG. 19A), 4 × 4 (FIG. 19B), and 16 × 16 (FIG. 19C) buckets. 890, 892, and 894 are shown. If the upper left is the origin (0, 0), the point marked by x has integer coordinates (3, 4), which corresponds to binary (011, 100) ₂ . Its binary Z curve index 100101 ₂ is calculated by interleaving the binary coordinates in bit units.

これは、任意の次元及び問題サイズについて、計算が容易で高速であり、追加的なメモリを必要としない。その結果は、全体的な文脈では、例えばヒルベルト曲線ほど良好でないことが分かった。しかし、平均的な場合の分割品質、及び平均的な場合／最悪の場合の対数インデックス範囲は、同程度に良好である。問題は、数値的実験用に使用された、図２０Ａに示されたＳｈｉｒｌｅｙの「Ｓｃｅｎｅ６」のような対称性の高いシーンにおいて生じる。（ｘ，ｙ）平面に平行な壁上の状態は、非常に良好にソートされるが、（ｙ，ｚ）平面に平行な他の２つの壁上に配置された状態は、曲線によって交互に訪れられ、それが、いくつかのシナリオでは、相関アーチファクトをもたらし得る。 This is easy and fast to compute for any dimension and problem size and does not require additional memory. The results were found not to be as good in the overall context as for example the Hilbert curve. However, the average split quality and the average / worst case log index range are equally good. The problem arises in highly symmetric scenes, such as Shirley's “Scene 6” shown in FIG. 20A, used for numerical experiments. The states on the wall parallel to the (x, y) plane are sorted very well, but the states placed on the other two walls parallel to the (y, z) plane are alternated by the curves. Visited, which may lead to correlation artifacts in some scenarios.

５．数値的な証拠
上述の説明に続いて、不偏誤差推定を得るために、Ｃｒａｎｌｅｙ−Ｐａｔｔｅｒｓｏｎ回転によってランダム化された、Ｆａｕｒｅによる置換を行った、Ｈａｌｔｏｎ列が使用された。ソーティングについては、Ｚ曲線順序が、特定のセッティングに対して最良の結果をもたらし、以下の実験のために使用された。ランダム化の省略は結果の精度に顕著な影響をもたないことが数値的に検証されたことにさらに留意されたい。数値的な実験では、マルコフ連鎖をシミュレートするための４つの手法が比較された。 5. Numerical Evidence Following the discussion above, a Halton sequence, randomized by Cranley-Patterson rotation, replaced by Faure, was used to obtain an unbiased error estimate. For sorting, the Z curve order gave the best results for a particular setting and was used for the following experiments. It is further noted that the omission of randomization has been numerically verified that it has no significant effect on the accuracy of the results. In numerical experiments, four approaches for simulating Markov chains were compared.

ＭＣ：メルセンヌツイスタ（ＭｅｒｓｅｎｎｅＴｗｉｓｔｅｒ）によって生成された一様乱数が、古典的モンテカルロサンプリング用に使用された。 MC: Uniform random numbers generated by Mersenne Twister were used for classical Monte Carlo sampling.

ＲＱＭＣ：Ｃｒａｎｌｅｙ−Ｐａｔｔｅｒｓｏｎ回転によってランダム化された、Ｆａｕｒｅによる置換を行った、高次元Ｈａｌｔｏｎ列が使用され、要素のペアが、２次元放出及び散乱事象をサンプリングするために使用された。 A high-dimensional Halton sequence, randomized by RQMC: Cranley-Patterson rotation, with substitution by Faure, was used, and pairs of elements were used to sample two-dimensional emission and scattering events.

ｌｏ−ｄｉｍＲＱＭＣＳ：Ｃｒａｎｌｅｙ−Ｐａｔｔｅｒｓｏｎ回転によってランダム化された、２次元Ｈａｌｔｏｎ列が使用された。Ｚ曲線が、バケットソーティングされた状態を列挙するために使用された。 lo-dim RQMCS: A two-dimensional Halton sequence randomized by Cranley-Patterson rotation was used. The Z curve was used to enumerate the bucket sorted state.

ｈｉ−ｄｉｍＲＱＭＣＳ：Ｃｒａｎｌｅｙ−Ｐａｔｔｅｒｓｏｎ回転によってランダム化された、Ｆａｕｒｅによる置換を行った、高次元Ｈａｌｔｏｎ列が使用された。Ｚ曲線が、バケットソーティングされた状態を列挙するために使用された。 hi-dim RQMCS: A high-dimensional Halton sequence randomized by Cranley-Patterson rotation, with substitution by Faure, was used. The Z curve was used to enumerate the bucket sorted state.

第１の実験では、経路積分を計算するために、強い全体的な照明技法が使用された。図２０Ａは、相対的に簡単な光輸送設定である、テストシーン「Ｓｈｉｒｌｅｙ６」を示し、図２１Ａは、より複雑な光輸送設定である、テストシーン「ＩｎｖｉｓｉｂｌｅＤａｔｅ」を示している。図２０Ｂ及び図２１Ｂは、マスタ解に対する平均ＲＭＳ誤差を示す、グラフ９０１及び９１１であり、図２０Ｃ及び図２１Ｃは、（画像全体にわたって平均化された）大域的分散を示す、グラフ９０２及び９１２であり、図２０Ｄ及び図２１Ｄは、ピクセルベースの分散を示す、グラフ９０３及び９１３である。数値は、マルコフ連鎖の数を変化させた、１０回の独立実行を平均することによって獲得された。測定された数値は、簡単なテストシーンについてのみ説得力がある。複雑なケースでさえ、独立実行の回数が少なすぎるため、モンテカルロサンプリングを上回る性能利得は測定できない。より多くの実験は、過大な実行時間のせいで可能でなかった。 In the first experiment, a strong global illumination technique was used to calculate the path integral. FIG. 20A shows a test scene “Shirley 6”, which is a relatively simple light transport setting, and FIG. 21A shows a test scene “Invisible Date”, which is a more complicated light transport setting. 20B and 21B are graphs 901 and 911 showing the average RMS error for the master solution, and FIGS. 20C and 21C are graphs 902 and 912 showing the global variance (averaged over the entire image). 20D and FIG. 21D are graphs 903 and 913 showing pixel-based variance. The figure was obtained by averaging 10 independent runs with varying number of Markov chains. The measured numbers are convincing only for simple test scenes. Even in complex cases, the performance gain over Monte Carlo sampling cannot be measured because the number of independent runs is too small. More experiments were not possible due to excessive execution time.

しかし、視覚誤差は、図２２Ａ〜Ｃに見られるような、劇的に異なったストーリを語り、非常に困難なセッティングにおいてさえ、新しい技法の紛れもない優位性が明らかになる。図２２Ａ〜Ｃは、シミュレーション用に３００個の連鎖を使用した、テストシーン「ＩｎｖｉｓｉｂｌｅＤａｔｅ」についての視覚的比較を示している。図２２Ａは、モンテカルロを使用してレンダリングされたテストシーンを示し、図２２Ｂは、ランダム化準モンテカルロを使用してレンダリングされたテストシーンを示し、図２２Ｃは、ソーティングを伴うランダム化準モンテカルロを使用してレンダリングされたテストシーンを示している。 However, visual error tells a dramatically different story, as seen in FIGS. 22A-C, revealing the unmistakable superiority of the new technique, even in very difficult settings. 22A-C show a visual comparison for the test scene “Invisible Date” using 300 chains for simulation. 22A shows a test scene rendered using Monte Carlo, FIG. 22B shows a test scene rendered using randomized quasi-Monte Carlo, and FIG. 22C uses randomized quasi-Monte Carlo with sorting. Shows the rendered test scene.

図２２Ａ〜Ｃでは、唯一の光源は、隣の部屋の天井に配置されているので、見えていない。光子のより良い分布のため、ランダム化準モンテカルロ（ＲＱＭＣ）（図２２Ｂ）は、減少した陰影アーチファクトによって分かるように、モンテカルロ（ＭＣ）（図２２Ａ）よりも視覚的に性能がよい。（バケットソートのために２５６^３個のボクセルを使用した）ソーティングを伴うＲＱＭＣ（ＲＱＭＣＳ）（図２２Ｃ）は、より多くの光子が第２の部屋に入り込み、ドアの裏側さえも非常によく照明されるので、さらに優れている。 22A-C, the only light source is not visible because it is located on the ceiling of the adjacent room. Because of the better distribution of photons, randomized quasi-Monte Carlo (RQMC) (FIG. 22B) performs better visually than Monte Carlo (MC) (FIG. 22A), as can be seen by the reduced shadow artifacts. RQMC (RQMCS) with sorting (using 256 ³ voxels for bucket sort) (FIG. 22C), more photons enter the second room and even the back of the door is very well illuminated So it is even better.

このケースは例外でなく、多くのテストケースについて観測することができる。標準的な誤差測定は視覚的品質にとって適切な誤差測定でなく、このことは、コンピュータグラフィックスにおける、知られてはいるが未解決の問題であることだけを強調する。 This case is no exception and can be observed for many test cases. Standard error measurement is not an appropriate error measurement for visual quality, and this only highlights a known but unsolved problem in computer graphics.

図２３Ａ〜Ｄは、非常に困難な光輸送問題についての測定を示しており、光子は光源から直接トレースされ、最終経路セグメントはカメラに接続された（双方向経路トレーシングの１つの技法）。図２３Ａは、説明のために床と天井が取り除かれた、迷宮テストシーン９４０の概略図を示している。カメラは下方の部屋の中に配置され、光源は接続トンネルの他の終端に配置される。図２３Ｂに示されたグラフ９５０は、「箱ひげ」図であり、各技法を使用した５０回の独立した実現について、１０４８５７６のシミュレートされた光経路の、カメラによって受け取られた総放射の平均量（×によって印される）を示している。図２３Ｃに示されたグラフ９６０は、図２３Ｂに示されたグラフの関心のある部分を拡大したものである。図２３Ｄに示された表９７０は、５０回の独立した実現にわたって平均された、カメラ及び各ピクセル（２５６×２５６のピクセル解像度）によって受け取られた総放射のＬ１及びＬ２ノルム（分散）を使用した、マスタ解からの偏差を最終的に表示している。この困難なセッティングにおいて、新しい方法（ＲＱＭＣＳ）は、紛れもなく優れている。 Figures 23A-D show measurements for a very difficult light transport problem, with photons traced directly from the light source and the final path segment connected to the camera (one technique of bidirectional path tracing). FIG. 23A shows a schematic diagram of a labyrinth test scene 940 with the floor and ceiling removed for illustration. The camera is placed in the lower room and the light source is placed at the other end of the connecting tunnel. Graph 950 shown in FIG. 23B is a “box-and-whisker” diagram, and is the average of the total radiation received by the camera of 1048576 simulated light paths for 50 independent implementations using each technique. The quantity (marked by x) is shown. The graph 960 shown in FIG. 23C is an enlarged view of the portion of interest in the graph shown in FIG. 23B. Table 970 shown in FIG. 23D used the L1 and L2 norms (dispersion) of the total radiation received by the camera and each pixel (256 × 256 pixel resolution), averaged over 50 independent implementations. The deviation from the master solution is finally displayed. In this difficult setting, the new method (RQMCS) is undoubtedly superior.

上記の測定とは反対に、ただ１つの同時にシミュレートされるマルコフ連鎖が検討される。今は、十分な回数の実験が、計算的に実現可能であり、新しい技法の優位性が、明瞭に見えるものとなった。 Contrary to the above measurement, only one simultaneously simulated Markov chain is considered. Now, a sufficient number of experiments are computationally feasible, and the superiority of the new technique is clearly visible.

６．結論
上述のシステム及び技法は、従来技術にまさる多くの利点を有することが分かる。これらは、以下のもの、即ち、直観的には系列特性に焦点が絞られた技法の簡略化、改善された収束、線形の計算量、及び光輸送シミュレーションへの適用を含む。上述のシステム及び技法は、技法を積分として定式化することに一歩近づき、その点において、一貫性を示すことが相対的に容易である。 6). CONCLUSION It can be seen that the systems and techniques described above have many advantages over the prior art. These include the following: simplification of techniques that are intuitively focused on sequence characteristics, improved convergence, linear complexity, and application to light transport simulations. The systems and techniques described above are one step closer to formulating the technique as an integral, and in that respect it is relatively easy to show consistency.

発明者らは、首尾よく、マルコフ連鎖を同時にシミュレートするための技法を簡略化し、いつ、なぜ、状態のソーティングが収束を改善できるかについての直観を提供した。加えて、技法はもはや次元の呪いによって制限されず、シミュレーションは時間につれて変化し得る遷移確率Ｐ≡Ｐ_ｎをまさに使用できるので、斉次マルコフ連鎖（ｈｏｍｏｇｅｎｅｏｕｓＭａｒｋｏｖｃｈａｉｎ）への制限は存在しない。したがって、本明細書で説明された技法は、非斉次マルコフ連鎖（ｉｎｈｏｍｏｇｅｎｅｏｕｓＭａｒｋｏｖｃｈａｉｎ）をシミュレートするために、即ち、Ｐ≡Ｐ_ｎが時間ステップｎに依存する場合に、使用することができる。 The inventors have successfully simplified techniques for simultaneously simulating Markov chains and provided an intuition as to when and why state sorting can improve convergence. In addition, the technique is no longer limited by the curse of dimension, and the simulation can just use the transition probabilities P≡P _n that can change over time, so there is no limit to the homogenous Markov chain. Thus, the techniques described herein can be used to simulate an inhomogeneous Markov chain, i.e., where P≡P _n depends on time step n. .

実験は、必ずしもすべての（ｔ，ｓ）−列又は根基逆変換ベースの点列が等しく良好ではないことを示した。これは、さらなる特性解析に値する。 Experiments have shown that not all (t, s) -sequences or root inverse-based point sequences are equally good. This deserves further characterization.

本明細書で説明された技法は、ランク１格子の系列が適用できる場合、さらに簡単になる。しかし、これまでの構築は、改善された性能にとって必要とされる、（ｔ，ｓ）−列の特性を欠いている。本発明のさらなる一態様によれば、所望の結果を達成するために、適切なランク１格子の系列を構築することが可能なこともある。 The techniques described herein are even simpler when rank 1 lattice sequences are applicable. However, previous constructions lack the (t, s) -sequence characteristics required for improved performance. According to a further aspect of the invention, it may be possible to construct a suitable rank-1 lattice sequence to achieve the desired result.

発明者らは全方向を使用しているので、即ち、球の積（ｐｒｏｄｕｃｔｏｆｓｐｈｅｒｅｓ）にわたって積分するので、その方向で最近の研究との関連性を確立することも興味深い。 It is also interesting to establish relevance to recent work in that direction, since the inventors use all directions, i.e., integrate over the product of spheres.

７．一般的技法
図２４〜図３０は、本発明の上述の態様による、多くの技法及び副技法のフローチャートである。個々のボックス及びボックスのグループを含む、以下の技法及び副技法の一部又は全部は、本明細書で説明された本発明を実施するために、望みどおりに組み合わされてよいことに留意されたい。 7). General Techniques FIGS. 24-30 are flowcharts of a number of techniques and sub-techniques according to the above aspects of the invention. It should be noted that some or all of the following techniques and sub-techniques, including individual boxes and groups of boxes, may be combined as desired to implement the invention described herein. .

図２４は、本発明による、第１の一般的技法１０００のフローチャートである。技法１０００は以下を含む。
ボックス１００１：準モンテカルロ法を使用して、マルコフ連鎖をシミュレートする。
ボックス１００２：状態をソートする。
（近接性ソーティングを含む、状態によるソーティングを含むことができる）
ボックス１００３：複数のマルコフ連鎖を同時にシミュレートする。
（準モンテカルロ点を利用できる。Ｈａｌｔｏｎ列、Ｈａｌｔｏｎ列の変形、格子列、又は（ｔ，ｓ）−列のいずれかを使用して、準モンテカルロ点を選択できる。任意の数の連鎖を利用できる。軌道分割によるなど、軌道をオンザフライで追加するステップを含むことができる。ロシアンルーレット法又は粒子吸収をシミュレートする他の技法を利用できる） FIG. 24 is a flowchart of a first general technique 1000 according to the present invention. Technique 1000 includes the following.
Box 1001: Simulate a Markov chain using a quasi-Monte Carlo method.
Box 1002: Sort status.
(Can include sorting by state, including proximity sorting)
Box 1003: Simulate multiple Markov chains simultaneously.
(A quasi-Monte Carlo point can be used. A quasi-Monte Carlo point can be selected using either a Halton sequence, a variant of the Halton sequence, a grid sequence, or a (t, s) -sequence. Any number of chains can be used. (This can include adding trajectories on the fly, such as by trajectory splitting. Russian roulette methods or other techniques to simulate particle absorption can be used)

図２５は、ｓ次元問題について複数のマルコフ連鎖を同時にシミュレートするための、本発明による、さらなる技法１１００のフローチャートである。技法１１００は以下を含む。
ボックス１１０１：ｎ：＝０
（ｎを０に等しく設定する）
ボックス１１０２：準モンテカルロ点ｘ_ｌを使用して、Ｘ_０，ｌを初期化する。
ボックス１１０３：適切な順序を使用して、状態ベクトルをソートする。
ボックス１１０４：ｎ：＝ｎ＋１
（ｎを１だけ増やす）
ボックス１１０５：次の準モンテカルロ点ｘ_ｌを使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。
（任意の選択されたランダム化を施された任意の選択された点集合又は点列、及び状態空間における近接性による状態の順序づけも利用できる）
ボックス１１０３〜１１０５は、ループ内で実行される。 FIG. 25 is a flowchart of a further technique 1100 according to the present invention for simultaneously simulating multiple Markov chains for an s-dimensional problem. Technique 1100 includes the following:
Box 1101: n: = 0
(Set n equal to 0)
Box 1102: X _{0, l} is initialized using the quasi-Monte Carlo point _xl .
Box 1103: Sort state vector using appropriate order.
Box 1104: n: = n + 1
(Increase n by 1)
Box 1105: Continue the chain by calculating X _{n, l} using the next quasi-Monte Carlo point _xl .
(Any selected point set or sequence with any chosen randomization and state ordering by proximity in state space can also be used)
Boxes 1103 to 1105 are executed in a loop.

図２６は、追加技法１２００のフローチャートであり、その１つ又は複数は、図２５に示された技法１１００と併せて実行されてよい。これらの技法１２００は以下を含む。
ボックス１２０１：明るさ、又は状態のノルムによってソートする。
ボックス１２０２：空間階層を使用してソートする。
（バイナリ階層、又はＢＳＰツリー、ｋＤツリー、若しくはＢＳＰツリー構造の他の軸平行サブセットのいずれか、又は境界ボリューム階層、又は規則的ボクセルを利用するステップを含むことができる。バイナリ階層は、選択された発見的手法によって選択された平面を使用して再帰的に細分を行い、その後、階層のリーフを近接性の順序で列挙するために、階層を順序正しく横断することによって構築することができる。階層の効率的な横断を可能にするために、ツリーを左平衡化させるステップと、ツリーを配列データ構造に保存するステップも含むことができる）
ボックス１２０３：少なくとも１つの空間充填曲線によって列挙されたバケットを使用してソートする。 FIG. 26 is a flowchart of an additional technique 1200, one or more of which may be performed in conjunction with technique 1100 shown in FIG. These techniques 1200 include:
Box 1201: Sort by brightness or state norm.
Box 1202: Sort using spatial hierarchy.
(May include using a binary hierarchy, or any other axis-parallel subset of a BSP tree, kD tree, or BSP tree structure, or a boundary volume hierarchy, or regular voxels. The binary hierarchy is selected. Can be constructed by recursively subdividing using the plane selected by the heuristic and then traversing the hierarchy in order to enumerate the leaves of the hierarchy in order of proximity. (It can also include left balancing the tree and saving the tree in an array data structure to allow for efficient traversal of the hierarchy)
Box 1203: Sort using buckets enumerated by at least one space filling curve.

図２７は、本発明による、さらなる技法１３００のフローチャートである。技法１３００は以下を含む。
ボックス１３０１：ｎ：＝０、点列ｘ_ｌを選択する。
ボックス１３０２：ランダム化をインスタンス化する。
ボックス１３０３：ｒａｎｄ（ｘ_ｌ）を使用して、Ｘ_０，ｌを初期化する。
ボックス１３０４：何らかの順序を使用して、状態ベクトルをソートする。
ボックス１３０５：ｎ：＝ｎ＋１
ボックス１３０６：ランダム化をインスタンス化する。
ボックス１３０７：ｒａｎｄ（ｘ_ｌ）を使用して、Ｘ_ｎ，ｌを計算することによって、連鎖を継続させる。
ボックス１３０４〜１３０７は、ループ内で実行される。 FIG. 27 is a flowchart of a further technique 1300 according to the present invention. Technique 1300 includes the following.
Box 1301: Select n: = 0, point sequence _xl .
Box 1302: Instantiate randomization.
Box 1303: Initialize X _{0, l} using rand (x _l ).
Box 1304: Sort the state vector using some order.
Box 1305: n: = n + 1
Box 1306: Instantiate randomization.
Box 1307: Continue chaining by calculating X _{n, l} using rand (x _l ).
Boxes 1304-1307 are executed in a loop.

図２８は、図２７の技法１３００と併せて実行されてよい、追加技法１４００のフローチャートである。追加技法１４００は、各時間ステップｎにおいて準モンテカルロ点をランダム化し、以下を含む。
ボックス１４０１：後続のサンプルがそこから取り出される、基数ｂ＝２の（ｔ，ｓ）−列を選択する。
ボックス１４０２：サンプルにｓ次元ランダムベクトルによるＸＯＲ演算を施し、ランダムベクトルは、各遷移ステップの後に生成される。 FIG. 28 is a flowchart of an additional technique 1400 that may be performed in conjunction with technique 1300 of FIG. The additional technique 1400 randomizes the quasi-Monte Carlo points at each time step n and includes:
Box 1401: Select a (t, s) -column of radix b = 2 from which subsequent samples are taken.
Box 1402: Sample is XORed with s-dimensional random vector, and a random vector is generated after each transition step.

図２９は、本発明による、さらなる技法１５００のフローチャートである。技法１５００は以下を含む。
ボックス１５０１：軸平行境界ボックスによって、レンダリングされる物体の境界を定める。
ボックス１５０２：物体の選択された点を通過する分割平面を適用することによって、境界ボックスを、各々がリーフで終了する、連続的な左ブランチ及び右ブランチに再帰的又は反復的に分割し、それによって、複数のリーフを生成する。
ボックス１５０３：リーフをそれぞれの明るさに従って階層的に順序づける。
ボックス１５０４：選択された数のバケットに分割された、選択されたサイズの行列上で、バケットソートを実行する。
ボックス１５０５：行列内をたどるために選択された空間充填曲線を使用し、個々のバケット内において、より小さな空間充填曲線が、行列内の個々のセル内をたどるために使用される。 FIG. 29 is a flowchart of a further technique 1500 according to the present invention. Technique 1500 includes the following.
Box 1501: An axis parallel bounding box delimits the object to be rendered.
Box 1502: recursively or iteratively splits the bounding box into successive left and right branches, each ending with a leaf, by applying a splitting plane that passes through the selected points of the object; To generate a plurality of leaves.
Box 1503: The leaves are ordered hierarchically according to their brightness.
Box 1504: Perform a bucket sort on a matrix of a selected size divided into a selected number of buckets.
Box 1505: Use the selected space filling curve to follow in the matrix, and within each bucket, a smaller space filling curve is used to follow within the individual cells in the matrix.

図３０は、本発明による、さらなる技法１６００のフローチャートである。技法１６００は、リアルに見える画像を合成するための光輸送シミュレーションを可能にするシミュレーションを構成し、以下を含む。
ボックス１６０１：光輸送シミュレーションの基礎をなす積分方程式を、経路空間をサンプリングすることがマルコフ連鎖をシミュレートすることに対応するように、経路積分として定式化し直し、その際、経路は、レイトレーシング及び散乱事象によって確立される。
ボックス１６０２：初期分布を、光源又はセンサの放出特性によって決定し、遷移確率を、画像内の表面についての双方向反射分布関数及び双方向表面下散乱分布関数によって決定する。
ボックス１６０３：透過効果を、経路積分、初期分布、及び遷移確率を利用することによって処理する。
ボックス１６０４：経路積分を、任意の種類のランダム化を施して、又は施さずに、高次元準モンテカルロ点を利用することによって、又は低次元準モンテカルロ点をパディングすることによって解く。
（光輸送シミュレーションの基礎をなす積分方程式を、Ｆｒｅｄｈｏｌｍ積分方程式又はＶｏｌｔｅｒｒａ積分方程式と見なす） FIG. 30 is a flowchart of a further technique 1600 according to the present invention. The technique 1600 configures a simulation that enables a light transport simulation to synthesize a real-looking image and includes:
Box 1601: The integral equation underlying the light transport simulation is reformulated as a path integral so that sampling the path space corresponds to simulating a Markov chain, where the path is a ray tracing and Established by scattering events.
Box 1602: The initial distribution is determined by the emission characteristics of the light source or sensor, and the transition probability is determined by the bidirectional reflection distribution function and the bidirectional subsurface scattering distribution function for the surface in the image.
Box 1603: Process transmission effects by utilizing path integrals, initial distribution, and transition probabilities.
Box 1604: The path integral is solved by using high-dimensional quasi-Monte Carlo points, with or without any kind of randomization, or by padding low-dimensional quasi-Monte-Carlo points.
(The integral equation underlying the light transport simulation is regarded as a Fredholm integral equation or a Volterra integral equation)

上記の説明は、当業者が本発明を実施することを可能にする詳細を含むが、説明が本質的に例示的なものであること、これらの教示から利益を得る当業者には本発明の多くの修正及び変形が明らかであることを理解されたい。したがって、本明細書の発明が、本明細書に添付された特許請求の範囲によってのみ確定されること、特許請求の範囲が、従来技術によって許される限り広く解釈されることが意図されている。 While the above description includes details that enable one skilled in the art to practice the invention, the description is intended to be exemplary in nature and to those skilled in the art who would benefit from these teachings. It should be understood that many modifications and variations are apparent. Accordingly, it is intended that the invention herein be determined only by the claims appended hereto, and that the claims be interpreted as broadly as permitted by the prior art.

コンピュータプログラムリスト

CODE LISTING 2.2.1

void Triangle::Transform()
{
Point *p = (Point *)this;
Vector n3d;
Vector n_abs = n3d =(p[1]-p[0])|(p[2]-p[0]);
// search largest component forprojection (0=x,1=y,2=z)
uintCast(n_abs.dx) &= 0x7FFFFFFF;
uintCast(n_abs.dy) &= 0x7FFFFFFF;
uintCast(n_abs.dz) &= 0x7FFFFFFF;
// Degenerated Triangles must be handled(set edge-signs)
if(!((n_abs.dx+n_abs.dy+n_abs.dz) >DEGEN_TRI_EPSILON))
//(!(...) > EPS) to handle NaN’s
{
d = p[0].x;
p0.u = -p[0].y;
p0.v = -p[0].z;
n.u=n.v = 0.0f;
e[0].u = e[1].u = e[0].v = e[1].v = 1.0f;
return;
}
U32 axis = 2;
if(n_abs.dx > n_abs.dy)
{
if(n_abs.dx > n_abs.dz)
axis = 0;
}
else if(n_abs.dy > n_abs.dz)
axis = 1;
Point p03d = p[0];
Point p13d = p[1];
Point p23d = p[2];
float t_inv = 2.0f/n3d[axis];
e[0].u =(p23d[PlusOneMod3[axis]]-p03d[PlusOneMod3[axis]])*t_inv;
e[0].v =(p23d[PlusOneMod3[axis+1]]-p03d[PlusOneMod3[axis+1]])*t_inv;
e[1].u =(p13d[PlusOneMod3[axis]]-p03d[PlusOneMod3[axis]])*t_inv;
e[1].v =(p13d[PlusOneMod3[axis+1]]-p03d[PlusOneMod3[axis+1]])*t_inv;
t_inv *= 0.5f;
n.u = n3d[PlusOneMod3[axis]] *t_inv;
n.v = n3d[PlusOneMod3[axis+1]]*t_inv;
p0.u = -p03d[PlusOneMod3[axis]];
p0.v = -p03d[PlusOneMod3[axis+1]];
d = p03d[axis] +n.u*p03d[PlusOneMod3[axis]] + n.v*p03d[PlusOneMod3[axis+1]];
}

CODE LISTING 2.2.2

U32 *idx = pointer_to_face_indices;
U32 ofs = projection_case;
for(U32 ii = num_triData; ii;ii--,idx++)
{
float t = (triData[*idx].d -ray.from[ofs]
-triData[*idx].n.u*ray.from[PlusOneMod3[ofs]]
-triData[*idx].n.v*ray.from[PlusOneMod3[ofs+1]])
/ (ray.d[ofs] +triData[*idx].n.u*ray.d[PlusOneMod3[ofs]]
+ triData[*idx].n.v*ray.d[PlusOneMod3[ofs+1]]);
if(uintCast(t)-1 >uintCast(result.tfar)) //-1 for +0.0f
continue;
float h1 = t*ray.d[PlusOneMod3[ofs]] +ray.from[PlusOneMod3[ofs]]
+ triData[*idx].p0.u;
float h2 = t*ray.d[PlusOneMod3[ofs+1]] +ray.from[PlusOneMod3[ofs+1]]
+ triData[*idx].p0.v;
float u = h1*triData[*idx].e[0].v -h2*triData[*idx].e[0].u;
float v = h2*triData[*idx].e[1].u -h1*triData[*idx].e[1].v;
float uv = u+v;
if((uintCast(u) | uintCast(v) |uintCast(uv)) > 0x40000000)
continue;
result.tfar = t;
result.tri_index = *idx;
}

CODE LISTING 2.3.1

Point *p = (Point*)&triData[tri_index];
int boxMinIdx, boxMaxIdx;
// boxMinIdx and boxMaxIdx index thesmallest and largest vertex of the triangle
// in the component dir[0] of the splitplane
if(p[0][dir[0]] < p[1][dir[0]])
{
if(p[2][dir[0]] < p[0][dir[0]])
{
boxMinIdx = 2;
boxMaxIdx = 1;
}
else
{
boxMinIdx = 0;
boxMaxIdx = p[2][dir[0]] <p[1][dir[0]] ? 1 : 2;
}
}
else
{
if(p[2][dir[0]] < p[1][dir[0]])
{
boxMinIdx = 2;
boxMaxIdx = 0;
}
else
{
boxMinIdx = 1;
boxMaxIdx = p[2][dir[0]] <p[0][dir[0]] ? 0 : 2;
}
}
/* If the triangle is in the split planeor completely on one side of the split plane
is decided without any numerical errors,i.e. at the precision the triangle is
entered to the rendering system. Usingepsilons here is wrong and not necessary.
*/
if((p[boxMinIdx][dir[0]] == split)&& (p[boxMaxIdx][dir[0]] == split)) // in split plane ?
{
on_splitItems++;
if(split < middle_split) // put tosmaller volume
left_num_divItems++;
else
{
unsorted_border--;
U32 t = itemsList[unsorted_border];
right_num_divItems--;
itemsList[right_num_divItems] =itemsList[left_num_divItems];
itemsList[left_num_divItems] = t;
}
}
else if(p[boxMaxIdx][dir[0]] <=split) // triangle completely left ?
left_num_divItems++;
else if(p[boxMinIdx][dir[0]] >=split) // triangle completely right ?
{
unsorted_border--;
U32 t = itemsList[unsorted_border];
right_num_divItems--;
itemsList[right_num_divItems] =itemsList[left_num_divItems];
itemsList[left_num_divItems] = t;
}
else
// and now detailed decision, trianglemust intersect split plane ...
{
/* In the sequel we determine whether atriangle should go left and/or right, where we already know that it mustintersect the split plane in a line segment.
All computations are ordered so that themore precise computations are done first. Scalar products and cross productsare evaluated last.
In some situations it may be necessaryto expand the bounding box by
an epsilon. This, however, will blow upthe required memory by large amounts.
If such a situation is encountered, itmay be better to analyze it numerically in order not to use any epsilons...
Arriving here we know thatp[boxMaxIdx][dir[0]] < split < p[boxMaxIdx][dir[0]]
and that p[boxMidIdx][dir[0]] \in[p[boxMaxIdx][dir[0]], p[boxMaxIdx][dir[0]]].
We also know, that the triangle has anon-empty intersection with the current
voxel. The triangle also cannot lie inthe split plane, and its vertices cannot
lie on one side only.
*/
int boxMidIdx = 3 - boxMaxIdx -boxMinIdx; // missing index, found by 3 = 0 + 1 + 2
/* We now determine the vertex that isalone on one side of the split plane.
Depending on whether the lonely vertexis on the left or right side,
we have to later swap the decision,whether the
triangle should be going to the left orright.
*/
int Alone = (split <p[boxMidIdx][dir[0]]) ? boxMinIdx : boxMaxIdx;
int NotAlone = 3 - Alone - boxMidIdx;
// == (split < p[boxMidIdx][dir[0]])? boxMaxIdx : boxMinIdx;
// since sum of idx = 3 = 0 + 1 + 2
float dist = split - p[Alone][dir[0]];
U32 swapLR = uintCast(dist)>>31;// == p[Alone][dir[0]] > split;
/* Now the line segments connecting thelonely vertex with the remaining two verteces
are intersected with the split plane. a1and a2 are the intersection points.
The special case "if(p[boxMidIdx][dir[0]]== split)" [yields a x / x, which could
be optimized] does not help at all sinceit only can happen as often as the highest
valence of a vertex of the mesh is...
*/
float at = dist / (p[boxMidIdx][dir[0]]- p[Alone][dir[0]]);
float at2 = dist / (p[NotAlone][dir[0]]- p[Alone][dir[0]]);
float a1x = (p[boxMidIdx][dir[1]] -p[Alone][dir[1]]) * at;
float a1y = (p[boxMidIdx][dir[2]] -p[Alone][dir[2]]) * at;
float a2x = (p[NotAlone][dir[1]] -p[Alone][dir[1]]) * at2;
float a2y = (p[NotAlone][dir[2]] - p[Alone][dir[2]])* at2;
// n is a vector normal to the line ofintersection a1a2 of the triangle
// and the split plane
float nx = a2y - a1y;
float ny = a2x - a1x;
// The signs indicate the quadrant ofthe vector normal to the intersection line
U32 nxs = uintCast(nx)>>31; // == (nx< 0.0f)
U32 nys = uintCast(ny)>>31; // ==(ny < 0.0f)
/* Numerical precision: Due tocancellation, floats of approximately same exponent
should be subtracted first, beforeadding something of a different order of
magnitude. All brackets in the sequelare ESSENTIAL for numerical precision.
Change them and you will see more errorsin the BSP...
pMin is the lonely point in thecoordinate system with the origin at
bBox.bMinMax[0]
pMax is the lonely point in thecoordinate system with the origin at
bBox.bMinMax[1]
*/
float pMinx = p[Alone][dir[1]] -bBox.bMinMax[0][dir[1]];
float pMiny = p[Alone][dir[2]] -bBox.bMinMax[0][dir[2]];
float pMaxx = p[Alone][dir[1]] -bBox.bMinMax[1][dir[1]];
float pMaxy = p[Alone][dir[2]] -bBox.bMinMax[1][dir[2]];
// Determine coordinates of the boundingbox, however, with respect to p + a1 being the origin.
float boxx[2];
float boxy[2];
boxx[0] = (pMinx + a1x) * nx;
boxy[0] = (pMiny + a1y) * ny;
boxx[1] = (pMaxx + a1x) * nx;
boxy[1] = (pMaxy + a1y) * ny;
/* Test, whether line of intersection ofthe triangle and the split plane passes by the
bounding box. This is done by indexingthe coordinates of the bounding box by the
quadrant of the vector normal to theline of intersection. In fact this is
the nifty implementation of the 3d testintroduced by in the book with Haines:
"Real-Time Rendering"
By the indexing the vertices areselected, which are farthest from the line.
Note that the triangle CANNOT completelypass the current voxel, since it must have
a nonempty intersection with it.
*/
U32 resultS;
if(pMinx + MAX(a1x,a2x) < 0.0f) //line segment of intersection a1a2 left of box
resultS = uintCast(pMinx)>>31;
else if(pMiny + MAX(a1y,a2y) < 0.0f)// line segment of intersection a1a2 below box
resultS = uintCast(pMiny)>>31;
else if(pMaxx + MIN(a1x,a2x) > 0.0f)// line segment of intersection a1a2 right of box
resultS = (pMaxx > 0.0f);
else if(pMaxy + MIN(a1y,a2y) > 0.0f)// line segment of intersection a1a2 above box
resultS = (pMaxy > 0.0f);
else if(boxx[1^nxs] > boxy[nys])
// line passes beyond bbox ? =>triangle can only be on one side
resultS = (a1y*a2x > a1x*a2y);
// sign of cross product a1 x a2 ischecked to determine side
else if(boxx[nxs] < boxy[1^nys])
resultS = (a1y*a2x < a1x*a2y);
else
// Ok, now the triangle must be bothleft and right
{
stackList[currStackItems++] =itemsList[left_num_divItems];
unsorted_border--;
itemsList[left_num_divItems] =itemsList[unsorted_border];
continue;
}
if(swapLR != /*^*/ resultS)
{
unsorted_border--;
U32 t = itemsList[unsorted_border];
right_num_divItems--;
itemsList[right_num_divItems] =itemsList[left_num_divItems];
itemsList[left_num_divItems] = t;
}
else
left_num_divItems++;
}

CODE LISTING 2.4.1

Intersection Boundary::Intersect(Ray&ray) //ray.tfar is changed!
{
// Optimized inverse calculation (saves2 of 3 divisions)
float inv_tmp =(ray.d.dx*ray.d.dy)*ray.d.dz;
if((uintCast(inv_tmp)&0x7FFFFFFF)> uintCast(DIV_EPSILON))
{
inv_tmp = 1.0f / inv_tmp;
ray.inv_d.dx =(ray.d.dy*ray.d.dz)*inv_tmp;
ray.inv_d.dy = (ray.d.dx*ray.d.dz)*inv_tmp;
ray.inv_d.dz =(ray.d.dx*ray.d.dy)*inv_tmp;
}
else
{
ray.inv_d.dx =((uintCast(ray.d.dx)&0x7FFFFFFF) > uintCast(DIV_EPSILON)) ?
(1.0f / ray.d.dx) :INVDIR_LUT[uintCast(ray.d.dx) >> 31];
ray.inv_d.dy =((uintCast(ray.d.dy)&0x7FFFFFFF) > uintCast(DIV_EPSILON)) ?
(1.0f / ray.d.dy) :INVDIR_LUT[uintCast(ray.d.dy) >> 31];
ray.inv_d.dz =((uintCast(ray.d.dz)&0x7FFFFFFF) > uintCast(DIV_EPSILON)) ?
(1.0f / ray.d.dz) :INVDIR_LUT[uintCast(ray.d.dz) >> 31];
}
Intersection result;
result.tfar = ray.tfar;
result.tri_index = -1;
//
//BBox-Check
//
float tnear = 0.0f;
worldBBox.Clip(ray,tnear);
if(uintCast(ray.tfar) == 0x7F7843B0)//ray.tfar==3.3e38f //!!
return(result);
//
U32 current_bspStack = 1; //wegen emptystack case == 0
U32 node = 0;
//
//BSP-Traversal
//
const U32 whatnode[3] ={(uintCast(ray.inv_d.dx)>>27) & sizeof(BSPNODELEAF),
(uintCast(ray.inv_d.dy)>>27) &sizeof(BSPNODELEAF),
(uintCast(ray.inv_d.dz)>>27) &sizeof(BSPNODELEAF)};
U32 bspStackNode[128];
float bspStackFar[128];
float bspStackNear[128];
bspStackNear[0] = -3.4e38f; // sentinel
do
{
//Ist Node ein Leaf (type<0) oder nur neVerzweigung (type>=0)
while(((BSPNODELEAF&)bspNodes[node]).type >= 0)
{
//Split-Dimension (x|y|z)
U32 proj =((BSPNODELEAF&)bspNodes[node]).type & 3;
float distl =(((BSPNODELEAF&)bspNodes[node]).splitlr[whatnode[proj]>>4]
- ray.from[proj])*ray.inv_d[proj];
float distr =(((BSPNODELEAF&)bspNodes[node]).splitlr[(whatnode[proj]>>4)^1]
- ray.from[proj])*ray.inv_d[proj];
node =(((BSPNODELEAF&)bspNodes[node]).type - proj) | whatnode[proj];
//type & 0xFFFFFFF0
if(tnear <= distl)
{
if(ray.tfar >= distr)
{
bspStackNear[current_bspStack] =MAX(tnear,distr);
bspStackNode[current_bspStack] =node^sizeof(BSPNODELEAF);
bspStackFar[current_bspStack] =ray.tfar;
current_bspStack++;
}
ray.tfar = MIN(ray.tfar,distl);
}
else
if(ray.tfar >= distr)
{
tnear = MAX(tnear,distr);
node ^= sizeof(BSPNODELEAF);

}
else
goto stackPop;
}
//
//Faces-Intersect
... code omitted ...
//
//
//Hit gefunden?
//
do //!! NEEDS bspStackNear[0] = -3.4e38f;!!!!!
{
stackPop:
current_bspStack--;
tnear = bspStackNear[current_bspStack];
}while(result.tfar < tnear);
if(current_bspStack == 0)
return(result);
node = bspStackNode[current_bspStack];
ray.tfar =bspStackFar[current_bspStack];
} while (true);
} Computer program list

CODE LISTING 2.2.1

void Triangle :: Transform ()
{
Point * p = (Point *) this;
Vector n3d;
Vector n_abs = n3d = (p [1] -p [0]) | (p [2] -p [0]);
// search largest component forprojection (0 = x, 1 = y, 2 = z)
uintCast (n_abs.dx) & = 0x7FFFFFFF;
uintCast (n_abs.dy) & = 0x7FFFFFFF;
uintCast (n_abs.dz) & = 0x7FFFFFFF;
// Degenerated Triangles must be handled (set edge-signs)
if (! ((n_abs.dx + n_abs.dy + n_abs.dz)> DEGEN_TRI_EPSILON))
// (! (...)> EPS) to handle NaN's
{
d = p [0] .x;
p0.u = -p [0] .y;
p0.v = -p [0] .z;
nu = nv = 0.0f;
e [0] .u = e [1] .u = e [0] .v = e [1] .v = 1.0f;
return;
}
U32 axis = 2;
if (n_abs.dx> n_abs.dy)
{
if (n_abs.dx> n_abs.dz)
axis = 0;
}
else if (n_abs.dy> n_abs.dz)
axis = 1;
Point p03d = p [0];
Point p13d = p [1];
Point p23d = p [2];
float t_inv = 2.0f / n3d [axis];
e [0] .u = (p23d [PlusOneMod3 [axis]]-p03d [PlusOneMod3 [axis]]) * t_inv;
e [0] .v = (p23d [PlusOneMod3 [axis + 1]]-p03d [PlusOneMod3 [axis + 1]]) * t_inv;
e [1] .u = (p13d [PlusOneMod3 [axis]]-p03d [PlusOneMod3 [axis]]) * t_inv;
e [1] .v = (p13d [PlusOneMod3 [axis + 1]]-p03d [PlusOneMod3 [axis + 1]]) * t_inv;
t_inv * = 0.5f;
nu = n3d [PlusOneMod3 [axis]] * t_inv;
nv = n3d [PlusOneMod3 [axis + 1]] * t_inv;
p0.u = -p03d [PlusOneMod3 [axis]];
p0.v = -p03d [PlusOneMod3 [axis + 1]];
d = p03d [axis] + nu * p03d [PlusOneMod3 [axis]] + nv * p03d [PlusOneMod3 [axis + 1]];
}

CODE LISTING 2.2.2

U32 * idx = pointer_to_face_indices;
U32 ofs = projection_case;
for (U32 ii = num_triData; ii; ii-, idx ++)
{
float t = (triData [* idx] .d -ray.from [ofs]
-triData [* idx] .nu * ray.from [PlusOneMod3 [ofs]]
-triData [* idx] .nv * ray.from [PlusOneMod3 [ofs + 1]])
/ (ray.d [ofs] + triData [* idx] .nu * ray.d [PlusOneMod3 [ofs]]
+ triData [* idx] .nv * ray.d [PlusOneMod3 [ofs + 1]]);
if (uintCast (t) -1> uintCast (result.tfar)) //-1 for + 0.0f
continue;
float h1 = t * ray.d [PlusOneMod3 [ofs]] + ray.from [PlusOneMod3 [ofs]]
+ triData [* idx] .p0.u;
float h2 = t * ray.d [PlusOneMod3 [ofs + 1]] + ray.from [PlusOneMod3 [ofs + 1]]
+ triData [* idx] .p0.v;
float u = h1 * triData [* idx] .e [0] .v -h2 * triData [* idx] .e [0] .u;
float v = h2 * triData [* idx] .e [1] .u -h1 * triData [* idx] .e [1] .v;
float uv = u + v;
if ((uintCast (u) | uintCast (v) | uintCast (uv))> 0x40000000)
continue;
result.tfar = t;
result.tri_index = * idx;
}

CODE LISTING 2.3.1

Point * p = (Point *) & triData [tri_index];
int boxMinIdx, boxMaxIdx;
// boxMinIdx and boxMaxIdx index thesmallest and largest vertex of the triangle
// in the component dir [0] of the splitplane
if (p [0] [dir [0]] <p [1] [dir [0]])
{
if (p [2] [dir [0]] <p [0] [dir [0]])
{
boxMinIdx = 2;
boxMaxIdx = 1;
}
else
{
boxMinIdx = 0;
boxMaxIdx = p [2] [dir [0]] <p [1] [dir [0]]? 1: 2;
}
}
else
{
if (p [2] [dir [0]] <p [1] [dir [0]])
{
boxMinIdx = 2;
boxMaxIdx = 0;
}
else
{
boxMinIdx = 1;
boxMaxIdx = p [2] [dir [0]] <p [0] [dir [0]]? 0: 2;
}
}
/ * If the triangle is in the split planeor completely on one side of the split plane
is decided without any numerical errors, ie at the precision the triangle is
entered to the rendering system.Usingepsilons here is wrong and not necessary.
* /
if ((p [boxMinIdx] [dir [0]] == split) && (p [boxMaxIdx] [dir [0]] == split)) // in split plane?
{
on_splitItems ++;
if (split <middle_split) // put tosmaller volume
left_num_divItems ++;
else
{
unsorted_border--;
U32 t = itemsList [unsorted_border];
right_num_divItems--;
itemsList [right_num_divItems] = itemsList [left_num_divItems];
itemsList [left_num_divItems] = t;
}
}
else if (p [boxMaxIdx] [dir [0]] <= split) // triangle completely left?
left_num_divItems ++;
else if (p [boxMinIdx] [dir [0]]> = split) // triangle completely right?
{
unsorted_border--;
U32 t = itemsList [unsorted_border];
right_num_divItems--;
itemsList [right_num_divItems] = itemsList [left_num_divItems];
itemsList [left_num_divItems] = t;
}
else
// and now detailed decision, trianglemust intersect split plane ...
{
/ * In the sequel we determine whether atriangle should go left and / or right, where we already know that it mustintersect the split plane in a line segment.
All computations are ordered so that themore precise computations are done first.Scalar products and cross productsare evaluated last.
In some situations it may be necessaryto expand the bounding box by
an epsilon. This, however, will blow up the required memory by large amounts.
If such a situation is encountered, itmay be better to analyze it numerically in order not to use any epsilons ...
Arriving here we know thatp [boxMaxIdx] [dir [0]] <split <p [boxMaxIdx] [dir [0]]
and that p [boxMidIdx] [dir [0]] \ in [p [boxMaxIdx] [dir [0]], p [boxMaxIdx] [dir [0]]].
We also know, that the triangle has anon-empty intersection with the current
voxel.The triangle also cannot lie inthe split plane, and its vertices cannot
lie on one side only.
* /
int boxMidIdx = 3-boxMaxIdx -boxMinIdx; // missing index, found by 3 = 0 + 1 + 2
/ * We now determine the vertex that isalone on one side of the split plane.
Depending on whether the lonely vertexis on the left or right side,
we have to later swap the decision, whether the
triangle should be going to the left orright.
* /
int Alone = (split <p [boxMidIdx] [dir [0]])? boxMinIdx: boxMaxIdx;
int NotAlone = 3-Alone-boxMidIdx;
// == (split <p [boxMidIdx] [dir [0]])? boxMaxIdx: boxMinIdx;
// since sum of idx = 3 = 0 + 1 + 2
float dist = split-p [Alone] [dir [0]];
U32 swapLR = uintCast (dist) >>31; // == p [Alone] [dir [0]]>split;
/ * Now the line segments connecting thelonely vertex with the remaining two verteces
are intersected with the split plane.a1and a2 are the intersection points.
The special case "if (p [boxMidIdx] [dir [0]] == split)" [yields ax / x, which could
be optimized] does not help at all sinceit only can happen as often as the highest
valence of a vertex of the mesh is ...
* /
float at = dist / (p [boxMidIdx] [dir [0]]-p [Alone] [dir [0]]);
float at2 = dist / (p [NotAlone] [dir [0]]-p [Alone] [dir [0]]);
float a1x = (p [boxMidIdx] [dir [1]] -p [Alone] [dir [1]]) * at;
float a1y = (p [boxMidIdx] [dir [2]] -p [Alone] [dir [2]]) * at;
float a2x = (p [NotAlone] [dir [1]] -p [Alone] [dir [1]]) * at2;
float a2y = (p [NotAlone] [dir [2]]-p [Alone] [dir [2]]) * at2;
// n is a vector normal to the line ofintersection a1a2 of the triangle
// and the split plane
float nx = a2y-a1y;
float ny = a2x-a1x;
// The signs indicate the quadrant of the vector normal to the intersection line
U32 nxs = uintCast (nx) >>31; // == (nx <0.0f)
U32 nys = uintCast (ny) >>31; // == (ny <0.0f)
/ * Numerical precision: Due tocancellation, floats of approximately same exponent
should be subtracted first, beforeadding something of a different order of
magnitude.All brackets in the sequelare ESSENTIAL for numerical precision.
Change them and you will see more errors in the BSP ...
pMin is the lonely point in thecoordinate system with the origin at
bBox.bMinMax [0]
pMax is the lonely point in thecoordinate system with the origin at
bBox.bMinMax [1]
* /
float pMinx = p [Alone] [dir [1]] -bBox.bMinMax [0] [dir [1]];
float pMiny = p [Alone] [dir [2]] -bBox.bMinMax [0] [dir [2]];
float pMaxx = p [Alone] [dir [1]] -bBox.bMinMax [1] [dir [1]];
float pMaxy = p [Alone] [dir [2]] -bBox.bMinMax [1] [dir [2]];
// Determine coordinates of the boundingbox, however, with respect to p + a1 being the origin.
float boxx [2];
float boxy [2];
boxx [0] = (pMinx + a1x) * nx;
boxy [0] = (pMiny + a1y) * ny;
boxx [1] = (pMaxx + a1x) * nx;
boxy [1] = (pMaxy + a1y) * ny;
/ * Test, whether line of intersection of the triangle and the split plane passes by the
this is done by indexing the coordinates of the bounding box by the
quadrant of the vector normal to theline of intersection.In fact this is
the nifty implementation of the 3d testintroduced by in the book with Haines:
"Real-Time Rendering"
By the indexing the vertices areselected, which are farthest from the line.
Note that the triangle CANNOT completelypass the current voxel, since it must have
a nonempty intersection with it.
* /
U32 resultS;
if (pMinx + MAX (a1x, a2x) <0.0f) // line segment of intersection a1a2 left of box
resultS = uintCast (pMinx) >>31;
else if (pMiny + MAX (a1y, a2y) <0.0f) // line segment of intersection a1a2 below box
resultS = uintCast (pMiny) >>31;
else if (pMaxx + MIN (a1x, a2x)> 0.0f) // line segment of intersection a1a2 right of box
resultS = (pMaxx>0.0f);
else if (pMaxy + MIN (a1y, a2y)> 0.0f) // line segment of intersection a1a2 above box
resultS = (pMaxy>0.0f);
else if (boxx [1 ^ nxs]> boxy [nys])
// line passes beyond bbox? => triangle can only be on one side
resultS = (a1y * a2x> a1x * a2y);
// sign of cross product a1 x a2 ischecked to determine side
else if (boxx [nxs] <boxy [1 ^ nys])
resultS = (a1y * a2x <a1x * a2y);
else
// Ok, now the triangle must be bothleft and right
{
stackList [currStackItems ++] = itemsList [left_num_divItems];
unsorted_border--;
itemsList [left_num_divItems] = itemsList [unsorted_border];
continue;
}
if (swapLR! = / * ^ * / resultS)
{
unsorted_border--;
U32 t = itemsList [unsorted_border];
right_num_divItems--;
itemsList [right_num_divItems] = itemsList [left_num_divItems];
itemsList [left_num_divItems] = t;
}
else
left_num_divItems ++;
}

CODE LISTING 2.4.1

Intersection Boundary :: Intersect (Ray & ray) //ray.tfar is changed!
{
// Optimized inverse calculation (saves2 of 3 divisions)
float inv_tmp = (ray.d.dx * ray.d.dy) * ray.d.dz;
if ((uintCast (inv_tmp) &0x7FFFFFFF)> uintCast (DIV_EPSILON))
{
inv_tmp = 1.0f / inv_tmp;
ray.inv_d.dx = (ray.d.dy * ray.d.dz) * inv_tmp;
ray.inv_d.dy = (ray.d.dx * ray.d.dz) * inv_tmp;
ray.inv_d.dz = (ray.d.dx * ray.d.dy) * inv_tmp;
}
else
{
ray.inv_d.dx = ((uintCast (ray.d.dx) &0x7FFFFFFF)> uintCast (DIV_EPSILON))?
(1.0f / ray.d.dx): INVDIR_LUT [uintCast (ray.d.dx) >>31];
ray.inv_d.dy = ((uintCast (ray.d.dy) &0x7FFFFFFF)> uintCast (DIV_EPSILON))?
(1.0f / ray.d.dy): INVDIR_LUT [uintCast (ray.d.dy) >>31];
ray.inv_d.dz = ((uintCast (ray.d.dz) &0x7FFFFFFF)> uintCast (DIV_EPSILON))?
(1.0f / ray.d.dz): INVDIR_LUT [uintCast (ray.d.dz) >>31];
}
Intersection result;
result.tfar = ray.tfar;
result.tri_index = -1;
//
// BBox-Check
//
float tnear = 0.0f;
worldBBox.Clip (ray, tnear);
if (uintCast (ray.tfar) == 0x7F7843B0) // ray.tfar == 3.3e38f // !!
return (result);
//
U32 current_bspStack = 1; // wegen emptystack case == 0
U32 node = 0;
//
// BSP-Traversal
//
const U32 whatnode [3] = {(uintCast (ray.inv_d.dx) >> 27) & sizeof (BSPNODELEAF),
(uintCast (ray.inv_d.dy) >> 27) & sizeof (BSPNODELEAF),
(uintCast (ray.inv_d.dz) >> 27) & sizeof (BSPNODELEAF)};
U32 bspStackNode [128];
float bspStackFar [128];
float bspStackNear [128];
bspStackNear [0] = -3.4e38f; // sentinel
do
{
// Ist Node ein Leaf (type <0) oder nur neVerzweigung (type> = 0)
while (((BSPNODELEAF &) bspNodes [node]). type> = 0)
{
// Split-Dimension (x | y | z)
U32 proj = ((BSPNODELEAF &) bspNodes [node]). Type &3;
float distl = ((((BSPNODELEAF &) bspNodes [node]). splitlr [whatnode [proj] >> 4]
-ray.from [proj]) * ray.inv_d [proj];
float distr = ((((BSPNODELEAF &) bspNodes [node]). splitlr [(whatnode [proj] >> 4) ^ 1]
-ray.from [proj]) * ray.inv_d [proj];
node = (((BSPNODELEAF &) bspNodes [node]). type-proj) | whatnode [proj];
// type & 0xFFFFFFF0
if (tnear <= distl)
{
if (ray.tfar> = distr)
{
bspStackNear [current_bspStack] = MAX (tnear, distr);
bspStackNode [current_bspStack] = node ^ sizeof (BSPNODELEAF);
bspStackFar [current_bspStack] = ray.tfar;
current_bspStack ++;
}
ray.tfar = MIN (ray.tfar, distl);
}
else
if (ray.tfar> = distr)
{
tnear = MAX (tnear, distr);
node ^ = sizeof (BSPNODELEAF);

}
else
goto stackPop;
}
//
// Faces-Intersect
... code omitted ...
//
//
// Hit gefunden?
//
do // !! NEEDS bspStackNear [0] = -3.4e38f; !!!!!
{
stackPop:
current_bspStack--;
tnear = bspStackNear [current_bspStack];
} while (result.tfar <tnear);
if (current_bspStack == 0)
return (result);
node = bspStackNode [current_bspStack];
ray.tfar = bspStackFar [current_bspStack];
} while (true);
}

Claims

A computer graphics system for generating pixel values of pixels in an image;
The pixel values represent points in the scene as recorded on the image plane of the simulated camera;
The computer graphics system generates the pixel value of an image using a selected ray tracing method that includes simulating at least one ray from the pixel into the scene along a selected direction. Configured to
The ray-tracing method further includes the steps of calculating the intersection of a ray with a surface of an object in the scene, and simulating a trajectory of the ray that illuminates the object in the scene, simulating the trajectory The computer graphics system , wherein the step uses a Markov chain,
Simulating the Markov chain using a quasi-Monte Carlo method,
Said step of simulating a Markov chain comprises sorting states;
A computer graphics system wherein the step of sorting includes proximity sorting .

The computer graphics system of claim 1, comprising simultaneously simulating a plurality of Markov chains .

The computer graphics system of claim 2, wherein the step of simultaneously simulating a plurality of Markov chains comprises utilizing a quasi-Monte Carlo point .

4. The computer graphics system of claim 3, wherein the quasi-Monte Carlo points are selected using any of a Halton sequence, a Halton sequence variant, a grid sequence, or a (t, s) -sequence .

The computer graphics system of claim 2, wherein the step of simultaneously simulating a plurality of Markov chains utilizes an arbitrary number of chains .

The computer graphics system of claim 2, wherein the step of simultaneously simulating a plurality of Markov chains further comprises the step of adding additional trajectories on the fly .

The computer graphics system of claim 2, wherein the step of simultaneously simulating a plurality of Markov chains further comprises trajectory splitting .

The computer graphics system of claim 2, wherein the step of simultaneously simulating a plurality of Markov chains comprises utilizing a technique for simulating particle absorption .

9. The computer graphics system of claim 8, wherein the step of simultaneously simulating multiple Markov chains further comprises using a Russian roulette method .

For the s-dimensional problem, said step of simultaneously simulating multiple Markov chains
n: = 0
Initialize X0, l using quasi-Monte Carlo point xl Loop:
-Sort the state vector using the appropriate order-n: = n + 1
The computer graphics system according to claim 2, comprising the step of continuing the chain by calculating Xn, l using the next quasi-Monte Carlo point xl .

The computer graphics system of claim 10, comprising ordering states by proximity in the state space .

The computer graphics system of claim 11, comprising randomization .

13. The computer graphics system of claim 12, comprising utilizing any selected point set or point sequence with any selected randomization .

Randomization and the following techniques:
n: = 0, select a sequence of points xi Instantiate randomization Initialize X0, l using rand (xl) Loop sort the state vector using some order n: = N + 1
12. The computer graphics system of claim 11, comprising: instantiating randomization; applying continuation of chain by calculating Xn, l using rand (xl) .

Randomizing the quasi-Monte Carlo points at each time step n;
Said step of randomizing comprises:
Selecting a (t, s) -sequence of radix b = 2 from which subsequent samples are taken;
The computer graphics system of claim 11, comprising performing an XOR operation on the sample with an s-dimensional random vector, wherein the random vector is generated after each transition step .

The computer graphics system of claim 15, comprising adding a selected number of additional trajectories on the fly .

The computer graphics system of claim 16, comprising trajectory splitting .

The computer graphics system of claim 17, comprising utilizing a technique for simulating particle absorption .

The computer graphics system of claim 18, comprising using a Russian roulette method .

Sorting by brightness or state norm;
Sorting using a spatial hierarchy;
And sorting using buckets enumerated by at least one space-filling curve .

21. The computer graphics system of claim 20, wherein sorting using a spatial hierarchy includes utilizing a binary hierarchy .

The computer graphics system of claim 21, wherein the spatial hierarchy includes any of a BSP tree, a kD tree, or any other axis-parallel subset of a BSP tree structure, or a boundary volume hierarchy, or regular voxels .

The binary hierarchy recursively subdivides using the plane selected by the selected heuristic,
21. The computer graphics system of claim 20, wherein the computer graphics system is then constructed by traversing the hierarchy in order to enumerate the leaves of the hierarchy in proximity order .

24. The computer graphics system of claim 23, comprising left balancing the tree and storing the tree in an array data structure to allow efficient traversal of the hierarchy .

The spatial hierarchy includes leaves;
25. The computer graphics system of claim 24, wherein sorting using the spatial hierarchy further comprises utilizing bucket sorting and a selected space filling curve .

25. The computer graphics system of claim 24, wherein sorting using the spatial hierarchy further comprises using regular voxels .

Delimiting the object to be rendered by an axis parallel bounding box;
By recursively or iteratively dividing the bounding box into successive left and right branches, each ending with a leaf, by applying a dividing plane that passes through a selected point of the object Generating a plurality of leaves;
Ordering the leaves hierarchically according to their brightness;
Performing a bucket sort on a matrix of a selected size divided into a selected number of buckets;
Using a selected space filling curve to follow within the matrix, wherein within each bucket a smaller space filling curve is used to follow within individual cells within the matrix. including the door, the computer graphics system of claim 25.

The computer graphics system of claim 2, wherein the simulation is operable for a configuration for solving an integral equation .

30. The computer graphics system of claim 28, wherein the simulation is operable to allow light transport simulation to synthesize a real-looking image .

The integral equation underlying the light transport simulation is reformulated as a path integral so that sampling the path space corresponds to simulating a Markov chain, where the path is a ray tracing and scattering event. Established by
30. The computer of claim 29, wherein the initial distribution is determined by the emission characteristics of the light source or sensor, and the transition probability is determined by a bidirectional reflection distribution function and a bidirectional subsurface scattering distribution function for the surface in the image. Graphics system .

31. The computer graphics system of claim 30, comprising processing a transmission effect by utilizing the path integral, initial distribution, and transition probability .

31. The computer graphics system of claim 30, wherein the path integral is solved by utilizing a high-dimensional quasi-Monte Carlo point or by padding a low-dimensional quasi-Monte Carlo point .

33. The computer graphics system of claim 32, wherein the integral equation underlying a light transport simulation is considered a Fredholm integral equation or a Volterra integral equation .

A computer graphics system for generating pixel values of pixels in an image;
The pixel values represent points in the scene as recorded on the image plane of the simulated camera;
The computer graphics system generates the pixel value of an image using a selected ray tracing method that includes simulating at least one ray from the pixel into the scene along a selected direction. Configured to
The ray-tracing method further includes calculating intersections of rays and surfaces of objects in the scene; and simulating trajectories of rays that illuminate objects in the scene;
The method in the computer graphics system, wherein the step of simulating a trajectory uses a Markov chain,
Simulating the Markov chain using a quasi-Monte Carlo method,
Said step of simulating a Markov chain comprises sorting states;
The method wherein the step of sorting includes proximity sorting.

A computer graphics system for generating pixel values of pixels in an image;
The pixel values represent points in the scene as recorded on the image plane of the simulated camera;
The computer graphics system generates the pixel value of an image using a selected ray tracing method that includes simulating at least one ray from the pixel into the scene along a selected direction. Configured to
The ray-tracing method further includes calculating intersections of rays and surfaces of objects in the scene; and simulating trajectories of rays that illuminate objects in the scene;
The step of simulating a trajectory is a system in the computer graphics system using a Markov chain,
Means for simulating the Markov chain using a quasi-Monte Carlo method;
Said means for simulating a Markov chain comprises means for sorting states;
The system wherein the means for sorting comprises means for proximity sorting.

A computer graphics system for generating pixel values of pixels in an image;
The pixel values represent points in the scene as recorded on the image plane of the simulated camera;

The computer graphics system generates the pixel value of an image using a selected ray tracing method that includes simulating at least one ray from the pixel into the scene along a selected direction. Configured to
The ray-tracing method further includes calculating intersections of rays and surfaces of objects in the scene; and simulating trajectories of rays that illuminate objects in the scene;
The step of simulating a trajectory is a computer executable program that causes a computer to function as the computer graphics system using a Markov chain ;
In the computer,
Computer code means for simulating the Markov chain using a quasi-Monte Carlo method ;
Said computer code means for simulating a Markov chain, computer code means for sorting states ;
A computer-executable program for causing the computer code means for sorting to implement computer code means for proximity sorting.

The computer graphics system of claim 19 including simulating an inhomogeneous Markov chain .