JP2003346178A

JP2003346178A - Parallel processing method of radiosity and its device

Info

Publication number: JP2003346178A
Application number: JP2002151536A
Authority: JP
Inventors: Yoshiki Arakawa; 佳樹荒川; Daisuke Iwamoto; 大輔岩本
Original assignee: Telecommunications Advancement Organization
Current assignee: Telecommunications Advancement Organization
Priority date: 2002-05-24
Filing date: 2002-05-24
Publication date: 2003-12-05
Anticipated expiration: 2022-05-24
Also published as: JP4122379B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a parallel processing method of a radiosity method and its device capable of realizing efficient load distribution by reducing communication data quantity under processing when simulating by using the radiosity method. <P>SOLUTION: When defining the surface of a shape model in a scene as patches and allocating the patches to each node, distribution is performed so that the total area of the patches becomes equal. Namely this method is characterized by adopting a total area identification standard as a distribution standard in parallel processing. <P>COPYRIGHT: (C)2004,JPO

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ラジオシティの並
列処理方法及びその装置に関し、さらに詳しくは、ラジ
オシティ３次元コンピュータ・グラフィックス処理の高
速処理化を可能とするラジオシティの並列処理方法及び
その装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a radiosity parallel processing method and apparatus, and more particularly, to a radiosity parallel processing method and radiosity three-dimensional computer graphics processing capable of high-speed processing. Regarding the device.

【０００２】[0002]

【従来の技術】ＢＳデジタル放送では、ＨＤ（ハイビジ
ョン）画像処理通信技術（放送技術）が中核技術となっ
ており、高精細な実画像処理通信（放送）を実現してい
る。一方、「実画像」に比べて、コンピュータ・グラフ
ィックス（ＣＧ）の高画質化・リアルタイム処理化技術
はあまり進んでいない。2. Description of the Related Art In BS digital broadcasting, HD (high definition) image processing communication technology (broadcasting technology) is a core technology, and high-definition real image processing communication (broadcasting) is realized. On the other hand, compared to "real images", computer graphics (CG) technology for realizing high image quality and real-time processing has not advanced much.

【０００３】ネットワーク技術は、ギガビットレベルか
らテラレベルに数年内に移行するであろう。すなわち、
次世代の通信では、ＨＤを超える超高精細画像が、現在
の放送のような無線だけではなく、広帯域のネットワー
クを用いてデータ通信されるであろう。このような次世
代の超高精細画像データ通信では、実画像はもちろん、
ＣＧに関しても超高画質化かつリアルタイム処理伝送が
強く求められるであろう。[0003] Network technology will transition from gigabit levels to tera levels within a few years. That is,
In the next generation of communication, ultra-high-definition images exceeding HD will be data-communicated using a wideband network as well as wireless as in current broadcasting. In such next-generation ultra-high-definition image data communication, not only real images,
For CG, ultra-high image quality and real-time processing transmission will be strongly required.

【０００４】近年の工業製品のデザイン・設計は、ＣＧ
なしでは考えられない。しかしながら、現在のＣＧシス
テムでは、画質（解像度）、処理速度等、まだまだいろ
んな点で不十分である。例えば、自動車設計の場合、Ｃ
Ｇモデルと、実物をスチルカメラで取った写真とを比較
すると、前者の方がリアリティ、質感の点でまだまだか
なり劣る。また、建築分野においては、高いリアリティ
を持つ光のシミュレーション機能を利用した、建築物等
の景観シミュレーション、インテリアデザイン機能が強
く望まれている。[0004] In recent years, the design of industrial products has been
I can't think without it. However, the current CG system is still insufficient in various points such as image quality (resolution) and processing speed. For example, in the case of car design, C
Comparing the G model with a still camera photograph, the former is still far inferior in terms of reality and texture. Also, in the field of architecture, there is a strong demand for a landscape simulation of a building or the like and an interior design function using a light simulation function having high reality.

【０００５】さらに、最近では、遠隔地間で３次元ＣＧ
データを伝送し、工業製品のデザイン・設計を行うこと
が増えているが、データ量が大きいためリアルタイムで
これを行うレベルにはまだ達していない。Further, recently, three-dimensional CG has been used between remote locations.
Increasingly, data is transmitted to design industrial products. However, due to the large amount of data, it has not yet reached the level of performing this in real time.

【０００６】リアルな３次元ＣＧ空間を生成するには、
光の物理的な振る舞いを計算機上でシミュレーション
し、実写映像に迫るリアリティを追求することが必要不
可欠である。このようなリアルな３次元ＣＧ空間を生成
する技術であるラジオシティ法は、光の物理的な振る舞
い（反射、拡散、写り込み、陰影等）を計算機上の数
理物理モデルを用いて計算し、間接光による光をシミュ
レーションすることにより、フォトリアリスティックな
空間を生成する手法である。To generate a realistic three-dimensional CG space,
It is essential to simulate the physical behavior of light on a computer and to pursue reality that approaches live-action video. The radiosity method, which is a technique for generating such a realistic three-dimensional CG space, calculates the physical behavior of light (reflection, diffusion, reflection, shadow, etc.) using a mathematical physical model on a computer. This is a method of generating a photorealistic space by simulating light due to indirect light.

【０００７】[0007]

【発明が解決しようとする課題】３次元ＣＧのレンダリ
ング手法の一つであるラジオシティ法は、伝熱工学を応
用した大域照明モデルによる画像生成手法であり、室内
照明の高品位ＣＧを生成する際などによく用いられてい
る方法である。ラジオシティ法では、光源からの直接光
だけでなく物体間の相互拡散反射も考慮に入れて画像を
計算するため、線光源・面光源が作る不均一な影や、間
接照明が多い室内などの表現に適し、非常に現実感の高
い画像を生成できるのが特徴である。しかし、ラジオシ
ティ法では、フォームファクタを求めるのに計算時間の
大部分を占めている。フォームファクタ計算の並列化に
よる高速化が重要である。The radiosity method, which is one of the three-dimensional CG rendering methods, is an image generation method based on a global illumination model to which heat transfer engineering is applied, and generates a high-quality CG of indoor lighting. This is a method that is often used in some cases. In the radiosity method, the image is calculated taking into account not only the direct light from the light source but also the interdiffuse reflection between objects. The feature is that it is suitable for expression and can generate a very realistic image. However, in the radiosity method, a large part of the calculation time is required to determine the form factor. It is important to speed up form factor calculation by parallelization.

【０００８】本発明が解決しようとする課題は、ラジオ
シティ法を用いてシミュレーションする際に、処理中の
通信データ量を少なくし、効率の良い負荷分散を実現す
ることができるラジオシティの並列処理方法及びその装
置を提供することにある。An object of the present invention is to provide a radiosity parallel processing that can reduce the amount of communication data being processed and realize efficient load distribution when performing a simulation using the radiosity method. It is an object of the present invention to provide a method and an apparatus therefor.

【０００９】[0009]

【課題を解決するための手段】上記課題を解決するため
に、本発明に係るラジオシティの並列処理方法は、シー
ン中の形状モデルの表面をパッチとして定義する工程
と、複数のノードを有するプロセッサの前記各ノードに
既に割り当てられた前記パッチの総面積の大小関係を判
断する工程と、既に割り当てられた前記パッチの総面積
が最小であるノードに次のパッチを割り当てる工程とを
備えていることを要旨とする。In order to solve the above problems, a radiosity parallel processing method according to the present invention includes a step of defining a surface of a shape model in a scene as a patch, and a processor having a plurality of nodes. Determining the magnitude relationship of the total area of the patches already allocated to the respective nodes, and allocating the next patch to the node having the smallest total area of the patches already allocated to the respective nodes. Is the gist.

【００１０】また、本発明に係るラジオシティの並列処
理装置は、複数のノードを有するプロセッサと、シーン
中の形状モデルの表面をパッチとして定義する手段と、
前記各ノードに既に割り当てられた前記パッチの総面積
の大小関係を判断する手段と、既に割り当てられた前記
パッチの総面積が最小であるノードに次のパッチを割り
当てる手段とを備えていることを要旨とする。A radiosity parallel processing apparatus according to the present invention includes a processor having a plurality of nodes, means for defining a surface of a shape model in a scene as a patch,
Means for determining the magnitude relationship of the total area of the patches already allocated to each of the nodes, and means for allocating the next patch to the node having the minimum total area of the patches already allocated to the nodes. Make a summary.

【００１１】本発明は、形状を構成する面をグループ化
し、この面グループを各ＣＰＵに分配し並列処理する方
法及び装置に関する。この並列処理手法の特徴は、ＣＰ
Ｕ間のデータ通信量が少なくて済むことである。すなわ
ち、各ＣＰＵ間で通信されるのは、各面グループ（各Ｃ
ＰＵ）において、受光面のエネルギー値が最大となる面
データ（１面）のみである。The present invention relates to a method and an apparatus for grouping surfaces constituting a shape, distributing the surface groups to respective CPUs, and performing parallel processing. The feature of this parallel processing method is that
The data communication amount between U is small. That is, what is communicated between the CPUs is each face group (each C
PU), there is only surface data (one surface) at which the energy value of the light receiving surface is maximum.

【００１２】本発明は、この演算量の均等化基準とし
て、合計面積同一化基準を採用した。この基準は、各面
グループに含まれる面の面積の総和が、他の面グループ
とできる限り同一となるように、面をグループ化する方
法である。そのため、各ＣＰＵの演算量が均等化され、
高い並列処理度を実現することができる。In the present invention, a total area equalization criterion is employed as a criterion for equalizing the calculation amount. This criterion is a method of grouping faces such that the sum of the areas of the faces included in each face group is as identical as possible to other face groups. Therefore, the calculation amount of each CPU is equalized,
A high degree of parallel processing can be realized.

【００１３】[0013]

【発明の実施の形態】以下に、本発明の一実施の形態に
ついて図面を参照しながら詳細に説明する。本発明で
は、負荷分散を考慮に入れたラジオシティ法の並列計算
モデルを提案し、その特性を明らかにする。基本となる
アルゴリズムは、フォームファクタの計算の際に物体を
構成する形状の面（以下：パッチ）を均等に分散する方
法を用いる。この方法は、各プロセッサ間の通信量がす
くなくて良いため、ＰＣクラスタ計算機システムのよう
に通信速度が比較的遅い分散メモリ型の並列計算機で有
効であると考えられる。Embodiments of the present invention will be described below in detail with reference to the drawings. In the present invention, a parallel computation model of the radiosity method taking into account load balancing is proposed and its characteristics are clarified. The basic algorithm uses a method of uniformly distributing the surfaces (hereinafter, referred to as patches) of the shape constituting the object when calculating the form factor. This method is considered to be effective for a distributed memory type parallel computer having a relatively low communication speed such as a PC cluster computer system because the communication amount between the processors does not need to be small.

【００１４】ただし、単純に初期面を各プロセッサに分
割したのでは、ラジオシティ計算のための細分割のメッ
シュ（以下：エレメント）の偏りが原因となり各プロセ
ッサ間の処理時間に差が生じる。そのため、各プロセッ
サの処理時間を均等になるように割り当てる必要があ
る。このアルゴリズムを超並列計算機ＰＣ−クラスタに
実装し、アルゴリズムの有効性と超並列計算上で実行す
る際の問題点について検討を行う。However, if the initial plane is simply divided into the respective processors, there is a difference in the processing time between the processors due to the bias of the mesh (hereinafter, element) of the subdivision for the radiosity calculation. Therefore, it is necessary to allocate the processing time of each processor so as to be equal. This algorithm is implemented in a massively parallel computer PC-cluster, and the effectiveness of the algorithm and problems in executing it on massively parallel computing are examined.

【００１５】１．超並列計算機ＰＣ−クラスタ並列計算機では、ＣＰＵとメモリの位置関係によって、
共有メモリ型と分散メモリ型に大別される。共有メモ
リ型（図１）では、１つのメモリを中心として複数のプ
ロセッサがこれにつながっている。この型の利点は、
プログラミング時にデータ分割を考慮に入れる必要がな
いため、自動並列化を容易に行うことができる点であ
る。さらに、メモリ間通信が必要ないため、プロセッ
サ数が少ない場合は性能が高まるのも利点である。し
かし、プロセッサ数が多くなると、他タスク・他ジョブ
とのメモリアクセス競合により通信が混み合い、性能が
低下してしまうという欠点もある。1. In the massively parallel computer PC-cluster parallel computer, depending on the positional relationship between the CPU and the memory,
It is roughly classified into a shared memory type and a distributed memory type. In the shared memory type (FIG. 1), a plurality of processors are connected to this centering on one memory. The advantage of this type is
Since there is no need to take into account data division during programming, automatic parallelization can be easily performed. Furthermore, since there is no need for inter-memory communication, there is an advantage that performance is improved when the number of processors is small. However, when the number of processors is increased, there is also a disadvantage that communication is crowded due to memory access competition with other tasks and other jobs, and performance is reduced.

【００１６】分散メモリ型（図２）では、１つのメモリ
と１つのプロセッサを１つの節（ノード）として、この
ノードを相互結合網で複数接続している。この型の利
点は、他タスク・他ジョブとのメモリアクセス競合によ
り通信が混み合うことがないため、全体的に性能が高ま
る点である。しかし、複数あるメモリの管理が難しい問
題もある。In the distributed memory type (FIG. 2), one memory and one processor are used as one node (node), and a plurality of nodes are connected by an interconnection network. The advantage of this type is that the communication is not crowded due to a memory access conflict with another task or another job, so that the overall performance is improved. However, there is also a problem that it is difficult to manage a plurality of memories.

【００１７】ラジオシティ法の並列処理を行うために図
３のような超並列計算機ＰＣ−クラスタに実装した。ま
た，並列化ライブラリとしてＭＰＩＣＨを用いた。実装
した超並列計算機ＰＣ−クラスタシステムの仕様を表１
に示す。In order to perform the parallel processing of the radiosity method, it was implemented in a massively parallel computer PC-cluster as shown in FIG. In addition, MPICH was used as a parallelized library. Table 1 shows the specifications of the implemented massively parallel computer PC-cluster system.
Shown in

【００１８】[0018]

【表１】 [Table 1]

【００１９】２．面分配による並列処理本発明ではパッチを分割し、それを各ノードに分散して
並列化する手法を考案した。この手法ではデータ管理を
行うホストとラジオシティ計算を行うノードによって並
列に処理を行う。2. Parallel Processing by Surface Distribution In the present invention, a method of dividing a patch, distributing the patch to each node, and parallelizing the patch is devised. In this method, processing is performed in parallel by a host that performs data management and a node that performs radiosity calculation.

【００２０】まず、ホストはシーンを構成している物体
モデルの入力データを読み込みそれをパッチとしての部
分領域に分割し、それによって出来た各部分領域のデー
タを作る。そして各部分領域のラジオシティ計算する担
当ノードを決める。次に、各部分領域のラジオシティ計
算に必要なデータを作成しそしてラジオシティ計算用デ
ータをそれぞれの決められたノードに送信する。データ
を送信した後はラジオシティ計算結果が戻って来るのを
待つ。そしてすべての部分領域のラジオシティ計算結果
を受信した後、ラジオシティの値を更新して、ラジオシ
ティの値が収束しているかどうかを判定する。収束して
いれば計算は終了、していなければ各部分領域のラジオ
シティ計算を作成するところに戻り同じ計算を行う。First, the host reads input data of an object model constituting a scene and divides the input data into partial areas as patches, and creates data of each partial area thus created. Then, a node in charge of calculating the radiosity of each partial area is determined. Next, the data necessary for the radiosity calculation of each partial area is created, and the data for the radiosity calculation is transmitted to each of the determined nodes. After transmitting the data, wait for the radiosity calculation result to return. Then, after receiving the radiosity calculation results of all the partial areas, the radiosity value is updated to determine whether the radiosity values have converged. If the convergence has been achieved, the calculation is completed. If not, the process returns to the step of creating the radiosity calculation of each partial area and the same calculation is performed.

【００２１】一方、部分領域のラジオシティ計算を担当
する各ノードはラジオシティ計算用データを受信して計
算を行い、計算終了後、計算結果をホストとほかのノー
ドに送信する。これを収束するまで繰り返し行う。ラ
ジオシティの処理には計算効率のよい漸進法を用いる。On the other hand, each node in charge of the radiosity calculation of the partial area receives the radiosity calculation data and performs the calculation, and after the calculation is completed, transmits the calculation result to the host and other nodes. This is repeated until convergence. The radiosity processing uses a computationally efficient progressive method.

【００２２】２．１各プロセッサへのデータ配置処理シーンデータの配置では、ポリゴンデータのコピーを全
ノードに持たせ、パッチデータを各ノードに面積の均等
化に分散させる。エレメントデータは、それを含むパッ
チデータが置かれるプロセッサが保持する。この方法で
は、全プロセッサがシーンへの描画に必要なポリゴンデ
ータを保持するため、並列フォームファクタ計算の際に
座標のデータをノード間で転送する必要なない。2.1 Data Arrangement Processing to Each Processor In the arrangement of scene data, copies of polygon data are provided to all nodes, and patch data is distributed to each node so as to equalize the area. Element data is held by a processor in which patch data including the element data is placed. In this method, since all processors hold polygon data necessary for drawing in a scene, it is not necessary to transfer coordinate data between nodes when calculating a parallel form factor.

【００２３】さらに、パッチデータとエレメントデータ
を各ノードに分散して配置し、各ノードは分散されたパ
ッチ内の最大ラジオシティ値を持つパッチの検索や各エ
レメントのラジオシティ値の更新を、保持するノードに
分散して行わせることで通信時間とメモリの節約でき
る。Further, the patch data and the element data are distributed and arranged at each node, and each node holds a search for a patch having a maximum radiosity value in the distributed patch and an update of the radiosity value of each element. The communication time and the memory can be saved by distributing the processing to the nodes that perform the communication.

【００２４】２．２分散パッチの面積均等化シーン中の形状モデルの表面を四角形と三角形のパッチ
として定義し、このパッチを各ノードに割り当てる時、
各々ノードのパッチの総面積が均等になるように各パッ
チを分散する。例えば、図４のようなシーンのパッチデ
ータをノード数が３台で並列化する場合のプロセッサへ
の割り当て方法を考える。ただし、（１）パッチＡからＫまで順に分散させることにす
る。（２）この処理は全てホストプロセッサのみで処理を
行う。2.2 Area Equalization of Distributed Patches The surface of a shape model in a scene is defined as quadrangular and triangular patches, and when this patch is assigned to each node,
Each patch is distributed so that the total area of the patches of each node is equal. For example, consider a method of assigning processors to a case where patch data of a scene as shown in FIG. 4 is parallelized with three nodes. However, (1) patches A to K are sequentially distributed. (2) All of this processing is performed only by the host processor.

【００２５】ステップ１．図５のように最初はデータ
配列の順番にパッチを１つずつ各ノードへ割り当てる。ステップ２．次は各ノードに割り当てられたパッチの
総面積の小さいものから次のパッチを割り当てていく
（図６参照）。以上処理を繰り返すと、大まかに図７の様な結果となる
とする。Step 1. First, as shown in FIG. 5, patches are assigned to each node one by one in the order of the data arrangement. Step 2. Next, the next patch is assigned from the one with the smaller total area of the patches assigned to each node (see FIG. 6). When the above processing is repeated, it is assumed that the result is roughly as shown in FIG.

【００２６】この時点でどのパッチをどのノードに分散
計算処理させるかが決定されるのでパッチの情報を保持
するパッチデータの構造体に表２のように反映する。At this time, which patch is to be distributed and processed by which node is determined, and is reflected in the structure of the patch data holding the patch information as shown in Table 2.

【００２７】[0027]

【表２】 [Table 2]

【００２８】ここまでの処理が終わったら「data.rad」
としてその他の全ての情報を含めてファイルに保持す
る。この「data.rad」ファイルをホストプロセッサから
全てのノードプロセッサにブロードキャストする（図８
参照）。以上から、各ノードは担当させたパッチのみに
ついてラジオシティの計算処理を行えばよい。When the above processing is completed, "data.rad"
And keep all other information in the file. This "data.rad" file is broadcast from the host processor to all node processors (FIG. 8).
reference). From the above, each node only needs to perform the radiosity calculation process for the patch assigned.

【００２９】２．３並列フォームファクタ計算まず、最初はホストから最大の未放射エネルギーを持つ
パッチ、すなわち光源パッチ（以下、Shoot Patch）に
ついて各ノードにShoot Patchの情報をブロードキャス
トする。次に各ノードは受けとったShoot Patchの情報
を基に担当のパッチのみフォームファクタ計算を行う。2.3 Parallel Form Factor Calculation First, the information of the patch having the maximum unradiated energy from the host, that is, the light source patch (hereinafter referred to as the “shot patch”) is broadcast to each node. Next, each node calculates the form factor of only the patch in charge based on the information of the received Shoot Patch.

【００３０】フォームファクタ計算処理後、各ノードで
担当したパッチが保持しているエネルギーを求める。各
ノードから「担当のパッチの総エネルギー：Ｅｎ」と
「担当のパッチの最大エネルギー：Ｅｍａｘｎ」をホス
トに送信する。ホストは受けとったＥｎを全て加算する
事で「総エネルギー：Ｅｔｏｔａｌ」を求める（数１の
式）。After the form factor calculation processing, the energy held by the patch assigned to each node is obtained. Each node transmits “total energy of the assigned patch: En” and “maximum energy of the assigned patch: Emaxn” to the host. The host obtains “total energy: Etotal” by adding all the received Ens (Equation 1).

【００３１】[0031]

【数１】Ｅｔｏｔａｌ＝Ｅ１＋Ｅ２＋…＋ＥｎEtotal = E1 + E2 +... + En

【００３２】ホストはＥｔｏｔａｌからどれだけエネル
ギーが減衰したかがわかる。また、Ｅｍａｘｎを比較す
ることにより最大未放射エネルギー値と次回のShoot Pa
tchを担当させるかのノードのＩＤを取得する。例（図
９）としてＥｍａｘ３＜Ｅｍａｘ１＜Ｅｍａｘ２となる
と、ノード２が次回のShoot Patchを担当することにな
る。The host can know from Etotal how much energy has been attenuated. Also, by comparing Emaxn, the maximum unradiated energy value and the next Shoot Pa
Acquires the node ID of whether to assign tch. As an example (FIG. 9), when Emax3 <Emax1 <Emax2, the node 2 is in charge of the next Shooting Patch.

【００３３】次にShoot Patchを担当しているノードはS
hoot Patchの情報を各ノードに送信する。Shoot Patch
を担当していないノードはShoot Patchの情報を受信す
る。この作業を未放射エネルギーが閾値以下になるまで
繰り返す。その結果得られたラジオシティ値をホストプ
ロセッサに送信して、ホストプロセッサは各ノードプロ
セッサから送信されてきたデータをレンダリングし結果
を表示させる。Next, the node in charge of the Shoot Patch is S
Send the information of the hoot patch to each node. Shoot Patch
The node not in charge of receiving the information of the Shoot Patch. This operation is repeated until the unradiated energy falls below the threshold. The resulting radiosity value is transmitted to the host processor, which renders the data transmitted from each node processor and displays the result.

【００３４】[0034]

【実施例】本実験ではCornelBox、TestModel01、TestMo
del02の３種類のデータを用いて、考案したラジオシテ
ィ法の並列処理の評価を行う。CornelBoxの場合、シー
ンデータが非常に単純なのである領域からラジオシティ
処理時間よりもノード数が増えたために必要になる部分
フォームファクタ送信時間の負荷のほうが大きくなって
きて、ノード数が８台以上になると、全体の処理時間が
遅くなっている（図１０参照）。[Example] In this experiment, CornelBox, TestModel01, TestMo
The parallel processing of the radiosity method devised is evaluated using three kinds of data of del02. In the case of CornelBox, the scene data is very simple. From a certain area, the load of the partial form factor transmission time required because the number of nodes has increased from the radiosity processing time has increased, and the number of nodes has increased to 8 or more. Then, the entire processing time is delayed (see FIG. 10).

【００３５】図１１と図１２より、シーンデータが複雑
になるほど多くの放射回数が必要となるので、通信時間
が占める割合が少ないため、ノード数にしたがって速度
向上がよくなっていると考えられる。しかし、ノードが
増えても線形的に、速度向上が見られないことがわか
る。From FIGS. 11 and 12, it can be considered that the more complicated the scene data, the larger the number of times of emission is required. Therefore, since the ratio of the communication time is small, the speed is improved in accordance with the number of nodes. However, it can be seen that even if the number of nodes increases, the speed does not increase linearly.

【００３６】図１３は全ノードの中の最短の処理時間Ｔ
minを、最長の処理時間Ｔmaxをとしたときの負荷均衡度
を示している。負荷均衡が完全にとれていると１とな
り、不均衡になるにつれて値が小さくなる。本発明で用
いた方法は、形状面（パッチ）を各ノードに分散させて
おき、各ノードが自分が担当する部分についてラジオシ
ティ計算処理を行う。この際に、パッチの場所によっ
て、細分化メッシュ（エレメント）の数に大きい偏りが
あるため、図１３のように各ノードごとの処理時間が不
均衡になる。この問題を解決するために、本発明では、
単純４分木分割で初期パッチを負荷分散に最適な大きさ
まで分割してから、ラジオシティの並列処理を行ってみ
た。図１４はその結果を示す。FIG. 13 shows the shortest processing time T of all nodes.
The load balance when min is the longest processing time Tmax is shown. The value is 1 when the load is completely balanced, and decreases as the load becomes unbalanced. According to the method used in the present invention, a shape surface (patch) is distributed to each node, and each node performs a radiosity calculation process on a part which is in charge of itself. At this time, since the number of subdivided meshes (elements) is largely biased depending on the location of the patch, the processing time of each node becomes unbalanced as shown in FIG. In order to solve this problem, in the present invention,
After dividing the initial patch into the optimal size for load distribution by simple quadtree division, parallel processing of radiosity was performed. FIG. 14 shows the result.

【００３７】形状面（パッチ）の面積を基準とする本手
法で並列処理した結果、１６台で約８．５倍の速度向上
を得られた。しかし、負荷分散の問題によってノード数
が増えても線形的な速度向上が得られなかった。As a result of performing parallel processing by this method based on the area of the shape surface (patch), the speed was improved about 8.5 times with 16 units. However, even if the number of nodes increases due to the problem of load distribution, linear speed improvement cannot be obtained.

【００３８】以上の結果から、ラジオシティ法の並列計
算をするために、パッチの面積を基準とし、面積が均等
に分散する手法を用いることにより、通常のラジオシテ
ィ計算における逐次計算より、約８．５の速度向上が得
られることがわかった。しかし、負荷分散をうまく均衡
にすることができなかったことで線形的な高速化は得ら
れなかった。From the above results, the parallel calculation of the radiosity method is performed by using the technique of uniformly distributing the area based on the area of the patch, which is about 8 times smaller than the sequential calculation in the normal radiosity calculation. It was found that a speed increase of 0.5 was obtained. However, a linear speedup was not obtained because the load distribution could not be balanced well.

【００３９】以上、本発明の実施の形態について詳細に
説明したが、本発明は、上記実施の形態に何ら限定され
るものではなく、本発明の要旨を逸脱しない範囲内で種
々の改変が可能である。Although the embodiments of the present invention have been described in detail above, the present invention is not limited to the above embodiments, and various modifications can be made without departing from the gist of the present invention. It is.

【００４０】[0040]

【発明の効果】本発明は、ラジオシティ法の並列計算を
するために、パッチの面積を基準とし、面積が均等に分
散する手法を用いているので、効率の良い負荷分散を実
現することができ、従来の方法に比して高速処理が可能
となるという効果がある。According to the present invention, since the area is uniformly distributed on the basis of the patch area for the parallel calculation of the radiosity method, efficient load distribution can be realized. This makes it possible to perform high-speed processing as compared with the conventional method.

[Brief description of the drawings]

【図１】共有メモリ型並列計算機の概略構成図であ
る。FIG. 1 is a schematic configuration diagram of a shared memory type parallel computer.

【図２】分散メモリ型並列計算機の概略構成図であ
る。FIG. 2 is a schematic configuration diagram of a distributed memory type parallel computer.

【図３】実装した超並列計算機ＰＣ−クラスタシステ
ムを示す図である。FIG. 3 is a diagram illustrating a mounted massively parallel computer PC-cluster system.

【図４】形状モデルの一例を示す図である。FIG. 4 is a diagram illustrating an example of a shape model.

【図５】最初の各パッチの各ノードへの割り当て方法
を示す概略構成図である。FIG. 5 is a schematic configuration diagram showing a method of assigning first patches to each node;

【図６】２回目以降の各パッチの各ノードへの割り当
て方法を示す概略構成図である。FIG. 6 is a schematic configuration diagram showing a method of assigning each patch to each node after the second time.

【図７】各パッチの各ノードへの割り当て結果を示す
概略構成図である。FIG. 7 is a schematic configuration diagram showing a result of assigning each patch to each node.

【図８】保持する情報をブロードキャストする方法を
示すブロック図である。FIG. 8 is a block diagram illustrating a method of broadcasting held information.

【図９】３台のノードによる並列処理の例を示すブロ
ック図である。FIG. 9 is a block diagram illustrating an example of parallel processing by three nodes.

【図１０】 Cornel Boxの実験結果を示す図である。FIG. 10 is a view showing experimental results of Cornel Box.

【図１１】 Test Model 01の実験結果を示す図であ
る。FIG. 11 is a diagram showing test results of Test Model 01.

【図１２】 Test Model 02の実験結果を示す図であ
る。FIG. 12 is a diagram showing test results of Test Model 02.

【図１３】ノード数による負荷均衡の評価を示す図で
ある。FIG. 13 is a diagram showing evaluation of load balance based on the number of nodes.

【図１４】単純４分木追加による負荷分散の結果を示
す図である。FIG. 14 is a diagram showing a result of load distribution by simple quadtree addition.

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5B045 AA01 BB17 BB28 BB42 GG02 GG11 5B080 AA13 CA03 GA08 ────────────────────────────────────────────────── ─── Continuation of front page F term (reference) 5B045 AA01 BB17 BB28 BB42 GG02 GG11 5B080 AA13 CA03 GA08

Claims

[Claims]

A step of defining a surface of a shape model in a scene as a patch; a step of determining a magnitude relationship of a total area of the patches already assigned to each of the nodes of a processor having a plurality of nodes; Assigning the next patch to the node having the smallest total area of the assigned patches.

2. A processor having a plurality of nodes, means for defining a surface of a shape model in a scene as a patch, means for judging a magnitude relationship of a total area of the patches already assigned to each of the nodes, Means for allocating the next patch to the node having the smallest total area of the already allocated patches.