JP5216130B2

JP5216130B2 - A device to accelerate the processing of the extended primitive vertex cache

Info

Publication number: JP5216130B2
Application number: JP2011231190A
Authority: JP
Inventors: カザコフマキシム
Original assignee: Digital Media Professionals Inc
Current assignee: Digital Media Professionals Inc
Priority date: 2006-11-01
Filing date: 2011-10-20
Publication date: 2013-06-19
Anticipated expiration: 2027-10-31
Also published as: WO2008053597A1; JPWO2008053597A1; JP2012014744A; JP4913823B2

Description

本発明は，３次元コンピュータグラフィックスの分野に関する。より詳しく説明すると，本発明は，本明細書において拡張されたプリミティブと呼ばれる複雑な多くの頂点を有する幾何学的プリミティブのための情報処理をハードウェアにおいて高速に処理するための方法及びシステムに関する。 The present invention relates to the field of three-dimensional computer graphics. More particularly, the present invention relates to a method and system for processing information at high speed in hardware for geometric primitives with many complex vertices, referred to herein as extended primitives.

伝統的に、対話型コンピュータ図形は、複雑な幾何学的な形に近似して、ポイント、線、および三角形のような簡単な幾何学的なプリミティブを使用する。既存のコンピュータグラフィックスハードウェアは，そのような簡単なプリミティブ、特に三角形のメッシュの処理を高速化するように最適化されている。プリミティブの単純三角形は、３Ｄ表面のフラグメントの線形近似に使用することができる中で最も簡単な多角形である。様々な三角形、線、およびポイントでかなり複雑な形に近似することができる。現在既存のハードウェアの設計は大いに互いからそれらの系列でプリミティブの頂点を独自に処理する可能性の利益を得ることができる。 Traditionally, interactive computer graphics approximate simple complex geometric shapes and use simple geometric primitives such as points, lines, and triangles. Existing computer graphics hardware is optimized to speed up the processing of such simple primitives, especially triangular meshes. Primitive simple triangles are the simplest polygons that can be used for linear approximation of 3D surface fragments. Various triangles, lines, and points can be approximated to fairly complex shapes. Currently existing hardware designs can greatly benefit from the possibility of independently processing the vertices of primitives in their series from each other.

複雑な多頂点プリミティブの処理の必要性は３Ｄコンピュータグラフィックスの様々な領域に起こる。事実上、ＮＵＲＢＳパッチとサブディビジョンサーフェスのような高次平面プリミティブで多くの複雑な形をモデルする。どちらの場合も、粗い制御メッシュと規則のセットは、滑らかできめ細かくかつ制御メッシュに影響することで容易に変形できるような形状を生成させるのに使用される。通常、パッチ自体はテッセレーションの過程を制御する追加情報と同様に制御メッシュの頂点の可変サイズセットによって形成される。ハードウェアでＮＵＲＢＳかサブディビジョンサーフェスパッチテッセレーションを有効にするのは対話的な３Ｄグラフィックスが大いに利益を得ることができる結果として、形状操作の格納、帯域幅要件、および単純さを大いに減少させる。もう1セットのアルゴリズムは一度に近似におけるただ一つの三角形より処理形のさらに複雑な一部分へのアクセスを必要とする。通常、三角形のいくらかの限られた近隣頂点へのアクセスがシルエット縁を検出するのに必要であり、メッシュ頂点での湾曲などについて計算する。 The need for complex multi-vertex primitive processing occurs in various areas of 3D computer graphics. In effect, many complex shapes are modeled with higher order plane primitives such as NURBS patches and subdivision surfaces. In either case, the coarse control mesh and rule set is used to generate a shape that is smooth, fine, and easily deformed by affecting the control mesh. Usually, the patch itself is formed by a variable size set of vertices of the control mesh, as well as additional information that controls the tessellation process. Enabling NURBS or subdivision surface patch tessellation in hardware greatly reduces shape manipulation storage, bandwidth requirements, and simplicity, as a result of which interactive 3D graphics can greatly benefit . Another set of algorithms requires access to a more complex part of the processing form than just a single triangle in the approximation at a time. Usually, access to some limited neighboring vertices of a triangle is necessary to detect silhouette edges, such as curvature at mesh vertices.

複雑なプリミティブの処理のサポートはいくつかの理由で既存の３Ｄグラフィックスアクセラレータ構造の中では難しい。アクセスが簡単な三角形のためだけに３つの頂点に制限されるとき、複雑なプリミティブの処理はすぐに、処理アルゴリズムのために複数の頂点への速いアクセスを必要とする。サブディビジョンサーフェスに使用されるものとＮＵＲＢＳパッチのようなプリミティブの複合体の頂点のための属性データがメモリの向こう側に点在する場合、アクセスはデータ転送におけるかなりのレイテンシをもたらすため処理アルゴリズムは低性能になるかもしれない。一度に単一の頂点を処理するために現在のプログラマブルアーキテクチャのジオメトリックパイプラインのプログラマブル処理要素を最適化する。そして、固定論理回路(いかなる他のプリミティブのタイプの上でも操作することができませんが、簡単なもの)で頂点の系列からプリミティブをアセンブリする。 Support for processing complex primitives is difficult among existing 3D graphics accelerator structures for several reasons. When access is limited to three vertices only for simple triangles, the processing of complex primitives immediately requires fast access to multiple vertices for processing algorithms. If the attribute data for the vertices of a complex of primitives such as those used for subdivision surfaces and NURBS patches are scattered across the memory, the processing algorithm can be used because access results in significant latency in data transfer. May be low performance. Optimize the programmable processing elements of the current programmable architecture's geometric pipeline to process a single vertex at a time. It then assembles primitives from a sequence of vertices with fixed logic (which can't be manipulated on any other primitive type, but simple).

パーソナルコンピュータとワークステーションに利用可能な３Ｄグラフィックスハードウェアアクセラレータのカーネル構造はいくつかの場合また、三角形、線、およびポイントのように簡単なプリミティブのリストの速い処理と表現で適応する。図１に、典型的な３Ｄグラフィックスハードウェアアクセルパイプラインの一部が示す。プリミティブのリストはインデックスバッファ1100と頂点バッファ1200によって説明される。通常、ホスト・マシンメモリ1000に記憶され、インデックスと頂点バッファの中身は図2Aと図2Bで例示される。図2Aの場合では、ことによるとある頂点を共有する1セットの三角形101が頂点バッファ1200とインデックスバッファ1100に連続して詰め込まれた1セットの頂点データによって表される。後者のコンテンツが頂点バッファのそばで頂点に参照をつけていて、インデックスバッファの三角形セット101、3位置の場合では、1100は三角形を定義する。次期3つの位置がセット101における次の三角形などを定義していて、頂点バッファ1200の中の頂点の大部分が再利用されるなら、どの点で、頂点バッファ1200の中の再利用された頂点が点を打たされた中詰めパターンでマークされる頂点データが通常インデックスバッファのインデックスよりはるかに多くの集積スペースを必要とするとき、三角形セットの代理はかなりコンパクトである場合がある。図2Bを参照して、表現は三角形片102の場合でさらに簡潔である。この場合、前の2つの頂点がインデックスバッファで参照をつけられている状態で取られた頂点は三角形を形成する。その結果、1つのインデックスだけが、最初のものが処理された後にさらなる三角形を定義するのに必要とされる。インデックスバッファの内容は、この場合１つの三角形あたりのインデックスリストの数に関してさらに効果的に三角形セットについて説明する。インデックスと頂点バッファを使用することで２つの頂点と１セットの先が１ポイントあたり１つの頂点で定義されている状態で1セットの線区分が定義した同様の方法を述べることができる。 The kernel structure of 3D graphics hardware accelerators available for personal computers and workstations also adapts in some cases with fast processing and representation of simple primitive lists such as triangles, lines, and points. FIG. 1 shows a portion of a typical 3D graphics hardware accelerator pipeline. The list of primitives is described by an index buffer 1100 and a vertex buffer 1200. Typically stored in the host machine memory 1000, the index and contents of the vertex buffer are illustrated in FIGS. 2A and 2B. In the case of FIG. 2A, possibly a set of triangles 101 sharing a vertex is represented by a set of vertex data continuously packed in the vertex buffer 1200 and the index buffer 1100. In the case where the latter content references vertices by the vertex buffer and the index buffer triangle set 101, position 3, 1100 defines a triangle. If the next three positions define the next triangle, etc. in set 101, and most of the vertices in vertex buffer 1200 are reused, at which point the reused vertex in vertex buffer 1200 Triangle set surrogates can be fairly compact when vertex data marked with a dotted padding pattern usually requires much more accumulated space than the index buffer. Referring to FIG. 2B, the representation is more concise in the case of the triangle piece 102. In this case, the vertices taken with the previous two vertices referenced in the index buffer form a triangle. As a result, only one index is needed to define further triangles after the first is processed. The contents of the index buffer describe the triangle set more effectively in this case with respect to the number of index lists per triangle. Using the index and vertex buffer, a similar method can be described in which a set of line segments is defined with two vertices and a set of tips defined by one vertex per point.

インデックスと頂点バッファを使用することで定義されたプリミティブのセットの処理を加速するために、頂点キャッシュ装置が用いられる。図１によると、インデックスバッファ1100の内容は、頂点バッファ1200を抽出するのに使用されるだけではなく、最近、同じインデックスがある頂点が処理されたかどうかまた検出するのに使用されて、キャッシュでまだ利用可能であるかもしれない。頂点キャッシュコントローラ2000は、インデックスバッファ1100の内容を取得し、それを分析する。初めはキャッシュが空であるので、頂点キャッシュコントローラはインデックスバッファ1100から第一の頂点キャッシュ3000に取得されたインデックスリストのための頂点データを含む頂点バッファ内容の配送を初期化する。通常、メモリアクセス潜在するペナルティを現在頂点キャッシュコントローラによってインデックスリストだけに処理されていない頂点データ対応を含むことができる比較的大きい連続したメモリブロック最小にする方法で頂点バッファから取得する。それにもかかわらず、それは第一の頂点キャッシュ3000でそれが後で分類調査されたインデックスリストによって使用されるかもしれないというので保持される。インデックスが第一の頂点に既にあるので頂点データが3000をキャッシュするなら、ホストメモリアクセスは全く実行されない。 A vertex cache device is used to accelerate the processing of a set of defined primitives by using an index and a vertex buffer. According to FIG. 1, the contents of the index buffer 1100 are not only used to extract the vertex buffer 1200, but are also used to detect if a vertex with the same index has recently been processed, in the cache. It may still be available. The vertex cache controller 2000 acquires the contents of the index buffer 1100 and analyzes it. Initially, since the cache is empty, the vertex cache controller initializes delivery of vertex buffer contents including vertex data for the index list obtained from the index buffer 1100 to the first vertex cache 3000. Typically, memory access potential penalties are obtained from the vertex buffer in a manner that minimizes a relatively large contiguous memory block that can contain vertex data correspondences that are not currently processed by the vertex cache controller only into the index list. Nevertheless, it is retained in the first vertex cache 3000 because it may be used later by the indexed index list. If the vertex data caches 3000 because the index is already at the first vertex, no host memory access is performed.

通常、第一の頂点キャッシュ3000に送られた頂点データは、変換される必要がある。
例えば、頂点の位置は、１個の座標系から別のものに変換される必要があることがある。頂点色は標準、位置などに基づいて計算される。頂点キャッシュコントローラ2000は頂点データの第一の頂点キャッシュから頂点処理装置4000までの配送を制御する。変換した頂点データを二次頂点キャッシュ5000に配送する。二次頂点キャッシュ5000から、変換した頂点データを用いて，頂点列から三角形、線、およびポイントのような固定セットのプリミティブを形成して、三角形セットアップ7000とラスタライザ8000が固定された処理パイプラインの残りにプリミティブをアセンブルし，固定プリミティブのアセンブリ6000に達する。インデックスが二次頂点キャッシュ5000でそれを提示するので、オーバーヘッドなしでいかなる他のものも変換した頂点データをプリミティブのアセンブリ6000に配送する。 Usually, the vertex data sent to the first vertex cache 3000 needs to be converted.
For example, vertex positions may need to be converted from one coordinate system to another. Vertex color is calculated based on standard, position, etc. The vertex cache controller 2000 controls the delivery of vertex data from the first vertex cache to the vertex processor 4000. The converted vertex data is delivered to the secondary vertex cache 5000. From the secondary vertex cache 5000, the transformed vertex data is used to form a fixed set of primitives, such as triangles, lines, and points, from the vertex sequence, and the processing pipeline with the triangle setup 7000 and rasterizer 8000 fixed. The remaining primitives are assembled and the fixed primitive assembly 6000 is reached. Since the index presents it in the secondary vertex cache 5000, it delivers the vertex data transformed to any other to the primitive assembly 6000 without overhead.

拡張プリミティブのハードウェアの加速している処理が特に複雑な形のモデルと画像分野にある状態で導入することができる利益のために、チップにおけるそのようなプリミティブの処理ができるハードウェアを設計するのに大きな努力がそそがれた。チップの上のサブディビジョンサーフェスとＮＵＲＢＳ表面テッセレーションを含んでいて、それにもかかわらず、ほとんど問題の複雑なプリミティブの処理タスクに構造が捧げた完全に異なったハードウェアを紹介するか、既存の構造の中のプログラマブル要素の使用のために試みをした。簡単なプリミティブだけの定数と伝統的なアーキテクチャが、プリミティブ生成施設の不在のサポートでまだ制限されている。 Design hardware capable of processing such primitives on the chip for the benefit that accelerated processing of hardware for extended primitives can be introduced in particular in complex shaped models and image fields However, a great effort was intensified. Introducing completely different hardware, including subdivision surface on chip and NURBS surface tessellation, but nevertheless dedicated to the task of processing complex primitives in question or existing structures Attempted to use programmable elements in Simple primitive-only constants and traditional architectures are still limited by support for the absence of primitive generation facilities.

ＷＯ０３／０８１５２８（特許文献1）には、それはいかなる他の目的のためにもそのハードウェアを再利用するか、または簡単なプリミティブの処理のために装置を共有する可能性とアプリケーションが複雑な構造を準備する必要性の不足がハードウェア処理に適したフォームで制御メッシュのための接続性情報について説明している状態でサブディビジョンサーフェスに制御メッシュの特別な記述を処理する専用ハードウェアユニットが開示されている。同様のアプローチは米国特許出願明細書２００５／００１２７５０（特許文献２）にも開示されており、ＮＵＲＢＳパッチのハードウェアで加速している処理が実現されている。 WO 03/081528 describes a complex structure with the possibility of reusing its hardware for any other purpose or sharing devices for simple primitive processing. Dedicated hardware unit to process special description of control mesh on subdivision surface with a lack of need to prepare for, describing connectivity information for control mesh in a form suitable for hardware processing Has been. A similar approach is also disclosed in US Patent Application No. 2005/0012750 (Patent Document 2), which realizes accelerated processing with the hardware of the NURBS patch.

プログラマブル３Ｄ加速ハードウェアの導入で、それで複雑なプリミティブの処理を実行するのをいくつかの試みが行われた。それにもかかわらず、有効な実現は処理がまだ簡単なプリミティブだけにもかかわらず、どんなプリミティブの生成にも制限されていないとき施設が商業的に利用可能なハードウェアでは実現できていない。 With the introduction of programmable 3D acceleration hardware, some attempts were made to perform complex primitive processing with it. Nevertheless, an effective implementation has not been realized with commercially available hardware when the primitive is not limited to the generation of any primitive, even though it is still simple to process.

ランダムアクセス、頂点データに、特にサブディビジョンサーフェステッセレーションの場合における拡張プリミティブの処理に関する問題の１つは、必要な情報を処理アルゴリズムに提供することである。ランダムメモリアクセスは、現在既存のハードウェアで高価であって、処理パイプラインの限られた場所だけで可能である。現在既存のハードウェアで最も多能な能力はプログラマブルハードウェアで頂点かフラグメントプログラムで制御することができるメッシュ状のサンプリング抽出に関連する。メッシュ単位のランダムメモリアクセスの有用性のために、実現はメッシュに詰め込まれたメッシュ接続性情報を取得する。専用テッセレーションハードウェアの場合では、テッセレーションアルゴリズム実現に必要な情報の取得を許容する特別なメッシュ表現が、高い準備の欠点と共に使用されて、コストを維持している。ランダムなメモリアクセスができる伝統的アーキテクチャにおける現在の別の装置は、頂点キャッシュである。 One problem with processing extended primitives in the case of random access, vertex data, especially in the case of subdivision surface tessellation, is providing the necessary information to the processing algorithm. Random memory access is currently expensive with existing hardware and is possible only in limited places in the processing pipeline. The most versatile capability currently in existing hardware is related to mesh sampling that can be controlled by vertices or fragment programs with programmable hardware. For the usefulness of random memory access on a per-mesh basis, the implementation obtains mesh connectivity information packed into the mesh. In the case of dedicated tessellation hardware, special mesh representations that allow the acquisition of the information necessary to implement the tessellation algorithm are used with high preparation drawbacks to maintain cost. Another current device in traditional architectures that allows random memory access is the vertex cache.

ＷＯ０３／０８１５２８パンフレットWO03 / 081528 brochure 米国特許出願明細書２００５／００１２７５０US Patent Application Specification 2005/0012750

本発明は，３次元コンピュータグラフィックスにおいて用いられるアルゴリズムなどにおける入力情報として用いられる，再分割表面パッチや，ＮＵＲＢＳパッチ，隣接する三角形など，拡張された幾何学的プリミティブに関するチップ上のプロセッシングに関する問題を解決することを目的とする。そのようなアルゴリズムとして，４−３細分割スキームであるＣａｔｍｕｌｌ−Ｃｌａｒｋループや，ＮＵＲＢＳ表面分割，シルエット発見，最も簡単な平面状の幾何学的プリミティブである三角形を構成するために３対状の頂点を要求するようなアルゴリズムなど，公知のコンピュータグラフィックスにおいて実装される様々なスキームを含む。 The present invention addresses on-chip processing related to extended geometric primitives such as subdivision surface patches, NURBS patches and adjacent triangles used as input information in algorithms used in 3D computer graphics. The purpose is to solve. Such algorithms include Catmull-Clark loop, which is a 4-3 subdivision scheme, NURBS surface segmentation, silhouette discovery, and triplet vertices to construct triangles, which are the simplest planar geometric primitives. Various schemes implemented in known computer graphics, such as algorithms that require

完全にチップ上で拡張された幾何学的プリミティブに関する情報処理を行うことは，複雑な幾何学的形状を迅速にレンダリングすることにつながる。さらには，コンピュータグラフィックスを用いてよりリアルな映像を得ることにつながり，三角形のような簡単なプリミティブを用いて直接に拡張されたプリミティブを得ることにもつながる。 Information processing on geometric primitives fully extended on chip leads to the rapid rendering of complex geometric shapes. Furthermore, it leads to obtaining more realistic images using computer graphics, and also to obtaining primitives that are directly extended using simple primitives such as triangles.

現在の３次元コンピュータグラフィックスにおけるプリミティブのサイズは，固定されている。例えば，三角形の場合には３つの頂点であり，直線の場合には２頂点であり，点の場合は１頂点である。一方，特に表面パッチの細分割における場合に用いられる，拡張されたプリミティブでは，プリミティブのサイズが任意のサイズであり得る。合理的な範囲での，最大範囲において，拡張されたプリミティブを用いる際には，チップ上で実現するために様々な頂点数のプリミティブが必要となる。 The size of primitives in current 3D computer graphics is fixed. For example, there are three vertices in the case of a triangle, two vertices in the case of a straight line, and one vertex in the case of a point. On the other hand, in the extended primitive used especially in the case of subdivision of the surface patch, the size of the primitive can be arbitrary. When using extended primitives in a reasonable range and at the maximum range, primitives with various numbers of vertices are required to be realized on the chip.

次なる問題は，特に拡張されたプリミティブを用いて情報を処理する際に用いられるランダムなメモリアクセスを達成する困難性である。プリミティブを構成する頂点データは,単純なプリミティブにおけると同様，記憶部において分散されて記憶されうる。ここで問題とされるのは，拡張されたプリミティブにおける頂点の数であり，それは通常より何倍か大きく，それに対応するだけの頂点データをフェッチするためにランダムにメモリにアクセスする必要が生ずることである。グラフィックス装置におけるランダムアクセスは，適切にキャッシングを行わなければ，重大な欠陥を生み出すために制限されており，通常は頂点キャッシュに関連した装置とテクスチャのサンプリングに関する装置に限って用いられている。 The next problem is the difficulty of achieving random memory access, particularly used when processing information using extended primitives. The vertex data constituting the primitive can be distributed and stored in the storage unit as in the simple primitive. The problem here is the number of vertices in the extended primitive, which is several times larger than usual, and it may be necessary to access memory randomly to fetch only the corresponding vertex data. It is. Random access in graphics devices is limited to creating serious defects if not properly cached, and is typically used only for devices related to vertex caches and texture sampling.

三角形のような簡単な幾何学的プリミティブのみをハードウェアを用いて効果的に利用すると，ゲート数を減らすことができなくなって，その結果，チップの構成を複雑にするという問題がある。拡張されたプリミティブを描画するための論理回路を単純なプリミティブ用の処理回路と別に設けることによっても，チップの構成を複雑にすることとなる。よって，単純なプリミティブと拡張されたプリミティブとを別々の処理回路ではなく，一つの処理系統で処理できれば，チップが複雑になるという問題を解決できる。 If only simple geometric primitives such as triangles are used effectively using hardware, the number of gates cannot be reduced, resulting in a problem of complicating the chip configuration. Even if the logic circuit for drawing the extended primitive is provided separately from the processing circuit for the simple primitive, the configuration of the chip is complicated. Therefore, if simple primitives and extended primitives can be processed by a single processing system instead of separate processing circuits, the problem that the chip becomes complicated can be solved.

さらに，拡張されたプリミティブを表現する際の問題がある。様々なタイプの拡張されたプリミティブに広く用いられるように，単純プリミティブ及び拡張されたプリミティブに対して，プロセッシングやAPI，すなわちアプリケーションプログラミングインターフェース，を統一するため，単純プリミティブにおけると同様に，記憶容量を軽減させ，データの転送速度を下げることが望まれる。以下では，本発明が上記の課題をどのようにして解決したのかについて説明する。 In addition, there are problems in expressing extended primitives. In order to unify the processing and APIs, ie application programming interfaces, for simple primitives and extended primitives as widely used for various types of extended primitives, the storage capacity is reduced as in simple primitives. It is desirable to reduce the data transfer rate. Hereinafter, how the present invention has solved the above-described problems will be described.

本発明は，基本的には，頂点にそれぞれインデックスを付して，そのインデックスをインデックスバッファに格納し，格納したインデックスを用いてプリミティブにおける頂点を表現した後，インデックスと関連する頂点情報を頂点バッファから読み出し，プリミティブの演算処理に用いる。さらに，通常のプリミティブは，三角形や四角形など定形のものを用いるが，本発明はこれら定形のものを用いつつも，頂点の数が４以上と通常より多いプリミティブをも用いる。このような頂点の数が可変で通常より多いプリミティブを可変サイズの拡張されたプリミティブとよぶ。本発明では，あるプリミティブに関する頂点情報が入力されると，そのプリミティブが通常のプリミティブか，可変サイズの拡張されたプリミティブかどうか判断する。この判断は，ひとつのプリミティブにおける頂点の数により容易に判断できる。そして，通常のプリミティブの場合は，通常と同様の装置を用いて通常と同様に演算処理を行ってもよい。一方，可変サイズの拡張されたプリミティブの場合は，中心となる多角形のインデックスを読みとり，その後中心となる三角形又は四角形に隣接する多角形のインデックスを読み取る。その上で，それらインデックスに関連する頂点データを取得し，プリミティブをアセンブルする演算処理を行う。本発明のシステムは，コンピュータから各種情報を受け取り，頂点に関する情報を変換する頂点エンジンと，頂点エンジンから変換された頂点情報などを受け取り，プリミティブをアセンブルするプリミティブエンジンとを具備する。そして，プリミティブエンジンによりアセンブルされたプリミティブは，ラスタライザによりラスタライズされ，３次元コンピュータグラフィックスなどとして，フレームバッファなどに格納され，モニタなどに描画される。頂点変換は，頂点に対する視点変換等の演算処理を意味する。プリミティブが，点であるか，線であるか，三角形であるか四角形以上の多角形であるか関係なしに，頂点情報に必要な演算処理を行う。プリミティブアセンブル（Ｐｒｉｍｉｔｉｖｅａｓｓｅｍｂｌｅ）は，変換された個別の頂点を組み立ててプリミティブとすることを意味する。プリミティブアセンブル以後はプリミティブ毎に個別処理が行われる。本発明では，プリミティブが可変サイズを含んでもよい。プリミティブが三角形の場合のように，プリミティブサイズが３である場合は，インデックステーブルは３の倍数ごとにアクセスすればよい。そのため，装置は，３という数字を覚えているだけでよく，記述する必要は無い。しかし，本発明では，プリミティブサイズがプリミティブごとに変わりうるので，どこまでが次のプリミティブであるかわかりにくくなる。そこで，本発明では，常に次の何要素がプリミティブを構成するかわかるように記述することが好ましい。本発明では，プリミティブサイズをインデックステーブルに記述する。そして，本発明の好ましい態様である複数のインデックステーブルがある場合は，位置属性を含むインデックステーブルの中にプリミティブサイズを記述することが好ましい。 In the present invention, basically, an index is assigned to each vertex, the index is stored in the index buffer, the vertex in the primitive is expressed using the stored index, and the vertex information associated with the index is then displayed in the vertex buffer. Is used for primitive processing. Further, regular primitives such as triangles and quadrilaterals are used as normal primitives, but the present invention also uses primitives with more than four vertices, although these regular ones are used. Such primitives with a variable number of vertices and more than usual are called extended primitives of variable size. In the present invention, when vertex information related to a certain primitive is input, it is determined whether the primitive is a normal primitive or an extended primitive of variable size. This determination can be easily made based on the number of vertices in one primitive. In the case of a normal primitive, the arithmetic processing may be performed in the same manner as usual by using a device similar to the normal one. On the other hand, in the case of an extended primitive of variable size, the index of the polygon that becomes the center is read, and then the index of the polygon that is adjacent to the triangle or the rectangle that becomes the center is read. After that, the vertex data related to these indexes is acquired, and the arithmetic processing for assembling the primitive is performed. The system of the present invention includes a vertex engine that receives various types of information from a computer and converts information about vertices, and a primitive engine that receives vertex information converted from the vertex engine and assembles primitives. The primitive assembled by the primitive engine is rasterized by a rasterizer, stored as a three-dimensional computer graphics in a frame buffer or the like, and rendered on a monitor or the like. Vertex conversion means arithmetic processing such as viewpoint conversion for vertices. Regardless of whether the primitive is a point, a line, a triangle, or a polygon greater than a quadrangle, an arithmetic process necessary for vertex information is performed. Primitive assembly means that converted individual vertices are assembled into a primitive. After the primitive assembly, individual processing is performed for each primitive. In the present invention, the primitive may include a variable size. When the primitive size is 3, as in the case where the primitive is a triangle, the index table may be accessed every multiple of 3. Therefore, the device only needs to remember the number 3 and does not need to be described. However, in the present invention, since the primitive size can be changed for each primitive, it is difficult to understand how far the next primitive is. Therefore, in the present invention, it is preferable to always describe so that the following elements constitute the primitive. In the present invention, the primitive size is described in the index table. When there are a plurality of index tables that are a preferred aspect of the present invention, it is preferable to describe the primitive size in the index table including the position attribute.

より多くのオブジェクトを描画するために，頂点の数を減らすことが好ましい。たとえば，三角形のプリミティブごとに頂点を記述して隣接する２つの三角形を描画することを考える。第1の三角形は，頂点をｖ_ｍとして，（ｖ_１，ｖ_２，ｖ_３：ただし，ｖ_ｍは（ｘ_ｍ，ｙ_ｍ，ｚ_ｍ））から構成され，それぞれの頂点の色が，赤，白及び黄色とする。第1の三角形と隣接する第２の三角形は，ｐ_２，及びｐ_３を共有するので（ｖ_２，ｖ_３，ｖ_４：ただし，ｐ_４は青）のように記述できる。すると，実際は頂点が４つしかないにもかかわらず，６つの頂点に関する記述が必要とされることとなる。すなわち，この方法は，簡単であるものの，ｎ個の３角形を描画するために，３×ｎ個の頂点に関する記述が必要となる。すなわち，記述しなければならない頂点数が多いので，多くのメモリを消費する。次に，三角形ストリップを使う場合について説明する。この描画方法は，第１のプリミティブの頂点に関する記述を行った後，次のプリミティブの記述を行う際に，共有する頂点に関する記述を利用するというものである。たとえば，プリミティブが三角形の場合，次の三角形は，前の三角形の２個の頂点を利用するという条件をつけて描画する。つまり、先に記述された三角形の頂点のうち先頭の頂点を除く頂点が次の三角形の２頂点となり，これと別の頂点を用いて，次の三角形を表現する。この方法では，メモリコストを大幅に軽減できる。しかしながら，依然としてｎ個の三角形を描画するために２×ｎ個以上の頂点を記述する必要がある。 In order to draw more objects, it is preferable to reduce the number of vertices. For example, consider drawing a vertex adjacent to each triangle primitive and drawing two adjacent triangles. The first triangle is composed of (v ₁ , v ₂ , v ₃ : where v _m is (x _m , y _m , z _m )), where the vertex is v _m , and the color of each vertex is red , White and yellow. Since the second triangle adjacent to the first triangle shares p ₂ and p ₃ , it can be described as (v ₂ , v ₃ , v ₄ : where p ₄ is blue). Then, although there are actually only four vertices, a description about six vertices is required. That is, although this method is simple, a description of 3 × n vertices is required to draw n triangles. That is, since a large number of vertices have to be described, a large amount of memory is consumed. Next, a case where a triangle strip is used will be described. In this drawing method, after describing the vertex of the first primitive, the description about the shared vertex is used when the next primitive is described. For example, if the primitive is a triangle, the next triangle is drawn with the condition that two vertices of the previous triangle are used. That is, of the vertices of the triangle described above, the vertices excluding the top vertex are the two vertices of the next triangle, and the next triangle is expressed using another vertex. This method can greatly reduce the memory cost. However, it is still necessary to describe 2 × n or more vertices to draw n triangles.

本発明では，（ｖ_１，ｖ_２，ｖ_３：ただし，ｖ_ｍは（ｘ_ｍ，ｙ_ｍ，ｚ_ｍ））を表現するために，インデックステーブルを用いる，すなわち，インデックステーブルには，座標値ではなく単に（ｖ_１，ｖ_２，ｖ_３）のような情報が格納され，この情報を用いて各頂点情報（例えば，ｖ_１は（ｘ_１，ｙ_１，ｚ_１）であり，赤という情報）を読み出す。すなわち，インデックステーブルには，ｖ_１，ｖ_２，ｖ_３で１個の三角形を形成し，ｖ_２，ｖ_３，ｖ_４で次の三角形を形成するように，インデックスを格納する。このように頂点エンジンの頂点バッファなどから頂点データを読み出すことができるインデックステーブルを用いるので，１個の頂点は１回記述すればよくなり、その結果、頂点の記述量が小さくて済む。一方，この方法では,インデックスを記述する必要がある。インデックステーブルの幅は、プリミティブによって異なり，プリミティブが三角形であれば３である。この方法では，一個の頂点が複数の三角形に共有されるため，ひとつの頂点に関する情報を繰り返し配送する必要がある。インデックステーブルは、通常は１個持つのが普通であり殆どの状況下で十分である。事実、ＯｐｅｎＧＬ／ＥＳインターフェースでは、インデックステーブルは１個だけ持つ。しかし頂点の属性、たとえばXYZ座標（位置属性）や色（頂点カラー属性）、テクスチャ座標（テクスチャ属性）それぞれに独立のインデックステーブルを持つインターフェースもある。そのようなインターフェースの有名な例としてはＤｉｒｅｃｔ３Ｄなどがあげられる。 In the present invention, an index table is used to express (v ₁ , v ₂ , v ₃ : where v _m is (x _m , y _m , z _m )). Instead, information such as (v ₁ , v ₂ , v ₃ ) is simply stored, and by using this information, each vertex information (for example, v ₁ is (x ₁ , y ₁ , z ₁ ) and is called red Information). That is, the index is stored in the index table so that one triangle is formed by v ₁ , v ₂ , v ₃ and the next triangle is formed by v ₂ , v ₃ , v ₄ . Since an index table that can read out vertex data from the vertex buffer or the like of the vertex engine is used in this way, one vertex only needs to be described once, and as a result, the amount of vertex description can be reduced. On the other hand, in this method, it is necessary to describe an index. The width of the index table differs depending on the primitive, and is 3 if the primitive is a triangle. In this method, since one vertex is shared by multiple triangles, it is necessary to repeatedly deliver information about one vertex. It is normal to have one index table, which is sufficient under most circumstances. In fact, the OpenGL / ES interface has only one index table. However, some interfaces have independent index tables for vertex attributes, such as XYZ coordinates (position attributes), colors (vertex color attributes), and texture coordinates (texture attributes). A famous example of such an interface is Direct3D.

本発明の要旨は，拡張された幾何学的プリミティブをプロセッシングできるようにするために，適切なグラフィックアクセレレータにおける頂点キャッシュ装置（vertex cache facilities）を用いることに関する。これは，従来技術に比べて追加された装置である，本発明において提案される拡張されたプリミティブのアッセンブルとプロセッシングを行うために用いられるプリミティブエンジン（プリミティブエンジン）と，拡張されたプリミティブを表現するために用いられるポリゴンメッシュを表現するための拡張されたインデックス／頂点バッファとを組み合わせて用いることにより達成される。 The subject matter of the present invention relates to the use of vertex cache facilities in a suitable graphics accelerator to allow processing of extended geometric primitives. This is a device added in comparison with the prior art, and represents the primitive engine (primitive engine) used for assembling and processing the extended primitive proposed in the present invention and the extended primitive. This is accomplished by using a combination of an extended index / vertex buffer to represent the polygon mesh used for the purpose.

本発明において，拡張されたプリミティブは，ポリゴンメッシュを表現するための拡張されたインデックス／頂点バッファに基づいて表現される。Ｄｉｒｅｃｔ３ＤやｏｐｅｎＧＬのような3次元グラフィックライブラリが現在あり、このような表現物を，ハードウェアを用いてポリゴンメッシュを迅速に処理する方法を提供している。表現過程において，頂点バッファは，プロセッシングされるポリゴンメッシュのそれぞれの頂点における頂点の属性（vertex attributes）を記憶し，一方，インデックスバッファの内容は，メッシュの接続情報を含む。インデックスバッファは，頂点バッファの内容を示すインデックス列を構成する頂点番号と関連した同じサイズのポリゴン列に関する記述を含む。あるポリゴンに属するインデックス列は，例えば頂点バッファにおける頂点データを参照することにより，そのポリゴンの頂点列を記述し，そして，ポリゴンにおける頂点の連続的な接続により形成されるポリゴンエッジ列を記述する。本発明においては，インデックスバッファは，ある決まったサイズの拡張されたプリミティブ列を記述するのと同じ方法のインデックスバッファを用いることができる。様々なサイズの拡張されたプリミティブを表現するため，プリミティブサイズとともに様々なサイズの拡張されたプリミティブを形成する頂点を参照する他のインデックスに前もってフィックスしておいたインデックス値を記憶する，インデックスバッファが用いられる。 In the present invention, the extended primitive is represented based on an extended index / vertex buffer for representing a polygon mesh. There are currently three-dimensional graphic libraries such as Direct3D and openGL, and a method for rapidly processing a polygon mesh using such a representation using hardware is provided. In the representation process, the vertex buffer stores vertex attributes at each vertex of the polygon mesh being processed, while the contents of the index buffer contain mesh connection information. The index buffer includes a description related to a polygon string of the same size associated with a vertex number constituting an index string indicating the contents of the vertex buffer. For example, an index string belonging to a polygon describes a vertex string of the polygon by referring to vertex data in the vertex buffer, and a polygon edge string formed by continuous connection of the vertices in the polygon. In the present invention, an index buffer can be used in the same way as an extended primitive sequence having a certain size. An index buffer that stores index values that have been previously fixed to other indices that reference the vertices that form the extended primitives of various sizes along with the primitive size to represent the extended primitives of various sizes Used.

そのように拡張されたプリミティブに関する演算処理を行うためにインデックスバッファ及び頂点バッファを改良することで，上述した問題を解決できる。拡張されたプリミティブやインデックスバッファのみに記憶されるインデックスを介した頂点リストにおける，付加的な頂点の数である頂点の数を特定しなければならないような，異なるサイズの固定サイズの拡張されたプリミティブや異なるサイズの可変サイズの拡張されたプリミティブを表現するのに十分である。サイズ，頂点リスト，及びある特別なタイプのプリミティブの形成は，プリミティブ形成アルゴリズムそのものによって行われ，それゆえ表現（representation）はプリミティブサイズによらない。これは，単純プリミティブの表現と同様である。グラフィックライブラリの観点からは，表現は，単純プリミティブの場合と同様であり，単純プリミティブ列を演算処理する場合に比べて拡張されたプリミティブ列を演算処理する場合であってもＡＰＩに大きな変化は必要とされず，必要なメモリを軽減できる。 The problem described above can be solved by improving the index buffer and the vertex buffer in order to perform the arithmetic processing related to the extended primitive. Extended primitives of different sizes and fixed sizes that must specify the number of vertices, which is the number of additional vertices, in the vertex list via an index stored only in the extended primitive or index buffer Or enough to represent variable size extended primitives of different sizes. The creation of size, vertex list, and certain special types of primitives is done by the primitive creation algorithm itself, so the representation is independent of the primitive size. This is similar to the representation of simple primitives. From the point of view of the graphic library, the representation is the same as that for simple primitives, and even if the primitive sequence is expanded compared to the case of processing simple primitive sequences, a large change is required in the API. This reduces the required memory.

その表現により，頂点が頂点バッファに一度だけ格納されるようなプリミティブ間で頂点を共有するような場合においてメモリコストを軽減できる。単純プリミティブの場合と同様，インデックスを参照することは，ずっと少ないメモリ空間しか要求しない。 This representation can reduce memory costs when vertices are shared between primitives whose vertices are stored only once in the vertex buffer. As with simple primitives, referencing an index requires much less memory space.

本発明は，ハードウェアにより拡張されたプリミティブを迅速に演算処理するために，頂点キャッシュとプリミティブエンジンを組み合わせたものを用いる。頂点キャッシュは，直前に演算処理された頂点への短い待ち時間によるアクセスを可能とする。インデックスバッファ及び頂点バッファによる表現の場合は，頂点のインデックスがキャッシュタグとして用いられうる。それゆえ，頂点のインデックスがキャッシュ内のものと同じであれば，後者に格納されるものは，更なる演算処理のために用いられる。本発明は，好ましくは，頂点キャッシュを拡張されたプリミティブの演算処理のためにも用いる。 The present invention uses a combination of a vertex cache and a primitive engine in order to quickly compute a hardware extended primitive. The vertex cache enables access with a short waiting time to the vertex that has just been processed. In the case of the expression by the index buffer and the vertex buffer, the vertex index can be used as the cache tag. Therefore, if the vertex index is the same as that in the cache, the one stored in the latter is used for further computation. The present invention preferably also uses the vertex cache for arithmetic processing of extended primitives.

単純プリミティブと拡張されたプリミティブでは表現が類似しているので，拡張されたプリミティブにおいても単純プリミティブにおけると同様に直前に演算処理された頂点への待ち時間の短いアクセスにより頂点キャッシュを利用できる。これにより，拡張されたプリミティブの頂点データをフェッチする際に要求される沢山のランダムなアクセスによるパフォーマンスの劣化を解消できる。また，単純プリミティブと拡張されたプリミティブの演算処理において同じ頂点キャッシュのハードウェアを用いるので，ハードウェアサイズを軽減できる。また，頂点キャッシュを用いるので，拡張されたプリミティブの最大サイズに上限がなくなる。さらに，頂点キャッシュは固定サイズの拡張されたプリミティブにおける頂点にも，可変サイズの拡張されたプリミティブにおける頂点にも用いられる。しかしながら，演算処理を行う拡張されたプリミティブの最大サイズに合理的な上限が設けられる頂点キャッシュ中の頂点データを有するプリミティブを演算処理することで効果的な演算処理が可能となる。これにより，単純プリミティブより大きなサイズの固定サイズ又は可変サイズの拡張されたプリミティブを演算処理する際の問題を解決できることとなる。 Since simple primitives and extended primitives have similar expressions, vertex caches can be used for extended primitives with low latency access to vertices that have been processed immediately before, as in simple primitives. This eliminates performance degradation due to many random accesses required when fetching the extended primitive vertex data. Also, since the same vertex cache hardware is used in the arithmetic processing of simple primitives and extended primitives, the hardware size can be reduced. In addition, since the vertex cache is used, there is no upper limit on the maximum size of the extended primitive. In addition, the vertex cache is used for vertices in fixed size extended primitives as well as vertices in variable size extended primitives. However, effective arithmetic processing can be performed by performing arithmetic processing on primitives having vertex data in the vertex cache in which a reasonable upper limit is set on the maximum size of the extended primitive that performs arithmetic processing. As a result, it is possible to solve a problem in processing a fixed-size or variable-size extended primitive larger than a simple primitive.

本発明においては，拡張されたプリミティブのアッセンブル及び演算処理は，従来のハードウェアに付加されたモジュールであるプリミティブエンジンによって処理される。このモジュールは，入力情報を，頂点キャッシュから拡張されたプリミティブのそれぞれの頂点についての変換された頂点データの形式で受け取り，付加的に，キャッシュコントローラから可変サイズの拡張されたプリミティブに関するサイズ情報を受け取る。 In the present invention, the assembly and operation processing of the extended primitive are processed by a primitive engine which is a module added to conventional hardware. This module receives input information in the form of transformed vertex data for each vertex of the extended primitive from the vertex cache, and additionally receives size information about variable size extended primitives from the cache controller .

プリミティブエンジンは，拡張されたプリミティブの演算処理アルゴリズムを実行する。そのアルゴリズムは，拡張されたプリミティブの頂点列を解釈し，演算処理された結果を単純プリミティブの列として出力する。これにより以下の問題を解決できる。まずは，拡張されたプリミティブを演算処理する際の融通性（versatility）である。プリミティブエンジンにより制御されプログラマブルな場合は，どのような拡張されたプリミティブであっても，本発明により提案される拡張されたインデックスバッファ及び頂点バッファによる表現を用いることで，演算処理を実現でき，頂点データへの迅速なアクセスを行うことができる。頂点キャッシュへ直接接続するので，拡張されたプリミティブをアッセンブルし，演算処理するための頂点データを取得するための待ち時間の問題は大幅に軽減される。演算処理のためのパイプラインを再利用することや拡張されたプリミティブのチップ上で演算処理を行うこともこれに貢献する。拡張されたプリミティブに関する演算処理結果は，単純プリミティブ列であり演算処理パイプラインの結果により直接サポートされるので，ホストコンピュータへの演算処理結果を用いずに，拡張されたプリミティブ列が直接単純プリミティブ列へと変換されるので，チップ上のみでの演算処理を行うという問題を解決できる。 The primitive engine executes an extended primitive arithmetic processing algorithm. The algorithm interprets the extended primitive vertex sequence and outputs the computed result as a simple primitive sequence. This can solve the following problems. The first is versatility when processing extended primitives. In the case of being controlled and programmable by the primitive engine, any extended primitive can be realized by using the expression by the extended index buffer and the vertex buffer proposed by the present invention. Provides quick access to data. Since it connects directly to the vertex cache, the latency problem of assembling extended primitives and obtaining vertex data for processing is greatly reduced. Reusing the pipeline for arithmetic processing and performing arithmetic processing on the extended primitive chip also contribute to this. Since the operation result for the extended primitive is a simple primitive sequence and is directly supported by the result of the operation pipeline, the extended primitive sequence is directly used for the simple primitive sequence without using the operation result to the host computer. Therefore, the problem of performing arithmetic processing only on the chip can be solved.

拡張されたインデックス／頂点バッファ表現，単純プリミティブを演算処理する頂点キャッシュ論理回路に加えたわずかな修正と，プリミティブエンジンとを組み合わせて用いることで，固定サイズの拡張されたプリミティブ及び可変サイズの拡張されたプリミティブの演算処理を可能とし，ＮＵＲＢＳテッセレーション（tessellation）など現状の３次元グラフィックハードウェアでは存在しない再分割表面レンダリングなどについての，チップ上での演算処理を可能とする。 By using a combination of extended index / vertex buffer representation, minor modifications to the vertex cache logic that computes simple primitives, and the primitive engine, fixed size extended primitives and variable size extended It is possible to perform arithmetic processing on a chip for subdivision surface rendering that does not exist in current three-dimensional graphics hardware such as NURBS tessellation.

図１は，頂点キャッシュ装置の従来におけるハードウェアアーキテクチャを示す。FIG. 1 shows a conventional hardware architecture of a vertex cache device. 図２Ａは，三角形列をサンプリングするためのインデックス／頂点バッファのレイアウトを示す。図２Ｂは，三角形ストリップ列をサンプリングするためのインデックス／頂点バッファのレイアウトを示す。FIG. 2A shows an index / vertex buffer layout for sampling a triangle row. FIG. 2B shows the index / vertex buffer layout for sampling a triangle strip sequence. 図３は，本発明におけるアーキテクチャの頂点キャッシュを示す。FIG. 3 shows the vertex cache of the architecture in the present invention. 図４Ａは，三角形とその近隣の固定サイズの拡張されたプリミティブを示す。図４Ｂは，三角形とその近隣の固定サイズの拡張されたプリミティブストリップ列を示す。図４Ｃは，三角形とその近隣の固定サイズの拡張されたプリミティブのインデックス／頂点バッファのレイアウトを示す。図４Ｄは，三角形とその近隣の固定サイズの拡張されたプリミティブストリップ列のインデックス／頂点バッファのレイアウトを示す。FIG. 4A shows a triangle and its neighboring fixed size extended primitives. FIG. 4B shows a triangle and its neighboring fixed-size extended primitive strip sequence. FIG. 4C shows a fixed size extended primitive index / vertex buffer layout of the triangle and its neighbors. FIG. 4D shows the index / vertex buffer layout of the triangle and its neighboring fixed-size extended primitive strip sequence. 図５Ａは，三角形とその近隣の固定サイズの拡張されたプリミティブスファン列を示す。図５Ｂは，エッジベースのフラップのシルエットの構造を示す。図５Ｃは，三角形とその近隣の固定サイズの拡張されたプリミティブスファン列のインデックス／頂点バッファのレイアウトを示す。FIG. 5A shows a fixed-size extended primitive fan sequence around a triangle and its neighbors. FIG. 5B shows the silhouette structure of the edge-based flap. FIG. 5C shows the index / vertex buffer layout of a fixed-size extended primitive fan array of triangles and their neighbors. 図６Ａは，可変サイズの拡張されたプリミティブを示す。図６Ｂは，可変サイズの拡張されたプリミティブによるＣａｔｍｕｌｌ−Ｃｌａｒｋ再分割パッチを示す。図６Ｃは，可変サイズの拡張されたプリミティブによるＣａｔｍｕｌｌ−Ｃｌａｒｋ再分割パッチのインデックス／頂点バッファのレイアウトを示す。FIG. 6A shows a variable-size extended primitive. FIG. 6B shows a Catmull-Clark subdivision patch with variable size extended primitives. FIG. 6C shows the index / vertex buffer layout of the Catmull-Clark subdivision patch with variable size extended primitives. 図７は，本発明により導入される頂点キャッシュ制御部, プリミティブエンジン及び固定サイズプリミティブ集積回路間の連絡通路を示す。FIG. 7 shows the communication path between the vertex cache controller, the primitive engine and the fixed size primitive integrated circuit introduced by the present invention. 図８Ａは，シルエット検出と視覚化を行わない場合のレンダリング結果を示す。図８Ｂは，シルエット検出と視覚化を行った場合のレンダリング結果を示す。FIG. 8A shows a rendering result when silhouette detection and visualization are not performed. FIG. 8B shows a rendering result when silhouette detection and visualization are performed. 図９Ａは，再分割を行わない場合のワイヤーフレーム形状のレンダリング結果を示す。図９Ｂは，再分割を行った場合のワイヤーフレーム形状のレンダリング結果を示す。FIG. 9A shows a rendering result of the wire frame shape when the re-division is not performed. FIG. 9B shows a rendering result of the wire frame shape when the re-division is performed. 図１０Ａは，再分割を行わない場合のワイヤーフレームのレンダリング結果を示し，ボックス内はメッシュの粗い要素を示す。図１０Ｂは，再分割を行った場合のワイヤーフレームのレンダリング結果を示し，ボックス内は再分割中にスムージングを施されたメッシュの粗いエレメントを示す。FIG. 10A shows the rendering result of the wire frame when the subdivision is not performed, and the inside of the box shows the coarse elements of the mesh. FIG. 10B shows the rendering result of the wire frame when the subdivision is performed, and the inside of the box shows the coarse element of the mesh that is smoothed during the subdivision. 図１１Ａは，再分割を行わない場合のレンダリング像を示す。図１１Ｂは，再分割を行った場合のレンダリング像を示す。FIG. 11A shows a rendered image when no re-division is performed. FIG. 11B shows a rendered image when re-division is performed.

101 単純三角形列
102 単純三角形ストリップ列
103 隣接するプリミティブを有する単純三角形
104 隣接するプリミティブを有する単純三角形ストリップ列
105 隣接するプリミティブファン列を有する単純三角形
106 Ｃａｔｍｕｌｌ−Ｃｌａｒｋ再分割表面パッチ群
107 Ｃａｔｍｕｌｌ−Ｃｌａｒｋ再分割表面パッチ群のモザイク
110-115
メッシュフラグメントの三角形
210 キャッシュデスティネーションマルチプレクサ
220 固定サイズプリミティブ集積回路ユニットソースマルチプレクサ
1000 ホストメモリ
1100 インデックスバッファ
1200 頂点バッファ
2000 頂点キャッシュ制御部
2100 頂点カウンタ
2300 頂点キャッシュ制御部とプリミティブエンジンとの間で頂点配送完了信号をやり取りする情報伝達パス
3000 頂点キャッシュ記憶部
4000 頂点処理ユニット
5000 第２の頂点キャッシュ記憶部
6000 固定サイズプリミティブ集積回路ユニット
7000 固定サイズプリミティブセットアップユニット
8000 プリミティブラステライザー
9000 プリミティブエンジン
9100 プリミティブサイズレジスタ 101 simple triangle row
102 simple triangle strip row
103 Simple triangle with adjacent primitives
104 Simple triangle strip sequence with adjacent primitives
105 Simple triangle with adjacent primitive fan train
106 Catmull-Clark subdivision surface patch group
107 Catmull-Clark Mosaic of subdivided surface patches
110-115
Mesh fragment triangle
210 Cache Destination Multiplexer
220 Fixed Size Primitive Integrated Circuit Unit Source Multiplexer
1000 host memory
1100 Index buffer
1200 vertex buffer
2000 Vertex cache controller
2100 vertex counter
2300 Information transfer path for exchanging vertex delivery completion signals between the vertex cache controller and the primitive engine
3000 vertex cache storage
4000 vertex processing units
5000 Second vertex cache storage
6000 fixed size primitive integrated circuit unit
7000 fixed size primitive setup unit
8000 Primitive Rasterizer
9000 primitive engine
9100 Primitive size register

以下，本発明の実施の形態について説明する。本発明は，基本的には，固定された又は可変の頂点数によって形成される複雑な幾何学的プリミティブ（拡張されたプリミティブともよぶ）のチップ上での処理に関する。 Hereinafter, embodiments of the present invention will be described. The present invention basically relates to on-chip processing of complex geometric primitives (also called extended primitives) formed by a fixed or variable number of vertices.

本発明の第１の側面は，単純プリミティブを記述するとともに，可変サイズの拡張されたプリミティブ列を，可変サイズの拡張されたプリミティブごとに４つ以上の頂点データを用いて記述する３次元コンピュータグラフィックスのための方法であって，複数の属性を含む頂点郡を保存でき，前記属性のバッファメモリ上の位置を，頂点列における頂点番号であるインデックスを整数倍した値と，属性のタイプを示す番号を用いてバイアスすることにより求めることができる頂点バッファを用いて，インデックスと属性のタイプを示す番号とを用いて，前記頂点バッファにおける，ある頂点の頂点属性のメモリ上の位置を特定する，頂点バッファの特定工程と，頂点列に関連して頂点バッファにおける頂点位置属性値列を格納し，可変サイズの拡張されたプリミティブのサイズをインデックスとして記憶し，前記固定サイズのプリミティブ列を，前記頂点列を用いて再構築できるようにし，前記可変サイズの拡張されたプリミティブ列を，プリミティブサイズをあわせて用いることにより再構築できるようにする，インデックスバッファにおけるインデックスを特定する，インデックスバッファの特定工程と，そして，前記工程で求められなかった残りの頂点属性について，可変サイズの拡張されたプリミティブであってもプリミティブサイズに関するインデックスを用いないことを除けば，頂点位置属性を求めると同様の方法によって，残留している全ての頂点属性を求める，複数のインデックスバッファの特定工程と，を有する方法に関する。本明細書において，拡張されたプリミティブは4つ以上の頂点を含んでもよく，頂点の数は，例えば，4,5,6,7,8,9又は10があげられる。 A first aspect of the present invention is a three-dimensional computer graphic that describes a simple primitive and describes a variable-size extended primitive sequence using four or more vertex data for each variable-size extended primitive. Can store a group of vertices containing multiple attributes, and indicates the position of the attribute in the buffer memory by multiplying the index that is the vertex number in the vertex sequence by an integer, and the type of the attribute Using a vertex buffer that can be determined by biasing with a number, using the index and a number indicating the type of attribute, identifying the position of the vertex attribute of a vertex in the vertex buffer in memory, Stores the vertex buffer attribute value sequence in the vertex buffer in relation to the specific process of the vertex buffer and the vertex sequence, variable size The size of the extended primitive is stored as an index, the fixed-size primitive sequence can be reconstructed using the vertex sequence, and the variable-size extended primitive sequence is used together with the primitive size. Primitives, even variable size extended primitives, for index buffer identification process, index buffer identification process, and remaining vertex attributes not found in the process The present invention relates to a method having a plurality of index buffer specifying steps for obtaining all remaining vertex attributes by a method similar to that for obtaining vertex position attributes except that no size-related index is used. In this specification, the extended primitive may include four or more vertices, and the number of vertices is, for example, 4, 5, 6, 7, 8, 9, or 10.

前記の方法は，前記固定サイズのプリミティブ列を前記頂点列を用いて再構築できるようにし，可変サイズの拡張されたプリミティブ列の場合はプリミティブサイズを用いることにより再構築できるように，頂点バッファにおける頂点位置属性の頂点列に対応したインデックス列を記憶するとともに，可変サイズの拡張されたプリミティブのサイズを前記インデックスバッファのインデックス値として記憶するインデックスバッファを特定する工程を含むので，いずれの固定サイズのそれを構成する頂点列から幾何学的プリミティブ列を再構築することができる。さらに，可変サイズの拡張されたプリミティブの頂点位置属性を参照するインデックス列の前にプリミティブサイズを含むことができるので，その方法は，そのサイズとそのサイズを形成する頂点列により再構築されうる可変サイズの拡張されたプリミティブを記述することができる。その方法は，さらに，残りの全ての頂点属性のために複数のインデックスバッファを特定する工程を含んでもよいので，それぞれの属性が自己のインデックスによりアドレスされ，それゆえ分離されたインデックスバッファが必要とされる場合，近隣の点など拡張されたプリミティブの頂点について全ての必要とされる頂点属性を特定することができる。 The method is such that the fixed size primitive sequence can be reconstructed using the vertex sequence, and in the case of variable size extended primitive sequences, it can be reconstructed using the primitive size. Since the index sequence corresponding to the vertex sequence of the vertex position attribute is stored, and the size of the variable-sized extended primitive is stored as the index value of the index buffer, the index buffer is specified. A geometric primitive sequence can be reconstructed from the vertex sequence that composes it. In addition, since the primitive size can be included before the index sequence that refers to the vertex position attribute of the variable size extended primitive, the method can be reconstructed by the size and the vertex sequence that forms the size. A size-extended primitive can be described. The method may further include identifying multiple index buffers for all remaining vertex attributes, so that each attribute is addressed by its own index, thus requiring a separate index buffer. If required, all required vertex attributes can be specified for extended primitive vertices, such as neighboring points.

本発明により導入された拡張されたプリミティブ列を特定するための方法の工程は，以下のメリットを有する。例えば，固定されたサイズ又は可変サイズの頂点列から再構築できるタイプなど，様々なタイプの拡張されたプリミティブを特定することができる。コンパクトインデックスにより参照されるプリミティブ列により共有される頂点として表現されることによりコンパクトになる。本方法は，三角形メッシュ／四角形メッシュを特定するための３次元グラフィックライブラリに用いられるインデックスバッファ／頂点バッファによる表現まで拡張されるので，拡張されたプリミティブ列及び三角形，四角形，線，点など単純プリミティブの列と比較して，拡張されたプリミティブ列を準備し複製するための応用のために，ライブラリＡＰＩｓをわずかに修正するだけですむ。各工程を以下に詳細に説明する。 The method steps for identifying extended primitive sequences introduced by the present invention have the following advantages. Various types of extended primitives can be identified, for example, types that can be reconstructed from fixed-size or variable-size vertex sequences. It becomes compact by being expressed as vertices shared by primitive sequences referenced by a compact index. Since this method is extended to the representation by the index buffer / vertex buffer used in the 3D graphic library for specifying the triangle mesh / rectangular mesh, simple primitives such as extended primitive sequences and triangles, rectangles, lines, points, etc. The library APIs need only be modified slightly for applications to prepare and duplicate extended primitive sequences. Each step will be described in detail below.

本方法は，固定されたサイズ又は可変サイズの拡張されたプリミティブ列を記述するために，単純プリミティブ列を表現するためのインデックス／頂点バッファを拡張する。本明細書において，単純プリミティブとは，三角形，四角形，線，点などコンピュータグラフィックスにおいて用いられる基本形状を意味する。そのようなプリミティブ列の演算処理は，ｏｐｅｎＧＬやＤｉｒｅｃｔ３Ｄなど３次元グラフィックライブラリにおいて通常最も大きな問題である。本明細書において，拡張されたプリミティブとは，４つ以上の頂点数を有する固定された数又は可変な数の頂点列により形成された幾何学的プリミティブを意味する。 The method extends the index / vertex buffer to represent simple primitive sequences to describe fixed size or variable size extended primitive sequences. In this specification, the simple primitive means a basic shape used in computer graphics such as a triangle, a quadrangle, a line, and a point. Such arithmetic processing of a primitive sequence is usually the biggest problem in a three-dimensional graphic library such as openGL and Direct3D. In this specification, the extended primitive means a geometric primitive formed by a fixed number or a variable number of vertex rows having four or more vertex numbers.

単純プリミティブを表現するためのインデックスバッファ／頂点バッファは，単純プリミティブがプリミティブを構成する頂点列として表現される単純プリミティブ列として表現されるものを意味する。頂点バッファは，コンピュータグラフィックスにおいて用いられる頂点属性の記憶装置を意味する。本明細書において，頂点属性は，プリミティブの頂点として用いられる４次元単一空間（homogeneous space）中の点と関連する属性を意味する。頂点属性の好ましい例は，空間内の点，色，テクスチャ座標，法線ベクトル，タンジェントベクトルなどを含む。属性は，スカラー，２つのコンポーネントベクトル，３つのコンポーネントベクトル，など様々な次元でありうるし，１バイト整数，２バイト整数，４バイト整数，４バイト浮動小数点など様々な値のタイプを取りうる。それぞれの属性の例は，その次元と値のタイプによって決めされるある固定されたメモリサイズの記憶装置を必要とする。頂点バッファを特定する方法は，同じ属性タイプの属性値がメモリ中に同じように配置されるような方法で，メモリ中に属性列を配置することを含む。それゆえ，頂点バッファ内のある属性の位置は，属性タイプと関連した列における位置と配置により容易にリカバーできる。従って，属性値は，その列における整数位置又はインデックスにより表すことができる。もしも全ての属性列が同じサイズを有するものであれば，当該列における頂点の数に関連して属性価の頂点列に由来する頂点のインデックス値は増加することとなる。そのような場合では，頂点列を特定することができ，それゆえ，インデックスバッファを用いずに単純プリミティブ列を特定できる。 The index buffer / vertex buffer for representing a simple primitive means that the simple primitive is represented as a simple primitive sequence represented as a vertex sequence constituting the primitive. The vertex buffer means a storage device for vertex attributes used in computer graphics. In this specification, the vertex attribute means an attribute related to a point in a four-dimensional homogeneous space used as a vertex of a primitive. Preferred examples of vertex attributes include points in space, colors, texture coordinates, normal vectors, tangent vectors, and the like. An attribute can have various dimensions such as a scalar, two component vectors, three component vectors, and can take various types of values such as a 1-byte integer, a 2-byte integer, a 4-byte integer, and a 4-byte floating point. Each attribute example requires a storage of a fixed memory size determined by its dimension and value type. A method for identifying a vertex buffer includes placing attribute strings in memory in such a way that attribute values of the same attribute type are similarly placed in memory. Therefore, the position of an attribute in the vertex buffer can be easily recovered by its position and placement in the column associated with the attribute type. Thus, an attribute value can be represented by an integer position or index in the column. If all the attribute columns have the same size, the index value of the vertex derived from the attribute value vertex sequence is increased in relation to the number of vertices in the column. In such a case, a vertex sequence can be specified, and therefore a simple primitive sequence can be specified without using an index buffer.

本明細書において，そのような状況は，「頂点バッファのみの表現」“vertex buffer-only representation”とよばれる。用語「インデックスバッファ」は，頂点属性インデックスと特定するための整数値の列を含むメモリ中のアレイを意味する。頂点列はそれに関連する頂点属性の全てのタイプのインデックス列のうち唯一のもののみを含む場合は，全ての頂点属性について頂点列を完全に記述するために，ひとつのインデックスバッファを特定すれば十分である。逆に，それぞれの頂点属性タイプが頂点列に関連して頂点インデックスの頂点属性に特異なものが存在する場合がある。そのような場合では，特定するためのインデックスバッファの数は，属性タイプの数と同じだけになる。頂点バッファとそれにおける全ての必要とされるインデックスアレイを有するインデックスバッファとを組み合わせることは，単純プリミティブの列により表現するインデックス／頂点バッファを形成することとなる。 In this specification, such a situation is referred to as a “vertex buffer-only representation”. The term “index buffer” means an array in memory that contains a sequence of integer values to identify a vertex attribute index. If a vertex sequence contains only one of all types of index sequences of its associated vertex attributes, it is sufficient to specify one index buffer to fully describe the vertex sequence for all vertex attributes It is. Conversely, there are cases where each vertex attribute type is unique to the vertex attribute of the vertex index in relation to the vertex sequence. In such cases, the number of index buffers to identify is only the same as the number of attribute types. Combining a vertex buffer with an index buffer with all the required index arrays in it forms an index / vertex buffer that is represented by a sequence of simple primitives.

頂点バッファを特定する工程は，上述したように，簡単なプリミティブ列におけるインテックスバッファ及び／又は頂点バッファのように各頂点に関連する頂点の属性（attributes）を特定するための工程である。 As described above, the step of specifying the vertex buffer is a step for specifying the vertex attributes associated with each vertex, such as the index buffer and / or the vertex buffer in a simple primitive sequence.

インデックスバッファを特定する工程は，頂点が拡張されたプリミティブを形成するために頂点位置の属性に関するインデックスを特定するための工程である。拡張されたプリミティブにおける様々な拡張されたサイズの属性のために,この工程は，頂点位置の特定のためのインデックスバッファ内のインデックス値によるプリミティブサイズを特定する工程を含む。可変サイズのプリミティブを用い頂点列がプリミティブを構成するものを実装する場合，この工程は，プリミティブごとにプリミティブサイズを特定することによる，拡張されたプリミティブ列を特定する工程を含んでも良い。本発明は，プリミティブのサイズと，プリミティブを形成できる頂点列と，頂点列の特定値とにより必要な情報の全てをそろえることができるような拡張されたプリミティブに用いることができる。 The step of specifying the index buffer is a step for specifying an index related to the attribute of the vertex position in order to form a primitive in which the vertex is extended. For the various extended size attributes in the extended primitive, this step includes specifying the primitive size by the index value in the index buffer for vertex position specification. If a variable-size primitive is used to implement a vertex sequence that constitutes a primitive, this step may include identifying an extended primitive sequence by specifying a primitive size for each primitive. The present invention can be used for an extended primitive in which all necessary information can be provided by the size of the primitive, the vertex sequence that can form the primitive, and the specific value of the vertex sequence.

本発明において，固定されたサイズの拡張されたプリミティブの例は，隣接を含む三角形“Triangle with Neighborhood”（ＴＷＮ）プリミティブである。そのようなプリミティブは，三角形の各辺に隣接する３つの三角形を含むメッシュを構成するそれぞれの三角形を，３次元空間における三角形によって閉じた空間を構成するメッシュ（三角形が複数辺を共有することで連なってメッシュ状になったもの）において検討することで形成することができる。 In the present invention, an example of a fixed-size extended primitive is a triangle “Triangle with Neighborhood” (TWN) primitive that includes a neighbor. Such a primitive is a mesh that forms a space closed by triangles in a three-dimensional space (the triangles share multiple sides). It can be formed by studying in a continuous mesh).

図４（Ａ）を参照して，上記を説明する。図４（Ａ）は，三角メッシュの概念図である。図４（Ａ）に示されるように，三角メッシュ上にフラグメント１０３が描画される。
頂点{ｖ_１，ｖ_２，ｖ_３}によって形成される三角形に対しては，ＴＷＮプリミティブは，その三角形と，その三角形{ｖ_１，ｖ_２，ｖ_３}の各辺{ｖ_２, ｖ_１}, {ｖ_１, ｖ_３}, {ｖ_３, ｖ_２}に隣接する三角形{ｖ_０, ｖ_１, ｖ_２}, {ｖ_１, ｖ_５, ｖ_３},{ｖ_２, ｖ_３, ｖ_４}によって形成される。本明細書において，三角形{ｖ_１, ｖ_２, ｖ_３}は，“中心”三角形とも呼ばれる。このようにＴＷＮプリミティブを表現するために，全部で6個の頂点列が必要とされる。ＴＷＮプリミティブにおける頂点決定，すなわち，頂点位置間のマッピング方法とプリミティブ内での各頂点と他の頂点との接続関係の好ましい例は，たとえば以下のものがあげられる。頂点列{ｖ_０, ｖ_１, ｖ_２}によって形成される三角形に隣接する三角形について，辺{ｖ_０, ｖ_１}, {ｖ_１, ｖ_２}, {ｖ_２, ｖ_０}に対応する頂点をｖ_０１, ｖ_１２, ｖ_２０（ただし，ｖ_０１, ｖ_１２, ｖ_２０はそれぞれ中心三角形に隣接する三角形であって，上記の辺を共有する三角形の{ｖ_０, ｖ_１, ｖ_２}以外の頂点）としたときに，ＴＷＮプリミティブを表現するための頂点列は，{ｖ_０, ｖ_１, ｖ_０１, ｖ_２, ｖ_２０, ｖ_１２}である。図４（Ａ）に示される三角形{ｖ_２, ｖ_１, ｖ_３}については，ＴＷＮプリミティブを表現するための頂点が{ｖ_２, ｖ_１, ｖ_０, ｖ_３, ｖ_４, ｖ_５}となる。すなわち，vjを，属性インデックスｊを有する頂点位置とすると，中心三角形{ｖ_２, ｖ_１, ｖ_３}のＴＷＮプリミティブの特定インデックスの頂点位置列は，{2,1,0,3,4,5}となる。もしも，もともとのメッシュの端部が開いていれば，それらに隣接した少なくとも一つの三角形を隣接させることができ，もしもメッシュ中の全てのエッジが２つ以下の隣接する三角形しか有しない場合は，そのような全ての三角形に対して更にＴＷＮプリミティブを生成できる。さらに，一つしか隣接する三角形を有しない辺に対しては，人工的に欠けた三角形を補うことも可能である。その一つの方法は，人工的に作られたオープンな辺の頂点を二度用いることによって，縮退された（degenerate）三角形を作ることである。上記と別の方法は，開端（open edge）とは反対位置にある中心三角形の頂点を用いる方法であり，その場合は三角形は縮退せずに中心三角形と（coincide）する。 The above will be described with reference to FIG. FIG. 4A is a conceptual diagram of a triangular mesh. As shown in FIG. 4A, the fragment 103 is drawn on the triangular mesh.
For a triangle formed by vertices {v ₁ , v ₂ , v ₃ }, the TWN primitive is the triangle and each side {v ₂ , v ₁ } of the triangle {v ₁ , v ₂ , v ₃ }. }, {v ₁ , v ₃ }, {v ₃ , v ₂ }, {v ₀ , v ₁ , v ₂ }, {v ₁ , v ₅ , v ₃ }, {v ₂ , v ₃ , v ₄ }. In this specification, the triangles {v ₁ , v ₂ , v ₃ } are also referred to as “center” triangles. In order to express the TWN primitive in this way, a total of 6 vertex sequences are required. Preferred examples of the vertex determination in the TWN primitive, that is, the mapping method between the vertex positions and the connection relation between each vertex in the primitive and other vertices are as follows. For triangles adjacent to the triangle formed by the vertex sequence {v ₀ , v ₁ , v ₂ }, correspond to edges {v ₀ , v ₁ }, {v ₁ , v ₂ }, {v ₂ , v ₀ } The vertices are v ₀₁ , v ₁₂ , v ₂₀ (where v ₀₁ , v ₁₂ , v ₂₀ are triangles adjacent to the central triangle, respectively, and {v ₀ , v ₁ , v _{2 of the} triangle sharing the above-mentioned side] Vertices for expressing the TWN primitive are {v ₀ , v ₁ , v ₀₁ , v ₂ , v ₂₀ , v ₁₂ }. For the triangle {v ₂ , v ₁ , v ₃ } shown in FIG. 4A, the vertices for expressing the TWN primitive are {v ₂ , v ₁ , v ₀ , v ₃ , v ₄ , v ₅ }. It becomes. That is, if vj is the vertex position having the attribute index j, the vertex position sequence of the specific index of the TWN primitive of the center triangle {v ₂ , v ₁ , v ₃ } is {2,1,0,3,4, 5}. If the edges of the original mesh are open, then at least one triangle adjacent to them can be adjacent, and if all edges in the mesh have no more than two adjacent triangles, Further TWN primitives can be generated for all such triangles. Furthermore, for a side having only one adjacent triangle, an artificially missing triangle can be supplemented. One way is to create a degenerate triangle by using the artificially created open edge vertices twice. Another method is to use the vertex of the central triangle located at the opposite position from the open edge, in which case the triangle is not degenerated and coincide with the central triangle.

固定サイズの拡張されたプリミティブ列は，プリミティブ列におけるそれぞれのプリミティブの頂点列を連結（concatenate）することにより形成することができる。頂点位置の属性についてのインデックスバッファは，連結された列における頂点の頂点位置属性のインテックス列として表現することができる。ＴＷＮプリミティブについては，３つの好ましいプリミティブ列の表現方法がある。すなわち，別々のＴＷＮプリミティブ列として表現するもの，ＴＷＮストリップとして表現するもの，及びＴＷＮファンとして表現するものである。それらは，別々の中心三角形列に基づくもの，中心三角形のストリップに基づくもの，及び中心三角形のファンに基づくものである。 A fixed-size extended primitive sequence can be formed by concatenating the vertex sequences of each primitive in the primitive sequence. The index buffer for vertex position attributes can be expressed as an intex sequence of vertex position attributes of vertices in the concatenated sequence. For TWN primitives, there are three preferred primitive string representation methods. That is, what is expressed as a separate TWN primitive sequence, what is expressed as a TWN strip, and what is expressed as a TWN fan. They are based on separate central triangle rows, based on central triangle strips, and based on central triangle fans.

セパレートされたＴＷＮプリミティブ列は，それぞれの三角形に対するＴＷＮプリミティブを中心三角形とし，それぞれの生成されたＴＷＮプリミティブに対する頂点を（concatenate）することにより，設計できる。図４（A）では，中心三角形{ｖ_２, ｖ_１, ｖ_３}と{ｖ_３, ｖ_１, ｖ_５}からなる列により形成されるTWNプリミティブのフラグメント（fragment）を図示している。対応する頂点は{ｖ_２, ｖ_１, ｖ_０, ｖ_３, ｖ_４, ｖ_５, ｖ_３, ｖ_１, ｖ_２, ｖ_５, ｖ_６, ｖ_７}であり，対応する頂点位置の属性に対応するインデックス列は図４(C)に示されるように{2,1,0,3,4,5,3,1,2,5,6,7}.である。 Separated TWN primitive sequences can be designed by concatenating the vertices for each generated TWN primitive with the TWN primitive for each triangle as the central triangle. FIG. 4A illustrates a TWN primitive fragment formed by a sequence of central triangles {v ₂ , v ₁ , v ₃ } and {v ₃ , v ₁ , v ₅ }. The corresponding vertices are {v ₂ , v ₁ , v ₀ , v ₃ , v ₄ , v ₅ , v ₃ , v ₁ , v ₂ , v ₅ , v ₆ , v ₇ }, and the corresponding vertex position attributes The index string corresponding to is {2,1,0,3,4,5,3,1,2,5,6,7}. As shown in FIG.

TWNストリップは，三角形ストリップにより構成することができる。三角形ストリップにおいて，２つの連続する三角形が２つの頂点を共有するという事実により，それぞれのプリミティブにつき２つの頂点のみによりストリップにおける第２の又は次のTWNプリミティブを定義付けることができる。本発明において，頂点列{ｖ_０, ｖ_１, ｖ_２, ｖ_３, ｖ_４, ｖ_５,…}によって形成される三角形ストリップ，{ｖ_０ｖ_１, ｖ_０ｖ_２, ｖ_１ｖ_３, ｖ_２ｖ_４, ｖ_３v5,…}辺に沿ってその辺に対して反対側にある三角形ストリップに隣接する三角形に隣接する三角形の頂点列{ｖ_０１, ｖ_０２, ｖ_１３, ｖ_２４, ｖ_３５,…}によって形成される三角形ストリップに対して，TWBストリップにより定義される頂点列は以下のようになる。すなわち，{ｖ_０, ｖ_１, ｖ_０１, ｖ_２, ｖ_０２, ｖ_３, ｖ_１３, ｖ_４, ｖ_２４, ｖ_５, ｖ_３５,…}であって，最初の６個の頂点列は，最初のＴＷＮプリミティブを定義するものであり，次の２つずつの頂点は，それに続くＴＷＮプリミティブをあらわすものである。図４（Ｂ）では，２つの三角形{ｖ_２, ｖ_１, ｖ_３}と{ｖ_３, ｖ_１, ｖ_５}により構成される三角形ストリップ１０４によるメッシュフラグメントが表現されている。このストリップは，４つの長さの頂点列{ｖ_２, ｖ_１, ｖ_３, ｖ_５}により形成することができる。図４（Ｂ）の例では，ＴＷＮストリップを定義する２つのＴＷＮプリミティブは，{ｖ_２, ｖ_１, ｖ_０, ｖ_３, ｖ_４, ｖ_５, ｖ_７, ｖ_６}となり，それに対応する頂点位置の特定に対するインデックス列は，図４（D）に示されるように{2,1,0,3,4,5,7,6}となる。 A TWN strip can be composed of a triangular strip. In the triangle strip, the fact that two consecutive triangles share two vertices allows the second or next TWN primitive in the strip to be defined by only two vertices for each primitive. In the present invention, triangular strips formed by vertex rows {v ₀ , v ₁ , v ₂ , v ₃ , v ₄ , v ₅ ,...}, {V ₀ v ₁ , v ₀ v ₂ , v ₁ v ₃ , v ₂ v ₄ , v ₃ v 5,...} along the side of the triangle adjacent to the triangle strip on the opposite side of the side, the vertex row {v ₀₁ , v ₀₂ , v ₁₃ , v ₂₄ , For the triangle strip formed by v ₃₅ ,...}, the vertex sequence defined by the TWB strip is: That is, {v ₀ , v ₁ , v ₀₁ , v ₂ , v ₀₂ , v ₃ , v ₁₃ , v ₄ , v ₂₄ , v ₅ , v ₃₅ , ...}, and the first six vertex rows are , Define the first TWN primitive, and the next two vertices represent the following TWN primitive. In FIG. 4B, a mesh fragment is represented by a triangle strip 104 composed of two triangles {v ₂ , v ₁ , v ₃ } and {v ₃ , v ₁ , v ₅ }. This strip can be formed by four length vertex sequences {v ₂ , v ₁ , v ₃ , v ₅ }. In the example of FIG. 4B, two TWN primitives that define a TWN strip are {v ₂ , v ₁ , v ₀ , v ₃ , v ₄ , v ₅ , v ₇ , v ₆ }, and correspond to them. The index string for specifying the vertex position is {2,1,0,3,4,5,7,6} as shown in FIG.

ＴＷＮストリップと同様にして，ＴＷＮファンも三角形ファンから形成することができる。ＴＷＮファンによっても，相当する中心となる三角形ファン列のＴＷＮプリミティブ列を表現するために必要とされるインデックスバッファの数を軽減することができる。本発明において，三角形ファンは，頂点列{ｖ_０, ｖ_１, ｖ_２, ｖ_３, ｖ_４, ｖ_５,…}と，{ｖ_０ｖ_１, ｖ_１ｖ_２, ｖ_２ｖ_３, ｖ_３ｖ_４, ｖ_４ｖ_５,…}辺に沿って，その辺の反対側の頂点における三角形ファンにおける三角形に隣接する三角形の頂点{ｖ_０１, ｖ_１２, ｖ_２３, ｖ_３４, ｖ_４５,…}により形成することができる。ＴＷＮファンを定義づける頂点列は，{ｖ_０, ｖ_１, ｖ_０１, ｖ_２, ｖ_１２, ｖ_３, ｖ_２３, ｖ_４, ｖ_３４, ｖ_５, ｖ_４５,…}で表され，最初の６つの頂点は最初のＴＷＮプリミティブを意味し，続く２つごとの頂点は，連続するＴＷＮプリミティブを定義するものである。ここで，最初の６つの頂点列がＴＷＮストリップの場合と異なることに留意されたい。図５（Ａ）は，三角形ファン１０５を構成する２つの三角形{ｖ_２, ｖ_１, ｖ_３}と{ｖ_２, ｖ_３, ｖ_４}とによるメッシュフラグメントを示す。このファンは，長さが４の頂点列{ｖ_１, ｖ_２, ｖ_３, ｖ_４}により表現することができる。図５（Ａ）に示される本発明においては，２つのＴＷＮプリミティブによるＴＷＮファンの頂点列は{ｖ_２, ｖ_１, ｖ_０, ｖ_３, ｖ_５, ｖ_４, ｖ_６, ｖ_７}となり，それに対応する頂点位置の属性に対するインデックス列は図５(C)に示されるように{2,1,0,3,5,4,6,7}となる。 Similar to the TWN strip, the TWN fan can also be formed from a triangular fan. Also with the TWN fan, it is possible to reduce the number of index buffers required for expressing the TWN primitive string of the triangle fan string corresponding to the center. In the present invention, the triangular fan has a vertex sequence {v ₀ , v ₁ , v ₂ , v ₃ , v ₄ , v ₅ ,...} And {v ₀ v ₁ , v ₁ v ₂ , v ₂ v ₃ , v ₃ v ₄ , v ₄ v ₅ ,...} Along the edge, the vertex of the triangle adjacent to the triangle {v ₀₁ , v ₁₂ , v ₂₃ , v ₃₄ , v ₄₅ , v ₄₅ , v ...}. The vertex sequence that defines the TWN fan is represented by {v ₀ , v ₁ , v ₀₁ , v ₂ , v ₁₂ , v ₃ , v ₂₃ , v ₄ , v ₃₄ , v ₅ , v ₄₅ ,. The six vertices in the mean the first TWN primitive, and every two vertices that follow define a continuous TWN primitive. Note that the first six vertex rows are different from the TWN strip. FIG. 5A shows a mesh fragment by two triangles {v ₂ , v ₁ , v ₃ } and {v ₂ , v ₃ , v ₄ } constituting the triangle fan 105. This fan can be expressed by a vertex sequence {v ₁ , v ₂ , v ₃ , v ₄ } having a length of 4. In the present invention shown in FIG. 5A, the vertex sequence of the TWN fan by the two TWN primitives is {v ₂ , v ₁ , v ₀ , v ₃ , v ₅ , v ₄ , v ₆ , v ₇ }. , The index string corresponding to the attribute of the vertex position corresponding thereto is {2,1,0,3,5,4,6,7} as shown in FIG.

可変サイズの拡張されたプリミティブの好ましい例は，カトマル−クラーク（Ｃａｔｍｕｌｌ−Ｃｌａｒｋ）の細分割パッチプリミティブ（ＣＣＳＰ）を含む。本発明において，ＣＣＳＰは，それぞれのポリゴンにおいて，４つと異なる隣接するエッジと異なる番号のひとつ以上の頂点を有する四角形メッシュにより表現されるポリゴンから構成される。Ｃａｔｍｕｌｌ−Ｃｌａｒｋの再分割による四角形メッシュという観点から，４つと異なる隣接するエッジと異なる番号のひとつ以上の頂点（a vertex with a number of
adjacent edges being different from four）は，イレギュラーな頂点とよばれる。また，ある頂点に隣接するエッジの数は，頂点価（a valence of vertex）とよばれる。つまり，頂点価が４以外の場合は，Ｃａｔｍｕｌｌ−Ｃｌａｒｋの再分割の観点からするとイレギュラーな頂点とみなされる。ＣＣＳＰプリミティブは，四角形における各頂点と，それら四角形の各頂点を共有するポリゴンにおける隣接する全ての頂点とによって形成される。ＣＣＳＰプリミティブにおいて，その中心に位置する四角形を中心四角形とよぶ。もしもイレギュラー頂点の頂点値が存在する場合，中心四角形は，メッシュにおける他の四角形と異なり，中心四角形の周囲の四角形や，ＣＣＳＰプリミティブにおける頂点の数も変化する。ＣＣＳＰプリミティブでは，イレギュラーな頂点価に関連する頂点の数は，ＶをＣＣＳＰプリミティブにおける頂点の数とし，Vをイレギュラーな頂点の頂点価とすると，N=2*V+8のように表すことができる。なお，この場合においてVに所定数を乗算することをスケーリング，２をスケールとよび，８など所定の数を加えることをバイアスとよぶ。 A preferred example of a variable size extended primitive includes the Catmul-Clark subdivision patch primitive (CCSP). In the present invention, the CCSP is composed of polygons each represented by a quadrilateral mesh having four or more adjacent edges and one or more vertices having different numbers. From the point of view of a quadrilateral mesh by Catmull-Clark subdivision, a vertex with a number of adjacent edges that differ from four and a number of vertices
Adjacent edges being different from four) are called irregular vertices. The number of edges adjacent to a vertex is called a valence of vertex. That is, when the vertex value is other than 4, it is regarded as an irregular vertex from the viewpoint of Catmull-Clark subdivision. A CCSP primitive is formed by vertices in a rectangle and all adjacent vertices in a polygon sharing each vertex of the rectangle. In the CCSP primitive, a rectangle located at the center is called a center rectangle. If vertex values of irregular vertices exist, the center rectangle is different from other rectangles in the mesh, and the number of vertices around the center rectangle and the CCSP primitive also change. In the CCSP primitive, the number of vertices related to the irregular vertex value is expressed as N = 2 * V + 8, where V is the number of vertices in the CCSP primitive and V is the vertex value of the irregular vertex. be able to. In this case, multiplying V by a predetermined number is called scaling, 2 is called scale, and adding a predetermined number such as 8 is called bias.

サイズが知られているので，ＣＣＳＰプリミティブは，連続する頂点列における頂点間にマッピングされる頂点列（sequence）によって表現することができ，それによりＣＣＳＰプリミティブの位置が決定される。本発明の好ましい態様では，頂点列（sequence）とマッピングは以下のように形成される。四角形のメッシュについては，ＣＣＳＰプリミティブは，頂点{ｖ_０, ｖ_１, ｖ_２, ｖ_３}を有しており，それらは四角形の辺（ｖ_０ｖ_１, ｖ_１ｖ_２, ｖ_２ｖ_３, ｖ_３ｖ_０）をそれぞれ形成する。なお，イレギュラーな頂点がもしあればｖ_０である。ＣＣＳＰプリミティブを記述する頂点列は以下のように形成される。まず，四角形の頂点は，番号順{ｖ_０, ｖ_１, ｖ_２, ｖ_３}に並べられる。その後，例えば，ｖ_０とｖ_１とのメッシュを構成する四角形に属する，ｖ_０を共有する隣接する辺の頂点が選択される。六番目は，ｖ_１と同じ辺上にあり，ｖ_０と同じ四角形状の頂点である。残りの隣接する頂点は，以下の規則に従って選択される。隣接するある頂点が選択された後に，以前に選択された辺を共有するように次の頂点が選択される。図６（Ａ）に示されるように，制御メッシュ（１０６）のフラグメントは，２つの隣接するＣＣＳＰプリミティブ{ｖ_９, ｖ_５, ｖ_６, ｖ_１０}及び{ｖ_９, ｖ_１０, ｖ_１６, ｖ_１５}を中心四角形として記述される。頂点ｖ_９は，頂点価が５のイレギュラーな頂点である。最初のプリミティブ及び第２のプリミティブを記述する頂点列は，それぞれ{ｖ_９, ｖ_５, ｖ_６, ｖ_１０, ｖ_８, ｖ_４, ｖ_０, ｖ_１, ｖ_２, ｖ_３, ｖ_７, ｖ_１１, ｖ_１７, ｖ_１６, ｖ_１５, ｖ_１４, ｖ_１３, ｖ_１２}と{ｖ_９, ｖ_１０, ｖ_１６, ｖ_１５, ｖ_５, ｖ_６, ｖ_７, ｖ_１１, ｖ_１７, ｖ_２１, ｖ_２０, ｖ_１９, ｖ_１８, ｖ_１４, ｖ_１３, ｖ_１２, ｖ_８, ｖ_４}である。本発明において，ＣＣＳＰプリミティブの頂点位置の属性（attribute）を示すインデックスは，頂点位置の属性を示すインデックスを含むインデックスバッファにおいてＣＣＳＰプリミティブとともに前もって決定されている。すなわち，図６（Ａ）に示される２つのＣＣＳＰプリミティブにおける頂点郡の内容として，図６（Ｃ）に示されるように，インデックスバッファ１２００(または頂点バッファ１１００)は{18,9,5,6,10,8,4,0,1,2,3,7,11,17,16,15,14,13,12,18,9,10,16,15,5,6,7,11,17,21,20,19,18,14,13,12,8,4}を記憶する。 Since the size is known, a CCSP primitive can be represented by a vertex sequence that is mapped between vertices in successive vertex sequences, thereby determining the position of the CCSP primitive. In a preferred embodiment of the present invention, the vertex sequence and mapping are formed as follows. For square meshes, the CCSP primitive has vertices {v ₀ , v ₁ , v ₂ , v ₃ }, which are square edges (v ₀ v ₁ , v ₁ v ₂ , v ₂ v ₃ , v ₃ v ₀ ), respectively. If there is an irregular vertex, it is v ₀ . Vertex sequences describing CCSP primitives are formed as follows. First, the quadrangular vertices are arranged in the order of numbers {v ₀ , v ₁ , v ₂ , v ₃ }. Thereafter, for example, the vertices of adjacent sides that share v ₀ and belong to the quadrangle constituting the mesh of v ₀ and v ₁ are selected. The sixth is on the same side as v ₁ and the same vertex as v ₀ . The remaining adjacent vertices are selected according to the following rules: After an adjacent vertex is selected, the next vertex is selected to share the previously selected edge. As shown in FIG. 6 (A), fragments of the control mesh (106), two adjacent CCSP primitive _{_{{v 9, v 5, v}} 6, v 10} and _{_{{v 9, v 10, v}} 16, v ₁₅ } is described as the center rectangle. The vertex v ₉ is an irregular vertex having a vertex value of 5. Vertex sequences describing the first and second primitives are {v ₉ , v ₅ , v ₆ , v ₁₀ , v ₈ , v ₄ , v ₀ , v ₁ , v ₂ , v ₃ , v ₇ , respectively. v ₁₁ , v ₁₇ , v ₁₆ , v ₁₅ , v ₁₄ , v ₁₃ , v ₁₂ } and {v ₉ , v ₁₀ , v ₁₆ , v ₁₅ , v ₅ , v ₆ , v ₇ , v ₁₁ , v ₁₇ , v ₁₇ , v _21, v is _{_{_{20, v 19, v 18,}}} v 14, v 13, v 12, v 8, v 4}. In the present invention, the index indicating the attribute of the vertex position of the CCSP primitive is determined together with the CCSP primitive in the index buffer including the index indicating the attribute of the vertex position. That is, as the contents of the vertex groups in the two CCSP primitives shown in FIG. 6A, the index buffer 1200 (or vertex buffer 1100) is {18, 9, 5, 6 as shown in FIG. 6C. , 10,8,4,0,1,2,3,7,11,17,16,15,14,13,12,18,9,10,16,15,5,6,7,11,17 , 21, 20, 19, 18, 14, 13, 12, 8, 4}.

残りの頂点属性についてのインデックスバッファ列を特定する工程は，頂点の全ての属性を参照するために同じインデックスが用いられないように別のインデックスが要求されるように頂点属性が特定される工程である。この場合において，頂点は属性インデックスの組によって形成され，そのインデックスは，頂点の異なる属性に関連するインデックスである。上記が当てはまらない場合，全ての頂点属性は，頂点位置属性に相当する一つのインデックスによって特定されうる。そうでなければ，全ての頂点属性を特定するためにインデックスバッファの組は特定されなければならない。頂点位置属性に関するインデックスバッファはプリミティブサイズを含むので，上述したように可変なサイズの拡張されたプリミティブを扱う場合は特別な扱いがなされる。他のインデックスバッファは，プリミティブサイズに関する情報を持たず，可変サイズのプリミティブを表現する場合に頂点位置属性のものと比べて小さい長さとすることができる。本発明では，ある頂点に対する頂点属性のインデックスは以下のように決められる。固定サイズの拡張された属性の場合，ｉ番目の位置のインデックスバッファ価はNを頂点列の長さとして，iは０以上N未満であるインデックスバッファ中のｉ番目頂点のインデックス列に由来するすべての属性の価となる。サイズが可変な拡張されたプリミティブの場合は，状況がより複雑となる。可変サイズの拡張されたプリミティブ列を表現するための頂点列を連結することにより形成される頂点列については，ｉ番目の頂点のインデックス列は，頂点位置属性を示すインデックスバッファ中の(ｉ＋Ｎ_ｉ−１)番目の価と他のインデックスバッファのｉ番目の価とに基づいて形成され，ここで，Ｎ_ｉはプリミティブ数の初期値を１として，ｉ番目の頂点が含むプリミティブの数である。 The process of identifying the index buffer sequence for the remaining vertex attributes is the process of identifying the vertex attributes such that another index is required so that the same index is not used to refer to all attributes of the vertex. is there. In this case, the vertices are formed by a set of attribute indices, which are indices associated with the different attributes of the vertices. If the above is not true, all vertex attributes can be identified by one index corresponding to the vertex position attribute. Otherwise, a set of index buffers must be specified to specify all vertex attributes. Since the index buffer related to the vertex position attribute includes the primitive size, as described above, special handling is performed when handling an extended primitive of a variable size. Other index buffers do not have information about the primitive size, and can have a smaller length than that of the vertex position attribute when expressing variable-size primitives. In the present invention, the index of the vertex attribute for a certain vertex is determined as follows. For extended attributes of fixed size, the index buffer value at the i th position is all derived from the index column of the i th vertex in the index buffer where N is the length of the vertex column and i is greater than or equal to 0 and less than N The value of the attribute. For extended primitives of variable size, the situation is more complicated. For vertex sequences formed by concatenating vertex sequences to represent variable-size extended primitive sequences, the index sequence of the i-th vertex is (i + N _i − in the index buffer indicating the vertex position attribute). 1) It is formed based on the i th value and the i th value of another index buffer, where N _i is the number of primitives included in the i th vertex, where the initial value of the number of primitives is 1.

本発明の第二の側面は，頂点キャッシュ（頂点バッファ）を用いて，固定サイズ又は可変サイズの拡張されたプリミティブを高速に処理するための方法であって，頂点位置属性のインデックスバッファから，可変サイズのプリミティブのプリミティブサイズをフェッチングし，前記プリミティブサイズを特定の回路か又は拡張されたプリミティブアセンブリを処理できるようにプログラマブルにされた演算装置であるプリミティブエンジンに配送する工程と，頂点キャッシュから頂点データを得られない場合は，頂点キャッシュにおける拡張されたプリミティブの頂点についての頂点データをフェッチし，変換して，蓄積する工程と，拡張されたプリミティブのアセンブリと演算処理のために，変換された頂点をプリミティブエンジンへ配送する工程と，プリミティブエンジンにおいて，拡張されたプリミティブをアセンブリし，演算処理する工程と，プリミティブエンジンにおける拡張されたプリミティブの演算処理により得られる固定サイズの単純プリミティブをプリミティブラスタライズのためのパイプラインを経由して固定サイズのプリミティブ集積回路へ配送する工程と，を含む。 A second aspect of the present invention is a method for processing fixed-size or variable-size extended primitives at high speed using a vertex cache (vertex buffer). Fetching the primitive size of a primitive of size and delivering the primitive size to a primitive engine, a computing device programmable to process a particular circuit or extended primitive assembly; and vertex data from a vertex cache Otherwise, the vertex data for the extended primitive vertices in the vertex cache is fetched, converted, stored, and the converted vertices for assembly and operation of the extended primitive To the primitive engine And a process for assembling and processing the extended primitive in the primitive engine, and a simple primitive of a fixed size obtained by the operation processing of the extended primitive in the primitive engine via the pipeline for primitive rasterization. And delivering to a fixed-size primitive integrated circuit.

本方法は，固定されたサイズ又は可変なサイズの拡張されたプリミティブのプロセッシングを実装するための拡張された３次元コンピュータグラフィックスの高速ハードウェアにおける頂点キャッシュ装置を拡張することに基づくものである。本明細書において，頂点キャッシュ装置（vertex cache facility）は，インデックスや頂点バッファからの価を用いて特定された簡単なプリミティブの列を高速にプロセッシングすることを達成するためのシステムを意味する。本明細書において頂点キャッシュ装置の機能は，インデックス／頂点バッファからの価を用いて特定された頂点列を横断（掃引）し，プロセッシングを行う頂点列における対応する頂点の属性列を決定する。頂点キャッシュ記憶部（ストア）としての，頂点キャッシュ装置における記憶部における入手可能な属性と同じ列の頂点かどうか決める。インデックスバッファからサンプリングされた頂点属性インデックスに従って頂点バッファから頂点属性値をサンプリングするか，もしも頂点が頂点キャッシュになく，頂点キャッシュ記憶部に記憶されている場合は，頂点バッファを順次生成しても良い。頂点変換についての属性値に基づいてアセンブルされた頂点を送り，そして付加的に，変形結果を頂点キャッシュの記憶部に記憶する。変換された頂点を，インデックス／頂点バッファにより表現される単純プリミティブ列を表現するために，固定サイズプリミティブ集積回路装置へ配送する。頂点変換を加速するか，又は，頂点データが頂点キャッシュに存在する場合，頂点バッファをサンプリングする必要をなくし，頂点を再度変換する必要をなくするために，変換された頂点を固定サイズプリミティブ集積回路装置へ配送してもよい。実装した態様に応じて付加的な装置を適宜具備しても良い。固定されたプリミティブアセンブル装置（固定されたプリミティブアセンブル装置）との用語は，頂点列から単純なプリミティブを集めて再構築することを実現するシステムを意味する。本明細書において，プリミティブを集めること（assembling of primitive）は，プリミティブを再構築するために必要な全ての情報を集めること，例えば，更に処理を行うためにプリミティブにおける全ての頂点についての情報を蓄積することを意味する。例えば，別々の三角形列の場合は，各々の三角形に存在する３つの頂点列が必要な情報に相当する。同様にして，別々の線については，その線を構成する２つの連続する頂点がそれに相当する。三角形ストリップの場合は，最初の三角形の３つの頂点と，隣接する三角形の頂点列がそれに相当する。固定されたプリミティブアセンブル装置は，演算処理され，配送されるプリミティブのタイプに応じた頂点列を，単純プリミティブを演算処理するのみのラスタライズパイプラインに送れるよう単純プリミティブの状態に戻す。 The method is based on extending a vertex cache device in high speed hardware of extended 3D computer graphics to implement fixed size or variable size extended primitive processing. In this specification, a vertex cache facility refers to a system for achieving high-speed processing of a sequence of simple primitives specified using an index or a value from a vertex buffer. In this specification, the function of the vertex cache device traverses (sweeps) the specified vertex sequence using the value from the index / vertex buffer, and determines the attribute sequence of the corresponding vertex in the vertex sequence to be processed. It is determined whether the vertex is the same column as the attribute available in the storage unit in the vertex cache device as the vertex cache storage unit (store). Sample vertex attribute values from the vertex buffer according to the vertex attribute index sampled from the index buffer, or if the vertex is not in the vertex cache and is stored in the vertex cache store, the vertex buffer may be generated sequentially . The assembled vertex is sent based on the attribute value for the vertex transformation, and additionally, the transformation result is stored in the storage part of the vertex cache. The transformed vertices are delivered to a fixed size primitive integrated circuit device to represent a simple primitive sequence represented by an index / vertex buffer. In order to accelerate vertex transformations or if vertex data is present in the vertex cache, the transformed vertices are fixed-size primitive integrated circuits to eliminate the need to sample the vertex buffer and eliminate the need to transform the vertices again. You may deliver to an apparatus. Additional devices may be provided as appropriate depending on the mounted mode. The term fixed primitive assembler (fixed primitive assembler) refers to a system that implements collecting and reconstructing simple primitives from a sequence of vertices. In this document, assembling of primitives means collecting all the information necessary to reconstruct the primitive, eg, storing information about all vertices in the primitive for further processing. It means to do. For example, in the case of separate triangle rows, three vertex rows existing in each triangle correspond to necessary information. Similarly, for separate lines, two consecutive vertices constituting the line correspond to it. In the case of a triangle strip, the three vertices of the first triangle and the vertex sequence of adjacent triangles correspond to this. The fixed primitive assembly unit is processed and returns the vertex sequence corresponding to the type of delivered primitive to the state of the simple primitive so that it can be sent to a rasterization pipeline that only processes simple primitives.

本発明の第２の側面にかかる方法は，様々な面で頂点キャッシュ装置を拡張する。第一に，本発明の第２の側面によれば，可変サイズの拡張されたプリミティブをフェッチするために新たなロジック回路が必要となる。次に，頂点キャッシュ装置と固定されたプリミティブアセンブル装置との間で機能するプリミティブエンジン（拡張されたプリミティブに関する情報をアッセンブルし，処理するための別個の装置）との情報交換をもたらす。 The method according to the second aspect of the present invention extends the vertex cache device in various aspects. First, according to the second aspect of the present invention, a new logic circuit is required to fetch variable-size extended primitives. It then provides information exchange with the primitive engine (separate device for assembling and processing information about extended primitives) that functions between the vertex cache device and the fixed primitive assembly device.

その方法は，頂点位置属性についてインデックスバッファからプリミティブのサイズをフェッチする工程を含むので，本発明の第１の側面による方法を用いて表現された可変サイズの拡張されたプリミティブの頂点列をプロセッシングするために用いることができる。頂点キャッシュにおける拡張されたプリミティブの頂点データをフェッチし，移送し，そして蓄積する工程により，頂点の属性値から頂点をアッセンブルすることができ，頂点を移送することができ，そして，同じ頂点を後に参照する場合は迅速に取戻しが行われ，頂点キャッシュストアに頂点を記憶することができる。この工程は，拡張されたプリミティブ列や単純プリミティブ列がプロセッシングされるか否かに拘わらず同様である。この方法は，移送された頂点をプリミティブエンジンに配送する工程，プリミティブエンジンにおいて拡張されたプリミティブのプロセッシングを行う工程，及びプロセッシングされた結果を単純プリミティブ列の形で残りのプロセッシングパイプラインに配送する工程をも含む。最初の工程について検討すると，その工程により拡張されたプリミティブをアッセンブルするための必要な情報や，プリミティブエンジンを駆動するプロセッシングアルゴリズムが供給される、そして，プリミティブの干渉が無かったかのような場合におけると同様のプロセッシングを行うために出力結果が配送される。組み合わせることで，以下のようなメリットがある。頂点キャッシュにおいて頂点データをフェッチし，配送し，そして蓄積する工程は，単純プリミティブを処理するのであっても，拡張されたプリミティブ列を処理するのであっても実質的には同様であるので，この方法は，インデックスバッファからフェッチされた属性インデックスに関連した属性から拡張されたプリミティブの頂点をアセンブルするための頂点バッファへのランダムなアクセスを特に特別な方法を用いずに実現することができる。先に説明したと同じ理由で，頂点データをフェッチし，配送し，そして蓄積する工程に用いられるほとんどの論理回路は，単純プリミティブの演算処理と拡張されたプリミティブの演算処理とで共用することができ，拡張されたプリミティブの処理を実現するために実装される論理回路のハードウェアコストを軽減することができる。頂点キャッシュを用いることは，プリミティブエンジンの頂点データを拡張されたプリミティブのアセンブル及び演算処理アルゴリズムへと配送する際の待ち時間を軽減することになる。これにより，拡張されたプリミティブの演算処理に関するパフォーマンスが向上することとなる。インデックスバッファから可変サイズのプリミティブのプリミティブサイズをフェッチングする工程とそれをプリミティブエンジンへ配送する工程と，拡張されたプリミティブをアセンブルし，演算処理するために変換された頂点をプリミティブエンジンへ配送する工程は，プリミティブエンジン上で，本発明の第一の側面により導入された方法を用いて拡張されたプリミティブを演算処理するために必要な全ての情報を配送することを可能とする。プリミティブエンジンによって演算処理された拡張されたプリミティブの出力は，単純プリミティブのものと同様な方法により達成され，拡張されたプリミティブの演算処理を行うためにプリミティブ演算処理用のパイプラインは特に修正する必要がない。さらに，単純プリミティブ列の処理は，単純プリミティブ列の演算処理による変換された頂点は，頂点キャッシュからプリミティブエンジンによりバイパスされる残りの演算処理パイプラインへと直接配送される。この方法は，単純プリミティブ列を演算処理することに比べて追加となるような付加を何ら要求しない。 Since the method includes fetching the size of the primitive from the index buffer for the vertex position attribute, the variable size extended primitive vertex sequence expressed using the method according to the first aspect of the present invention is processed. Can be used for The process of fetching, transporting, and accumulating extended primitive vertex data in the vertex cache can assemble vertices from vertex attribute values, transport vertices, and later transfer the same vertices When referencing, the retrieval is performed quickly and the vertex can be stored in the vertex cache store. This process is the same regardless of whether the extended primitive sequence or simple primitive sequence is processed. The method includes delivering the transferred vertices to a primitive engine, processing the extended primitives in the primitive engine, and delivering the processed results in the form of simple primitives to the rest of the processing pipeline. Is also included. Examining the first step provides the necessary information to assemble the primitive extended by that step, the processing algorithm that drives the primitive engine, and as if there was no primitive interference The output result is delivered for processing. There are the following merits by combining. The process of fetching, delivering and storing vertex data in the vertex cache is substantially the same whether processing simple primitives or extended primitive sequences, so this The method can achieve random access to the vertex buffer to assemble vertices of primitives extended from attributes associated with the attribute index fetched from the index buffer without using any special method. For the same reason as explained above, most logic circuits used in the process of fetching, distributing and storing vertex data can be shared between simple primitive processing and extended primitive processing. In addition, the hardware cost of the logic circuit implemented to realize the extended primitive processing can be reduced. Using a vertex cache reduces the latency when delivering primitive engine vertex data to extended primitive assembly and processing algorithms. As a result, the performance related to the operation processing of the extended primitive is improved. Fetching the primitive size of a variable-size primitive from the index buffer, delivering it to the primitive engine, assembling the extended primitive, and delivering the transformed vertices to the primitive engine for processing On the primitive engine, it is possible to deliver all the information necessary to compute the extended primitive using the method introduced by the first aspect of the present invention. Output of extended primitives processed by the primitive engine is achieved in a manner similar to that of simple primitives, and the pipeline for primitive operations needs to be particularly modified to perform the processing of extended primitives. There is no. Further, in the processing of the simple primitive sequence, the vertex converted by the arithmetic processing of the simple primitive sequence is directly delivered from the vertex cache to the remaining arithmetic processing pipeline bypassed by the primitive engine. This method does not require any additional additions compared to processing simple primitive sequences.

頂点位置属性のインデックスバッファから，可変サイズのプリミティブのプリミティブサイズをフェッチングし，前記プリミティブサイズを特定の回路か又は拡張されたプリミティブアセンブリを処理できるようにプログラマブルにされた演算装置であるプリミティブエンジンに配送する工程は，以下のような内容である。すなわち，頂点位置属性のインデックスバッファからプリミティブサイズを取り戻す工程である。本発明の第１の側面と関連して求められた拡張されたプリミティブを表現するために可変サイズの拡張されたプリミティブをそこに位置させるものである。プリミティブサイズは，拡張されたプリミティブの演算処理において必要とされるかもしれないので，プリミティブエンジンへと配送される。 Fetch primitive size of variable-size primitive from index buffer of vertex position attribute, and deliver the primitive size to primitive engine, which is a programmable arithmetic unit that can process a specific circuit or extended primitive assembly The process to be performed is as follows. That is, it is a step of retrieving the primitive size from the index buffer of the vertex position attribute. In order to represent the extended primitive determined in connection with the first aspect of the present invention, a variable-size extended primitive is located therein. The primitive size is delivered to the primitive engine as it may be needed in the processing of the extended primitive.

本発明においては，プリミティブサイズに関する情報が，頂点位置属性のインデックスバッファにおけるプリミティブの頂点に関する情報よりも先行する。プリミティブサイズのフェッチングが終わった結果として，後者がプリミティブに関するほかのどの情報にも先駆けてプリミティブエンジンへと配送される。プリミティブサイズを知ることは，次の可変サイズのプリミティブの始まりに関する頂点位置属性のインデックスアレイのオフセットを決めることができ，それにより，前記インデックスは次のプリミティブのサイズ情報を含むこととなる。プリミティブサイズは，可変サイズのプリミティブに関する演算処理を始めることができる状態とするためにも必要とされる。後者は，プリミティブサイズによって制御される全てのプリミティブの頂点をプリミティブエンジンが得た後に終えられる。この工程は，可変サイズの拡張されたプリミティブを実現する場合にのみ必要とされる工程である。他のプリミティブの演算処理を行う場合は，この工程を省略できる。 In the present invention, the information regarding the primitive size precedes the information regarding the vertex of the primitive in the index buffer of the vertex position attribute. As a result of the primitive size fetching, the latter is delivered to the primitive engine ahead of any other information about the primitive. Knowing the primitive size can determine the offset of the index array of vertex position attributes with respect to the beginning of the next variable size primitive, so that the index contains the size information of the next primitive. The primitive size is also needed to be able to begin operations on variable-size primitives. The latter ends after the primitive engine has obtained the vertices of all primitives controlled by the primitive size. This step is required only when realizing a variable-size extended primitive. This step can be omitted when performing arithmetic processing of other primitives.

頂点キャッシュから頂点データを得られない場合は，頂点キャッシュにおける拡張されたプリミティブの頂点についての頂点データをフェッチし，変換して，蓄積する工程は，以下のような工程である。上述したように，頂点キャッシュ装置により頂点キャッシュにおいて，フェッチし，変換して，蓄積される工程である。頂点キャッシュ装置は，拡張されたプリミティブの演算処理を行うものではないので，この工程の処理は，頂点キャッシュ装置に関して単純プリミティブであっても固定されたサイズの拡張されたプリミティブ列の処理であっても変わらない。それゆえ，単純プリミティブの演算処理と，拡張されたプリミティブの演算処理を行うための回路など装置面でシェアすることができる。この工程は，頂点キャッシュに既に必要な頂点情報が蓄積されている場合は，省略することができる。この工程は，インデックスバッファ／頂点バッファが拡張されたプリミティブ列を表現する場合でも行われる。そのような場合，頂点属性インデックスがインデックスバッファからサンプリングされる。また，頂点バッファが固定サイズの拡張されたプリミティブのみを表現する場合，頂点属性インデックスは，順番に生成される。 If the vertex data cannot be obtained from the vertex cache, the process of fetching, converting, and storing the vertex data of the extended primitive vertex in the vertex cache is as follows. As described above, this is a process of fetching, converting, and accumulating in the vertex cache by the vertex cache device. Since the vertex cache device does not perform processing of the extended primitive, the processing of this step is processing of an extended primitive sequence having a fixed size even if it is a simple primitive with respect to the vertex cache device. Will not change. Therefore, it is possible to share in terms of devices such as a circuit for performing simple primitive calculation processing and extended primitive calculation processing. This step can be omitted if necessary vertex information is already stored in the vertex cache. This process is performed even when the index buffer / vertex buffer represents an extended primitive sequence. In such a case, the vertex attribute index is sampled from the index buffer. Also, when the vertex buffer represents only fixed size extended primitives, vertex attribute indexes are generated in order.

可変サイズの拡張されたプリミティブを演算処理する場合はプリミティブサイズをフェッチングする工程では，もし異なる頂点属性の別々のインデックスバッファが用いられる場合はプリミティブの頂点データをフェッチングする工程を修正する必要が生じうる。その理由は，頂点位置属性のインデックスバッファの長さの相違と，他の全ての属性列を記述するインデックスバッファはプリミティブサイズに関する情報を含んでおり，その長さはプリミティブ列の数だけ他のインデックスバッファよりも大きいからである。ある頂点について頂点データをフェッチングする工程は，問題とされている頂点に対応するそれぞれの頂点属性のインデックス列を形成する必要がある。異なる頂点属性のための分離されたインデックスアレイや，固定サイズのプリミティブ，又は単純プリミティブ列を演算処理する場合は，このフォーメーションが，頂点の位置に対応した全てのインデックスバッファの位置をサンプリングすることにより行われる。可変サイズの拡張されたプリミティブ列を演算処理する場合は，このフォーメーションは以下のように修正される。すなわち，頂点位置属性以外の全ての属性についてのインデックス値が，頂点位置によって決められるインデックスバッファの位置をサンプリングすることにより得られるが，頂点位置属性のインデックスが他の属性のものと現在演算処理を行っている以前のプリミティブの数を足したものにより得られる位置の頂点位置属性のインデックスバッファにより得られる。この工程は，可変サイズの拡張されたプリミティブ列を，インデックスバッファ及び頂点バッファを用いるもので，前記本発明の第１の側面により導入されるものである。 When computing variable size extended primitives, fetching primitive size may require modification of fetching primitive vertex data if separate index buffers with different vertex attributes are used . The reason is that the index buffer length difference of the vertex position attribute and the index buffer describing all other attribute columns contain information about the primitive size, and the length is the number of other index columns. This is because it is larger than the buffer. The process of fetching vertex data for a vertex needs to form an index string for each vertex attribute corresponding to the vertex in question. When processing separate index arrays for different vertex attributes, fixed-size primitives, or simple primitive sequences, this formation will sample all index buffer positions corresponding to vertex positions. Done. When computing variable-size extended primitive sequences, this formation is modified as follows: In other words, the index values for all attributes other than the vertex position attribute are obtained by sampling the index buffer position determined by the vertex position. Obtained by the index buffer of the vertex position attribute of the position obtained by adding the number of previous primitives being performed. This step uses the index buffer and the vertex buffer for the variable-size extended primitive sequence, and is introduced according to the first aspect of the present invention.

拡張されたプリミティブのアセンブリと演算処理のために，変換された頂点をプリミティブエンジンへ配送する工程は，入力された頂点情報を拡張されたプリミティブのアセンブルと演算処理を行うためのプリミティブエンジンによるアルゴリズムへと配送するための工程である。本発明においては，この工程は，固定されたプリミティブアセンブル装置の代わりに，頂点キャッシュからの変換された頂点が配送されるプリミティブエンジンを選択することにより達成される。本明細書においては，“プリミティブエンジン”という語は，以下により，拡張されたプリミティブの演算処理を実現するための固定回路又はプログラマブルシステムを意味する。可変サイズの拡張されたプリミティブの場合は，プリミティブサイズに関する情報を受け取って内部に記憶するプリミティブの頂点に関する情報を蓄積する。蓄積されたプリミティブに関する情報に基づいてプリミティブを再構築する。可変サイズの拡張されたプリミティブを演算処理した結果,固定されたプリミティブアセンブル装置によってアクセスできる単純プリミティブ列となるようなアルゴリズムに従って，プリミティブを再構築する演算処理を行う。 The process of delivering the converted vertices to the primitive engine for extended primitive assembly and computation is to convert the input vertex information to an algorithm by the primitive engine for assembling and computing the extended primitive. It is a process for delivering. In the present invention, this step is accomplished by selecting a primitive engine to which the transformed vertices from the vertex cache are delivered instead of a fixed primitive assembler. In this specification, the term “primitive engine” means a fixed circuit or a programmable system for realizing the operation processing of an extended primitive by the following. In the case of a variable-size extended primitive, information on the primitive size is received and stored on the primitive vertex stored therein. Reconstruct primitives based on information about the accumulated primitives. Arithmetic processing for reconstructing primitives is performed according to an algorithm that results in a simple primitive sequence that can be accessed by a fixed primitive assembling device as a result of arithmetic processing of extended primitives of variable size.

プリミティブエンジンにおける拡張されたプリミティブの演算処理により得られる固定サイズの単純プリミティブをプリミティブラスタライズのためのパイプラインを経由して固定サイズのプリミティブ集積回路へ配送する工程は，以下の工程を含む。固定されたプリミティブアセンブル装置により処理できる形式とされた拡張されたプリミティブを配送する工程である。本発明において，この配送は，頂点キャッシュ装置により単純プリミティブ列が処理された場合に変換された頂点が頂点キャッシュから固定されたプリミティブアッセンブル装置へと直接配送されるのとまさに同じ方法で配送される。従って，固定されたプリミティブアセンブル装置を修正する必要はなく,拡張されたプリミティブの演算処理を行えるようにするためにラスタライゼーションパイプラインも修正する必要がない。 The process of delivering a fixed-size simple primitive obtained by the operation processing of the extended primitive in the primitive engine to the fixed-size primitive integrated circuit via the pipeline for primitive rasterization includes the following steps. Delivering extended primitives that are formatted to be processed by a fixed primitive assembling device. In the present invention, this delivery is delivered in exactly the same way that the transformed vertices are delivered directly from the vertex cache to the fixed primitive assembler when a simple primitive sequence is processed by the vertex cache device. . Therefore, there is no need to modify the fixed primitive assembly device, and there is no need to modify the rasterization pipeline in order to be able to perform extended primitive operations.

プリミティブエンジンにおいて，拡張されたプリミティブをアセンブルし，演算処理する工程は，プリミティブエンジンにより所定のアルゴリズムを実装し，これを用いて拡張されたプリミティブに関する演算処理を行うための工程である。プリミティブエンジンによって実現される固定サイズの拡張されたプリミティブを演算処理するアルゴリズムの好ましい例は，以下のものがあげられるが，これに限定されるものではない。すなわち，メッシュシルエットの検出と視覚化を実現するためにＴＷＮプリミティブ列を演算処理するというものである。本明細書において，メッシュシルエットは，一方が視点方向を向き他方がそうではない方向を向いている１組の三角形によって共有されるトライアングル辺の集合を意味する。メッシュシルエットを視覚化するとは，メッシュをレンダリングする際に，シルエットエッジを視覚的に強調する方法を意味する。シルエットを検出し，視覚化するためのアルゴリズムの概要は以下のとおりである。ＴＷＮプリミティブのための頂点データを蓄積する。中心三角形と隣接する三角形の方向を決定する。この工程は，部分的に並行して行われても良い。そして，中心三角形の三つの辺のそれぞれから三角形の方向を比較することでシルエットエッジを検出する。検出されたシルエットエッジの四角形フラップを生成し，ラスタライズのためにそれを配送する。次のプリミティブについて同様の処理を行う。プリミティブサイズが固定されているので，プリミティブサイズをフェッチする工程は行う必要がない。 In the primitive engine, the process of assembling and calculating the extended primitive is a process for implementing a predetermined algorithm using the primitive engine and performing an arithmetic process on the extended primitive. Preferred examples of algorithms for processing fixed-size extended primitives implemented by the primitive engine include, but are not limited to, the following. In other words, the TWN primitive sequence is processed in order to realize detection and visualization of the mesh silhouette. In this specification, a mesh silhouette means a set of triangle sides shared by a set of triangles, one facing the viewpoint direction and the other facing the other direction. Visualizing a mesh silhouette means a method of visually enhancing silhouette edges when rendering a mesh. The outline of the algorithm for detecting and visualizing silhouettes is as follows. Accumulate vertex data for TWN primitives. Determine the direction of the triangle adjacent to the central triangle. This process may be performed partially in parallel. A silhouette edge is detected by comparing the directions of the triangles from each of the three sides of the central triangle. Generate a rectangular flap of the detected silhouette edge and deliver it for rasterization. The same processing is performed for the next primitive. Since the primitive size is fixed, it is not necessary to perform the process of fetching the primitive size.

ＴＷＮストリップやＴＷＮファンの場合は，ストリップ／ファンにおいて最初とそれ以降のプリミティブは異なる方法で蓄積される。最初のプリミティブは，プリミティブに関する演算処理が始まる前に，６つの頂点を必要とする。第２及びそれ以降のプリミティブは，既に蓄積された頂点を利用することができるので，さらなる２つの頂点のみを要求する。分離されたＴＷＮプリミティブ列の場合は，第２以降のプリミティブであっても，最初のプリミティブと同様に頂点データが蓄積される。 In the case of a TWN strip or TWN fan, the first and subsequent primitives in the strip / fan are stored differently. The first primitive requires six vertices before the operation on the primitive begins. The second and subsequent primitives require only two additional vertices since they can make use of already stored vertices. In the case of the separated TWN primitive sequence, vertex data is accumulated in the same way as the first primitive even for the second and subsequent primitives.

図４（Ｂ）示されるように，それぞれ頂点列{ｖ_２,
ｖ_１, ｖ_３} と {ｖ_３, ｖ_１, ｖ_５}により構成される２つの中心三角形111及び113に相当するＴＷＮプリミティブストリップは，図４（Ｄ）に示されるように{2,1,0,3,4,5,7,6}で示される頂点位置属性列に相当するインデックスバッファのコンテンツとして表現される。頂点キャッシュ装置はインデックスバッファからインデックスに相当する頂点を読み出して，プリミティブエンジンに伝える。たとえば，頂点キャッシュ装置から，プリミティブエンジンに伝えられる第６番目の頂点は，インデックス価５に相当するものである。プリミティブエンジンが，中心三角形１１１と隣接する三角形１１０，１１２，及び１１３についての全ての頂点に関する情報を入手すれば，プリミティブエンジンはＴＷＮストリップ列における最初の三角形の方向やシルエット辺などを検出し始めることができる。三角形の方向の計算は，システム座標における三角形の頂点位置のｘ，ｙ，ｗにより構成される３つのコンポーネントベクトルからなる３つの内積によるスカラーにより評価され，以下のマトリックスにより評価される。 As shown in FIG. 4B, each of the vertex sequences {v ₂ ,
TWN primitive strips corresponding to the two central triangles 111 and 113 constituted by v ₁ , v ₃ } and {v ₃ , v ₁ , v ₅ } are {2,1 as shown in FIG. , 0,3,4,5,7,6} is represented as the contents of the index buffer corresponding to the vertex position attribute sequence indicated by. The vertex cache device reads out the vertex corresponding to the index from the index buffer and transmits it to the primitive engine. For example, the sixth vertex transmitted from the vertex cache device to the primitive engine corresponds to an index value of 5. If the primitive engine obtains information about all vertices for the triangle 101, 112, and 113 adjacent to the center triangle 111, the primitive engine will begin to detect the direction of the first triangle, silhouette edge, etc. in the TWN strip sequence. Can do. The calculation of the direction of the triangle is evaluated by a scalar by three inner products composed of three component vectors constituted by x, y, and w of the vertex position of the triangle in the system coordinates, and is evaluated by the following matrix.

ここで，下付の０，１，及び２は，それぞれ三角形における第１，第２及び第３の頂点を意味する。このサインが辺を共有する２つの三角形と異なる場合は，この辺はシルエットである。それゆえ，シルエット辺を決定するために，ＴＷＮプリミティブの中心三角形と，３つの隣接する三角形の４つの評価が必要となる。 Here, the subscripts 0, 1, and 2 mean the first, second, and third vertices in the triangle, respectively. If this sign is different from two triangles sharing an edge, the edge is a silhouette. Therefore, four evaluations of the central triangle of the TWN primitive and three adjacent triangles are required to determine the silhouette side.

シルエット辺を検出するために，観察者の視野方向と垂直な方向，すなわち視点座標におけるＺ軸の方向の長方形のフラップ（flap）を構成する余分な幾何学図形が設計され，それは対象物の外側に伸びる。その対象物の外側の方向は，頂点キャッシュ装置からプリミティブエンジンへは頂点が運ばれた後に入手することができる他の頂点の属性とそのメッシュの各々の頂点における法線ベクトルにより表現される法線方向に変換される。例えば，そのシルエット辺を構成する第１及び第２の頂点は，x₀,y₀,z₀,w₀視点座標系及びx₁,y₁,z₁,w₁視点座標系においてｖ_０及びｖ_１と命名され，それぞれに関連して法線ベクトルn₀及びn₁を有する。生成したフラップは，視点座標系において以下の座標値を有する頂点により表現される長方形である{x₀,y₀,z₀,w₀},
{x₁,y₁,z₁,w₁}, {x₁+offset_X*n_1x*w₁,y₁+offset_Y*n_1y,z₁,w₁},及び{x₀+offset_X*n_0x,y₀+offset_Y*n_0y,z₀,w₀}。このようにして得られたフラップは，観察者との距離によらずスクリーン空間において一定の幅を有するようにして生成される。図５（Ｂ）は，係数offset_Xと係数offset_Yとが同じ値であり，n_0zとn_1zとが０である場合における，そのようなフラップの構成を示す図である。 In order to detect the silhouette edges, an extra geometric figure is designed that forms a rectangular flap in the direction perpendicular to the observer's field of view, i.e. in the direction of the Z axis in the view coordinates, which is outside the object. To grow. The direction outside the object is the normal represented by the attributes of the other vertices that can be obtained after the vertices are transported from the vertex cache unit to the primitive engine and the normal vector at each vertex of the mesh. Converted to direction. For example, the first and second vertices of the silhouette edges, v ₀ and at _{_{_{x 0, y 0, z 0}}} , w 0 viewpoint coordinate system and x _1, y _1, z _1, w ₁ viewpoint coordinate system v _1, with normal vectors n ₀ and n ₁ associated with each. The generated flap is a rectangle represented by vertices having the following coordinate values in the viewpoint coordinate system {x ₀ , y ₀ , z ₀ , w ₀ },
{x ₁ , y ₁ , z ₁ , w ₁ }, {x ₁ + offset _X * n _1x * w ₁ , y ₁ + offset _Y * n _1y , z ₁ , w ₁ }, and {x ₀ + offset _X * n _0x , y ₀ + offset _Y * n _0y , z ₀ , w ₀ }. The flaps thus obtained are generated so as to have a certain width in the screen space regardless of the distance to the observer. FIG. 5 (B) is the same value as the coefficient offset _X and the coefficient offset _Y, in the case where the n _0z and n _1z is 0, is a diagram showing a configuration of such flaps.

次のプリミティブのために，頂点キャッシュ装置からインデックス値が７と６である２つの頂点が配送され，それによりさらに２つの三角形１１５及び１１４が定義される。この情報により，ストリップにおける第２の三角形のシルエット計算を行うことができる。ただし，以前求めた４つの頂点が再び用いられることに留意しなければならない。さらに，三角形の方向を計算することにより，２つの三角形が再び用いられることとなる。すなわち，一つは問題となる三角形であり，もうひとつは隣接する三角形である。図４Ｂに示される場合では，それら２つの三角形はそれぞれ１１３と１１１である。この方法によれば，シルエットを検出するアルゴリズムを実装する際の計算コストを大きく軽減できる。シルエット辺において重複した幾何学的フラップが生成する事態を防止するため，例えば，観察者に面する方向などある三角形のある方向のＴＷＮプリミティブとなるようにフラップが生成されなければならない。そのほかのプリミティブは，無視することができる。 For the next primitive, two vertices with index values 7 and 6 are delivered from the vertex cache unit, thereby further defining two triangles 115 and 114. With this information, the silhouette calculation of the second triangle in the strip can be performed. However, it should be noted that the previously obtained four vertices are used again. Furthermore, by calculating the direction of the triangle, two triangles will be used again. One is the triangle in question and the other is the adjacent triangle. In the case shown in FIG. 4B, the two triangles are 113 and 111, respectively. According to this method, the calculation cost for implementing an algorithm for detecting silhouettes can be greatly reduced. In order to prevent the occurrence of overlapping geometric flaps at the silhouette sides, the flaps must be generated to be TWN primitives in a certain triangle direction, for example, the direction facing the viewer. Other primitives can be ignored.

最後に，それぞれのプロセッシングされたＴＷＮプリミティブについて，プリミティブエンジンが単純な固定サイズのプリミティブを出力することもできる。シルエット検出アルゴリズムや視覚化アルゴリズムの特別な場合においては，その出力は，不変な中心三角形と，シルエット辺を形成する幾何学的フラップを構成する２つの三角形とを含む。フラップについて，アルゴリズムを実装するに当たり，いくつかの属性は変化されうる。例えば，色は，黒い外観を形成する場合は，黒のものに置き換えられる。出力された三角形は，頂点キャッシュ装置から直接配送されたかのように，固定されたプリミティブアセンブル装置へと配送され，処理される。 Finally, for each processed TWN primitive, the primitive engine can also output a simple fixed size primitive. In the special case of silhouette detection and visualization algorithms, the output includes an invariant central triangle and two triangles that form the geometric flap that forms the silhouette edge. For flaps, some attributes can be changed in implementing the algorithm. For example, the color is replaced with a black one if it forms a black appearance. The output triangles are delivered to the fixed primitive assembler and processed as if delivered directly from the vertex cache unit.

プリミティブエンジンが，可変サイズの拡張されたプリミティブを処理することを実現するためのアルゴリズムの例としてＣａｔｍｕｌｌ−Ｃｌａｒｋ再分割法を実装するものがあげられるが，これに限定されない。同様に，他の再分割スキーム，例えばループ再分割スキームや，４−３再分割スキームなどを実現できる。Ｃａｔｍｕｌｌ−Ｃｌａｒｋ，ＣＣ，再分割法は，複数の工程からなり，荒いメッシュから，滑らかで細かな碁盤目状のメッシュを生成し，各工程はひとつ前の工程で得られたメッシュを再分割する工程である。すなわち，再帰的な方法で基本メッシュにある規則を当てはめて細かくする。その規則は，その前の工程で得られたメッシュのそれぞれの面について新たな頂点を生成し，前の工程で得られたメッシュのそれぞれの辺について新たな頂点を生成し，前の工程での位置に関してメッシュの頂点位置を再配置する。上記の３つのいずれの場合も，次の工程のメッシュの頂点位置は，それが属する辺や面と隣接するものの頂点の線形結合に限られる。より具体的に説明すると，面点（face points）は，もとの頂点の面の位置の平均に位置する。辺点（edge point）の位置は，もとの辺の中心位置の平均，及び２つの新しく隣接する面点の平均として計算される。前工程からの頂点は，以下の式で示されるように位置される。S’=(Q + 2R + S(n-3)) /n。ここで，S’は，頂点の最新の位置であり，Qは，その頂点の周辺に位置する新たな面点の平均値であり, Rは前記頂点を共有する辺の中点の平均値であり,Sは前の工程での頂点位置であり,そしてnは頂点を共有する辺の数，すなわち頂点価である。 An example of an algorithm for realizing that the primitive engine processes variable-size extended primitives includes, but is not limited to, an implementation of the Catmull-Clark subdivision method. Similarly, other subdivision schemes, such as a loop subdivision scheme or a 4-3 subdivision scheme, can be realized. Catmull-Clark, CC, subdivision method consists of multiple steps, generates a smooth and fine grid-like mesh from a rough mesh, and each step subdivides the mesh obtained in the previous step It is a process. In other words, the rules in the basic mesh are applied in a recursive manner and refined. The rule generates a new vertex for each face of the mesh obtained in the previous step, a new vertex for each side of the mesh obtained in the previous step, and Rearrange mesh vertex positions with respect to position. In any of the above three cases, the vertex position of the mesh in the next process is limited to the linear combination of the vertices of the adjacent side or face to which the mesh belongs. More specifically, the face points are located at the average of the positions of the original vertex faces. The position of the edge point is calculated as the average of the center position of the original side and the average of two newly adjacent face points. The vertices from the previous process are positioned as shown in the following equation. S ′ = (Q + 2R + S (n−3)) / n. Where S ′ is the latest position of the vertex, Q is the average value of the new face points located around the vertex, and R is the average value of the midpoints of the edges sharing the vertex. Yes, S is the vertex position in the previous step, and n is the number of edges sharing the vertex, ie the vertex value.

次の再分割工程において頂点位置を計算するために必要とされる頂点の数は，辺の頂点は６，面の頂点は４と固定される。一方，前の工程からの頂点の位置を決めるためには，再配置される１つの隣接したものに属する全ての頂点に関する情報が必要となる。頂点のある隣接するものは，それとメッシュ面を共有する全ての頂点によって形成される。図６（Ａ），図６（Ｂ）及び図６（Ｃ）に示される場合では，基本メッシュ１０６における頂点ｖ_９の隣接するものは，頂点ｖ_１３, ｖ_１２, ｖ_８, ｖ_４, ｖ_５, ｖ_６, ｖ_１０, ｖ_１６, ｖ_１５及びｖ_１４により形成され，その価は５である。頂点の価は，任意の価を取ることができ，したがって，再分割ルールを実行する場合，隣接における多数の頂点の数に関する情報を必要とする。しかしながら，実際は，最大の頂点価を限定することができ，ある計上を再分割するために特にぎりぎりまで制限されない状態で再分割ルールを実装することができる。 The number of vertices required to calculate the vertex position in the next subdivision step is fixed to 6 for the side vertices and 4 for the face. On the other hand, in order to determine the positions of the vertices from the previous process, information on all the vertices belonging to one adjacent object to be rearranged is required. A neighbor with vertices is formed by all vertices that share a mesh face with it. FIG. 6 (A), the in case of shown in FIG. 6 (B) and FIG. 6 (C) is adjacent ones of the vertex _{v 9} in base mesh 106, vertex _{_{_{v 13, v 12, v 8}}} , v 4, v ₅ , v ₆ , v ₁₀ , v ₁₆ , v ₁₅ and v ₁₄ , the value of which is 5. Vertex values can take any value, so when executing a subdivision rule, information about the number of vertices in the neighborhood is needed. However, in practice, the maximum vertex value can be limited, and a subdivision rule can be implemented with no particular limit to subdivide a certain account.

ＣＣスキームを全てのメッシュに用いることもできるが，パッチごとに異なるルールを用いることもできる。再分割表面パッチは，その面のひとつの隣接するものに属する頂点列としての基本メッシュの面から外れるように形成されうる。すなわち，再分割ルールがひとつの隣接するものに限定されるので，面頂点のある隣接するものの集まりは，面そのものの頂点を含む。ＣＣ再分割を２工程行った後に，再分割されたメッシュにおける全ての面は，長方形であり，面についてひとつ異常のイレギュラーな頂点がなくなることが知られている。基本となる面は，そのようにして設計しうるが，同じ結果を得るためにただ１つのＣＣ再分割工程を行えばよい。いずれにせよ，一般的なＣＣスキームを失うことなく面ごとのイレギュラーな頂点を１つ以上有するイレギュラーな長方形でないように形成されるパッチの再分割を考慮することができる。本発明の第１の側面において記載したＣＣＳＰプリミティブは，上記の要件を満たし，基本メッシュの四角形郡に相当するＣＣＳＰプリミティブ郡によりＣＣ再分割が実現できる。 The CC scheme can be used for all meshes, but different rules can be used for each patch. The subdivision surface patch can be formed so as to deviate from the face of the basic mesh as a row of vertices belonging to one neighbor of the face. That is, since the subdivision rule is limited to one adjacent one, the collection of adjacent ones with surface vertices includes the vertices of the surface itself. It is known that after performing CC subdivision two steps, all the faces in the subdivided mesh are rectangular and there is no abnormal irregular vertex on the face. The basic surface can be designed in that way, but only one CC subdivision process needs to be performed to achieve the same result. In any case, subdivision of patches formed to be not irregular rectangles with one or more irregular vertices per face without losing the general CC scheme can be considered. The CCSP primitive described in the first aspect of the present invention satisfies the above-described requirements, and CC subdivision can be realized by the CCSP primitive group corresponding to the square group of the basic mesh.

図６を参照して，インデックス／頂点バッファからＣＣＳＰプリミティブを形成する方法や，プリミティブエンジンにおいてＣＣ再分割を行うアルゴリズムについて説明する。図６（Ａ）は，中心四角形{ｖ_９, ｖ_５, ｖ_６, ｖ_１０}及び{ｖ_９, ｖ_１０, ｖ_１６, ｖ_１５}により形成され頂点価５のイレギュラーな頂点ｖ_９を共有する２つの隣接するＣＣＳＰプリミティブ列を有する基本となる制御メッシュ１０６を示す図である。頂点位置属性のインデックス列を含むインデックスバッファは以下のとおりである。{18,9,5,6,10,8,4,0,1,2,3,7,11,17,16,15,14,13,12,18,9,10,16,15,5,6,7,11,17,21,20,19,18,14,13,12,8,4}。そして，ＣＣＳＰ列の処理は以下のとおりである。プリミティブサイズをフェッチングする工程は，頂点位置属性のインデックス群における最初の位置からプリミティブサイズの値18を取り戻し，それをプリミティブエンジンに配送する，頂点キャッシュ装置により実行される。頂点キャッシュ装置は，18個の頂点を頂点位置属性のインデックスに従ってプリミティブエンジンに配送する。頂点ｖ_１２に相当するプリミティブの最後の頂点がプリミティブエンジンに配送されると，ＣＣＳＰプリミティブの演算処理が始まる。プリミティブサイズ，頂点列を入手できるようになっているので，プリミティブエンジンにより実装されるアルゴリズムにより本発明の第1の側面によって導入された記述方法によりＣＣＳＰプリミティブを再構築できる。プリミティブを再構築した後は，再分割を行うための全ての情報が入手されているので，再構築されたＣＣＳＰプリミティブは，Ｃａｔｍｕｌｌ−Ｃｌａｒｋ再分割表面（ＪｅｆｆｒｅｙＢｌｏｚａｎｄＰｅｔｅｒＳｃｈｒｏｄｅｒ，ｉｎＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＷｅｂ３Ｄ２００２Ｓｙｍｐｏｓｉｕｍ（ＷＥＢ３Ｄ−０２），ｐａｇｅｓ１１−１８，ＮｅｗＹｏｒｋ，Ｆｅｂｒｕａｒｙ２４−２８，２００２，ＡＣＭＰｒｅｓｓ）などのパッチごとの再分割アルゴリズムにより再分割することができる。用いられた再分割スキームは，モザイクメッシュ(107)の断片として描画されるように演算処理されたＣＣＳＰプリミティブの中心四角形の細かなモザイク（tessellation）に相当する四角形の組となる。その得られた四角形のそれぞれは，さらに演算処理を施し，ラスタライズするために二つの三角形に分割され，固定されたプリミティブアセンブル装置へと配送される。これらの工程は，次のＣＣＳＰプリミティブの演算処理のために繰り返される。 With reference to FIG. 6, a method for forming a CCSP primitive from an index / vertex buffer and an algorithm for performing CC subdivision in the primitive engine will be described. FIG. 6A shows an irregular vertex v ₉ having a vertex value of 5 formed by the center squares {v ₉ , v ₅ , v ₆ , v ₁₀ } and {v ₉ , v ₁₀ , v ₁₆ , v ₁₅ }. FIG. 3 is a diagram illustrating a basic control mesh 106 having two adjacent CCSP primitive sequences to share. The index buffer including the index column of the vertex position attribute is as follows. {18,9,5,6,10,8,4,0,1,2,3,7,11,17,16,15,14,13,12,18,9,10,16,15,5 , 6,7,11,17,21,20,19,18,14,13,12,8,4}. The processing of the CCSP sequence is as follows. The step of fetching the primitive size is performed by a vertex cache device that retrieves the primitive size value 18 from the first position in the vertex position attribute index group and delivers it to the primitive engine. The vertex cache device delivers 18 vertices to the primitive engine according to the vertex position attribute index. When the last vertex of the primitive corresponding to the vertex v ₁₂ is delivered to the primitive engine, processing the CCSP primitive begins. Since the primitive size and the vertex sequence can be obtained, the CCSP primitive can be reconstructed by the description method introduced by the first aspect of the present invention by the algorithm implemented by the primitive engine. After reconstructing the primitive, all the information for performing the re-segmentation is obtained, so the reconstructed CCSP primitive is the Catmull-Clark re-segmentation surface (Jeffrey Bloz and Peter Schroder, in Proceedings of the Web). 3D 2002 Symposium (WEB3D-02), pages 11-18, New York, February 24-28, 2002, ACM Press), and the like. The subdivision scheme used is a set of rectangles corresponding to a fine tessellation of the center rectangle of the CCSP primitive that has been processed to be rendered as a fragment of the mosaic mesh (107). Each of the obtained rectangles is further processed and divided into two triangles for rasterization and delivered to a fixed primitive assembler. These steps are repeated for the processing of the next CCSP primitive.

本発明の第３の側面は，本発明の第２の側面にかかる方法を用いて，固定サイズのプリミティブ又は可変サイズのプリミティブ列を処理する本発明の第１の側面に係る処理方法を実現するためのシステムに関する。 The third aspect of the present invention realizes the processing method according to the first aspect of the present invention for processing a fixed-size primitive or a variable-size primitive sequence by using the method according to the second aspect of the present invention. For the system.

図３に示されるように，本発明の第３の側面に係るシステムは，拡張されたプリミティブを処理するために，従来技術の頂点処理に関する工程を変化するようにされている。インデックスバッファ１１００に記憶され，生成されるインデックスにしたがって，頂点バッファ１２００からの頂点データをフェッチし，頂点処理ユニット４０００に頂点データを変換する。変換された頂点データは，プリミティブエンジン９０００に配送され，ここで拡張されたプリミティブのアセンブル（ａｓｓｅｍｂｌｅ）や処理が行われる。プリミティブエンジン９０００で拡張されたプリミティブのアセンブル処理によって発生した単純プリミティブは，固定されたプリミティブを処理する集積回路６０００のような残りの処理パイプラインへと配送される。このような工程は，拡張されたプリミティブを処理する際にのみ特に有効である。単純プリミティブを処理する場合は，前記の第２の工程がなく，転送された頂点は固定されたプリミティブを処理する集積回路６０００に直接配送されればよい。そして，固定されたプリミティブを処理する集積回路６０００は，例えば，図１に示されるような公知のものを適宜採用すればよい。 As shown in FIG. 3, the system according to the third aspect of the present invention is adapted to change the process related to the prior art vertex processing in order to process the extended primitives. According to the index stored in the index buffer 1100 and generated, the vertex data from the vertex buffer 1200 is fetched, and the vertex data is converted to the vertex processing unit 4000. The converted vertex data is delivered to the primitive engine 9000, where the extended primitive is assembled and processed. Simple primitives generated by the assembly process of primitives extended by the primitive engine 9000 are delivered to the remaining processing pipelines such as the integrated circuit 6000 that processes fixed primitives. Such a process is particularly useful only when processing extended primitives. When processing a simple primitive, there is no second step as described above, and the transferred vertex may be directly delivered to the integrated circuit 6000 that processes the fixed primitive. As the integrated circuit 6000 for processing fixed primitives, for example, a known circuit as shown in FIG.

図３に示されるように，本発明のシステムの具体例としては,以下のモジュールを含むシステムがあげられる。すなわち，本発明のシステムは，頂点キャッシュ制御部, VCC, 2000，第１の頂点キャッシュ記憶部, PVC, 3000，第２の頂点キャッシュ記憶部, SVC, 5000，１又は複数の頂点処理ユニット, VPU, 4000，プリミティブエンジン, PE, 9000，及び固定サイズプリミティブ集積回路, FPA, 6000である。残りのパイプラインは，三角形のように単純プリミティブの固定された列の処理を実現する固定サイズプリミティブセットアップユニット 7000 及びラステライザー 8000を具備するシステムに関する。 As shown in FIG. 3, a specific example of the system of the present invention is a system including the following modules. That is, the system of the present invention includes a vertex cache control unit, VCC, 2000, first vertex cache storage unit, PVC, 3000, second vertex cache storage unit, SVC, 5000, one or more vertex processing units, VPU 4000, Primitive Engine, PE, 9000, and Fixed Size Primitive Integrated Circuit, FPA, 6000. The remaining pipeline relates to a system comprising a fixed size primitive setup unit 7000 and a rasterizer 8000 that implements processing of a fixed sequence of simple primitives such as triangles.

図３には，上記のシステムと情報の授受や処理を行う他のユニットを示していない。それは簡潔にするためと，本発明に特徴のあるものにアッセンブルするためである。具体的に記載を省略したものとして，ホストＣＰＵやホストメモリ，三角形のラスタライズ用パイプラインなどがある。それらの相互の情報授受は，本発明との関係で必要な範囲で説明を行う。上記したモジュールにおいては，プリミティブエンジンと可変サイズの拡張されたプリミティブを処理することに関する頂点キャッシュ制御部の論理回路を適宜修正し，その他は公知のものを適宜用いることができる。すなわち，上記以外のモジュールは，点，線，三角形など単純固定サイズのプリミティブの処理を実現するためには，現在の３次元コンピュータグラフィックスにおいて用いられているものと類似したものを適宜採用することができる。 FIG. 3 does not show other units that exchange information and process information with the above system. This is for the sake of brevity and assembling what is characteristic of the present invention. Specifically omitted are a host CPU, a host memory, a triangular rasterization pipeline, and the like. The mutual exchange of information will be described to the extent necessary in relation to the present invention. In the above-described module, the logic circuit of the vertex cache control unit related to processing of the primitive engine and the variable-size extended primitive can be modified as appropriate, and the others can be used as appropriate. In other words, modules other than those described above should appropriately adopt modules similar to those used in current three-dimensional computer graphics in order to implement simple fixed-size primitives such as points, lines, and triangles. Can do.

頂点キャッシュ制御部,ＶＣＣ,２０００は，以下の処理を行う。インデックスバッファの内容を解析する。ＰＶＣ３０００とＳＶＣ５０００の状態に従って，インデックスバッファ１１００からフェッチされた頂点インデックスにしたがって，ＰＶＣ３０００の頂点バッファの内容をフェッチする。頂点データの頂点ごとの処理を実行するために，ＰＶＣ３０００からＶＰＵ（ｓ）４０００へ伝えられる頂点データを制御する。ＶＰＵ（ｓ）４０００からＳＶＣ５０００へと伝えられうる転送された頂点データの蓄積を制御する。プリミティブの種類（可変なサイズの拡張されたプリミティブかどうか）に応じて，ＳＶＣ５０００の内容を，ＰＥ９０００又はＦＰＡ６０００に配送する（具体的には，プリミティブの種類を判断し，プリミティブが可変なサイズの拡張されたプリミティブであればＳＶＣ５０００の内容を，ＰＥ９０００に配送する）。可変なサイズの拡張されたプリミティブを処理する場合は，ＰＥ９０００にプリミティブのサイズに関する情報を配送する。ＰＥ９０００に，プリミティブのサイズに関する情報と，全てのプリミティブの頂点に関する頂点データが配送された後には，可変なサイズの拡張されたプリミティブに関する処理を始める可能性があることを伝える。上記のとおり，本発明の好ましい態様は，プリミティブサイズはＰＥ９０００にのみ配送される。さらに，ＳＶＣの内容はＰＥに配送され，ＰＥ９０００に，プリミティブのサイズに関する情報と，全てのプリミティブの頂点に関する頂点データが配送された後には，可変なサイズの拡張されたプリミティブに関する処理を始める可能性があることを伝えられる。インデックスバッファを解析する工程も，本発明に特有の工程であり，プリミティブサイズの抽出や，可変サイズの拡張されたプリミティブの処理に関連するものである。他の処理工程は，単純プリミティブに行われている公知の処理工程を適宜採用することができる。 The vertex cache control unit, VCC, 2000 performs the following processing. Analyzes the contents of the index buffer. The contents of the vertex buffer of the PVC 3000 are fetched according to the vertex index fetched from the index buffer 1100 according to the state of the PVC 3000 and the SVC 5000. In order to execute processing for each vertex of the vertex data, the vertex data transmitted from the PVC 3000 to the VPU (s) 4000 is controlled. Controls the accumulation of transferred vertex data that can be communicated from the VPU (s) 4000 to the SVC 5000. The contents of the SVC 5000 are delivered to the PE 9000 or the FPA 6000 depending on the type of primitive (whether it is an extended primitive of variable size) (specifically, the type of primitive is determined, and the size of the primitive is variable) If it is a primitive, the contents of the SVC 5000 are delivered to the PE 9000). When processing an extended primitive with a variable size, information on the size of the primitive is delivered to PE 9000. The PE 9000 is informed that there is a possibility that processing regarding the extended primitive of variable size may be started after the information regarding the size of the primitive and the vertex data regarding the vertices of all the primitives are delivered. As described above, in the preferred embodiment of the present invention, the primitive size is delivered only to PE9000. In addition, the contents of the SVC are delivered to the PE, and after the information about the size of the primitive and the vertex data about the vertices of all the primitives are delivered to the PE 9000, it is possible to start processing on the extended primitive of variable size I can tell you that there is. The process of analyzing the index buffer is also a process unique to the present invention, and is related to the extraction of primitive sizes and the processing of extended primitives of variable sizes. As other processing steps, known processing steps performed on simple primitives can be appropriately employed.

第１の頂点キャッシュ記憶部ＰＶＣは，ホストメモリから連続的に大量の頂点バッファからの情報が伝えられてキャッシュされる場所である。ＰＶＣは，頂点ごとのプロセッシング（per-vertex processing）において，配送されなかった頂点データがＶＰＵ（ｓ）に配送されるように貯蔵するものとしても機能する。比較的大きく連続的なメモリブロックに対して，配送されなかった頂点データによりＰＶＣが埋められることにより，メモリ移送に関して待ち時間の問題を軽減できる。また，ＰＶＣとＶＰＵ（ｓ）が物理的，又は電気的に近くに設けられることにより，ＰＶＣ中に配送されておらず処理されなければならない頂点データがある場合の移送のための待ち時間を軽減できる。 The first vertex cache storage unit PVC is a place where information from a large number of vertex buffers is continuously transmitted from the host memory and cached. The PVC also functions to store vertex data that has not been delivered to VPU (s) in per-vertex processing. By filling the PVC with non-delivered vertex data for a relatively large continuous memory block, the latency problem with respect to memory transfer can be reduced. Also, PVC and VPU (s) are physically or electrically located close together to reduce the waiting time for transport when there is vertex data that is not delivered in the PVC and must be processed. it can.

頂点処理ユニットＶＰＵ（ｓ）は，ある固定された機能を達成するモジュールであるか，例えば，ＯｐｅｎＧＬ及び／又はＤｉｒｅｃｔ３Ｄ３ＤｇｒａｐｈｉｃｓＡＰＩｓなどに基づく頂点処理を実行できるようにプログラマブルなモジュールであっても良い。ＶＰＵ（ｓ）は，位置，色，テキスト座標など様々な移送されなかった頂点属性を入力として受け取り，位置，色，テキスト座標，視点ベクトルなど移送される頂点の様々な属性を生成する。入力と出力の数や次元，すなわち入力と出力の頂点情報のフォーマットは異なっても良い。ＶＰＵ（ｓ）は，ＰＶＣから情報を受け取り，その出力をＳＶＣへと伝える。 The vertex processing unit VPU (s) may be a module that achieves a certain fixed function, or may be a programmable module that can execute vertex processing based on, for example, OpenGL and / or Direct3D 3D graphics APIs. . VPU (s) receives as input various vertex attributes that have not been transferred such as position, color, and text coordinates, and generates various attributes of the transferred vertex such as position, color, text coordinates, and viewpoint vector. The number and dimensions of the input and output, that is, the format of the vertex information of the input and output may be different. VPU (s) receives information from the PVC and communicates its output to the SVC.

プリミティブエンジンＰＥは拡張されたプリミティブに関する情報処理を行うためのある固定された機能を実現するモジュールか，又はプログラマブルなモジュールである。ＰＥを固定された機能を実現するモジュールとして実装することは，高いパフォーマンスを持って，限られた機能を実現するといった用途には有効である。ＰＥをプログラマブルなモジュールとして実現することは，拡張されたプリミティブを処理するアルゴリズムに選択の自由度を与えることができるので，好ましい。拡張されたプリミティブを処理するモードにおいて，ＰＥは，ＳＶＣから移送された頂点に関する出力を受け取り，プリミティブをアッセンブルし，プリミティブに関する処理を行って，ＦＰＡが理解できる単純プリミティブの列として出力する。ＰＥは，本発明に独特なモジュールである。一方，プログラマブルなモジュールとして実装する場合は，単純なプリミティブ列の形で得られる拡張されたプリミティブについての演算結果をＦＰＡに移送するように制御する機能を付加するほかは，プログラマブルなＶＰＵ（ｓ）と多くの機能ロジックを共用するなど，プログラマブルなＶＰＵ（ｓ）と同様の機能を有すればよい。ＰＥは，ＦＰＡの場合と同様に，ＳＶＣとＶＣＣから入力情報を受け取る。ＰＥと他の装置との主な違いは，拡張されたプリミティブサイズの情報と，可変サイズの拡張されたプリミティブを処理する際には拡張されたプリミティブの全ての頂点を配送するための頂点データを配送したことについての通知信号の授受が必要であることに起因する。ＶＣＣは,以下に説明されるように頂点データのためにプリミティブサイズに関する情報をＰＥに伝えるために同じデータチャネルを用いる。 The primitive engine PE is a module that realizes a certain fixed function for performing information processing on the extended primitive, or a programmable module. Mounting a PE as a module that realizes a fixed function is effective for a purpose of realizing a limited function with high performance. Realizing PE as a programmable module is preferable because it gives the algorithm for processing extended primitives a degree of freedom of selection. In the extended primitive processing mode, the PE receives the output for the vertices transferred from the SVC, assembles the primitive, performs the processing for the primitive, and outputs it as a sequence of simple primitives understood by the FPA. PE is a module unique to the present invention. On the other hand, when implemented as a programmable module, a programmable VPU (s) is added except that a function for controlling the operation result of the extended primitive obtained in the form of a simple primitive sequence is transferred to the FPA. It is only necessary to have the same function as the programmable VPU (s), such as sharing many functional logics. The PE receives input information from the SVC and the VCC as in the case of the FPA. The main differences between PE and other devices are the extended primitive size information and the vertex data for delivering all the vertices of the extended primitive when processing variable size extended primitives. This is because it is necessary to send and receive a notification signal about delivery. The VCC uses the same data channel to convey information about the primitive size to the PE for vertex data as described below.

最後に，ＳＶＣから配送された頂点列から固定サイズプリミティブ集積回路は，点，線及び三角形といった単純プリミティブのアッセンブルを行うための固定された機能を有するモジュールである。ＳＶＣは，点，線，ラインループ，三角形，三角形ストリップ，及び三角形ファンといった，ｏｐｅｎＧＬやＤｉｒｅｃｔ３Ｄといった最近の３ＤグラフィックＡＰＩｓに用いることができるプリミティブの集合を実現する。拡張されたプリミティブをプロセッシングする場合は，ＦＰＡはＰＥから入力情報を受け取るが，ＰＥが，移送された頂点に関する情報の授受についてのプロトコルとして，ＳＶＣやＶＣＣがＦＰＡへの頂点データ移送を行うためのものと同じものを用いるために，このモジュールにとって情報源は完全に透明であっても良い。 Finally, the fixed-size primitive integrated circuit from the vertex sequence delivered from the SVC is a module having a fixed function for assembling simple primitives such as points, lines, and triangles. SVC implements a collection of primitives that can be used in modern 3D graphics APIs such as openGL and Direct3D, such as points, lines, line loops, triangles, triangle strips, and triangle fans. When processing extended primitives, the FPA receives input information from the PE, but the PE or SVC can transfer vertex data to the FPA as a protocol for transferring information about the transferred vertices. The source may be completely transparent to this module in order to use the same thing.

頂点キャッシュ制御部は，チップ上のすべてのプリミティブ処理を制御する。頂点キャッシュ制御部は，プリミティブのための頂点データがＰＶＣ及びＳＶＣに記憶された情報を用いて，インデックスバッファに記憶された情報を介して間接的に参照される場合において，インデックスバッファ及び頂点バッファの処理を加速することができる。また，頂点キャッシュ制御部は，プリミティブのための頂点データがＰＶＣを用いて頂点バッファ内の頂点データ列によって形成される場合は，頂点バッファのみを参照することにより，処理を加速できる。ＶＣＣは，固定されたサイズのプリミティブを処理するモードと，拡張されたサイズのプリミティブを処理するモードの２つのモードを，プリミティブに応じて演算処理する。ＶＣＣの演算処理は，単純プリミティブや固定されたサイズの拡張されたプリミティブの演算処理する場合と同様の処理が行われる。 The vertex cache control unit controls all primitive processing on the chip. The vertex cache control unit uses the information stored in the PVC and the SVC to indirectly reference the vertex data for the primitive through the information stored in the index buffer. Processing can be accelerated. Further, when the vertex data for the primitive is formed by the vertex data string in the vertex buffer using PVC, the vertex cache control unit can accelerate the processing by referring only to the vertex buffer. The VCC performs arithmetic processing of two modes, a mode for processing primitives having a fixed size and a mode for processing primitives having an extended size, in accordance with the primitives. The VCC calculation process is the same as the calculation process for a simple primitive or a fixed-size extended primitive.

最初に，現在処理する頂点に対応する頂点属性インデックスセットの演算処理を行うために，移送されなかった頂点に関する頂点データをロードして，ＶＰＵ（ｓ）に送る。もしも，現在サンプルされたＳＶＣから頂点属性インデックス列に相当する移送された頂点がなく，ＰＶＣにも移送されていない頂点データがない場合は，ＶＣＣは，ホストメモリからＰＶＣへと頂点バッファに記憶される内容の塊をアップロードする。その塊は，ＰＶＣの空いているスペースを用いて質問とされるインデックス列に関する頂点データを含んでいるか，そこに前から記憶されている使用されていないデータを上書きする。その塊は，他のインデックスの頂点データを含んでいてもよく，それゆえその塊はＰＶＣに存在するのであれば更に利用されても良い。ＰＶＣにおいて頂点データが利用可能となると，ラウンドロビン（round-robin）方式か又は他の方法により，ＶＣＣはＶＰＵ（ｓ）への配送を始める。もし変換された頂点データが，ＳＶＣには存在せず，ＰＶＣに移送されないものが存在する場合は，ＶＣＣはホストメモリにアクセスせずにＶＰＵ（ｓ）からＰＶＣへの頂点データの移送を始める。このため，ホストメモリへのアクセス時間を削減できる。このようにして，ＰＶＣは，移送されない頂点データに関する演算処理をする処理速度を向上させることにつながる。 First, in order to perform the calculation processing of the vertex attribute index set corresponding to the currently processed vertex, the vertex data relating to the vertex that has not been transferred is loaded and sent to the VPU (s). If there are no transferred vertices corresponding to the vertex attribute index string from the currently sampled SVC and there is no transferred vertex data in the PVC, the VCC is stored in the vertex buffer from the host memory to the PVC. Upload a chunk of content. The chunk contains, or overwrites, unused data previously stored in the PVC containing the vertex data for the index sequence that is queried using the vacant space of the PVC. The chunk may contain other index vertex data, and therefore the chunk may be further utilized if present in the PVC. When vertex data becomes available in the PVC, the VCC begins delivery to the VPU (s), either by a round-robin scheme or other methods. If the converted vertex data does not exist in the SVC and there is something that is not transferred to the PVC, the VCC starts transferring the vertex data from the VPU (s) to the PVC without accessing the host memory. For this reason, the access time to the host memory can be reduced. In this way, PVC leads to an improvement in processing speed for performing arithmetic processing on vertex data that is not transferred.

ホストメモリにおいて異なる頂点属性の記憶位置は，異なるスペーシングと属性サイズのために，好ましくは，分散配置される。ＶＣＣは，そのような分散された頂点属性からのＶＰＵへの入力データをアッセンブルする。具体的な実装では，ＶＰＵへの入力は４つのコンポーネントのフローティングベクターであり，それぞれの頂点に関する入力データはそのようなベクトルのいくつかにまたがることができる。ＶＣＣは，必要なタイプの変換処理を行うことで，分散された頂点属性から入力データのアッセンブルを行う。例えば，２バイトの整数値を浮動小数点のものに変換し，パッキング属性をそのコンフィグレーションに応じて４つのベクトルに変換する。例えば，2セットの２つのコンポーネントテクスチャー座標属性は，４つのコンポーネント入力ベクトルへ変換される。そして，ＶＰＵが動作可能な時間に沢山の入力データとして移送される。 The storage locations of different vertex attributes in the host memory are preferably distributed for different spacing and attribute sizes. The VCC assembles input data to the VPU from such distributed vertex attributes. In a specific implementation, the input to the VPU is a four component floating vector, and the input data for each vertex can span several such vectors. The VCC assembles input data from distributed vertex attributes by performing a necessary type of conversion process. For example, a 2-byte integer value is converted into a floating-point number, and the packing attribute is converted into four vectors according to the configuration. For example, two sets of two component texture coordinate attributes are converted into four component input vectors. Then, it is transferred as a lot of input data during the time when the VPU can operate.

頂点データに関する演算処理を終えて，１つ又はいくつかのＶＰＳ（ｓ）が利用可能な状態となった際には，移送されるべき次の頂点に関する入力がＶＰＵに４つのコンポーネントの浮動小数点ベクトル（4 component floating point
vectors）として配送される。ある数のベクトルを受け取った後に，ＶＰＵは頂点に関する演算処理を始める。その数は，頂点ごとの属性の数や，それらのパッキングに依存し，ＶＰＵを経て決定される。ＶＰＵが移送を終えると，ＶＰＵは，４つのコンポーネントの浮動小数点ベクトルをＳＶＣへと出力する。いくつかのＶＰＵが並行して動作するときは，ＶＣＣは頂点列を表現するためのＶＰＵがＳＶＣへアクセスすることを管理する。移送された頂点のフォーマット，すなわち，属性の数や出力ベクターへのパッキングは，入力されたものと異なるものであっても良い。 When processing on the vertex data is completed and one or several VPS (s) is available, the input for the next vertex to be transferred is input to the VPU as a four-component floating point vector. (4 component floating point
vectors). After receiving a certain number of vectors, the VPU starts processing on the vertices. The number depends on the number of attributes for each vertex and their packing, and is determined through VPU. When the VPU finishes the transfer, the VPU outputs the four component floating point vectors to the SVC. When several VPUs operate in parallel, the VCC manages that the VPU for representing the vertex sequence accesses the SVC. The format of the transferred vertices, ie the number of attributes and the packing into the output vector, may be different from the one entered.

ＳＶＣの貯蔵エンティティも４つのコンポーネントの浮動小数点ベクトルである。それ故，それぞれの移送されたベクトルは，ＳＶＣへの入力列によって占められる。
実際の態様においては，ＳＶＣは，単にＦＩＦＯキューを行い，移送されたデータを最近処理された頂点へと移送する際にＦＩＦＯがオーバーフローした場合には，最も古いデータを急いで出力する。勿論，ＬＲＵ（least recently used）のようなその他の方法をＳＶＣが格納することを妨げるものではない。本発明の本質は，そのような些細な事柄とは関係ないところにある。 The SVC storage entity is also a four component floating point vector. Therefore, each transferred vector is occupied by an input string to the SVC.
In an actual embodiment, the SVC simply performs a FIFO queue and rushes out the oldest data if the FIFO overflows when transferring the transferred data to the recently processed vertex. Of course, this does not prevent the SVC from storing other methods such as LRU (least recently used). The essence of the present invention is not related to such a trivial matter.

単純プリミティブ列の演算処理においては，移送された頂点データは，頂点列の順番に応じてＳＶＣからＦＰＡへと配送される。ＦＰＡは，所定の数の４コンポーネントベクターをＳＶＣから受け取った場合に，頂点演算を始めるようにされている。所定の数は，移送された頂点についてＶＰＵが出力したベクトルの数と等しい。ＦＰＡは，移送された頂点列から三角形をアッセンブルし，そしてラスタライズ処理のためにラスタライゼーションパイプラインへと配送される。 In the operation processing of the simple primitive sequence, the transferred vertex data is delivered from the SVC to the FPA according to the order of the vertex sequence. The FPA starts vertex calculation when a predetermined number of four-component vectors are received from the SVC. The predetermined number is equal to the number of vectors output by the VPU for the transferred vertices. The FPA assembles triangles from the transferred vertex sequence and delivers them to the rasterization pipeline for rasterization processing.

拡張されたプリミティブの演算処理はＰＥにおいてなされるので，変換された頂点データは，最初にＰＥへ送られる。ＰＥも，固定されたサイズのプリミティブの演算処理と，可変サイズの拡張されたプリミティブの演算処理という，主に２つの演算モードを有する。固定サイズのプリミティブの演算モードにおいて，ＰＥは，ＦＰＡと同様にＳＶＣから所定の数の４つのコンポーネントベクトルを受けると，頂点の演算処理を開始する。拡張されたプリミティブの演算処理結果は，単純なプリミティブの列からなる形式で，ＰＥからＦＰＡへと配送される。ＰＥから出力される単純なプリミティブの列は，単純プリミティブの頂点列として再現可能な形式で実現され，プリミティブ列は４つのコンポーネントベクトルをもって表現される。これは，ＶＣＣ，ＰＶＣ，ＶＰＵｓ，ＳＶＣ，ＦＰＡなど単純プリミティブの演算処理に用いられる多くのハードウェアは，ＰＥに拡張されたプリミティブ処理のための演算処理機構を付加するのみで，拡張されたプリミティブを処理する際にも再利用できることを意味する。更に，ホストメモリに格納される頂点属性に関しては，単純プリミティブの演算処理と同程度しか頂点キャッシュを利用しないので，拡張されたプリミティブの演算処理を迅速に行うことができ，その結果，拡張されたプリミティブに関するチップ上での迅速な演算処理を実現できる。 Since the operation processing of the extended primitive is performed in the PE, the converted vertex data is first sent to the PE. The PE also has two main operation modes: a fixed size primitive calculation process and a variable size extended primitive calculation process. In the fixed-size primitive calculation mode, when the PE receives a predetermined number of four component vectors from the SVC, similarly to the FPA, the PE starts the vertex calculation process. The operation result of the extended primitive is delivered from the PE to the FPA in the form of a simple sequence of primitives. A simple primitive sequence output from the PE is realized in a format reproducible as a vertex sequence of simple primitives, and the primitive sequence is represented by four component vectors. This is because most hardware used for arithmetic processing of simple primitives such as VCC, PVC, VPUs, SVC, and FPA only adds an arithmetic processing mechanism for primitive processing extended to PE, This means that it can be reused when processing. Furthermore, with respect to vertex attributes stored in the host memory, the vertex cache is used only to the same extent as the computation processing of simple primitives, so that the computation processing of extended primitives can be performed quickly, and as a result, it has been expanded. It is possible to realize quick calculation processing on the chip regarding the primitive.

固定サイズの拡張されたプリミティブを処理するモードにおいて，ＰＥはそれぞれの頂点データを受け取った後，すなわち，特定の数の４つのベクトルをＳＶＣから受け取った後に演算処理アルゴリズムを実行する。演算処理アルゴリズムは，受け取った頂点データの蓄積を管理するとともに，固定サイズの拡張されたプリミティブを頂点列から再構築した瞬間やその演算処理が簡潔された瞬間を検出する。 In the mode of processing fixed-size extended primitives, the PE executes an arithmetic processing algorithm after receiving each vertex data, that is, after receiving a specific number of four vectors from the SVC. The arithmetic processing algorithm manages the accumulation of received vertex data and detects the moment when a fixed-size extended primitive is reconstructed from a vertex sequence and the moment when the arithmetic processing is simplified.

固定サイズの拡張されたプリミティブを演算処理する際のひとつの好ましい動作システム例は，ＴＷＮプリミティブ列を演算処理するモードであり，それは上述した固定サイズの拡張されたプリミティブの演算処理にしたがって行われる。ＴＷＮプリミティブの列は，中心三角形列の中のひとつと同様の空間的な整合性を有する。しかしながら，プリミティブのサイズは，単なる三角形の２倍であるから，キャッシュサイズがキャッシュヒットレートが中心三角形列の演算処理のものと同様に刷るために，２次的キャッシュのサイズを大きくしなければならない。 One preferred operating system for processing fixed size extended primitives is the mode of processing TWN primitive sequences, which is performed according to the fixed size extended primitive processing described above. The TWN primitive sequence has the same spatial consistency as one of the central triangle sequences. However, since the size of the primitive is twice that of a simple triangle, the size of the secondary cache must be increased in order for the cache size to be printed in the same way as that of the arithmetic processing with the cache hit rate being the central triangle row. .

ＴＷＮプリミティブの列は，頂点データのフェッチング，移送，及びキャッシングに関して，単純プリミティブの列と同様の演算処理が行われる。相違点は，変換された頂点データの送り先がＦＰＡではなくＰＥであるという点である。そして，変換された頂点データは，入力された頂点の順にＳＶＣからＰＥへとひとつひとつ配送される。 The TWN primitive column is subjected to the same arithmetic processing as the simple primitive column with respect to fetching, transporting, and caching of vertex data. The difference is that the destination of the converted vertex data is PE, not FPA. The converted vertex data is delivered one by one from the SVC to the PE in the order of the input vertices.

可変サイズの拡張されたプリミティブの演算処理を行うために，ＶＣＣの論理回路は，本発明において，以下の手段を提供するように修正される。すなわち，インデックスバッファの内容を正確に解析して，プリミティブを形成する頂点の頂点位置属性の頂点属性インデックスからプリミティブサイズを分離する手段，ＰＥへとプリミティブサイズに関する情報を配送する手段，及び，これから処理を行う，まだ配送されていない可変サイズの拡張されたプリミティブの頂点の数を計測して，ＰＥへ伝えるとともに，プリミティブに関する演算処理を始めさせるために頂点データの配送が終わったことをＰＥへと知らしめる手段，である。この修正は，本発明の特有のものである。ＶＣＣの残りの機能は固定されたサイズのプリミティブを演算処理するものをそのまま用いることができ，したがって固定されたサイズのプリミティブを演算処理するための機能をそのまま用いることができる。 In order to perform operations on extended primitives of variable size, the VCC logic circuit is modified in the present invention to provide the following means. That is, means for accurately analyzing the contents of the index buffer, separating the primitive size from the vertex attribute index of the vertex position attribute of the vertex forming the primitive, means for delivering information about the primitive size to the PE, and processing from now on The number of vertices of extended primitives of variable size that have not yet been delivered is counted and communicated to the PE, and the delivery of vertex data is completed to the PE to start the processing related to the primitive. It is a means to inform. This modification is unique to the present invention. The remaining functions of VCC can be used as they are for arithmetic processing of fixed-size primitives, and therefore functions for arithmetic processing of fixed-size primitives can be used as they are.

具体的な具現例では，頂点位置属性のためのインデックスバッファ中のプリミティブの初めのインデックスは，ＶＣＣ論理回路の可変サイズの拡張されたプリミティブ演算処理のモードにおいてそのサイズを決定する。図７に基づいて説明すると，ＶＣＣ２０００は，その内部にある頂点カウンター２１００を初期化し，次のプリミティブの始まりを検出するために次に処理をされるインデックスごとにカウンター値を減らす。プリミティブの演算処理においてＰＥが利用可能となっているので，サイズは，拡張されたプリミティブを形成するための頂点列の頂点属性がＳＶＣから配送されると同じパスを用いてＰＥ内のプリミティブサイズレジスタ９１００へと配送される。プリミティブサイズレジスタ９１００は，ＰＥ内のプリミティブ演算処理アルゴリズムによりアクセスできるように実装され，頂点列を用いてプリミティブを再構築できるようにされている。 In a specific implementation, the initial index of the primitive in the index buffer for the vertex position attribute determines its size in the VCC logic circuit's variable size extended primitive operation mode. Referring to FIG. 7, the VCC 2000 initializes the vertex counter 2100 therein, and decreases the counter value for each index processed next in order to detect the start of the next primitive. Since PE is available in the primitive calculation process, the size is the same as the primitive size register in the PE using the same path as the vertex attribute of the vertex sequence for forming the extended primitive is delivered from the SVC. Delivered to 9100. The primitive size register 9100 is mounted so that it can be accessed by a primitive arithmetic processing algorithm in the PE, and a primitive can be reconstructed using a vertex sequence.

プリミティブサイズに関する情報がＰＥからＶＣＣへと配送されるので，ＶＣＣがＳＶＣからＰＥへと変換された頂点データを送り始めることができることとなる。頂点データのフェッチング，移送及びＳＶＣへの配送に関する演算処理は，固定プリミティブのものと同様である。頂点の配送指令は，サイズ情報を含む最初のインデックスに続くインデックスバッファに記憶されるインデックス列により決められる。ＰＥにプリミティブを演算処理するためのアルゴリズムは，頂点情報を蓄積し，ＶＣＣの頂点カウンター２１００の数値を減らすために，それぞれの頂点の属性列がＰＥへと配送された後となるように実装される。カウンターが０になったときは，全ての頂点データがＰＥへと配送されたことを意味し,その旨の通知の信号が接続２３００を介して，ＶＣＣからＰＥへと配送される。接続２３００を介して信号を受領すると，ＰＥは内部でのプリミティブ演算処理を始める。システムオペレーションのひとつの好ましい例は，基本となるＣＣＳＰプリミティブ列として記述されるコントロールメッシュから再分割表面を発生させるためのＣＣＳＰプリミティブ列に関する処理を行うことである。演算処理は，可変サイズの拡張されたプリミティブの演算処理にしたがって行われる。ＣＣＳＰプリミティブ列との空間的な整合性（Spatial coherency）は極めて高い。実際に，イレギュラーな頂点以外の隣接するパッチの演算処理において，ＣＣＳＰプリミティブは１６個の頂点のうち１２個もの頂点を共有することができ，したがって次のＣＣＳＰプリミティブを演算処理するために４つのみの頂点を加える演算処理を行い，ＳＶＣへと配送すればよい。このようにして，拡張されたプリミティブを演算処理するための頂点キャッシュの使用は，データのアクセスコストを大幅に軽減し，パッチの近隣における再分割パッチのチップ上での演算処理を実現できることとなる。図６に示される例では，全てで２２個の頂点のうち１４個を共有する連続する２つのパッチであり，ＳＶＣのヒット率は，約６４％であり，より長い列の場合はより高くなると考えられる。そのような大きなプリミティブに関する処理を行うために，頂点キャッシュやインデックスバッファ／頂点バッファを用いなければ，頂点処理のために，極めて大きなメモリコストが必要となると考えられる。 Since the information regarding the primitive size is delivered from the PE to the VCC, the VCC can start sending the vertex data converted from the SVC to the PE. Arithmetic processing related to the fetching, transfer and delivery of vertex data to the SVC is the same as that of the fixed primitive. The vertex delivery command is determined by an index string stored in the index buffer following the first index including size information. The algorithm for computing primitives in the PE is implemented so that vertex information is accumulated and the attribute sequence of each vertex is delivered to the PE in order to reduce the value of the VCC vertex counter 2100. The When the counter reaches 0, this means that all vertex data has been delivered to the PE, and a notification signal to that effect is delivered from the VCC to the PE via the connection 2300. When a signal is received via connection 2300, the PE begins internal primitive processing. One preferred example of system operation is to perform processing on a CCSP primitive sequence to generate a subdivision surface from a control mesh described as a basic CCSP primitive sequence. The arithmetic processing is performed according to the arithmetic processing of the variable-size extended primitive. Spatial coherency with the CCSP primitive sequence is extremely high. In fact, in the computation of adjacent patches other than irregular vertices, the CCSP primitive can share as many as 12 vertices out of 16 vertices, and therefore 4 to compute the next CCSP primitive. It is sufficient to perform an arithmetic process for adding only the vertex and deliver it to the SVC. In this way, the use of a vertex cache to compute extended primitives significantly reduces data access costs and allows computation on the chip of subdivision patches in the vicinity of the patch. . In the example shown in FIG. 6, there are two consecutive patches that share 14 of the 22 vertices, and the SVC hit rate is about 64%, which is higher for longer sequences. Conceivable. If a vertex cache or index buffer / vertex buffer is not used to perform processing related to such a large primitive, it is considered that extremely large memory costs are required for vertex processing.

上述のような接続２３００から信号を受領して処理を行うＣＣＳＰプリミティブの再分割パッチの場合は，演算処理に必要な全ての情報はＰＥに配送されているので，ＰＥがＣａｔｍｕｌｌ−Ｃｌａｒｋ再分割表面アルゴリズム（Jeffrey Bolz and Peter Schroder, in Proceedings
of the Web３Ｄ 2002
Symposium (WEB３Ｄ-02)）や，the Web３Ｄ 2002 Symposium (WEB３Ｄ-02), pages 11-18, New York,
February 24 28
2002, ACM Press,に開示された演算処理や他のパッチごとの再分割アルゴリズムに従って，ＣＣＳＰプリミティブの中心四角形に含まれる再分割メッシュを作成し始めることができる。再分割パッチの演算処理の結果は，他の拡張されたプリミティブと同様に，入力情報に応じて，三角形や，ＦＰＡにおける三角形ストリップ，などとされる。本発明の別の側面は，上記のようなシステムを集積回路において実現するものであり，固定された又は可変サイズの拡張されたプリミティブをオンチップで演算処理できるものである。また，そのような集積回路を含むグラフィックカードや，そのような集積回路を含むビデオゲーム装置をも提供する。 In the case of a CCSP primitive subdivision patch that receives and processes a signal from the connection 2300 as described above, since all information necessary for the arithmetic processing is delivered to the PE, the PE is a Catmull-Clark subdivision surface. Algorithm (Jeffrey Bolz and Peter Schroder, in Proceedings
of the Web3D 2002
Symposium (WEB3D-02)), the Web3D 2002 Symposium (WEB3D-02), pages 11-18, New York,
February 24 28
The subdivision mesh included in the center rectangle of the CCSP primitive can be started in accordance with the arithmetic processing disclosed in 2002, ACM Press, and other subdivision algorithms for each patch. The result of the calculation processing of the subdivided patch is a triangle, a triangle strip in the FPA, or the like according to the input information, like other extended primitives. Another aspect of the present invention realizes the above-described system in an integrated circuit, and can process fixed or variable-sized extended primitives on-chip. A graphic card including such an integrated circuit and a video game device including such an integrated circuit are also provided.

本発明は，Ｃａｔｍｕｌｌ−Ｃｌａｒｋ再分割スキームによって生成された再分割表面の高速な処理を実現するために，上記のような集積回路を用いるシステムに関する。本発明の拡張されたプリミティブを演算処理するための装置は，リアルタイムレンダリングのインタラクティブな３次元コンピュータグラフィックスに用いることができる。その３次元コンピュータグラフィックスとして，同時にポリゴンのメッシュの多数の頂点にアクセスするためのアルゴリズムを実装するとともに，ＮＵＲＢＳパッチによる碁盤の目や再分割された表面により表現された複雑な幾何学的な形状のリアルタイムなレンダリングに関するものがあげられる。 The present invention relates to a system using an integrated circuit as described above to realize high-speed processing of a subdivision surface generated by the Catmull-Clark subdivision scheme. The apparatus for computing extended primitives of the present invention can be used for interactive 3D computer graphics for real-time rendering. As 3D computer graphics, an algorithm for accessing multiple vertices of a polygon mesh at the same time is implemented, and a complex geometric shape expressed by a grid of NURBS patches and a subdivided surface. For real-time rendering.

例えば，そのような装置は，３次元画像を視覚化する３次元コンピュータグラフィックスにおけるハードウェアアクセレータカード，パーソナルデジタルアシスタンス，ビデオゲーム装置，カーナビゲーションシステムなどに用いることができる。 For example, such a device can be used for a hardware accelerator card, a personal digital assistance, a video game device, a car navigation system, etc. in 3D computer graphics for visualizing a 3D image.

これらの画像は，株式会社デジタルメディアプロフェッショナルのＰＩＣＡとして知られるエミュレーターに本発明の装置を実装させて，各種3次元コンピュータグラフィックス画像を得た。図８Ａ，図８Ｂ，図９Ａ，図９Ｂ，図１０Ａ，図１０Ｂ，図１１Ａ及び図１１Ｂは，本発明の装置を用いて得られた画像結果を示す。図８Ａと図８Ｂとは，シルエットの方向を比較し，視覚化の可否を検討するものであり，図８Ａは視覚化できず，図８Ｂは視覚化できたものである。例えば，マンガのキャラクタをアニメーションするといった形状のアウトラインを強調する必要があるような用途には，図８Ｂに示されるようなシルエットによる視覚化を用いることができる。 These images were obtained by mounting the apparatus of the present invention on an emulator known as PICA of Digital Media Professional Inc. to obtain various three-dimensional computer graphics images. 8A, 8B, 9A, 9B, 10A, 10B, 11A and 11B show the image results obtained using the apparatus of the present invention. FIG. 8A and FIG. 8B compare the silhouette directions and examine the possibility of visualization. FIG. 8A cannot be visualized, and FIG. 8B can be visualized. For example, a silhouette visualization as shown in FIG. 8B can be used for an application where it is necessary to emphasize the outline of a shape such as animating a manga character.

図９Ａと図９Ｂとは，本発明における再分割を行う前後の簡単なディスクのような形状のものをレンダリングした図を示す。図９Ａは，もとの荒いメッシュを用いたものを示し，図９Ｂは，再分割過程においてポリ銀を追加して再分割したメッシュを用いたものを示す。図１０Ａと図１０Ｂとは，図９Ａ及び図９Ｂに比べてより複雑な形状のものに再分割を行ったものを示す図である。図１１Ａ，図１１Ｂは，それぞれワイヤーフレームを除いた図１０Ａと図１０Ｂのシェーディングに関する。本発明は，頂点キャッシュによりアクセレートされ，再分割アルゴリズムをオンチップで実装できる。よって，本発明は，ビデオゲームで重要な特徴である単純なメッシュに基づいて再分割を行うことにより得られる複雑な形状について実時間での発生や視覚化を行うことができることとなる。 FIG. 9A and FIG. 9B show a rendered version of a simple disc-like shape before and after re-division in the present invention. FIG. 9A shows the one using the original rough mesh, and FIG. 9B shows the one using the mesh that is subdivided by adding polysilver in the subdivision process. FIG. 10A and FIG. 10B are diagrams showing re-division into a more complicated shape as compared with FIG. 9A and FIG. 9B. 11A and 11B relate to the shading in FIGS. 10A and 10B, respectively, excluding the wire frame. The present invention is accelerated by the vertex cache and can implement the subdivision algorithm on-chip. Therefore, according to the present invention, it is possible to generate and visualize in real time a complex shape obtained by performing subdivision based on a simple mesh that is an important feature in video games.

本発明の拡張されたプリミティブを演算処理する装置は，リアルタイム３次元コンピュータグラフィックスの分野などで利用されうる。
The apparatus for processing the extended primitive of the present invention can be used in the field of real-time three-dimensional computer graphics.

Claims

By using a vertex cache in a 3D computer graphics computer system with a vertex cache control unit, it is possible to perform arithmetic processing of fixed-size simple primitive sequences and fixed-size extended or variable-size extended primitive sequences. A way to do,
The vertex cache control unit analyzes the contents of an index buffer that stores the simple primitive sequence and the extended primitive sequence, and determines whether the primitive is the variable size extended primitive according to the type of the primitive. An analysis process for determining whether or not,
When the vertex cache controller determines in the analyzing step that the primitive is the extended primitive of the variable size, the primitive size of the extended primitive of the variable size is fetched from the index buffer of the vertex position attribute. And delivering the primitive size of the variable-size extended primitive to a primitive engine that is a unit for assembling and calculating the variable-size extended primitive;
When there is no vertex data from the vertex cache, the vertex cache control unit fetches vertex data related to the vertices of the extended primitive into the vertex cache, converts the vertex data, and stores the vertex data;
To perform assembling and processing of the extended primitives, wherein the vertex cache control unit, a step of delivering the transformed vertex data to the primitive engine,
The primitive engine, comprising the steps of assembling and processing of the extended primitives,
The primitive engine, a simple primitives fixed size caused by the processing of the extended primitives, and outputting the rasterized pipeline primitives,
A method including the step of delivering the simple primitive to a pipeline for rasterizing the primitive when the vertex cache control unit determines that the primitive is the simple primitive in the analyzing step .

Fetching the primitive size of the variable size extended primitive from the index buffer of the vertex position attribute,
Access the index buffer that describes the vertex position attribute sequence for a vertex of a variable-size extended primitive, and retrieve the first index value for that primitive, along with the next _Ni index that references the vertex forming the primitive Is done by
Here, N _i indicates the retrieved value of the i th primitive, and the size of the next primitive is stored in the N _{i + 1} th position following the size position of the previous primitive in the index buffer.
The method of claim 1.

When there is no more vertex data from the vertex cache, the steps of fetching vertex data relating to the vertices of the extended primitive to the vertex cache, converting the vertex data, and storing are as follows:
In the case of fixed-size extended primitives, this is done from the vertex buffer indicated by aligning the fixed-size extended primitives or those stored in the vertex buffer;
When performing operations on variable-size extended primitives, fixed-size extensions with the first index associated with the primitive in the index buffer, including the index of the vertex position attribute that is not relevant by referring to the contents in the vertex buffer In the case of a selected primitive or a variable-size extended primitive, from the vertex buffer of the column selected from one or more index buffers,
The method of claim 1.

In 3D computer graphics, a system that performs processing of a fixed-size simple primitive sequence and a fixed-size extended or variable-size extended primitive sequence using a vertex cache ,
An initial vertex buffer store (PVC) for accessing a portion of the recently used vertex buffer and obtaining vertex data from the vertex buffer;
One or more vertex processing units (VPUs) for transporting the vertex data obtained from the initial vertex buffer store (PVC);
A second vertex cache store (SVC) for accessing transformed vertex data recently processed by the vertex processing unit (VPU);
Primitives for storing and calculating simple and extended primitives assembled by the vertex data delivered from the second vertex cache store (SVC), and for the calculation result of the calculation processing to be a series of simple primitives An engine (PE),
A rasterizer for collecting a series of fixed-size simple primitives from the series of simple primitives, rasterizing the series of fixed-size simple primitives into fragments, and outputting the fragments to a processing pipeline;
A vertex cache control unit (VCC),
The vertex cache control unit (VCC)
By accessing the index buffer for storing the simple primitive sequence and the extended primitives column, analyzes the contents of the index buffer, depending on the type of primitive, or primitives are extended primitives of said variable size Determine whether or not
When processing an extended primitives of the variable size, conveys information about the extended primitive size of primitives of said variable size to the primitive engine (PE),
Controlling the initial vertex buffer store (PVC) to obtain vertex data from the vertex buffer;
Controls the collection of vertex data stored in the initial vertex buffer store (PVC) and if the transferred vertex data is lost from the initial vertex buffer store (PVC), the vertex processing unit ( Control to transfer the collected vertex data to the VPU), and
To perform these processes as well as collect information about the simple primitives and the extended primitive control that vertex data is delivered the second vertex cache store from (SVC) to the primitive engine (PE) Do,
system.

When processing variable-sized extended primitives, the process of delivering the primitive size to the primitive engine also delivers the vertex data converted to the primitive engine to form the vertices forming the primitive. ,
The system according to claim 4.

The vertex cache controller (VCC) simple primitives and fixed size, and is designed to handle the extended primitives that expanded or variable size of a fixed size,
The fixed-size simple primitive processing mode is used for fixed-size simple primitive arithmetic processing and fixed-size extended or variable- size extended primitive arithmetic processing.
On the other hand, the variable-size extended primitive calculation mode is used for variable-size extended primitive calculation processing.
The system according to claim 4.

5. An integrated circuit comprising the system of claim 4, wherein the series of fixed size extended primitives operates on a series of variable size extended primitives.

A graphics card comprising the integrated circuit according to claim 7.

A video game machine comprising the integrated circuit according to claim 7.