JP5832284B2

JP5832284B2 - Shader complex having distributed level 1 cache system and centralized level 2 cache

Info

Publication number: JP5832284B2
Application number: JP2011511651A
Authority: JP
Inventors: ピー．ドローリエアンソニー; レザーマーク; エス．ハートグロバート; ジェイ．メンターマイケル; シー．ファウラーマーク; ピー．ジニマルコス
Original assignee: Advanced Micro Devices Inc
Current assignee: Advanced Micro Devices Inc
Priority date: 2008-05-30
Filing date: 2009-06-01
Publication date: 2015-12-16
Anticipated expiration: 2029-06-01
Also published as: WO2009145919A1; EP2294571A1; KR20110015034A; CN102047316A; JP2011523745A; EP2294571A4; CN102047316B; KR101427409B1

Description

本発明は概してコンピュータシステム内で実行される計算処理に関し、より特定的にはコンピュータシステム内で実行されるグラフィックス処理タスクに関する。 The present invention relates generally to computational processing performed within a computer system, and more particularly to graphics processing tasks performed within a computer system.

グラフィックス処理ユニット（ＧＰＵ）は、グラフィックス処理タスクを実行するために特別に設計された複合的な(complex)集積回路である。ＧＰＵは、例えば、ビデオゲームアプリケーション等のエンドユーザアプリケーションによって要求されるグラフィックス処理タスクを実行することができる。このような例においては、エンドユーザアプリケーションとＧＰＵの間にいくつものソフトウエアの階層が存在する。 A graphics processing unit (GPU) is a complex integrated circuit specifically designed to perform graphics processing tasks. The GPU may perform graphics processing tasks required by end user applications such as video game applications, for example. In such an example, there are a number of software hierarchies between the end user application and the GPU.

エンドユーザアプリケーションはアプリケーションプログラミングインタフェース（ＡＰＩ）と通信する。ＡＰＩは、ＧＰＵに依存するフォーマットでよりはむしろ標準化されたフォーマットでエンドユーザアプリケーションがグラフィックスデータ及び命令を出力することを可能にする。マイクロソフトコープ(Microsoft Corp.)により開発されたＤｉｒｅｃｔＸ（登録商標）及びシリコングラフィックスインク(Silicon Graphics, Inc.)により開発されたＯｐｅｎＧＬ（登録商標）を含め、何種類ものＡＰＩが商業的に入手可能である。ＡＰＩはドライバと通信する。ドライバはＡＰＩから受け取った標準コードをＧＰＵが理解する固有のフォーマットの命令に変換する。ドライバは典型的にはＧＰＵの製造業者によって書かれる。ＧＰＵはそうしてドライバからの命令を実行する。 The end user application communicates with an application programming interface (API). The API allows end-user applications to output graphics data and instructions in a standardized format rather than in a GPU dependent format. A number of APIs are commercially available, including DirectX (registered trademark) developed by Microsoft Corp. and OpenGL (registered trademark) developed by Silicon Graphics, Inc. is there. The API communicates with the driver. The driver converts the standard code received from the API into instructions in a unique format understood by the GPU. Drivers are typically written by the GPU manufacturer. The GPU then executes instructions from the driver.

ＧＰＵは、描画(rendering)として知られる処理において、その構成要素のより高い階層記述から画像を作り出すピクセルを生成する。ＧＰＵは典型的には、ピクセル、テクスチャ、及び幾何データを処理するパイプラインの使用による連続的描画の概念を利用している。これらのパイプラインはしばしば、固定機能特殊目的パイプラインの集合、例えばラスタライザ(rasterizers)、セットアップエンジン(setup engines)、カラーブレンダ(color blenders)、ヒエラチカルデプス(hieratical depth)、テクスチャマッピング(texture mapping)及びシェーダパイプ又はシェーダパイプラインで達成可能なプログラム可能な段階の集合の意味を持ち、「シェーダ」は描画効果を生じさせる前にグラフィックリソースによって用いられる一連のソフトウエア命令を参照するコンピュータグラフィックスでの用語である。またＧＰＵは、より高いスループットを得るために並列処理設計にある多重化プログラム可能パイプラインを採用することもできる。シェーダパイプラインの多重化はシェーダパイプアレイと称されることもある。 The GPU generates pixels that produce an image from a higher hierarchical description of its components in a process known as rendering. GPUs typically make use of the concept of continuous drawing through the use of pipelines that process pixel, texture, and geometric data. These pipelines are often a collection of fixed function special purpose pipelines such as rasterizers, setup engines, color blenders, hierarchical depth, texture mapping And a set of programmable stages achievable in a shader pipe or shader pipeline, a “shader” is a computer graphic that refers to a set of software instructions used by a graphic resource before producing a rendering effect. Is the term. The GPU can also employ a multiplexed programmable pipeline in parallel processing design to obtain higher throughput. Multiplexing shader pipelines is sometimes referred to as shader pipe arrays.

また、ＧＰＵはテクスチャマッピングとして知られる概念をも支持する。テクスチャマッピングは、テクスチャの隣接ピクセル又はテクセル(texels)の色の使用を通して、テクスチャマッピングされたピクセルに対してテクスチャ色を決定するために用いられる処理である。その処理はテクスチャ平滑化又はテクスチャ補間とも称される。しかし、高画像品質テクスチャマッピングは高度な計算上の複雑性を必要とする。 The GPU also supports a concept known as texture mapping. Texture mapping is a process used to determine the texture color for texture mapped pixels through the use of the color of neighboring pixels or texels of the texture. This process is also called texture smoothing or texture interpolation. However, high image quality texture mapping requires a high degree of computational complexity.

更に、統合化シェーダを備えたＧＰＵは、ピクセル、頂点、プリミティブ(primitive)、表面からの多くの種類のシェーダ処理を同時に支持もするので、汎用のコンピュータはより高性能な汎用メモリアクセス能力に対する要求を高めている。 In addition, GPUs with integrated shaders also support many types of shader processing from pixels, vertices, primitives, and surfaces at the same time, so general-purpose computers demand higher performance general-purpose memory access capabilities. Is increasing.

テクスチャフィルタはピクセルデータのためのローカルキャッシュメモリへの高速アクセスに頼っている。しかし、テクスチャフィルタのための専用のローカルキャッシュメモリの使用は、より一般目的の共用メモリの使用を典型的には阻害する。一般目的の共用メモリはより柔軟な一方で、典型的には応答時間がより遅く従ってより低い性能基準しか満たさない(less performant)。 Texture filters rely on fast access to local cache memory for pixel data. However, the use of dedicated local cache memory for texture filters typically hinders the use of more general purpose shared memory. While general purpose shared memory is more flexible, it typically has a slower response time and therefore less performant.

新しいソフトウエアアプリケーションの高まりつつある複雑性に鑑み、効率的で高品質な描画、テクスチャフィルタリング及び誤り訂正をＧＰＵにおいて提供する要求もまた高まっている。 In view of the increasing complexity of new software applications, there is also a growing demand to provide efficient and high quality rendering, texture filtering and error correction in the GPU.

従って、前述した欠点を軽減するシステム及び／又は方法が求められている。特に、集中型の共用可能なレベル２キャッシュシステムと組み合わされた各テクスチャフィルタのための分散型のレベル１キャッシュシステムが求められている。 Accordingly, there is a need for a system and / or method that alleviates the aforementioned shortcomings. In particular, there is a need for a distributed level 1 cache system for each texture filter combined with a centralized sharable level 2 cache system.

この欄は本発明の幾つかの側面を概説することを目的としており、幾つかの望ましい実施形態を簡潔に紹介するためのものである。この欄の目的を曖昧にすることを避けるために単純化又は省略がなされているかもしれない。当該単純化又は省略が本発明の範囲を限定することは意図されていない。ここに具象化され且つ広く記述される本発明の原理に従い、本発明は方法及び装置を含み、それによりシェーダパイプテクスチャフィルタは、レベル１キャッシュシステムを記憶の主要方法として用いるが、レベル２キャッシュシステムに対する読み出し及び書き込みを必要に応じてレベル１キャッシュシステムにさせる能力を有している。各レベル１キャッシュシステムは特定のシェーダパイプテクスチャフィルタを伴うが、レベル２キャッシュメモリはそのような関係性を持たず、従って全てのレベル１キャッシュシステムにとって利用可能である。また、レベル１キャッシュシステムは、他のリソースの間で共有可能に画定されるメモリ領域を割り当てることができる。 This section is intended to outline some aspects of the present invention and is intended to provide a brief introduction to some preferred embodiments. Simplifications or omissions may be made to avoid obscuring the purpose of this field. Such simplifications or omissions are not intended to limit the scope of the invention. In accordance with the principles of the present invention as embodied and broadly described herein, the present invention includes a method and apparatus whereby a shader pipe texture filter uses a level 1 cache system as the primary method of storage, but a level 2 cache system. The level 1 cache system has the ability to read from and write to as needed. Each level 1 cache system has a specific shader pipe texture filter, but level 2 cache memory does not have such a relationship and is therefore available to all level 1 cache systems. In addition, the level 1 cache system can allocate a memory area defined to be sharable among other resources.

本発明の実施形態においては、２つのシェーダパイプテクスチャフィルタが単一のレベル１キャッシュシステムへのアクセスを有するように、レベル１キャッシュシステムはデュアルアクセスで構成されている。 In an embodiment of the present invention, the level 1 cache system is configured with dual access so that the two shader pipe texture filters have access to a single level 1 cache system.

他の実施形態においては、１つより多いレベル２キャッシュシステムが各レベル１キャッシュシステムによってアクセス可能に構成される。 In other embodiments, more than one level 2 cache system is configured to be accessible by each level 1 cache system.

他の実施形態においては、レベル１キャッシュシステムとレベル２キャッシュシステムの間での通信が１つより多いメモリチャネルを利用し、それによりより大きなデータスループットがもたらされる。 In other embodiments, communication between the Level 1 and Level 2 cache systems utilizes more than one memory channel, which results in greater data throughput.

他の実施形態においては、１つ以上のレベル１キャッシュシステムは、他のレベル１キャッシュシステムを含む他のリソースの間で共有されるべく画定されるメモリの領域を割り当てることができる。特定の例においては、このアプローチは、要求されたデータが既にレベル２キャッシュシステムからレベル１キャッシュシステムへ移動している場合に、テクセルデータのより速いフェッチ(fetch)時間を可能にすることになる。 In other embodiments, one or more level 1 cache systems may allocate an area of memory that is defined to be shared among other resources, including other level 1 cache systems. In a particular example, this approach will allow for faster fetch times for texel data if the requested data has already been moved from the level 2 cache system to the level 1 cache system. .

本発明の更なる特徴及び利点の他、本発明の種々の実施形態の構成及び動作は、添付の図面を参照して以下に詳細に記述されている。本発明がここに記述される特定の実施形態に限定されないことを特記しておく。当該実施形態は例示の目的のみのためにここに提示されている。追加的な実施形態はここに含まれる教示に基き関連分野を含めた当業者にとって明らかであろう。 Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with reference to the accompanying drawings. It should be noted that the present invention is not limited to the specific embodiments described herein. This embodiment is presented here for illustrative purposes only. Additional embodiments will be apparent to those skilled in the art, including the relevant fields, based on the teachings contained herein.

添付の図面は、ここに組み込まれまた出願書類の一部をなし、本発明の実施形態を示しそして、上述の一般的な説明及び以下に示される実施形態の詳細な説明と共に本発明の原理を説明するのに役立つ。 The accompanying drawings, which are incorporated herein and form part of the application documents, illustrate embodiments of the present invention, and together with the general description above and the detailed description of the embodiments presented below, illustrate the principles of the invention. Help explain.

図１は単一のレベル２キャッシュシステムを伴う単一のレベル１キャッシュシステムの実施又は実装を示すシステム図である。FIG. 1 is a system diagram illustrating an implementation or implementation of a single level 1 cache system with a single level 2 cache system.

図２は複数のレベル１及びレベル２キャッシュシステムの実施又は実装を示すシステム図である。FIG. 2 is a system diagram illustrating the implementation or implementation of multiple level 1 and level 2 cache systems.

図３はデュアルポートレベル１及び複数のレベル２キャッシュシステムの実施又は実装を示すシステム図である。FIG. 3 is a system diagram illustrating the implementation or implementation of a dual port level 1 and multiple level 2 cache system.

図４はシェーダフィルタキャッシュシステムのための方法の実施を示すフローチャートである。FIG. 4 is a flowchart illustrating an implementation of a method for a shader filter cache system.

本発明の特徴及び利益は、図面と共に以下に記述される詳細な説明からより明らかになり、図面において同様の参照符号は全体を通して対応する要素を特定する。図面において、同様の参照数字は一般的に同一の、機能的に類似の、及び／又は構造的に類似の要素を示す。ある要素が最初に現れる図面は対応する参照番号の一番左の桁の１つ以上の数字によって示される。 The features and advantages of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings, in which like reference characters identify corresponding elements throughout. In the drawings, like reference numbers generally indicate identical, functionally similar, and / or structurally similar elements. The drawing in which an element first appears is indicated by one or more digits in the leftmost digit of the corresponding reference number.

本発明は、本発明の種々の「実施形態」の以下の説明からよりよく理解されるであろう。このように、特定の種々の「実施形態」は本発明の種々の見え方であり、しかしその見え方の各々は本発明全体ではない。１つの視点においては、本発明は集中型のレベル２キャッシュシステムを伴う分散型のレベル１キャッシュシステムに関連している。各シェーダパイプテクスチャフィルタは専用のレベル１キャッシュシステムを有しており、レベル１キャッシュシステムはその範囲内に含まれるテクセルデータに対する読み出し及び書き込みアクセスを提供する。また、１つ以上のレベル２キャッシュシステムがあり、レベル２キャッシュシステムはシェーダパイプテクスチャフィルタにとって専用ではなく、従って全てのレベル１キャッシュシステムによってアクセス可能である。 The invention will be better understood from the following description of various "embodiments" of the invention. Thus, the various specific “embodiments” are various aspects of the invention, but each of the aspects is not an entire invention. In one aspect, the present invention relates to a distributed level 1 cache system with a centralized level 2 cache system. Each shader pipe texture filter has a dedicated level 1 cache system that provides read and write access to the texel data contained within that range. There are also one or more level 2 cache systems, which are not dedicated to shader pipe texture filters and are therefore accessible by all level 1 cache systems.

特定の構成、配置、及びステップが論じられるが、このことは例示的な目的のみのためのものであることが理解されるべきである。関連分野を含めた当業者であれば、本発明の精神及び範囲から逸脱することなしに他の構成、配置、及びステップが用いられ得ることを認識するであろう。この発明が種々の他の応用においても採用され得ることは、関連分野を含めた当業者にとって明らかであろう。 Although specific configurations, arrangements, and steps are discussed, it should be understood that this is for exemplary purposes only. Those skilled in the art, including the relevant fields, will recognize that other configurations, arrangements, and steps may be used without departing from the spirit and scope of the invention. It will be apparent to those skilled in the art including the relevant fields that the present invention may be employed in various other applications.

尚、「１つの実施形態」、「実施形態」、「例示的実施形態」等に対する明細書での言及は、説明される実施形態が特定の特徴、構造、又は特性を含んでいてよいが、全ての実施形態が必ずしも当該特定の特徴、構造、又は特性を含む必要はないことを示している。また、そのような表現は必ずしも同じ実施形態を参照していない。更に、特定の特徴、構造、又は特性が実施形態に関連して説明されている場合には、明白に説明されていようとなかろうと、他の実施形態に関連して当該特定の特徴、構造、又は特性を具現化することは当業者の知識の範囲内にあることと言える。 It should be noted that references in the specification to “one embodiment”, “embodiment”, “exemplary embodiment”, etc. may include specific features, structures, or characteristics of the described embodiment, All embodiments need not include the particular features, structures, or characteristics. Moreover, such phrases are not necessarily referring to the same embodiment. Further, where a particular feature, structure, or characteristic is described in connection with an embodiment, whether or not it is explicitly described, the particular feature, structure, Alternatively, it can be said that the realization of characteristics is within the knowledge of those skilled in the art.

本発明は特定の応用のための例示的な実施形態を参照してここに説明されるが、本発明はそれらに限定されないことが理解されるべきである。当業者であれば、ここに提供される教示を利用して、当該範囲内での及び本発明が顕著に有用であろう追加的な分野での追加的な修正、応用、及び実施形態を認識するであろう。 While the present invention is described herein with reference to illustrative embodiments for particular applications, it should be understood that the invention is not limited thereto. Those skilled in the art will be able to use the teachings provided herein to recognize additional modifications, applications, and embodiments within the scope and in additional areas where the present invention would be significantly useful. Will do.

図１は本発明の実施形態に従う単一のレベル１キャッシュシステム及び単一のレベル２キャッシュシステム１００の図である。システム１００は単一のシェーダパイプテクスチャフィルタ１１０を備えており、これに付随するレベル１キャッシュシステム１２０はワイドチャネルメモリバス１２５を利用してレベル２キャッシュシステム１３０と通信するように構成される。 FIG. 1 is a diagram of a single level 1 cache system and a single level 2 cache system 100 according to an embodiment of the present invention. The system 100 includes a single shader pipe texture filter 110 and the associated level 1 cache system 120 is configured to communicate with the level 2 cache system 130 using a wide channel memory bus 125.

図１に示される実施形態においては、シェーダパイプテクスチャフィルタ１１０は特定のピクセルの色を決定するために双線形(bilinear)フィルタリングの概念を採用している。双線形フィルタリングの間、シェーダパイプテクスチャフィルタ１１０は問題のピクセルに最も近い４つのピクセルに対してテクセル(texel)データを解析する。４つのテクセルに対するテクセルデータは次いで、望ましい結果を計算する距離に従う加重平均によって結合される。問題のテクセルデータは、動作中のシェーダパイプテクスチャフィルタ１１０に付随するレベル１キャッシュシステム１２０から検索・回収される(retrieved)。 In the embodiment shown in FIG. 1, shader pipe texture filter 110 employs the concept of bilinear filtering to determine the color of a particular pixel. During bilinear filtering, shader pipe texture filter 110 analyzes texel data for the four pixels closest to the pixel in question. The texel data for the four texels is then combined by a weighted average according to the distance to calculate the desired result. The texel data in question is retrieved and retrieved from the level 1 cache system 120 associated with the active shader pipe texture filter 110.

しかし、もし望ましいテクセルデータが望ましい時間にレベル１キャッシュシステム１２０に無い場合、レベル１キャッシュシステム１２０は望ましいテクセルデータに対する読み出し要求をレベル２キャッシュシステム１３０に対して発行する。この場合、要求されたデータは次いで、シェーダパイプテクスチャフィルタ１１０によって解析され且つ処理されるために、レベル２キャッシュシステム１３０からレベル１キャッシュシステム１２０へコピーされる。 However, if the desired texel data is not in the level 1 cache system 120 at the desired time, the level 1 cache system 120 issues a read request for the desired texel data to the level 2 cache system 130. In this case, the requested data is then copied from the level 2 cache system 130 to the level 1 cache system 120 for analysis and processing by the shader pipe texture filter 110.

図２は本発明の実施形態に従いレベル１キャッシュ及びレベル２キャッシュシステムを伴う多重化シェーダパイクテクスチャフィルタの図である。システム２００は１つ以上のシェーダパイプテクスチャフィルタを備えており、これらはここでは符号１１０−１〜１１０−Ｎで表されるシェーダパイプテクスチャフィルタ１乃至シェーダパイプテクスチャフィルタＮで示され、「Ｎ」は１より大きい正の整数を表す。システム２００はまた、各シェーダパイプテクスチャフィルタに付随するレベル１キャッシュシステムを備えており、これらはここでは符号１２０−１〜１２０Ｎで表されるＬ１−１キャッシュシステム乃至Ｌ１−Ｎキャッシュシステムで示され、「Ｎ」は１より大きい正の整数を表す。また、レベル１キャッシュシステム１２０−１〜１２０Ｎを一連のレベル２キャッシュシステムにリンクするワイドチャネルメモリバス１２５が含まれている。この実施形態では、レベル２キャッシュシステムは１つ以上のレベル２キャッシュシステムを含み、これらはここでは符号１３０−１〜１３０−Ｍで表されるＬ２−１キャッシュシステム乃至Ｌ２−Ｍキャッシュシステムで示され、「Ｍ」は１より大きい整数であるが、必ずしもレベル１キャッシュシステムの数Ｎと同じでなくてよい。 FIG. 2 is a diagram of a multiplexed shader pike texture filter with a level 1 cache and a level 2 cache system in accordance with an embodiment of the present invention. The system 200 includes one or more shader pipe texture filters, which are denoted herein as shader pipe texture filters 1 through N represented by 110-1 through 110-N, and are “N”. Represents a positive integer greater than 1. The system 200 also includes a level 1 cache system associated with each shader pipe texture filter, which are denoted herein as L1-1 cache systems through L1-N cache systems, denoted 120-1 through 120N. , “N” represents a positive integer greater than 1. Also included is a wide channel memory bus 125 that links the level 1 cache systems 120-1 through 120N to a series of level 2 cache systems. In this embodiment, the level 2 cache system includes one or more level 2 cache systems, which are denoted here as L2-1 to L2-M cache systems, denoted as 130-1 through 130-M. “M” is an integer greater than 1, but not necessarily the same as the number N of level 1 cache systems.

図２の実施形態においては、シェーダパイプテクスチャフィルタ１１０−１〜１１０Ｎの各々は、特定のピクセルの色を決定する双線形フィルタリングの概念を採用している。上述したように、各シェーダパイプテクスチャフィルタ１１０−１〜１１０Ｎは、問題のピクセルに最も近い４つのピクセルに対するテクセルデータを解析する必要がある。従って、各シェーダパイプテクスチャフィルタに対する問題のテクセルデータは、それに付随するレベル１キャッシュシステムから検索・回収される。であるから、シェーダパイプテクスチャフィルタ１１０−１はテクセルデータに対する要求をＬ１−１キャッシュシステム１２０−１に対して発行する。残りのシェーダパイプテクスチャフィルタは同様にしてテクセルデータ要求を発行することになる。 In the embodiment of FIG. 2, each of the shader pipe texture filters 110-1 to 110N employs the concept of bilinear filtering that determines the color of a particular pixel. As described above, each shader pipe texture filter 110-1 to 110N needs to analyze the texel data for the four pixels closest to the pixel in question. Thus, the texel data in question for each shader pipe texture filter is retrieved and retrieved from the associated level 1 cache system. Therefore, the shader pipe texture filter 110-1 issues a request for texel data to the L1-1 cache system 120-1. The remaining shader pipe texture filters will similarly issue texel data requests.

しかし、もし任意の特定のシェーダパイプテクスチャフィルタに対する望ましいテクセルデータがそれに付随するレベル１キャッシュシステム１２０内に存在しなければ、レベル１キャッシュシステムは望ましいテクセルデータに対する読み出し要求をレベル２キャッシュシステム１３０に対して発行することができる。図２の実施形態には多重化されたレベル２キャッシュシステムがあるので、そのうちの１つ以上がテクセルデータに対するレベル１キャッシュシステム要求に応答することができる。しかしながら、他の実施形態においては、多重化されたシェーダパイプテクスチャフィルタ及び付随するレベル１キャッシュシステムを有している一方で単一のレベル２キャッシュシステムがあってもよい。 However, if the desired texel data for any particular shader pipe texture filter does not exist in the associated level 1 cache system 120, the level 1 cache system will issue a read request for the desired texel data to the level 2 cache system 130. Can be issued. Since there is a multiplexed level 2 cache system in the embodiment of FIG. 2, one or more of them can respond to level 1 cache system requests for texel data. However, in other embodiments, there may be a single level 2 cache system while having a multiplexed shader pipe texture filter and associated level 1 cache system.

図２に関する他の実施形態においては、１つ以上のレベル１キャッシュシステムは、他のレベル１キャッシュシステムを含む他のリソースの間で共有されるべく画定されたメモリの領域を割り当てることができる。特定の例においては、このアプローチは、要求されたデータが既にレベル２キャッシュシステムからレベル１キャッシュシステムへ移動している場合に、テクセルデータのより速いフェッチ(fetch)時間を可能にする。 In other embodiments with respect to FIG. 2, one or more level 1 cache systems may allocate a region of memory defined to be shared among other resources including other level 1 cache systems. In a particular example, this approach allows a faster fetch time for texel data if the requested data has already been moved from the level 2 cache system to the level 1 cache system.

図３は本発明の実施形態に従いデュアルポートのレベル１キャッシュ及び多重化されたレベル２キャッシュシステムを伴う多重化シェーダパイクテクスチャフィルタの図である。システム３００は、各々が２つまでのシェーダパイプテクスチャフィルタを支持する１つ以上のデュアルポートのレベル１キャッシュシステムと、レベル２キャッシュシステムとを備えている。この実施形態では、レベル１キャッシュシステムは２つまでのシェーダパイプテクスチャフィルタをそれらのテクセルデータの要求に対して支持する。この実施形態では、符号３２０−１〜３２０−Ｎで表される１つ以上のレベル１キャッシュシステムＬ１−１〜Ｌ１−Ｎがあり、Ｎは正の整数を示す。ここで、各レベル１キャッシュは、シェーダパイプテクスチャフィルタＡ，３１０及びシェーダパイプテクスチャフィルタＢ，３１２として示される２つのシェーダパイプテクスチャフィルタを支持する。また、レベル１キャッシュ３２０−１〜３２０−Ｎは、Ｌ２−１〜Ｌ２−Ｍで表されるレベル２キャッシュシステムへのワイドチャネルメモリバス１２５を介したアクセスを有しており、Ｍは正の整数である。 FIG. 3 is a diagram of a multiplexed shader pike texture filter with a dual port level 1 cache and multiplexed level 2 cache system in accordance with an embodiment of the present invention. The system 300 includes one or more dual-port level 1 cache systems, each supporting up to two shader pipe texture filters, and a level 2 cache system. In this embodiment, the level 1 cache system supports up to two shader pipe texture filters for their texel data requests. In this embodiment, there are one or more level 1 cache systems L1-1 to L1-N represented by reference numerals 320-1 to 320-N, where N represents a positive integer. Here, each level 1 cache supports two shader pipe texture filters, shown as shader pipe texture filters A and 310 and shader pipe texture filters B and 312. Level 1 caches 320-1 through 320-N also have access through the wide channel memory bus 125 to the level 2 cache system represented by L2-1 through L2-M, where M is a positive It is an integer.

図３に関する他の実施形態においては、１つ以上のレベル１キャッシュシステムは、他のレベル１キャッシュシステムを含む他のリソースの間で共有されるべく画定されたメモリの領域を割り当てることができる。特定の例においては、このアプローチは、要求されたデータが既にレベル２キャッシュシステムからレベル１キャッシュシステムへ移動している場合に、テクセルデータのより速いフェッチ時間を可能にする。 In other embodiments with respect to FIG. 3, one or more level 1 cache systems may allocate a region of memory defined to be shared among other resources including other level 1 cache systems. In a particular example, this approach allows a faster fetch time for texel data if the requested data is already moving from the level 2 cache system to the level 1 cache system.

図４は方法４００を示すフローチャートであり、それによりシェーダパイプテクスチャフィルタは、必要に応じてレベル２キャッシュにアクセスする能力を有する主要な記憶方法としてレベル１キャッシュシステムを利用する。方法４００はステップ４０２で開始される。ステップ４０４では各レベル１キャッシュシステムは、他のリソースの間で共有可能に画定されるメモリ領域を割り当てることができる。ステップ４０６では、シェーダパイプテクスチャフィルタがそれに付随するレベル１キャッシュシステムに対して読み出し又は書き込み命令を発行する。ステップ４０８ではレベル１キャッシュシステムが必要に応じてテクセルデータを検索・回収又は書き込みする。ステップ４１０では、各レベル１キャッシュシステムはレベル２キャッシュシステムに対して読み出し及び書き込み要求を発行することができる。方法４００はステップ４１２で終了する。 FIG. 4 is a flow chart illustrating a method 400 whereby the shader pipe texture filter utilizes the level 1 cache system as the primary storage method with the ability to access the level 2 cache as needed. Method 400 begins at step 402. In step 404, each level 1 cache system can allocate a memory area defined to be sharable among other resources. In step 406, the shader pipe texture filter issues a read or write instruction to the associated level 1 cache system. In step 408, the level 1 cache system retrieves, collects, or writes texel data as necessary. In step 410, each level 1 cache system can issue read and write requests to the level 2 cache system. Method 400 ends at step 412.

図１、２、３及び４で概説した機能、処理、システム及び方法は、ソフトウエア、ファームウエア若しくはハードウエア又はこれらの組み合わせにおいて実施又は実装され得る。プログラム可能な論理が使用される場合には、当該論理は商業的に入手可能な処理プラットフォーム又は特殊目的デバイス上で実行可能である。 The functions, processes, systems and methods outlined in FIGS. 1, 2, 3 and 4 may be implemented or implemented in software, firmware or hardware, or combinations thereof. If programmable logic is used, the logic can be executed on a commercially available processing platform or special purpose device.

関連分野を含めた当業者に明らかであろうように、ここでの説明に基いて、本発明の実施形態はハードウエア記述言語（ＨＤＬ）、例えばＶｅｒｉｌｏｇ又はＶＨＤＬを用いたソフトウエアにおいて設計することができる。ＨＤＬ設計は電子システムの挙動をモデル化することができ、そこでは当該設計は合成され最終的にはハードウエアデバイスへと製造される。また、ＨＤＬ設計はコンピュータ製品内に記憶されてよく、ハードウエア製造に先立ちコンピュータシステム内に取り込まれてよい。 As will be apparent to those skilled in the art including the relevant fields, based on the description herein, embodiments of the present invention should be designed in software using a hardware description language (HDL) such as Verilog or VHDL. Can do. An HDL design can model the behavior of an electronic system, where the design is synthesized and eventually fabricated into a hardware device. The HDL design may also be stored in a computer product and captured in a computer system prior to hardware manufacture.

概要及び要約部ではなく、詳細な説明部が特許請求の範囲を解釈するために用いられるよう意図されていることが理解されるべきである。概要及び要約部は、発明者によって意図される本発明の１つ以上のしかし全部のではない例示的実施形態を示すことができ、従って本発明及び添付の特許請求の範囲を限定することを意図されるものでは決してない。 It should be understood that the detailed description, rather than the summary and abstract, is intended to be used for interpreting the scope of the claims. The summary and summary section may illustrate one or more, but not all, exemplary embodiments of the present invention contemplated by the inventor, and thus is intended to limit the present invention and the appended claims. Never done.

特定の機能の実装及びそれらの関係を示す機能構成ブロックを用いて本発明が以上のように説明されてきた。それらの機能構成ブロックの境界は説明の便宜のためここでは任意に画定されてきた。特定の機能及びそれらの関係が適切に実行される限りにおいて代替的な境界が画定されてよい。 The present invention has been described above using functional building blocks that indicate the implementation of specific functions and their relationships. The boundaries of these functional blocks have been arbitrarily defined here for convenience of explanation. Alternative boundaries may be defined as long as certain functions and their relationships are properly performed.

特定の実施形態の前述した説明は、他者が、当該分野の技能の範囲内の知識を適用することによって、必要以上の実験を要せず、本発明の一般的概念から逸脱することなしに、当該特定の実施形態を容易に修正し及び／又は種々の応用に適合させることができる程度に、本発明の一般的性質を完全に明らかにするであろう。従って、そのような適合及び修正は、ここに提示される教示及び指針に基き、開示された実施形態と均等なものの意味及び範囲内にあることが意図されている。ここでの表現法及び用語は説明を目的としたものであり限定を目的としてないことが理解されるべきであり、本明細書の表現法及び用語は教示及び指針を考慮して当業者によって解釈されるべきである。 The foregoing description of specific embodiments does not require undue experimentation by others applying knowledge within the skill of the art and without departing from the general concept of the invention. To the extent that the particular embodiments can be readily modified and / or adapted to various applications, the general nature of the invention will be fully clarified. Accordingly, such adaptations and modifications are intended to be within the meaning and scope of the equivalents of the disclosed embodiments, based on the teachings and guidance presented herein. It should be understood that the expressions and terms herein are for illustrative purposes and are not intended to be limiting, and the expressions and terms herein are interpreted by one of ordinary skill in the art in view of the teachings and guidelines. It should be.

本発明の種々の実施形態が上に説明されてきたが、それらは例示のみを目的として提示されたものであり、限定を目的とはしていないことが理解されるべきである。本発明の精神及び範囲から逸脱することなしに、形態及び詳細における種々の変更がここになされ得ることが関連分野をも含めた当業者には明らかであろう。従って、本発明の広さ及び範囲は上述したいかなる例示的実施形態によっても限定されるべきではなく、以下の特許請求の範囲及びそれらと均等なものに従ってのみ画定されるべきである。 While various embodiments of the invention have been described above, it should be understood that they have been presented for purposes of illustration only and are not intended to be limiting. It will be apparent to those skilled in the art, including the relevant fields, that various changes in form and detail may be made herein without departing from the spirit and scope of the invention. Accordingly, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims

A method for caching shader filter data, comprising:
Receiving read and write instructions from the shader pipe filter for access to multiple level 1 cache systems;
Executing the instruction from the shader pipe filter;
Assigning a memory area within defined by Relais bell 1 cache system to be shared among the plurality of level one cache system,
Managing read and write requests from the plurality of level 1 cache systems to a level 2 cache system.

The method of claim 1 , wherein the receiving comprises receiving from a plurality of shader pipe filters.

The method of claim 1 , wherein the shader pipe filter accesses data in the shared level 1 cache system .

Wherein the level 2 and to manage the read and write requests to the cache system from the plurality of level one cache systems, managing the read and write requests from a plurality of level one cache system to a plurality of level two cache system The method of claim 1 comprising :

The method of claim 1 , wherein the method is performed by combining hardware description language instructions.

A system for caching shader filter data,
A processor;
A memory in communication with the processor,
The memory is
Receiving read and write instructions from the shader pipe filter for access to multiple level 1 cache systems;
Executing the instruction from the shader pipe filter;
Assigning a memory area within defined by Relais bell 1 cache system to be shared among the plurality of level one cache system,
A system configured to store a plurality of processing instructions for instructing the processor to manage read and write requests from the plurality of level 1 cache systems to a level 2 cache system.

The system of claim 6 , further comprising instructions configured to cause the processor to manage instructions from a plurality of shader pipe filters.

The system of claim 6 , further comprising instructions configured to cause the processor to manage the plurality of level 1 cache systems.

The system of claim 6 , further comprising instructions configured to cause the processor to manage a plurality of level 2 cache systems.

A system for caching shader filter data,
Means for receiving read and write instructions from a shader pipe filter for access to a plurality of level 1 cache systems;
Means for executing the instructions from the shader pipe filter;
Means for allocating a memory area of the Relais bell 1 cache system is defined to be shared among the plurality of level one cache system,
Means for managing read and write requests from the plurality of level 1 cache systems to the level 2 cache system.

The system of claim 10 , further comprising means for managing instructions from the plurality of shader pipe filters.

The system of claim 10 , further comprising means for managing the plurality of level 1 cache systems.

The system of claim 10 , further comprising means for managing a plurality of level 2 cache systems.

One or more instructions executed by one or more processor-based computer systems that convey one or more sequences of one or more instructions that cause the computer system to perform a method of caching shader filter data A readable storage medium, the method comprising:
Receiving read and write instructions from the shader pipe filter for access to multiple level 1 cache systems;
Executing the instruction from the shader pipe filter;
Assigning a memory area within defined by Relais bell 1 cache system to be shared among the plurality of level one cache system,
Wherein the plurality of levels 1 computer-readable storage medium that includes a managing read and write requests from the cache system level 2 to the cache system, the.

The computer-readable storage medium of claim 14, wherein the method further comprises managing commands from a plurality of shader pipe filters.

The computer readable storage medium of claim 14, wherein the method further comprises managing the plurality of level 1 cache systems.

The computer-readable storage medium of claim 14, wherein the method further comprises managing a plurality of level 2 cache systems.