JP5835879B2

JP5835879B2 - Method and apparatus for controlling reading of an array of data from memory

Info

Publication number: JP5835879B2
Application number: JP2010213508A
Authority: JP
Inventors: ダレン・クロックスフォード; ラース・エリクソン; ヨン・エリク・オテルハルス
Original assignee: アーム・リミテッド
Priority date: 2009-09-25
Filing date: 2010-09-24
Publication date: 2015-12-24
Anticipated expiration: 2030-09-24
Also published as: CN102033728B; CN102033809A; GB2474115B; CN102033728A; GB2474114A; GB2474115A; JP2011070671A; GB2474114B; GB201016162D0; CN102033809B; JP5751782B2; JP2011070672A; GB201016165D0

Description

本発明は、処理を行うためのメモリからのデータの配列(arrays)の読み込み(reading)に関する。この一例は、表示のためにフレームバッファからの画像を処理するときのディスプレイコントローラの操作(operation)である。 The present invention relates to reading an array of data from a memory for processing. An example of this is the operation of the display controller when processing an image from the frame buffer for display.

当技術分野で知られているように、多くの電子デバイスおよびシステムでは、画像などのデータの配列を処理する必要がある。例えば、ユーザーに対して表示すべき画像は、通常、表示のために表示デバイスのいわゆる「ディスプレイコントローラ」によって処理される。 As is known in the art, many electronic devices and systems require processing an array of data such as images. For example, an image to be displayed to the user is usually processed by a so-called “display controller” of the display device for display.

典型的には、ディスプレイコントローラは、画像をデータ配列として格納するメモリ内のいわゆる「フレームバッファ」から表示すべき出力画像を読み込んで、画像データを適宜ディスプレイに送る。グラフィックス処理システムの場合、例えば、グラフィックス処理システムの出力画像は、表示する準備が整うとメモリ内のフレームバッファに格納され、次いでディスプレイコントローラがフレームバッファを読み込み、それを表示するためディスプレイ(例えば、画面もしくはプリンタであってもよい)に送る。 Typically, a display controller reads an output image to be displayed from a so-called “frame buffer” in a memory that stores images as a data array, and sends image data to a display as appropriate. In the case of a graphics processing system, for example, the output image of the graphics processing system is stored in a frame buffer in memory when ready to display, and then a display controller reads the frame buffer and displays it for display (e.g. , Screen or printer).

当技術分野で知られているように、フレームバッファそれ自体は、注目しているシステムのいわゆる「メイン」メモリ内に通常は格納され、したがって、それは、表示デバイスおよびディスプレイコントローラの外部にある。したがって、表示のためにフレームバッファからデータを読み込むと、比較的かなりの量の電力およびメモリ帯域幅を消費する可能性がある。例えば、新しい画像フレームを毎秒30フレーム以上のフレームレートでフレームバッファから読み込んで表示する必要があり、またそれぞれのフレームは、特に高解像度ディスプレイおよび高精細度(HD)グラフィックスのために、かなりの量のデータを必要とすることがある。 As is known in the art, the frame buffer itself is usually stored in the so-called “main” memory of the system of interest, so it is external to the display device and display controller. Thus, reading data from the frame buffer for display can consume a relatively significant amount of power and memory bandwidth. For example, new image frames need to be read from the frame buffer and displayed at a frame rate of 30 frames per second or more, and each frame is significant, especially for high-resolution displays and high-definition (HD) graphics. May require amount of data.

したがって、フレームバッファオペレーションの電力消費を低減することを試みることが望ましいことは知られており、さまざまな技術が、このような低減を達成することを試みるために提案されている。 Thus, it is known that it is desirable to attempt to reduce the power consumption of frame buffer operations, and various techniques have been proposed to attempt to achieve such a reduction.

これらの技術は、オンチップ(外部とは反対に)フレームバッファの形成、フレームバッファキャッシング(バッファリング)、フレームバッファ圧縮および動的色深度制御を含む。しかし、これらの技術のそれぞれは、それぞれの欠点と不利点を有する。 These techniques include on-chip (as opposed to external) frame buffer formation, frame buffer caching (buffering), frame buffer compression and dynamic color depth control. However, each of these techniques has its own drawbacks and disadvantages.

例えば、オンチップフレームバッファを使用すると、特に、高解像度ディスプレイの場合に、大量のオンチップリソースが必要になる場合がある。フレームバッファキャッシングまたはバッファリングは、フレーム生成が典型的にはフレームバッファ表示に対し非同期であるため、実用的とはいえない。フレームバッファ圧縮は役に立つことがあるが、必要なロジックは比較的複雑なものであり、フレームバッファフォーマットが変更される。不可逆フレームバッファ圧縮を使用すると、画質が低下する。動的色深度制御も、同様に、不可逆な方式であり、したがって、画質が低下する。 For example, using an on-chip frame buffer may require a large amount of on-chip resources, especially for high resolution displays. Frame buffer caching or buffering is not practical because frame generation is typically asynchronous to the frame buffer display. Although frame buffer compression can be useful, the required logic is relatively complex and the frame buffer format is changed. Using lossy frame buffer compression degrades image quality. Similarly, the dynamic color depth control is an irreversible method, and therefore the image quality is degraded.

処理のためにメモリからデータ配列を読み込む必要がある可能性のある他の配置構成は、例えば、CPUがグラフィックスプロセッサによって生成された画像を読み込んでそれを修正する必要がある可能性のある状況、およびグラフィックスプロセッサが後でグラフィックス処理の際に使用する外部生成されたテクスチャを読み込む必要がある可能性のある場合を含む。これらの配置構成は、処理のために格納されているデータ配列を呼び込むときに比較的大きなメモリ帯域幅および電力を消費する可能性もある。 Other arrangements that may need to read a data array from memory for processing are situations where the CPU may need to read an image generated by a graphics processor and modify it, for example And the case where the graphics processor may need to load an externally generated texture for later use in graphics processing. These arrangements can also consume relatively large memory bandwidth and power when recalling a stored data array for processing.

そこで、出願人は、フレームバッファ、読み込みオペレーションなど、データ配列に対する改善の余地があると確信する。 Therefore, the applicant is convinced that there is room for improvement to the data array, such as a frame buffer and a read operation.

本発明の第1の態様によれば、それぞれがデータの配列の特定の領域を表すデータの連続するブロックを処理することによって処理デバイスがデータの配列を処理し、データの配列の特定の領域を表すデータのブロックが、データの配列が格納されている第1のメモリから読み込まれ、処理デバイスによってデータのブロックが処理される前に処理デバイスのメモリ内に格納される、データの配列を処理する方法が提供され、この方法は、
データ配列に対して処理されるべきデータのブロックが、処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかを判定し、類似度判定に基づいて、処理すべきデータのブロックに対して処理デバイスのメモリ内にすでに格納されているデータのブロック、または第1のメモリ内に格納されているデータの配列からのデータの新しいブロックを処理するステップを含む。 According to a first aspect of the present invention, a processing device processes an array of data by processing successive blocks of data, each representing a specific area of the data array, A block of data representing is read from the first memory in which the array of data is stored, and the array of data is stored in the memory of the processing device before the block of data is processed by the processing device A method is provided, which is
Determine whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and based on the similarity determination, Processing a block of data already stored in the memory of the processing device for the block or a new block of data from an array of data stored in the first memory.

本発明の第2の態様によれば、システムが実現され、このシステムは、
処理すべきデータの配列を格納するための第1のメモリと、
第1のメモリ内に格納されているデータの配列を、それぞれがデータの配列の特定の領域を表すデータの連続するブロックを処理することによって処理する、ローカルメモリを有する処理デバイスと、
第1のメモリ内に格納されているデータの配列の特定の領域を表すデータのブロックを読み込み、データのブロックが処理デバイスによって処理される前に処理デバイスのローカルメモリ内にデータのブロックを格納するように構成された読み込みコントローラと、
データ配列に対して処理されるべきデータのブロックが、処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかを判定し、類似度判定に基づいて、処理すべきデータのブロックに対して、処理デバイスのメモリ内にすでに格納されているデータのブロック、または第1のメモリ内に格納されているデータの配列からのデータの新しいブロックを処理デバイスに処理させるように構成されたコントローラとを備える。 According to a second aspect of the present invention, a system is realized, the system comprising:
A first memory for storing an array of data to be processed;
A processing device having local memory that processes an array of data stored in a first memory by processing successive blocks of data, each representing a particular region of the array of data;
Read a block of data representing a specific area of the array of data stored in the first memory and store the block of data in the local memory of the processing device before the block of data is processed by the processing device A read controller configured to
Determine whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and based on the similarity determination, For a block, configured to cause the processing device to process a block of data already stored in the memory of the processing device, or a new block of data from an array of data stored in the first memory And a controller.

本発明の第3の態様によれば、第1のメモリ内に格納されているデータの配列を処理するための処理デバイスが実現され、この処理デバイスはデータの配列を、それぞれがデータの配列の特定の領域を表すデータの連続するブロックを処理することによって処理するように構成され、
ローカルメモリと、
第1のメモリ内に格納されているデータの配列の特定の領域を表すデータのブロックを読み込み、データのブロックが処理デバイスによって処理される前に処理デバイスのローカルメモリ内にデータのブロックを格納するように構成された読み込みコントローラと、
データ配列に対して処理されるべきデータのブロックが、処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかを判定し、類似度判定に基づいて、処理すべきデータのブロックに対して、処理デバイスのメモリ内にすでに格納されているデータのブロック、または第1のメモリ内に格納されているデータの配列からのデータの新しいブロックを処理デバイスに処理させるように構成されたコントローラとを備える。 According to the third aspect of the present invention, a processing device for processing an array of data stored in the first memory is realized, the processing device comprising an array of data, each of which is an array of data. Configured to process by processing successive blocks of data representing a particular area,
Local memory,
Read a block of data representing a specific area of the array of data stored in the first memory and store the block of data in the local memory of the processing device before the block of data is processed by the processing device A read controller configured to
Determine whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and based on the similarity determination, For a block, configured to cause the processing device to process a block of data already stored in the memory of the processing device, or a new block of data from an array of data stored in the first memory And a controller.

本発明は、処理すべきデータの配列(例えば、表示すべきフレームとすることが可能であり、また好ましい一実施形態では、表示すべきフレームである)が、データの配列の特定の領域を表すデータのブロックの形で処理デバイス(例えば、ディスプレイコントローラとすることも可能であり、また好ましい一実施形態では、ディスプレイコントローラである)によって処理のためにメモリから読み込まれる、システムに関するものであり、またそのようなシステムにおいて実装される。 The present invention provides that an array of data to be processed (eg, can be a frame to be displayed, and in one preferred embodiment is a frame to be displayed) represents a particular region of the array of data. Relates to a system that is read from memory for processing by a processing device in the form of a block of data (e.g., may be a display controller, and in a preferred embodiment is a display controller), and Implemented in such a system.

したがって、本質的に、本発明は、システムによって処理されるべきデータ配列が、メモリから読み込まれ、単一の出力「配列」丸ごと直接的にではなく、ブロック毎に処理される、システムに関係するものであり、またそのようなシステムにおいて実装されることが意図されている。 Thus, in essence, the present invention pertains to a system in which the data array to be processed by the system is read from memory and processed block by block rather than directly as a single output "array". And is intended to be implemented in such a system.

上述のように、これは、例えば、タイルベースのグラフィックス処理システムによって生成される画像の表示の場合に当てはまるものとしてよい。この場合、ディスプレイコントローラは、タイル毎にフレームバッファから表示するためのそれぞれのフレームを処理することができる(ただし、以下でさらに説明されるように、これは本質的ではなく、実際、常に好ましいものとは限らない)。 As described above, this may be the case, for example, for the display of images generated by a tile-based graphics processing system. In this case, the display controller can process each frame for display from the frame buffer on a tile-by-tile basis (however, as will be further explained below, this is not essential and is in fact always preferred) Not necessarily).

(当技術分野で知られているように、タイルベースのレンダリングでは、レンダリングプロセスの二次元出力配列またはフレーム(「ターゲットをレンダリングする」)(例えば、また典型的には、レンダリングされているシーンを表示するために表示されることになる)は、レンダリングプロセスのために、通常「タイル」と称される、複数のより小さな領域に細分または区分化される。タイル(部分領域)は、それぞれ個別にレンダリングされる(典型的には次から次へと)。次いで、レンダリング済みタイル(部分領域)は、再結合されて、例えば表示するために、完全な出力配列(フレーム)(レンダーターゲット)を形成する。 (As is known in the art, in tile-based rendering, a two-dimensional output array or frame (`` renders the target '') of the rendering process (e.g., typically also represents the scene being rendered). Will be displayed or subdivided into multiple smaller areas, usually called “tiles”, for the rendering process. Rendered tiles (partial regions) are then recombined to produce a complete output array (frame) (render target) for display, for example, Form.

「タイリング(tiling)」および「タイルベース(tile based)」レンダリングに通常使用される他の用語として、「チャンキング(chunking)」(部分領域は、「チャンク(chunks)」と称される)および「バケット(bucket)」レンダリングが挙げられる。「タイル」および「タイリング」という用語は、本明細書では、便宜上使用されるものであるが、これらの用語は、すべての代替えおよび同等の用語および技術を包含することが意図されていることは理解されるであろう。 Another term commonly used for "tiling" and "tile based" rendering is "chunking" (partial regions are referred to as "chunks") And "bucket" rendering. The terms “tile” and “tiling” are used herein for convenience, but these terms are intended to encompass all alternative and equivalent terms and techniques. Will be understood.

本発明では、それぞれのデータブロック(例えば、レンダリング済みタイル)が単純に、データ配列が格納されているメモリから読み出され、順に処理されるのではなく、データブロックが処理されるときに(例えば表示するために)、最初に、そのブロックがデータ配列を処理することになっている処理デバイス(例えば、ディスプレイコントローラ)の(ローカル)メモリ内にすでに格納されているデータブロック(例えば、タイル)に類似しているかどうかが判定される。次いで、類似度判定に基づいて処理すべきデータブロックとして、ローカルメモリ内の既存のデータブロックを処理するのか、またはメモリ内の格納されているデータ配列からの新しいデータブロックを処理するのかが決定される。 In the present invention, each data block (e.g., a rendered tile) is simply read from the memory in which the data array is stored and not processed in sequence, but when the data block is processed (e.g., First) to a data block (e.g. tile) already stored in the (local) memory of the processing device (e.g. display controller) that the block is supposed to process the data array It is determined whether they are similar. It is then determined whether to process an existing data block in local memory or a new data block from a stored data array in memory as a data block to process based on the similarity determination. The

以下でさらに詳しく説明されるように、出願人は、このプロセスを使用することにより、使用時に処理するためにメインメモリ(例えば、フレームバッファ)から読み込まれるデータブロック(例えば、レンダリング済みタイル)の個数を著しく減らし、それにより、メインメモリ(例えば、フレームバッファ)読み込みトランザクションの回数を著しく減らし、したがって、メインメモリ(例えば、フレームバッファ)読み込みオペレーションに関係する電力およびメモリ帯域幅消費を著しく減らすことができることを発見し、理解した。また、これは、例えば、低電力低コストのポータブルデバイスが求められる背景状況において特に有利であると思われる、性能が低い低電力メモリシステムをしかるべく使用しやすくしうる。 As described in more detail below, Applicant uses this process to determine the number of data blocks (eg, rendered tiles) that are read from main memory (eg, frame buffer) for processing during use. Can significantly reduce the number of main memory (e.g. frame buffer) read transactions, and thus significantly reduce the power and memory bandwidth consumption associated with main memory (e.g. frame buffer) read operations. Discovered and understood. This may also make it easier to use low power memory systems with lower performance, which may be particularly advantageous, for example, in the context of low power, low cost portable devices.

例えば、処理すべきデータブロックが処理デバイスのローカルメモリ内にすでに存在しているデータブロック(例えば、レンダリング済みタイル)と同じであるとわかった場合、データブロックを格納されているデータ配列から処理デバイスのローカルメモリに読み込む必要はないと判定することができ(また好ましくは判定し)、それにより、その読み込み「トランザクション」が不要になる。したがって、処理すべきデータブロックが、処理デバイスのローカルメモリ内にすでに格納されているデータブロックに類似していると判定された場合、好ましくは処理デバイスのローカルメモリ内の(適切な)既存のブロックは処理デバイスによって処理され、または逆も同様である。 For example, if the data block to be processed is found to be the same as a data block that already exists in the processing device's local memory (e.g., a rendered tile), the data block is processed from the stored data array. Can be determined (and preferably determined) that it is not necessary to read into the local memory, thereby eliminating that read “transaction”. Thus, if it is determined that the data block to be processed is similar to a data block already stored in the local memory of the processing device, preferably an (appropriate) existing block in the local memory of the processing device Are processed by the processing device or vice versa.

さらに、出願人は、例えば、グラフィックス処理の場合に、処理すべき新しいデータブロック(例えば、レンダリング済みタイル)が例えばディスプレイコントローラのローカルメモリ内にすでにあるデータブロック(例えば、レンダリング済みタイル)と同じであるか、または類似しているという状況が比較的ふつうにありうる。例えば、グラフィックス処理の場合、多くのアプリケーションなどのユーザーインターフェイスの大半である、空、海、または他の均一な背景など、互いに類似する画像の領域がある。したがって、そのような領域(例えばタイル)を識別し、次いで、所望するなら、そのような領域(例えばタイル)をディスプレイコントローラのローカルメモリから再び読み込むことを回避することを行いやすくすることによって、例えばディスプレイコントローラのローカルメモリへの読み込みトラヒック(読み込みトランザクション)の大幅節減を達成することができる。 In addition, applicants have the same data blocks (e.g. rendered tiles) that are already in local memory of the display controller, e.g. in the case of graphics processing, e.g. There can be relatively common situations that are similar or similar. For example, in the case of graphics processing, there are regions of images that are similar to each other, such as the sky, sea, or other uniform background, which is the majority of user interfaces such as many applications. Thus, by identifying such areas (e.g. tiles) and then making it easier to avoid re-reading such areas (e.g. tiles) from the local memory of the display controller if desired, e.g. Significant savings in read traffic (read transactions) to the display controller's local memory can be achieved.

したがって、本発明を使用し、不要なメモリ(例えば、フレームバッファ)読み込みトランザクションの識別および排除を事実上しやすくすることによって、フレームバッファおよびメモリ読み込みオペレーションに使用される消費電力およびメモリ帯域幅を著しく低減することができる。 Thus, the power consumption and memory bandwidth used for frame buffer and memory read operations is significantly reduced by using the present invention, making it virtually easier to identify and eliminate unnecessary memory (e.g., frame buffer) read transactions. Can be reduced.

さらに、上述の従来技術の方式に比べて、本発明では、必要なオンチップハードウェアが比較的わずかで済み、可逆プロセスとすることができ、またデータ配列(例えば、フレームバッファ)フォーマットを変更しない。また、既存の出力(例えば、フレームバッファ)電力低減方式と連携して、またそれを補完する形で容易に使用することができ、これにより、所望するならさらに節電しやすくすることができる。 Furthermore, compared to the above-described prior art schemes, the present invention requires relatively little on-chip hardware, can be a reversible process, and does not change the data array (eg, frame buffer) format. . Also, it can be easily used in conjunction with, or complementing, existing output (eg, frame buffer) power reduction schemes, thereby making it easier to save power if desired.

以下でさらに説明されるように、本発明は、まず第一に、初期データ配列へのデータブロックの書き込みを回避するためにも使用することができる。このような書き込みトランザクション排除が行われると、メモリ(例えば、フレームバッファ)トランザクションの電力およびメモリ帯域幅の節減をさらに進めることができる(ただし、データ配列の読み込みは書き込み(更新)に比べて増える可能性があるので、読み取りトランザクションの排除の方が、一般的に、有益と思われる)。 As will be described further below, the present invention can also be used to avoid first writing data blocks to the initial data array. This elimination of write transactions can further reduce the power and memory bandwidth of memory (eg, frame buffer) transactions (however, reading data arrays can increase compared to writing (updating). Therefore, it seems generally more beneficial to eliminate read transactions).

上述のように、特に好ましい一実施形態では、処理デバイスは、類似度判定に基づいて、メインメモリ内のデータ配列から新しいデータブロックを処理デバイスのローカルメモリ内に読み込むかどうかを決定する。 As described above, in one particularly preferred embodiment, the processing device determines whether to read a new data block from the data array in main memory into the local memory of the processing device based on the similarity determination.

したがって、特に好ましい一実施形態では、処理すべきデータの(例えば、次の)ブロックが、処理デバイスのローカルメモリ内にすでに格納されているデータのブロックに類似していると考えられると判定された場合、データの新しいブロックは、メインメモリ内のデータ配列から読み込まれず、処理デバイスのローカルメモリ内に格納されるが、その代わりに、処理デバイスのローカルメモリ内のデータの既存のブロックが、処理デバイスによって処理されるべきデータの(例えば、次の)ブロックとして処理される。 Thus, in a particularly preferred embodiment, it has been determined that the (e.g., next) block of data to be processed is considered similar to the block of data already stored in the local memory of the processing device. If the new block of data is not read from the data array in the main memory and is stored in the local memory of the processing device, instead, the existing block of data in the local memory of the processing device Is processed as a (e.g., next) block of data to be processed.

その一方で、処理すべきデータの(例えば、次の)ブロックが、処理デバイスのローカルメモリ内にすでに格納されているデータのブロックに類似していないと考えられると判定された場合、データの新しいブロックは、メインメモリ内のデータ配列から読み込まれ、処理デバイスのローカルメモリ内に格納され、次いで、処理デバイスによって処理されるべきデータの(例えば、次の)ブロックとして処理される。 On the other hand, if it is determined that the (e.g., the next) block of data to be processed is not considered similar to the block of data already stored in the local memory of the processing device, the new data A block is read from a data array in main memory, stored in local memory of the processing device, and then processed as a (eg, next) block of data to be processed by the processing device.

以下でさらに説明されるように、類似度判定は、好ましくは、注目するデータブロックに関連付けられている類似度情報(メタデータ)に基づく。このような類似度情報の生成は、本発明の他の態様である。以下では、このことについてさらに詳しく説明する。 As described further below, the similarity determination is preferably based on similarity information (metadata) associated with the data block of interest. Such generation of similarity information is another aspect of the present invention. This will be described in more detail below.

本発明は、データが配列として格納され、ブロック毎に処理デバイスによって読み出され処理されるシステムにおいて使用することができる。したがって、これは、例えば、グラフィックスプロセッサ、CPU、ビデオプロセッサ、合成エンジン、ディスプレイコントローラなどにおいて使用することができる。 The present invention can be used in a system where data is stored as an array and read and processed by a processing device block by block. Thus, it can be used, for example, in a graphics processor, CPU, video processor, composition engine, display controller, etc.

一般に、本発明は、処理すべきデータ配列内の近くにあるデータブロックが類似しているか、または同じである可能性が高い読み込みトランザクション(および書き込みトランザクション)を排除する際に有用である。したがって、この方式を使用することで、例えば、画像データが、グラフィックスプロセッサ(GPU)、CPU、ビデオプロセッサ、カメラコントローラ、およびディスプレイコントローラのうちのどれか2つの間で転送されるときの読み込みトランザクション(および書き込みトランザクション)を排除することができる。 In general, the present invention is useful in eliminating read transactions (and write transactions) that are likely to have similar or identical data blocks in the data array to be processed. Thus, using this scheme, for example, read transactions when image data is transferred between any two of the graphics processor (GPU), CPU, video processor, camera controller, and display controller. (And write transactions) can be eliminated.

例えば、データのブロックの形態で表示すべき画像を、潜在的に、また典型的に処理する、上述のようなディスプレイコントローラの操作で画像を表すだけでなく、ビデオプロセッサによりテクスチャとして使用するためにグラフィックスプロセッサに転送すべき画像を生成することができ、この場合、本発明の技術は、グラフィックスプロセッサが使用する画像(テクスチャ)を読み込むときに読み込みトランザクションを排除するために使用することが可能である。同様に、グラフィックスプロセッサによって生成されるフレームは、CPUで操作することが可能であり、この場合、CPUを本発明の仕方で操作することで、CPUがフレームを読み込んでそれを操作するのに必要な読み込みトランザクションを低減することができる。これには、CPUで使用されうるキャッシュラインがより少ないという付加的なメリットもある。 For example, to display an image to be displayed in the form of a block of data, potentially and typically, to display the image with the operation of a display controller as described above, but also to use it as a texture by a video processor An image to be transferred to the graphics processor can be generated, in which case the technique of the present invention can be used to eliminate read transactions when reading the image (texture) used by the graphics processor It is. Similarly, a frame generated by the graphics processor can be manipulated by the CPU, in which case the CPU is manipulated in the manner of the present invention so that the CPU reads the frame and manipulates it. The required read transaction can be reduced. This has the added benefit that fewer cache lines can be used by the CPU.

同様に、カメラ(ビデオまたは静止画像)は、例えば、メモリに格納し、その後、画像を処理する、コンピュータ、ディスプレイなどのデータ処理システムに供給するために、ブロック毎にそのセンサーによって生成される画像を処理することができる。 Similarly, a camera (video or still image) is an image generated by its sensor on a block-by-block basis for delivery to a data processing system such as a computer, display, etc., which is then stored in memory and then processed. Can be processed.

データの配列が格納されるメモリは、好適なそのような任意のメモリを含むことができ、また好適な、望ましい仕方で構成することができる。例えば、処理デバイスにオンチップで取り付けられているメモリであってもよいし、また外部メモリであってもよい。好ましい一実施形態では、これは、システムのメインメモリなどの外部メモリである。これは、この目的に合わせた専用メモリとすることができるか、または他のデータにも使用されるメモリの一部とすることもできる。グラフィックス処理システムの場合、好ましい一実施形態において、データ配列が格納されるメモリは、グラフィックス処理システムの出力先となるフレームバッファである。 The memory in which the array of data is stored can include any suitable such memory and can be configured in any suitable and desirable manner. For example, the memory may be an on-chip attached to the processing device, or may be an external memory. In a preferred embodiment, this is an external memory, such as the main memory of the system. This can be a dedicated memory for this purpose, or it can be part of the memory used for other data. In the case of a graphics processing system, in a preferred embodiment, the memory in which the data array is stored is a frame buffer that is the output destination of the graphics processing system.

第1の(例えば、メイン)メモリに格納され、処理のためにそこから読み出されるデータの配列は、データのそのような好適な、望ましい配列とすることができる。これは、例えば、グラフィックスプロセッサを使用して生成することができるデータの好適な、望ましい配列を含みうる。好ましい一実施形態では、これは、例えば、表示すべき画像を表すデータである。 The array of data stored in and read from the first (eg, main) memory may be such a suitable and desirable array of data. This may include a suitable and desirable arrangement of data that can be generated using, for example, a graphics processor. In a preferred embodiment, this is, for example, data representing an image to be displayed.

特に好ましい一実施形態では、これは、表示するための出力フレームを含むが、これは、グラフィックステクスチャ(例えば、レンダー「ターゲット」が、グラフィックスプロセッサを使用して生成するテクスチャである(例えば、「テクスチャにレンダー」オペレーションで))またはグラフィックスプロセッサシステムの出力の書き込み先となる他の表面などのグラフィックスプロセッサの他の出力を含むこともできるか、またはその代わりに含むことができる。これは、例えば、上述のように、ビデオプロセッサ、またはCPUによって生成される画像を含むことも可能である。 In one particularly preferred embodiment, this includes an output frame for display, which is a texture that a graphics texture (e.g., a render `` target '' generates using a graphics processor (e.g., Other outputs of the graphics processor, such as in a “render to texture” operation)) or other surface to which the graphics processor system output is written, may alternatively be included. This may include, for example, an image generated by a video processor or CPU, as described above.

処理デバイスは、例えば使用するために、データ配列を(ブロック毎に)読み込んで処理し、またはその内容を変更する任意のデバイスとすることができる。したがって、これは、例えば、ディスプレイコントローラ、CPU、ビデオプロセッサ、およびグラフィックスプロセッサのうちの1つとしてよく、また好ましい一実施形態では、ディスプレイコントローラ、CPU、ビデオプロセッサ、およびグラフィックスプロセッサのうちの1つである。 The processing device can be any device that reads and processes the data array (for each block) or changes its contents, eg, for use. Thus, this may be, for example, one of a display controller, CPU, video processor, and graphics processor, and in a preferred embodiment, one of the display controller, CPU, video processor, and graphics processor. One.

処理デバイスのローカルメモリも、同様に、このような好適なメモリとしてよい。これは、好ましくは、処理デバイスのバッファもしくはキャッシュメモリであるか、または処理デバイスに関連付けられたバッファもしくはキャッシュメモリである。キャッシュは、例えば、フルアソシアティブ方式またはセットアソシアティブ方式とすることができる。 The local memory of the processing device may likewise be such a suitable memory. This is preferably a buffer or cache memory of the processing device or a buffer or cache memory associated with the processing device. The cache may be, for example, a full associative method or a set associative method.

上述のように、特に好ましい一実施形態では、本発明は、グラフィックス処理システム(グラフィックスプロセッサ)によって生成されたデータ配列に関して実施され、この場合、処理すべきデータ配列は、好ましくは、表示すべき出力フレームであり、データ配列が格納される第1のメインメモリは、好ましくは、グラフィックス処理システムのフレームバッファである。同様に、出力フレームが表示されるデータ配列を処理すべき処理デバイスは、好ましくは、表示デバイス(例えば、画面またはプリンタ)の、または表示デバイス(例えば、画面またはプリンタ)用のディスプレイコントローラである。また、これは、例えば、上述のように、グラフィックスプロセッサによって生成されるフレームを操作すべきCPUであってもよい。 As mentioned above, in one particularly preferred embodiment, the present invention is implemented with respect to a data array generated by a graphics processing system (graphics processor), in which case the data array to be processed is preferably displayed. The first main memory that is the output frame and in which the data array is stored is preferably a frame buffer of the graphics processing system. Similarly, the processing device that is to process the data array in which the output frame is displayed is preferably a display controller of the display device (eg, screen or printer) or for the display device (eg, screen or printer). In addition, this may be a CPU that is to operate a frame generated by a graphics processor as described above, for example.

処理され(また比較される)データのブロックは、それぞれ、データの配列全体の好適な、所望の任意の領域(エリア)を表すことができる。データの配列全体が、それぞれ配列全体の一部を表す複数の識別可能なより小さな領域に分割または区分化される限り、またそれに応じて識別され、考察されうるデータのブロックとして表すことができる限り、データのブロックへの配列の細分を必要に応じて実行することができる。 Each block of data to be processed (and compared) can represent any suitable desired area of the entire array of data. As long as the entire array of data is divided or partitioned into multiple identifiable smaller regions each representing a portion of the entire array, and as long as it can be represented as a block of data that can be identified and considered accordingly Subdivision of the array into blocks of data can be performed as needed.

データのそれぞれのブロックは、好ましくは、データ配列全体の異なる部分(部分領域)を表す(これらのブロックは、望ましければ重なり合うことも可能である)。それぞれのブロックは、配列内の複数のデータ位置などの、データ配列の適切な部分(エリア)を表すであろう。好適なデータブロックサイズは、データ配列内において8×8、16×16、または32×32のデータ位置となる。 Each block of data preferably represents a different part (partial region) of the entire data array (these blocks can overlap if desired). Each block will represent an appropriate portion (area) of the data array, such as multiple data positions within the array. The preferred data block size is 8 × 8, 16 × 16, or 32 × 32 data positions in the data array.

特に好ましい一実施形態では、データの配列は、規則正しいサイズおよび形状の領域(データのブロック)に、好ましくは正方形または矩形の形の領域に分割される。しかし、これは本質的でなく、他の配置構成も、所望するなら使用することが可能である。 In a particularly preferred embodiment, the array of data is divided into regularly sized and shaped areas (blocks of data), preferably square or rectangular shaped areas. However, this is not essential and other arrangements can be used if desired.

処理デバイスのメモリ内にすでに格納されているデータのブロックまたは第1のメモリ内に格納されているデータの配列からのデータの新しいブロックのいずれかを処理するための類似度判定および結果として行われる判定は、データ配列が処理されるときに望ましい、また好適な仕方で、望ましい、好適な任意のポイントおよび時点に実行されうる。 Similarity determination and result for processing either a block of data already stored in the memory of the processing device or a new block of data from an array of data stored in the first memory The determination can be performed at any desired and preferred point and time point desired and in a suitable manner when the data array is processed.

例えば、類似度判定および結果として行われるデータブロック選択は、それぞれのデータブロックについてそのデータブロックが処理される番になったときに実行されうる(一実施形態では実行される)。この場合、例えば、処理されているデータの現在のブロックの後に処理すべきデータの次のブロックが処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかが判定され、次いで、データの新しいもしくは既存のブロックがデータのその次のブロックに対してしかるべく処理される。 For example, similarity determination and resulting data block selection can be performed for each data block when it is time to process that data block (performed in one embodiment). In this case, for example, it is determined whether the next block of data to be processed after the current block of data being processed is similar to a block of data already stored in the memory of the processing device, and then A new or existing block of data is processed accordingly for the next block of data.

しかし、特に好ましい一実施形態では、類似度判定および結果として行われるデータブロック選択は、データブロックが実際に処理される前に実行される。この場合、類似度判定は、例えば、その後に処理デバイスローカルメモリから取り出され処理されるデータブロックの前に処理デバイスのローカルメモリ内へのデータブロックの実質的な「プリフェッチ」を制御するために使用される。この配置構成は、例えば、処理デバイス(例えば、ディスプレイコントローラ)が、そのローカルメモリ内のキューに処理すべきデータブロックを入れる形で動作し、次いで、キューから1つずつ表示用にそれらのブロックを処理する場合に好適である。このような配置構成では、類似度判定は、ローカルメモリ内のキューへのデータブロックのフェッチを制御するために使用することが可能である(つまり、実際に、キュー内にすでに入っているデータブロックを繰り返すか、または新しいデータブロックを格納されているデータ配列からキューにフェッチするかどうか)。 However, in one particularly preferred embodiment, the similarity determination and the resulting data block selection is performed before the data block is actually processed. In this case, similarity determination is used, for example, to control the substantial “prefetching” of the data block into the local memory of the processing device before the data block that is subsequently retrieved from the processing device local memory and processed. Is done. This arrangement works, for example, when a processing device (e.g. a display controller) puts the blocks of data to be processed into queues in its local memory, and then puts those blocks one by one from the queue for display. Suitable for processing. In such an arrangement, similarity determination can be used to control fetching of data blocks to a queue in local memory (i.e., data blocks that are already already in the queue). Whether to repeat or fetch a new block of data from the stored data array into the queue).

処理すべき新しいデータブロックが処理デバイス(例えば、ディスプレイコントローラ)のローカルメモリ内にすでに格納されているブロックに類似しているかどうかの判定は、好適な、望ましい仕方で実行されうる。例えば、格納されているデータ配列から読み込むべき新しいデータブロックをローカルメモリ内部にすでに格納されている1つまたは複数のブロックと比較して、それらのブロックは類似しているかどうかを判定することが可能である。したがって、例えば、新しいデータブロックの内容の一部を、ローカルメモリ内にすでに格納されている1つの、またはそのデータブロックまたは複数のデータブロックの内容の一部または全部と比較することが可能である。 The determination of whether a new data block to be processed is similar to a block already stored in the local memory of the processing device (eg, display controller) may be performed in any suitable and desirable manner. For example, a new block of data to be read from a stored data array can be compared to one or more blocks already stored inside local memory to determine if those blocks are similar It is. Thus, for example, it is possible to compare part of the contents of a new data block with part or all of the contents of one or more data blocks or one or more data blocks already stored in local memory .

特に好ましい一実施形態では、データ配列に関連付けられている情報が、所定のブロックが互いに類似していると考えるべきかどうかを判定するために使用される。したがって、特に好ましい一実施形態では、データブロックそれ自体の内容を比較するのではなく、類似度判定プロセスが、データの配列に関連付けられている情報を使用して処理すべきデータブロックがローカルメモリ内にすでに格納されているブロックに類似しているかどうかを判定する。 In one particularly preferred embodiment, the information associated with the data array is used to determine whether a given block should be considered similar to each other. Thus, in a particularly preferred embodiment, rather than comparing the contents of the data block itself, the similarity determination process causes the data block to be processed using information associated with the array of data to be in local memory. Determine if it is similar to a block already stored in.

言い換えると、類似度判定プロセスは、好ましくは、データ配列に関連付けられている「メタデータ」(情報)を使用して、処理すべきデータブロックが処理デバイスのローカルメモリ内にすでにあるブロックに類似しているかどうかを判定する。以下でさらに説明されるように、この目的のためにデータ配列に関連付けられているメタデータを使用することで、処理デバイスにかかる負担が軽減され、使用中の読み込みトランザクションの回数を減らすのに特に有効なメカニズムが実現されうる。 In other words, the similarity determination process preferably uses “metadata” (information) associated with the data array to resemble blocks where the data block to be processed is already in the local memory of the processing device. Determine whether or not. As described further below, the use of metadata associated with data arrays for this purpose reduces the burden on the processing device, especially to reduce the number of read transactions in use. An effective mechanism can be realized.

データブロックが類似していると考えるべきかどうかを判定するために処理デバイスによって使用されうる好適な任意の形式のメタデータ(情報)を使用することができる(またデータの格納されている配列に適宜関連付けることができる)。 Any suitable form of metadata (information) that can be used by the processing device to determine whether the data blocks should be considered similar can be used (also in the stored array of data). Can be associated as appropriate).

例えば、メタデータは、処理デバイスそれ自体がデータブロックが互いに類似していると考えるべきかどうかを評価するために利用できる情報を含むことが可能であり、また好ましい一実施形態では、そのような情報を含む。 For example, the metadata can include information that can be used by the processing device itself to assess whether the data blocks should be considered similar to each other, and in a preferred embodiment, such a Contains information.

このような好ましい一実施形態では、データの配列に関連付けられ、データのブロックが類似しているかどうかを判定するために使用すべき情報(メタデータ)は、注目するデータブロックの内容を表す、および/または注目するデータブロックの内容から導出される情報を含む。この場合、次いで、類似度判定プロセスは、好ましくは、新しいデータブロックの内容を表し、および/または新しいデータブロックの内容から導出される情報を、ローカルメモリ内にすでに格納されているデータブロックの内容を表し、および/またはローカルメモリ内にすでに格納されているデータブロックの内容から導出される情報と比較することによって、各データブロックが類似しているかどうかを判定する。 In one such preferred embodiment, the information (metadata) that is associated with an array of data and that should be used to determine whether a block of data is similar represents the content of the data block of interest, and Contains information derived from the contents of the data block of interest. In this case, the similarity determination process then preferably represents the content of the new data block and / or information derived from the content of the new data block, the content of the data block already stored in local memory. And / or determine whether each data block is similar by comparing it with information derived from the contents of the data block already stored in local memory.

これらの配置構成におけるそれぞれのデータブロックの内容を表す情報は、好適な任意の形式のものとしてよいが、好ましくは、データブロック上の内容に基づくか、またはデータブロック上の内容から導出される。最も好ましくは、これは、データブロックの内容から生成されるか、またはデータブロックの内容に基づくデータブロックに対する「署名」の形をとる。このようなデータブロックの内容の「署名」は、例えば、また好ましくは、データブロックから導出される(データブロックに関して生成される)チェックサム、CRC、またはハッシュ値などの、データブロックの内容を表すと考えることができる好適な任意の一組の導出情報を含むことができる。好適な署名としては、CRC32などの標準的なCRC、またはMD5、SHA-1などの他の形式の署名が挙げられる。 Information representing the contents of each data block in these arrangements may be of any suitable form, but is preferably based on or derived from the contents on the data block. Most preferably, it is generated from the content of the data block or takes the form of a “signature” for the data block based on the content of the data block. Such a “signature” of the content of the data block represents, for example and preferably, the content of the data block, such as a checksum, CRC or hash value derived from the data block (generated for the data block) Any suitable set of derivation information that can be considered can be included. Suitable signatures include standard CRCs such as CRC32 or other forms of signatures such as MD5, SHA-1.

したがって、特に好ましい一実施形態では、データブロックの内容を示すか、または表す、および/またはデータブロックの内容から導出される署名は、比較されるべき各データブロック毎に生成され、類似度判定プロセスが、各データブロックの署名を比較して、それらのブロックが類似しているかどうかを判定する。 Thus, in a particularly preferred embodiment, a signature indicating or representing the content of the data block and / or derived from the content of the data block is generated for each data block to be compared, and the similarity determination process Compares the signatures of each data block to determine if the blocks are similar.

例えば、1つの、例えばRGBA、データブロック(例えば、レンダリング済みタイル)に対する単一の署名を生成することが可能であるか、またはそれぞれの色平面に対し別の署名(例えば、CRC)を生成することが可能である。同様に、色変換を実行し、別の署名を必要に応じてY、U、V平面に対し生成することが可能である。 For example, it is possible to generate a single signature for one, e.g. RGBA, data block (e.g. rendered tile) or generate a separate signature (e.g. CRC) for each color plane It is possible. Similarly, color conversion can be performed and different signatures can be generated for the Y, U, and V planes as needed.

当業者であれば理解するように、データブロックに対し生成される署名が長ければ長いほど(署名がデータブロックをより正確に表すほど)、署名間の誤った「一致」(したがって、例えば、新しいデータブロックを読み込まないという誤り)が生じる可能性が低くなる。そこで、一般的に所望される、必要な精度に応じて(および例えば署名生成および処理に必要なメモリおよび処理リソースに関するトレードオフとして)、使用する署名(例えば、CRC)を長くも短くもすることが可能である。 As those skilled in the art will appreciate, the longer the signature generated for a data block (the more accurately the signature represents the data block), the more false “matches” between the signatures (and thus, for example, new The possibility of an error that the data block is not read) is reduced. So, depending on the accuracy required and generally desired (and as a trade-off for memory and processing resources required for signature generation and processing, for example), use longer or shorter signatures (e.g. CRC) Is possible.

署名は、データブロックの内容の特定の1つまたは複数の態様に向けて重み付けすることも可能であり、これにより、例えば、署名の所定の全長が出力全体に対してより効果的なデータブロック内容(データ)の部分に対し署名の重み付けをすることによって全体的によい結果をもたらすようにできる(例えば、データ配列が表す画像の観察者が知覚するように)。 The signature can also be weighted towards a particular aspect or aspects of the content of the data block so that, for example, a predetermined total length of the signature makes the data block content more effective for the entire output. Weighting the signature on the (data) portion can give good overall results (eg, as perceived by the viewer of the image represented by the data array).

例えば、アプリケーション、例えば、ディスプレイ、の要件に応じて、異なるアプリケーションなどに対する異なる長さの署名を使用することも可能である。これは、消費電力を低減するのにさらに役立ちうる。したがって、好ましい一実施形態では、使用される署名の長さは、使用中に変更することができる。好ましくは、署名の長さは、使用中のアプリケーションに応じて変更することができる(使用しているアプリケーションに応じて適応チューニングすることができる)。 For example, different length signatures for different applications, etc. may be used depending on the requirements of the application, eg, display. This can further help reduce power consumption. Thus, in a preferred embodiment, the signature length used can be changed during use. Preferably, the signature length can be changed according to the application in use (adapted tuning can be made according to the application being used).

これらの実施形態の特に好ましい配置構成では、データブロック署名は、作成時に「ソルト」が付けられている(つまり、他の数値(ソルト値)が生成された署名値に付加されている)。ソルト値は、都合のよいことに、例えば、起動以降のデータ配列(例えば、フレーム)の数値、またはランダム値としてよい。これは、当技術分野で知られているように、署名比較プロセスにおける不正確さによって引き起こされる誤りを非決定論的にするのに役立つ(つまり、例えば、プロセスが映画またはテレビ番組を表示するために使用されている場合など、画像の所定のシーケンスの反復ビューイングについて同じ点で誤りが常に発生するのを回避する)。 In a particularly preferred arrangement of these embodiments, data block signatures are “salt” ed when created (ie, other numerical values (salt values) are appended to the generated signature value). The salt value may conveniently be, for example, a numerical value of a data array (eg, a frame) after activation, or a random value. This is useful for making errors caused by inaccuracies in the signature comparison process non-deterministic as known in the art (i.e., for the process to display a movie or television program, for example). Avoid errors always occurring at the same point for iterative viewing of a given sequence of images, such as when used).

上記の配置構成において、類似度判定プロセスは、2つ(またはそれ以上)のデータブロックに関連付けられているメタデータ(情報)を使用して、処理すべき新しいデータブロックが処理デバイスのローカルメモリ内にすでに格納されているデータブロックに類似しているかどうかを判定する。 In the above arrangement, the similarity determination process uses the metadata (information) associated with two (or more) data blocks so that the new data block to be processed is in the local memory of the processing device. To determine whether it is similar to a data block already stored in.

しかし、他の特に好ましい実施形態では、データ配列に関連付けられているメタデータ(情報)は、データ配列内の所定のデータブロックがデータ配列内の他のブロックに類似しているかどうかを直接示す類似度情報の形をとる。この場合、処理デバイスは、メタデータを単純に読み込んで、新しいデータブロックが処理デバイスのローカルメモリにすでに格納されているデータブロックに類似していると考えられるかどうかを判定することができ、処理デバイス側でメタデータを使用してブロックそれ自体の何らかの形式の類似度評価を実行する必要はない。これにより、データ配列処理オペレーションの実行中に処理デバイスにかかる必要な処理負担が軽減される。 However, in other particularly preferred embodiments, the metadata (information) associated with the data array is similar to directly indicating whether a given data block in the data array is similar to other blocks in the data array. It takes the form of degree information. In this case, the processing device can simply read the metadata and determine if the new data block is considered similar to the data block already stored in the processing device's local memory, There is no need to perform some form of similarity assessment of the block itself using metadata on the device side. This reduces the necessary processing burden on the processing device during execution of the data array processing operation.

したがって、好ましい一実施形態では、第1の(メイン)メモリ内のデータの配列に関連付けられている情報(メタデータ)は、各データブロック(上述のようなデータブロック「署名」などの)間の類似度を評価するために使用することができる情報を含むが、特に好ましい一実施形態では、この情報(メタデータ)は、各データブロックがデータ配列内の他のデータブロックに類似していると考えられるかどうかを(直接)示す情報を含む。 Thus, in a preferred embodiment, the information (metadata) associated with the array of data in the first (main) memory is between each data block (such as the data block “signature” as described above). Includes information that can be used to assess similarity, but in one particularly preferred embodiment, this information (metadata) indicates that each data block is similar to other data blocks in the data array. Contains information that indicates (directly) whether it can be considered.

メタデータは、データブロックがデータ配列内の他のデータブロックに類似していると考えられるかどうかを直接示す場合、メタデータは、それを行うために適している、望ましい形式をとりうる。これは、例えば、階層的四分木を含むことが可能である。好ましい一実施形態では、(2D)ビットマップの形式である。 If the metadata directly indicates whether the data block is considered similar to other data blocks in the data array, the metadata can take any desirable form suitable for doing so. This can include, for example, a hierarchical quadtree. In a preferred embodiment, it is in the form of a (2D) bitmap.

特に好ましいこのような一実施形態において、メタデータ(例えば、ビットマップ)は、データ配列から読み込むべきデータブロックを表し、それぞれのメタデータ(例えば、ビットマップ)エントリは、対応するデータブロックについて、そのデータブロックがデータ配列内の他のデータブロックに類似しているかどうかを示す。最も好ましくは、データ配列内のそれぞれのデータブロック位置は、そのブロックが他のブロックに類似している(かどう)かを示すメタデータエントリに関連付けられている。この場合、類似度判定プロセスは、データブロックが処理デバイスのローカルメモリ内にすでに格納されているデータブロックに類似しているかどうかを判定するために、注目するデータブロック位置に対する関連するメタデータエントリを単純に読み込むだけでよい。 In one such particularly preferred embodiment, the metadata (e.g., bitmap) represents a data block to be read from the data array, and each metadata (e.g., bitmap) entry is for that corresponding data block. Indicates whether the data block is similar to other data blocks in the data array. Most preferably, each data block position in the data array is associated with a metadata entry that indicates whether the block is similar to other blocks. In this case, the similarity determination process determines the associated metadata entry for the data block location of interest to determine whether the data block is similar to a data block already stored in the local memory of the processing device. Simply read.

したがって、特に好ましい一実施形態では、データ配列にはビットマップなどのメタデータが関連付けられており、これにより、データ配列内の各それぞれのデータブロック毎にそのデータブロックがデータ配列内の他のデータブロックに類似しているかを示し、類似度判定プロセス(処理デバイス)は、処理すべき新しいデータブロックが処理デバイスのローカルメモリ内にすでに格納されているデータブロックに類似しているかを、新しいデータブロックに対する関連するメタデータを使用することで判定する。 Thus, in a particularly preferred embodiment, the data array is associated with metadata, such as a bitmap, so that for each data block in the data array, that data block is the other data in the data array. The similarity determination process (processing device) indicates whether the new data block to be processed is similar to a data block already stored in the local memory of the processing device. Determine by using the relevant metadata for.

これらの配置構成において、メタデータは、望み通りに構成し配置することができる。例えば、これは、データブロックがデータ配列内の直前のデータブロックに類似しているかどうかを単純に示すことが可能であり、また好ましい一実施形態では、単純に示す。この場合、それぞれのメタデータエントリは、単一ビットのみを含むだけでよく、一方の値(例えば、「1」)は対応するブロックが直前のブロックに類似していることを示し、他方の値(例えば、「0」)は類似していないことを示す。 In these arrangements, the metadata can be configured and arranged as desired. For example, this can simply indicate whether the data block is similar to the previous data block in the data array, and in a preferred embodiment it simply indicates. In this case, each metadata entry need only contain a single bit, one value (e.g. `` 1 '') indicates that the corresponding block is similar to the previous block, and the other value (For example, “0”) indicates that they are not similar.

これを容易に行えるようにするために、データブロックは、特定の定義済み順序で処理されるべきである(データ配列に書き込む場合とその配列から読み込む場合の両方)。好ましくは、ブロック間の空間的コヒーレンスを利用することができる順序が、使用される。 To make this easy, the data blocks should be processed in a specific predefined order (both when writing to and reading from the data array). Preferably, an order in which spatial coherence between blocks can be utilized is used.

例えば、データブロックが直前のデータブロックに関して考えられるだけでなく、データ配列内の複数のデータブロックに関しても考えられる場合に、より高度なメタデータ配置構成を使用することも可能である。この場合、それぞれの各それぞれのブロック位置に関連付けられているメタデータ(例えば、ビットマップエントリ)は、対応するデータブロックがデータ配列内の他のデータブロックに類似していることだけでなく、データ配列内のどのデータブロックに類似しているかも示すべきである。この場合、それぞれのデータブロック位置に関連付けられているメタデータ(例えば、ビットマップエントリ)は、それぞれのブロック位置について伝達される情報がより多いので単一ビットよりも大きくなる。メタデータエントリの実際のサイズは、例えば、類似度を目的としてそれぞれのデータブロックが比較されるデータ配列内のデータブロックの個数に依存する(その後それぞれのメタデータエントリが表すことができなければならない可能な類似のブロック置換の数を決定するので)。 For example, a more sophisticated metadata arrangement can be used when a data block is considered not only with respect to the immediately preceding data block but also with respect to multiple data blocks in the data array. In this case, the metadata (e.g., bitmap entry) associated with each respective block location is not only that the corresponding data block is similar to other data blocks in the data array, but also the data It should also indicate which data blocks in the array are similar. In this case, the metadata (eg, bitmap entry) associated with each data block position is larger than a single bit because more information is conveyed about each block position. The actual size of the metadata entry depends, for example, on the number of data blocks in the data array to which each data block is compared for the purpose of similarity (after which each metadata entry must be able to represent Since it determines the number of possible similar block permutations).

これらの配置構成において、それぞれの類似度値(メタデータエントリ)は、例えば、注目しているデータブロックがデータ配列内の他のどのデータブロックに類似しているかを示す相対的指標(例えば「001」は現在のデータブロックに相対的に前にあるデータブロックを示す)、または注目してるデータブロックがデータ配列内の他のどのデータブロックに類似しているかを示す絶対的指標(例えば、メタデータ「125」は、ブロックが注目しているデータ配列内の125番目のデータブロックに類似していることを示す)を与えることができる。 In these arrangement configurations, each similarity value (metadata entry) is, for example, a relative indicator (for example, “001” indicating which other data block in the data array is similar to the data block of interest. '' Indicates a data block that precedes the current data block), or an absolute indicator that indicates which data block of interest is similar to other data blocks in the data array (e.g., metadata “125” indicates that the block is similar to the 125th data block in the data array of interest).

メタデータエントリのサイズの選択は、メタデータを準備し格納する際のオーバーヘッドとメタデータがデータ配列内のさらに多くの他のデータブロックへの類似度を示すことができる場合に排除される読み込みトランザクションの回数を表す潜在的にさらに大きな数との間のトレードオフまたは最適化である。したがって、使用するメタデータ配置構成の選択は、例えば、これらの基準に基づいて、また例えば、システムの予期もしくは予想される使用または実装状態に基づいて行うことができる。(なお、ここでは、本発明の実施形態の仕方でメタデータを使用すると、データブロック1つ当たりのメタデータのオーバーヘッドが比較的小さいものとなりうるので、かなり小さなデータブロックサイズを(キャッシュラインのレベルなどにおいて)使用しやすくすることができることに留意されたい。) Selection of the size of the metadata entry is a read transaction that is eliminated when the metadata is prepared and stored overhead and the metadata can indicate similarity to more other data blocks in the data array Trade-off or optimization between potentially larger numbers representing the number of times. Thus, the selection of the metadata arrangement to use can be made, for example, based on these criteria, and for example, based on the expected or anticipated usage or implementation of the system. (Note that here, using metadata in the manner of the embodiment of the present invention can result in relatively small metadata overhead per data block, so a fairly small data block size (cache line level (Note that it can be easier to use, etc.)

これらの配置構成において、それぞれのメタデータエントリとともに、各データブロックがどれだけ類似しているかを示す「ライクネス(likeness)」値を含むことも可能である。次いで、類似度判定プロセスが、例えば、このライクネス値を使用して、データ配列から新しいブロックを読み込むか、または使用中の処理デバイスのローカルメモリ内のすでに存在している類似データブロックを再利用するかを決定することが可能である。例えば、類似度判定プロセスは、ライクネス閾値を設定し、新しいデータブロックに対するライクネス値をその閾値と比較して新しいデータブロックを読み込むか、または読み込まないように、しかるべく設定することが可能である。次いで、これにより、読み込みプロセスを修正し、例えば、使用時に、例えば、使用中にライクネス閾値を変えることによって、データ配列読み込みプロセスの正確さを上げ下げすることができる。 In these arrangements, it is also possible to include a “likeness” value indicating how similar each data block is with each metadata entry. The similarity determination process then uses this likeness value, for example, to read a new block from the data array or to reuse an already existing similar data block in the local memory of the processing device being used. Can be determined. For example, the similarity determination process may set a likeness threshold and compare the likeness value for a new data block with the threshold to read or not read a new data block accordingly. This in turn can modify the read process and increase or decrease the accuracy of the data array read process, for example, by changing the likeness threshold during use, eg, during use.

他の特に好ましい実施形態では、データ配列に関連付けられているメタデータ(類似度情報)は、それらの相対的類似度に応じて処理デバイスのローカルメモリ内にデータブロックを読み込むように処理デバイスに命令するコマンドリストの形態をとる。例えば、ブロック1を処理デバイスのローカルメモリ内に読み込むならば、次の3つのブロックについてそのブロックを繰り返し、次いで、データ配列からローカルメモリ内へ5番目のデータブロックを読み込み、そのブロックを1回繰り返し、ローカルメモリから1番目のデータブロックを撤去し、データ配列から7番目のブロックを読み込み、データ配列から8番目のブロックを読み込んで処理し、というようにコマンドリストを作成することが可能である。このようなコマンドリストは、直接的に生成することが可能であるか、または例えば、類似度ビットマップを最初に生成し、次いで解析してその後データ配列について格納されるコマンドリストを作成することが可能である。 In another particularly preferred embodiment, the metadata (similarity information) associated with the data array instructs the processing device to read the data block into the local memory of the processing device according to their relative similarity. Takes the form of a command list. For example, if block 1 is read into the local memory of the processing device, the block is repeated for the next three blocks, then the fifth data block is read from the data array into local memory and the block is repeated once. It is possible to create a command list by removing the first data block from the local memory, reading the seventh block from the data array, reading the eighth block from the data array, processing, and so on. Such a command list can be generated directly or, for example, a similarity bitmap can be first generated and then analyzed to create a command list that is then stored for the data array. Is possible.

類似度メタデータ(情報)がデータ配列に関連付けられている場合、データ配列に関連付けられるべき必要なメタデータを生成することも必要になる。本発明は、その好ましい実施形態において少なくとも、メタデータの生成に拡大適用される。 When similarity metadata (information) is associated with a data array, it is also necessary to generate necessary metadata to be associated with the data array. The present invention, in its preferred embodiment, extends at least to the generation of metadata.

メタデータを生成して、所望の、好適な仕方でデータ配列に関連付けることができる。好ましくは、データ配列の生成とともに生成される。好ましい一実施形態では、メタデータは、データ配列を生成しているデバイスによって生成される(このデバイスは、上述のように、例えば、グラフィックスプロセッサ、ビデオプロセッサ、カメラコントローラ(カメラのセンサーによって生成されるデータを処理する)、またはCPUとすることができる)。 Metadata can be generated and associated with the data array in any desired and preferred manner. Preferably, it is generated together with the generation of the data array. In a preferred embodiment, the metadata is generated by the device generating the data array (this device is generated by, for example, a graphics processor, video processor, camera controller (camera sensor) as described above. Data can be processed), or it can be a CPU).

メタデータが、それぞれのデータブロックに対する内容の「署名」を含む場合、それらの署名は、データブロックが生成されるときに生成し、次いで、適切な仕方で生成されたデータブロックに関連付けて格納することが可能である。 If the metadata includes a “signature” of content for each data block, those signatures are generated when the data block is generated and then stored in association with the data block generated in an appropriate manner. It is possible.

メタデータが、データブロックが上述の「類似度」ビットマップなどの他のデータブロックと同じであると考えられるかを直接示す場合、データ配列生成プロセスは、好ましくは、データのブロックをそれらが生成されると同時に比較し、それに応じて類似度情報、例えばビットマップを生成することを含む。 If the metadata directly indicates whether the data blocks are considered the same as other data blocks such as the “similarity” bitmap described above, the data array generation process preferably generates the blocks of data As well as making comparisons and generating similarity information accordingly, eg, bitmaps.

この場合、データブロック比較は、例えば、データブロックの内容を表し、および/またはデータブロックの内容から導出される、上述の署名などの情報を、他のデータブロックの内容を表し、および/または他のデータブロックの内容から導出される情報と比較し、データブロックの類似度または他のものを評価することによって、実行することが可能である。 In this case, the data block comparison represents, for example, the content of the data block and / or information such as the signature described above derived from the content of the data block, the content of the other data block, and / or other This can be done by comparing the information derived from the contents of the data block and evaluating the similarity of the data block or others.

しかし、特に好ましい一実施形態では、ブロックの実際の内容(その内容の何らかの表現ではなく)が比較され、これにより、ブロックが類似していると考えられるかどうかを判定する。そうするために、データ配列のデータブロックの内容の一部または全部を、データ配列の他のデータブロック(複数のデータブロック)の内容の一部または全部と比較することができる。データブロックの実際の内容の一部または全部を比較することで、比較プロセスにおいて、複雑さを低減し、誤りを少なくすることができる。 However, in a particularly preferred embodiment, the actual contents of the blocks (rather than some representation of their contents) are compared to determine whether the blocks are considered similar. To do so, part or all of the contents of the data blocks of the data array can be compared with part or all of the contents of other data blocks (multiple data blocks) of the data array. Comparing some or all of the actual contents of the data blocks can reduce complexity and errors in the comparison process.

比較プロセスは、好ましくは、ある種の形の限界基準を使用して、ブロックが他のブロックに類似していると考えるべきかどうかを判定する。例えば、好ましくは、各ブロックの内容の選択された数のビットが一致している場合、ブロックは類似していると考えられる。好ましくは、許容されるブロック間に何らかの最大の視覚的偏差がある(データ配列が画像を表す場合)。 The comparison process preferably uses some form of limit criteria to determine whether a block should be considered similar to other blocks. For example, preferably the blocks are considered similar if the selected number of bits of the contents of each block match. Preferably, there is some maximum visual deviation between allowed blocks (if the data array represents an image).

最も好ましくは、ピクセルのLSBの差の量などの最大偏差は、ブロックが類似していないと考えられる前に許容される。好ましくは、この閾値、例えば、最大内容偏差は、使用中に変更する(例えば、プログラムする)ことができる。例えば、静的フレームデータと動的フレームデータの割合に基づき、および/または使用中の電力モード(例えば、低電力モードかどうか)に基づき、アプリケーション毎に設定することが可能である。 Most preferably, maximum deviations, such as the amount of pixel LSB difference, are allowed before the blocks are considered dissimilar. Preferably, this threshold, eg, maximum content deviation, can be changed (eg, programmed) during use. For example, it can be set for each application based on the ratio of static frame data to dynamic frame data and / or based on the power mode in use (eg, whether it is a low power mode).

特に好ましい一実施形態では、考察されるデータのブロックは、それぞれ、処理デバイスのローカルメモリの1つのキャッシュライン、またはデータ配列の2D部分タイルを含む(タイルベースのグラフィックス処理システムの場合などの、配列が別々のタイルから構成される場合)。これらは、データ配列を処理すべき処理デバイスの処理要素によって効率的に操作することができ、またデータ配列を処理すべき処理デバイスによってメモリから効率的にフェッチすることができる格納済みデータのユニットを使用するので、特に効果的な実装である。 In one particularly preferred embodiment, each block of data considered includes one cache line in the local memory of the processing device, or a 2D partial tile of the data array (such as in the case of a tile-based graphics processing system, etc. If the array consists of separate tiles). These are units of stored data that can be efficiently manipulated by the processing elements of the processing device that is to process the data array and that can be efficiently fetched from memory by the processing device that is to process the data array. Because it uses, it is a particularly effective implementation.

グラフィックス処理システムでは、好ましい一実施形態において、それぞれのデータブロックは、グラフィックスプロセッサがそのレンダリング出力として生成するレンダリング済みタイルに対応する。これは、グラフィックスプロセッサがレンダリングタイルを直接生成し、したがって、考察され、比較されるデータブロックを「生成する」ためにさらなる処理を必要としないという点で有益である。 In a graphics processing system, in a preferred embodiment, each data block corresponds to a rendered tile that the graphics processor generates as its rendering output. This is beneficial in that the graphics processor generates the rendering tile directly and therefore does not require further processing to “generate” the data blocks to be considered and compared.

これらの配置構成において、レンダーターゲット(データ配列)は、レンダリングのために、所望の、および好適な任意のサイズおよび形状のいくつかの(レンダリング)タイルに分割することができる。レンダリング済みタイルは、好ましくは、当技術分野で知られているように、すべて同じサイズおよび形状であるが、ただし、これは本質的なことではない。好ましい一実施形態では、それぞれのレンダリング済みタイルは、矩形であり、好ましくは、サイズに関して8×8、16×16、または32×32のサンプリング位置をとる。 In these arrangements, the render target (data array) can be divided into several (rendering) tiles of any size and shape desired and suitable for rendering. The rendered tiles are preferably all the same size and shape as is known in the art, although this is not essential. In a preferred embodiment, each rendered tile is rectangular and preferably takes a sampling position of 8 × 8, 16 × 16, or 32 × 32 with respect to size.

特に好ましい他の実施形態では、レンダリングプロセスが操作する(生成する)タイルと異なるサイズおよび/または形状のデータブロックを使用することができ、また好ましくは使用する。 In other particularly preferred embodiments, data blocks of different sizes and / or shapes can be used, and preferably used, from the tiles that the rendering process manipulates (generates).

例えば、好ましい一実施形態では、考察され、比較される1つの、またはそれぞれのデータブロックは、一組の複数の「レンダリング済み」タイルで構成され、および/またはレンダリング済みタイルの小部分のみを含みうる。これらの場合には、実際には、グラフィックスプロセッサが生成する1つまたは複数のレンダリング済みタイルから所望のデータブロックを「生成する」中間段階がありうる。 For example, in a preferred embodiment, one or each data block considered and compared is composed of a set of multiple “rendered” tiles and / or includes only a small portion of the rendered tiles. sell. In these cases, there may actually be an intermediate stage that “generates” the desired data block from one or more rendered tiles generated by the graphics processor.

好ましい一実施形態では、同じブロック(領域)構成(サイズおよび形状)が、データの配列全体にわたって使用される。しかし、他の好ましい実施形態では、異なるブロック構成(例えば、そのサイズおよび/または形状に関して)が、所定のデータ配列の異なる領域に対して使用される。したがって、好ましい一実施形態では、異なるデータブロックサイズが、同じデータ配列の異なる領域に対して使用されうる。 In a preferred embodiment, the same block (region) configuration (size and shape) is used throughout the array of data. However, in other preferred embodiments, different block configurations (eg, with respect to their size and / or shape) are used for different regions of a given data array. Thus, in a preferred embodiment, different data block sizes can be used for different regions of the same data array.

特に好ましい一実施形態では、ブロック構成(例えば、考察されているブロックのサイズおよび/または形状に関して)は、使用中に、例えば、データ配列毎にデータ配列(例えば、出力フレーム)上で変更することができる。最も好ましくは、ブロック構成は、排除される(回避される)読み込み(および/または書き込み)トランザクションの数または割合に応じて、例えば、また好ましくは、使用中に適応変更することができる。例えば、また好ましくは、特定のブロックサイズを使用しても、ブロックがメインメモリから読み込まれる必要がない確率が低いだけであることがわかっていれば、データのブロックをメインメモリから読み込まなくて済むようにする確率を高めることを試みるため、考察されているブロックサイズをデータのその後の配列に対して変更する(例えば、また好ましくは、小さくする)ことが可能である。 In one particularly preferred embodiment, the block configuration (e.g., with respect to the considered block size and / or shape) is changed in use, e.g., on a data array (e.g., output frame) for each data array. Can do. Most preferably, the block configuration can be adapted during use, for example and preferably, depending on the number or percentage of read (and / or write) transactions that are eliminated (avoided). For example, and preferably, a block of data need not be read from main memory if a particular block size is used, but only a low probability that the block does not need to be read from main memory is known. In order to try to increase the probability of doing so, it is possible to change (eg, also preferably reduce) the considered block size for subsequent arrays of data.

データブロックサイズを使用中に変更する場合、その変更は、例えば、データ配列全体にわたって、またはデータ配列の特定の部分のみにわたって、望み通りに実行できる。 If the data block size is changed in use, the change can be performed as desired, for example, over the entire data array or only over a specific portion of the data array.

データブロックを1つの他のデータブロックと、または複数の他のデータブロックと比較することができる。好ましくは、比較は、オンチップバッファ/キャッシュ内に各ブロックを格納することによって行われる。 A data block can be compared to one other data block or to multiple other data blocks. Preferably, the comparison is done by storing each block in an on-chip buffer / cache.

好ましい一実施形態では、データブロックは、単一の格納済みデータブロックとのみ、好ましくはデータ配列内の直前のデータブロックと比較される。 In a preferred embodiment, the data block is compared only with a single stored data block, preferably the previous data block in the data array.

他の好ましい実施形態では、データブロックは、データ配列の複数の他のデータブロックと比較される。これは、データ配列内の他の位置にあるデータブロックに類似しているデータブロックの読み込みをなくすことができるため、データ配列から読み込む必要のあるデータブロックの個数をさらに減らすのに役立ちうる。 In other preferred embodiments, the data block is compared to a plurality of other data blocks in the data array. This can help to further reduce the number of data blocks that need to be read from the data array because it can eliminate reading data blocks that are similar to data blocks at other locations in the data array.

データブロックが、データ配列の複数の他のデータブロックと比較される場合、それぞれのデータブロックをデータ配列のすべてのデータブロックと比較することが可能であるが、好ましくは、それぞれのデータブロックは、データ配列の他のデータブロックの全部ではなく一部と、例えば、また好ましくは、注目するデータブロックと同じ、データ配列のエリア内のデータブロックとのみ比較される(例えば、これらのデータブロックはデータブロックの位置を覆い、取り囲む)。これは、データ配列内のすべてのデータブロックをチェックしなくても、データブロックの一致を検出する可能性を高める。最も好ましくは、データブロックは、データ配列内の同じライン上にあるデータブロックと比較される(ブロックが生成される順に)。 When a data block is compared to multiple other data blocks in the data array, it is possible to compare each data block to all data blocks in the data array, but preferably each data block is It is compared only with a part, not all of the other data blocks of the data array, for example, and preferably only with data blocks within the same area of the data array as the data block of interest (e.g. these data blocks Covers and surrounds the position of the block). This increases the likelihood of detecting a data block match without having to check every data block in the data array. Most preferably, the data blocks are compared with the data blocks that are on the same line in the data array (in the order in which the blocks are generated).

例えば、フレーム毎に、使用中にそれぞれのデータブロックの比較相手となる他のデータブロックの個数を変えることも可能である。データブロック比較の探索深さを変えることで、メタデータ幅も変えることができる。 For example, it is possible to change the number of other data blocks to be compared with each data block during use for each frame. The metadata width can also be changed by changing the search depth of the data block comparison.

好ましい一実施形態では、データ配列のあらゆるデータブロックは、他の1つまたは複数データブロックと比較される。しかし、これは本質的なことではなく、したがって、他の好ましい実施形態では、所定のデータ配列(例えば、出力フレーム)のデータブロックの全部ではなく一部に関して比較が実行される。 In a preferred embodiment, every data block of the data array is compared with one or more other data blocks. However, this is not essential and, therefore, in other preferred embodiments, the comparison is performed on some but not all of the data blocks of a given data array (eg, output frame).

特に好ましい一実施形態では、各データ配列に対する他の1つまたは複数のデータブロックと比較されるデータブロックの個数は、例えば、また好ましくは、データ配列毎に(例えば、フレーム毎に)、またはでデータ配列(例えば、フレーム)の複数のシーケンスに基づいて変更される。これは、好ましくは、連続するデータ配列(例えば、フレーム)間の予期される相関に基づく(か、または基づかない)。 In one particularly preferred embodiment, the number of data blocks compared to one or more other data blocks for each data array is, for example, and preferably, per data array (e.g., per frame) or Changes are made based on multiple sequences of data arrays (eg, frames). This is preferably based on (or not based on) the expected correlation between successive data sequences (eg, frames).

したがって、メタデータ生成プロセスは、好ましくは、所定のデータ配列に対して他の1つまたは複数のデータブロックと比較されるべきデータ配列内のデータブロックの個数を選択するための手段またはそのようなデータブロックの個数を選択するステップを含む。 Thus, the metadata generation process is preferably a means for selecting the number of data blocks in a data array to be compared with one or more other data blocks for a given data array or such Selecting a number of data blocks.

特に好ましい一実施形態では、比較されるデータブロックの個数は、データ配列の異なる領域について異なる可能性があり、また好ましくは異なる。 In one particularly preferred embodiment, the number of data blocks to be compared can be different and preferably different for different regions of the data array.

好ましい一実施形態では、ソフトウェアアプリケーション(例えば、データ配列の生成をトリガーする)がデータブロック比較プロセスがデータ配列のどの領域に対して実行されるべきかを示し、制御することが可能である。次いで、これにより、比較プロセスは、アプリケーションが常に異なることを「知っている」データ配列の領域に対してアプリケーションによって「オフにする」ことができる。 In a preferred embodiment, a software application (eg, triggering the generation of a data array) can indicate and control for which region of the data array the data block comparison process should be performed. This then allows the comparison process to be “turned off” by the application for regions of the data array that “know” that the application is always different.

これは、望む通りに達成されうる。好ましい一実施形態では、データ配列領域に対してデータブロック(例えば、レンダリング済みタイル)比較を有効/無効にするレジスタを備え、そこで、ソフトウェアアプリケーションがレジスタをしかるべく設定する(例えば、グラフィックスプロセッサドライバを介して)。 This can be achieved as desired. In a preferred embodiment, a register is provided that enables / disables data block (eg, rendered tile) comparison for the data array region, where a software application sets the register accordingly (eg, a graphics processor driver). Through).

上述のように、処理すべきデータの配列のデータブロックに対する「類似度」メタデータの生成は、新しいものであって、それ自体有益であるものとしてよい。 As described above, the generation of “similarity” metadata for data blocks of an array of data to be processed may be new and useful in itself.

したがって、本発明の第4の態様によれば、メモリ内に格納されているデータの配列を処理するときに使用するためのメタデータを生成する方法が提供され、この方法は、
処理すべきデータの配列の特定の領域を表すデータの1つまたは複数のブロックのそれぞれについて、
データのブロックが、データ配列に対するデータの他のブロックに類似していると考えるべきかどうかを判定するステップと、
データのブロックが、データの配列に関連してデータ配列に対するデータの他のブロックに類似していると判定されたかどうかを示す類似度情報を格納するステップとを含む。 Thus, according to a fourth aspect of the invention, there is provided a method for generating metadata for use when processing an array of data stored in memory, the method comprising:
For each of one or more blocks of data representing a particular region of the array of data to be processed,
Determining whether a block of data should be considered similar to other blocks of data for a data array;
Storing similarity information indicating whether the block of data has been determined to be similar to other blocks of data for the data array in relation to the array of data.

本発明の第5の態様によれば、データ処理システムが実現され、このシステムは、
処理するためにデータの配列を生成するためのデータプロセッサと、
データの配列の特定の領域を表すデータの1つまたは複数のブロックのそれぞれについて、データのブロックがデータ配列に対するデータの他のブロックに類似していると考えるべきかどうかを判定するための手段と、
データのブロックが、データの配列に関連してデータ配列に対するデータの他のブロックに類似していると判定されたかどうかを示す類似度情報を格納するための手段とを備える。 According to a fifth aspect of the present invention, a data processing system is realized, the system comprising:
A data processor for generating an array of data for processing;
Means for determining, for each of one or more blocks of data representing a particular region of the data array, whether the block of data should be considered similar to other blocks of data for the data array; ,
Means for storing similarity information indicating whether the block of data has been determined to be similar to other blocks of data for the data array in relation to the array of data.

本発明の第6の態様によれば、データプロセッサが実現され、このプロセッサは、
処理するためにデータの配列を生成するための手段と、
データの配列の特定の領域を表すデータの1つまたは複数のブロックのそれぞれについて、データのブロックがデータ配列に対するデータの他のブロックに類似していると考えるべきかどうかを判定するための手段と、
データのブロックが、データの配列に関連してデータ配列に対するデータの他のブロックに類似していると判定されたかどうかを示す類似度情報を格納するための手段とを備える。 According to a sixth aspect of the present invention, a data processor is realized, the processor comprising:
Means for generating an array of data for processing;
Means for determining, for each of one or more blocks of data representing a particular region of the data array, whether the block of data should be considered similar to other blocks of data for the data array; ,
Means for storing similarity information indicating whether the block of data has been determined to be similar to other blocks of data for the data array in relation to the array of data.

当業者であれば理解するように、本発明のこれらの態様および実施形態は、適宜、本明細書で説明されている本発明の好ましい、およびオプションの機能の1つまたは複数あるいはすべてを備えることができ、好ましくは備える。したがって、例えば、類似度指示情報は、好ましくは、データの配列に関連付けられているビットマップの形をとる。データブロックの類似度は、好ましくは、データブロックを比較することによって、好ましくはそれらの内容を直接比較することによって判定される。データの配列は、好ましくは、画像を表すデータであり、データプロセッサ(データ配列生成プロセッサ)は、好ましくは、グラフィックスプロセッサである(しかし、例えば、ビデオプロセッサまたはCPUであってもよい)。 As will be appreciated by those skilled in the art, these aspects and embodiments of the present invention may include, where appropriate, one or more or all of the preferred and optional features of the present invention described herein. And preferably comprises. Thus, for example, the similarity indication information preferably takes the form of a bitmap associated with an array of data. The similarity of data blocks is preferably determined by comparing the data blocks, preferably by directly comparing their contents. The array of data is preferably data representing an image, and the data processor (data array generation processor) is preferably a graphics processor (but may be, for example, a video processor or CPU).

好ましくは、これらの態様および配置構成において、システムは、上述のように、出力データ配列内のどの領域(ブロック)が同じである(類似していると考えることができる)かを示す一組の関連する類似度情報(メタデータ)と一緒に出力データ配列を生成する。 Preferably, in these aspects and arrangements, the system is a set of indicating which regions (blocks) in the output data array are the same (can be considered similar) as described above. An output data array is generated together with related similarity information (metadata).

最も好ましくは、データ配列全体が適切ないくつかのデータブロックに分割され、データ配列を分割したそれぞれのデータブロックについて、そのデータブロックがデータの他のデータブロックに類似しているかどうか(およびそれに応じてデータブロックに対して格納されている類似度情報)が決定される。 Most preferably, the entire data array is divided into a number of suitable data blocks, and for each data block that divides the data array, whether that data block is similar to other data blocks in the data (and accordingly) The similarity information stored for the data block is determined.

特に好ましい一実施形態では、類似度情報は、データ配列がメモリに書き込まれているときに(つまり、データ配列が生成されている最中に)生成される。これにより、類似度情報を生成するためにいったん生成された後はデータ配列を処理する必要がなくなる。この場合、データ配列は、好ましくは、データをデータ配列にブロック単位で書き込むことによって生成され、それぞれの新しいブロックが配列への書き込みのために生成されるときに、好ましくは、そのブロックがデータ配列に対してすでに生成されている他のブロックに類似しているかどうか、およびそれに応じて生成されたその類似度情報(メタデータ)が決定される。 In one particularly preferred embodiment, the similarity information is generated when the data array is being written to memory (ie, while the data array is being generated). This eliminates the need to process the data array once it has been generated to generate similarity information. In this case, the data array is preferably generated by writing the data into the data array in blocks, and when each new block is generated for writing to the array, preferably the block is the data array. It is determined whether or not it is similar to other blocks already generated and its similarity information (metadata) generated accordingly.

したがって、特に好ましい一実施形態では、データの配列は、データの配列の特定の領域を表すデータのブロックをメモリ内の格納されている配列に書き込むことによってメモリ(例えば、フレームバッファ)内に格納され、データの新しいブロックがデータ配列に関して生成されるときに、データのその新しいブロックがデータ配列に対してすでに生成されているデータのブロックに類似していると考えるべきかどうかが判定され、データのその新しいブロックがデータ配列に対してすでに生成されているデータのブロックに類似していると判定されたかどうかを示す類似度情報が、生成され、それに応じてデータの配列に関連して格納される。 Thus, in a particularly preferred embodiment, the array of data is stored in memory (e.g., a frame buffer) by writing a block of data representing a particular region of the array of data to the stored array in memory. When a new block of data is generated for a data array, it is determined whether that new block of data should be considered similar to a block of data already generated for the data array, Similarity information is generated that indicates whether the new block is determined to be similar to the block of data already generated for the data array, and stored accordingly in relation to the array of data .

これらの配置構成において、データブロックは、好ましくは、類似度情報生成プロセス用にローカルメモリ内にバッファリングまたはキャッシュされ、これにより、例えば、類似度情報を生成するためにデータ配列が格納される場合にメインメモリからブロックを読み込まなくて済む。 In these arrangements, the data blocks are preferably buffered or cached in local memory for the similarity information generation process, so that, for example, a data array is stored to generate similarity information. It is not necessary to read a block from the main memory.

例えば、配列が生成されるときにデータのブロックに対して「署名」(上述のような)を生成し、次いで、その署名を使用して、データ配列に対する、類似度ビットマップなどの、さらなる類似度情報を生成することも可能であるか、または代わりにそのような類似度情報を生成することが可能である。 For example, generate a "signature" (as described above) for a block of data when the array is generated, and then use that signature to make further similarities, such as a similarity bitmap for the data array It is possible to generate degree information, or alternatively it is possible to generate such similarity information.

上記の態様および実施形態において、データ配列に関連付けられ、データ配列が処理されるときに使用されるべき、データブロックに対する、ブロック類似度ビットマップおよび/または署名などの、メタデータ(情報)が、適宜格納されなければならない。好ましい一実施形態では、これは、メモリ内(第1のメモリ内)のデータ配列とともに格納される。しかし、これはそのような場合でなくてもよく、類似度メタデータは、所望するなら、システム内の他の好適な場所などの、データの配列と異なる場所に格納することも可能である。実際、類似度メタデータは、比較的小さい場合もあるため、例えば、必要に応じて、オフチップメモリではなく、オンチップメモリまたはバッファ内に格納することも可能である。 In the above aspects and embodiments, metadata (information), such as a block similarity bitmap and / or signature for a data block, associated with the data array and to be used when the data array is processed, Must be stored as appropriate. In a preferred embodiment, this is stored with the data array in memory (in the first memory). However, this need not be the case, and the similarity metadata can be stored in a different location than the data arrangement, such as other suitable locations in the system, if desired. In fact, the similarity metadata may be relatively small, so it can be stored, for example, in an on-chip memory or buffer instead of an off-chip memory, if necessary.

メタデータが使用される場合、処理デバイスによって適宜取り出すことができる。好ましくは1つまたは複数のデータブロックに対する、および好ましくは複数のデータブロックに対する、メタデータ、例えば、署名は、例えば、オンチップメタデータの、例えば、署名、バッファ内で、処理デバイスのローカルに、例えば処理デバイスそれ自体にキャッシュされる。この方法は、ブロック類似度評価を行う毎に外部メモリからメタデータをフェッチする必要をなくすことができ、したがって、メタデータを読み出すために使用されるメモリ帯域幅を低減するのに役立つ。 If metadata is used, it can be retrieved as appropriate by the processing device. Metadata, for example, signatures, preferably for one or more data blocks, and preferably for multiple data blocks, eg, on-chip metadata, eg, signatures, in buffers, locally on the processing device, For example, it is cached on the processing device itself. This method can eliminate the need to fetch metadata from external memory each time a block similarity assessment is made, thus helping to reduce the memory bandwidth used to read the metadata.

最も好ましくは、処理されているデータ配列に対するメタデータは、関係するデータブロックの読み込みおよび処理に先立っていくつかの部分(データ配列の複数のブロックに対応する)に分けて取り出される(読み込まれる)。したがって、類似度メタデータ(情報)は、好ましくは読み込みプロセス用にプリフェッチされる。これにより、類似度判定をより迅速に実行することが可能になる。 Most preferably, the metadata for the data array being processed is retrieved (read) in several parts (corresponding to multiple blocks of the data array) prior to reading and processing the relevant data block. . Therefore, the similarity metadata (information) is preferably prefetched for the reading process. Thereby, it becomes possible to perform similarity determination more rapidly.

データブロック署名などのメタデータが、処理デバイスのローカルにキャッシュされる、例えば、オンチップバッファ内に格納される場合、データブロックは、好ましくは、ヒルベルト順序などの好適な順序で処理され、これにより、メタデータがローカルにキャッシュされている(オンチップバッファ内に格納されている)(複数の)データブロックと一致する可能性が高まる。 When metadata such as data block signatures is cached locally on the processing device, eg, stored in an on-chip buffer, the data blocks are preferably processed in a suitable order, such as Hilbert order, thereby , The likelihood that the metadata will match the data block (s) cached locally (stored in the on-chip buffer).

当業者であれば理解するように、データブロック(例えば、レンダリング済みタイル)に対するメタデータの生成および格納は、ある程度の処理およびメモリリソースを必要とするが、出願人は、これよりも、そのときに上述の仕方でそのデータを使用することによって実現できる電力消費およびメモリ帯域幅に関する潜在的節減の方が勝ると確信している。 As will be appreciated by those skilled in the art, the generation and storage of metadata for data blocks (e.g., rendered tiles) requires some processing and memory resources, but applicants are more likely to We are confident that the potential savings in terms of power consumption and memory bandwidth that can be achieved by using the data in the manner described above will be better.

当業者であれば理解するように、特に好ましい一実施形態では、生成されたデータ配列およびメタデータが、次いで、上述の仕方で処理デバイスによって読み込まれ、使用される。 As will be appreciated by those skilled in the art, in a particularly preferred embodiment, the generated data array and metadata is then read and used by the processing device in the manner described above.

したがって、本発明の他の態様によれば、データの配列を処理する方法が提供され、この方法は、
処理すべきデータの配列を生成するステップと、
処理すべきデータの配列の特定の領域を表すデータの1つまたは複数のブロックのそれぞれについて、
データのブロックが、データ配列のデータの他のブロックに類似していると考えるべきかどうかを判定するステップと、
データのブロックが、データ配列のデータの他のブロックに類似していると判定されたかどうかを示す類似度情報を生成するステップと、
データの配列およびその関連する生成された類似度情報を第1のメモリ内に格納するステップと、
第1のメモリからデータの配列の特定の領域をそれぞれ表すデータのブロックを読み込み、データのブロックが処理デバイスによって処理される前にデータ配列を処理すべき処理デバイスのメモリ内にデータのブロックを格納するステップと、
データ配列に対して生成された類似度情報を使用して、データ配列に対して処理すべきデータのブロックが処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかを判定するステップと、
処理すべきデータのブロックに対して、類似度判定に基づいて、処理デバイスのメモリ内にすでに格納されているデータのブロックを、または第1のメモリ内に格納されているデータの配列からデータの新しいブロックを処理するステップとを含む。 Thus, according to another aspect of the invention, there is provided a method for processing an array of data, the method comprising:
Generating an array of data to be processed;
For each of one or more blocks of data representing a particular region of the array of data to be processed,
Determining whether a block of data should be considered similar to other blocks of data in the data array;
Generating similarity information indicating whether a block of data has been determined to be similar to other blocks of data in the data array;
Storing an array of data and its associated generated similarity information in a first memory;
Read a block of data, each representing a specific area of the data array from the first memory, and store the block of data in the memory of the processing device where the data array should be processed before the block of data is processed by the processing device And steps to
Use the similarity information generated for the data array to determine if the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device And steps to
For a block of data to be processed, based on the similarity determination, a block of data already stored in the memory of the processing device or an array of data stored in the first memory Processing a new block.

本発明の他の態様によれば、データ処理システムが実現され、このシステムは、
処理すべきデータの配列を格納するための第1のメモリと、
処理すべきデータの配列を生成するためのデータプロセッサと、
データの配列の特定の領域を表すデータの1つまたは複数のブロックのそれぞれについて、データのブロックがデータ配列のデータの他のブロックに類似していると考えるべきかどうかを判定するための手段と、
データのブロックが、データ配列のデータの他のブロックに類似していると判定されたかどうかを示す類似度情報を生成するための手段と、
データの配列およびその関連する生成された類似度情報を第1のメモリ内に格納するための手段と、
第1のメモリ内に格納されているデータの配列を、それぞれがデータの配列の特定の領域を表すデータの連続するブロックを処理することによって処理する、ローカルメモリを有する処理デバイスと、
第1のメモリ内に格納されているデータの配列の特定の領域を表すデータのブロックを読み込み、データのブロックが処理デバイスによって処理される前に処理デバイスのローカルメモリ内にデータのブロックを格納するように構成された読み込みコントローラと、
データ配列に対して生成された類似度情報を使用して、データ配列に対して処理されるべきデータのブロックが、処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかを判定し、類似度判定に基づいて、処理すべきデータのブロックに対して、処理デバイスのメモリ内にすでに格納されているデータのブロック、または第1のメモリ内に格納されているデータの配列からのデータの新しいブロックを処理デバイスに処理させるように構成された制御回路とを備える。 According to another aspect of the present invention, a data processing system is implemented, the system comprising:
A first memory for storing an array of data to be processed;
A data processor for generating an array of data to be processed;
Means for determining, for each of one or more blocks of data representing a particular region of the data array, whether the block of data should be considered similar to other blocks of data in the data array; ,
Means for generating similarity information indicating whether a block of data has been determined to be similar to another block of data in the data array;
Means for storing an array of data and its associated generated similarity information in a first memory;
A processing device having local memory that processes an array of data stored in a first memory by processing successive blocks of data, each representing a particular region of the array of data;
Read a block of data representing a specific area of the array of data stored in the first memory and store the block of data in the local memory of the processing device before the block of data is processed by the processing device A read controller configured to
Whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device using the similarity information generated for the data array Based on the similarity determination, for the block of data to be processed, the block of data already stored in the memory of the processing device, or the array of data stored in the first memory And a control circuit configured to cause the processing device to process a new block of data from.

当業者であれば理解するように、これらの態様および配置構成は、適宜、本明細書で説明されている本発明の好ましい、およびオプションの機能の1つまたは複数あるいはすべてを備えることができ、好ましくは備える。 As those skilled in the art will appreciate, these aspects and arrangements may comprise one or more or all of the preferred and optional features of the invention described herein, as appropriate, Preferably provided.

上で説明されているように、本発明の技術は、特に、使用するためにメモリからのデータを読み込むプロセスに関係しているけれども、出願人は、本発明の技術の原理は、まず第一にデータ配列をメモリ内に書き込むプロセスを改善するためにも使用できることを認識している。例えば、特に、出願人は、データブロックがデータ配列に対してすでに生成されているブロックに十分に類似していると判定された場合に、データ配列内に新しいデータブロックをさらに格納することは必要ない場合があることを認識している。 As explained above, although the techniques of the present invention are particularly concerned with the process of reading data from memory for use, applicants have found that the principles of the techniques of the present invention are first and foremost. It is recognized that it can also be used to improve the process of writing data arrays into memory. For example, in particular, applicants need to store a new data block further in the data array if it is determined that the data block is sufficiently similar to a block already generated for the data array. Recognize that there may not be.

したがって、特に好ましい一実施形態では、データ配列に対するデータブロックがメモリ内のデータ配列に書き込まれているときに、完成したデータブロック(例えば、レンダリング済みタイル)は、そのデータブロックがデータ配列に対してすでに生成されている(つまり、データ配列内にすでに格納されていることになる)データブロックに類似していると考えるべきであると判定された場合にメモリ内のデータ配列には書き込まれない。したがって、これにより、データ配列内にすでに格納されることになっているデータブロックと同じであると判定されたデータブロックを、データ配列に書き込まなくて済む。 Thus, in a particularly preferred embodiment, when a data block for a data array is being written to a data array in memory, a completed data block (e.g., a rendered tile) It is not written to the data array in memory if it is determined that it should be considered similar to a data block that has already been generated (ie will already be stored in the data array). Thus, this eliminates the need to write data blocks that are determined to be the same as data blocks already stored in the data array to the data array.

したがって、この場合、データ配列に書き込まれるべきそれぞれのデータブロックが生成されるときに、これは、別の1つのデータブロックまたはデータ配列の複数のブロックと比較され、次いで、新しいデータブロックが、その比較に基づいてデータ配列に書き込まれるか、または書き込まれないようにできる。 Thus, in this case, when each data block to be written to the data array is generated, it is compared to another data block or blocks of the data array and then the new data block is Based on the comparison, the data array can be written or not written.

したがって、特に好ましい一実施形態では、データ配列に対するデータブロックが完成したときに、そのデータブロックをデータ配列の少なくとも1つの他のデータブロックと比較し、比較に基づいて完成したデータブロックをデータ配列に書き込むかどうかを決定するステップまたは手段がある。 Thus, in a particularly preferred embodiment, when a data block for a data array is completed, the data block is compared with at least one other data block of the data array, and the completed data block is converted into a data array based on the comparison. There is a step or means for determining whether to write.

このプロセスは、好ましくは、上述のと同じブロック比較配置構成を使用して、データブロックの内容を表す署名同士を比較するか、または最も好ましくは、ブロックの内容を直接比較することなどで、ブロックが類似しているかどうかを判定する。 This process preferably uses the same block comparison arrangement as described above to compare the signatures representing the contents of the data blocks, or most preferably by directly comparing the contents of the blocks, etc. Determine whether they are similar.

これらの配置構成では、データブロックそれ自体が、データ配列に書き込まれない場合があるけれども、類似度メタデータはそれでも、その情報がデータ配列の他のどのブロックが代わりに処理デバイスによって処理されるべきかを判定するために必要になるので、注目するブロック位置について生成され格納されなければならない。 In these arrangements, the data block itself may not be written to the data array, but the similarity metadata will nevertheless be processed by the processing device instead of any other block in the data array. Therefore, the block position of interest must be generated and stored.

これらの配置構成の好ましい一実施形態では、書き込み排除プロセスは、同じデータ配列(現在のデータ配列)のみについて生成されているブロックに関して(ブロックを比較することによって)実行される。 In a preferred embodiment of these arrangements, the write exclusion process is performed (by comparing the blocks) on blocks that are being generated only for the same data array (current data array).

しかし、メモリ(例えば、フレームバッファ)内にすでに格納されている前のデータ配列からのデータブロックを含むように比較を拡張し、類似するデータブロックが前のデータ配列からメモリ内にすでに存在している場合に類似するデータブロックをデータ配列に関してメモリに再び書き込まないようにすることが可能である。これは、一連の類似のデータ配列(ビデオシーケンスのフレームなど)が生成されている場合に特に有用であると思われる。この場合、新たに生成されたデータブロックを(例えば、その内容または内容の署名に基づいて)メモリ内にすでに格納されているデータ配列の1つまたは複数のブロックと比較することが可能である。 However, the comparison is extended to include data blocks from previous data arrays already stored in memory (e.g., frame buffer), and similar data blocks already exist in memory from previous data arrays. It is possible to prevent data blocks that are similar to those from being rewritten to memory with respect to the data array. This may be particularly useful when a series of similar data sequences (such as frames of a video sequence) are being generated. In this case, it is possible to compare the newly generated data block with one or more blocks of the data array already stored in memory (eg based on its contents or content signature).

これらの配置構成では、システムは、好ましくは、新たに生成されたデータブロックをメモリ内のデータ配列に、定期的に、例えば、1秒に1回、それぞれの所定のデータブロック(データブロック位置)に関して、常に書き込むように構成される。したがって、新しいデータブロックはすべてのデータブロック位置について少なくとも定期的にデータ配列内に確実に書き込まれるようになり、これにより、例えば、誤って一致するデータブロックが(例えば、データブロックの内容が実際には異なっているとしてもデータブロックの署名がたまたま一致するせいで)、所定の、例えば、所望のもしくは選択された、期間より長い間、データ配列内に保持されることが回避される。これは、例えば、新しいデータ配列を丸ごと定期的に(例えば、1秒に1回)単純に書き出すことによって、または新しいデータブロックをデータ配列に循環パターンによりローリングベースで書き出すことによって実行できるので、時間が経過するうちにすべてのデータブロック位置が最終的には書き出されて新しくなる。 In these arrangements, the system preferably places the newly generated data blocks into a data array in memory, periodically, e.g. once per second, for each predetermined data block (data block position). Is configured to always write. Thus, new data blocks are guaranteed to be written into the data array at least periodically for every data block position, so that, for example, erroneously matching data blocks (e.g. Are kept in the data array for longer than a predetermined, eg, desired or selected period, because the signatures of the data blocks happen to match, even if they differ. This can be done, for example, by simply writing a whole new data array periodically (e.g. once per second), or by writing a new data block to the data array in a rolling pattern in a rolling basis. As time elapses, all data block positions are finally written out and renewed.

特に好ましい一実施形態では、本発明は、他の1つまたは複数の電力および帯域幅低減方式、例えば、また好ましくは、データ配列(例えば、フレームバッファ)の圧縮方式(所望するなら、不可逆または可逆とすることができる)と併せて使用される。 In one particularly preferred embodiment, the present invention provides other one or more power and bandwidth reduction schemes, e.g., and preferably compression schemes for data arrays (e.g., frame buffers) (irreversible or reversible if desired). Can be used in conjunction with

上述のように、本技術は、グラフィックスプロセッサオペレーションに対し具体的に適用されているが、出願人は、それらが、例えばタイルベースのグラフィックス処理システムに類似の仕方でブロックの形態のデータを処理し、また例えば、フレームバッファ、テクスチャ、および/または画像を読み込む他のシステムにも同様に適用することができることを認識している。したがって、例えば、フレームバッファを操作するホストプロセッサ、テクスチャを読み込むグラフィックスプロセッサ、合成すべき画像を読み込む合成エンジン、またはビデオ復号化を行うために基準フレームを読み込むビデオプロセッサに適用することができる。したがって、本技術は、例えば、ビデオ処理(ビデオ処理はグラフィックス処理においてタイルに類似するデータのブロックに対し行われるので)、また合成画像処理(ここでもまた、合成フレームバッファはデータの異なるブロックとして処理されるので)にも等しく使用することができる。これらは、例えば、デジタルカメラがカメラのセンサーによって生成されるデータ(画像)を処理している場合に、また例えば、表示のために、デジタルカメラによって生成されたデータ(画像)を処理するときに、使用することもできる。 As described above, although the present technology has been specifically applied to graphics processor operations, Applicants have determined that they can store data in the form of blocks in a manner similar to, for example, a tile-based graphics processing system. It has been recognized that other systems that process and load frame buffers, textures, and / or images, for example, can be similarly applied. Thus, for example, the present invention can be applied to a host processor that operates a frame buffer, a graphics processor that reads a texture, a synthesis engine that reads an image to be synthesized, or a video processor that reads a reference frame for video decoding. Thus, for example, the technology can be used for video processing (since video processing is performed on blocks of data similar to tiles in graphics processing), and composite image processing (again, the composite frame buffer is a different block of data Can be used equally well) These are, for example, when a digital camera is processing data (images) generated by a camera sensor, and for example when processing data (images) generated by a digital camera for display. Can also be used.

本技術は、それぞれが同じデータ配列、例えば、フレームバッファ内のフレームに書き込む複数のマスターデバイスがある場合に使用することもできる。これは、例えば、ホストプロセッサが、グラフィックスプロセッサによって生成されている画像上に表示すべき「オーバーレイ」を生成する場合としてよい。 The technique can also be used when there are multiple master devices each writing to the same data array, eg, a frame in a frame buffer. This may be the case, for example, when the host processor generates an “overlay” to be displayed on the image being generated by the graphics processor.

この場合、データ配列に書き込むそれぞれのデバイスは、類似度メタデータをしかるべく更新することが可能であるか、または例えば、他のマスターの書き込み先であるデータ配列の部分に対するメタデータを無効化するか、またはクリアすることが可能である(データ配列のこれらの部分は、全部、処理デバイスに読み出される)。後者は、所定のマスターデバイスが類似度メタデータを更新することができない場合に必要になる。また、例えば、他のマスターがデータ配列の比較的大きな部分を修正する場合(またはデータ配列をそもそも修正する場合)にデータ配列全体に対するメタデータを無効化(クリア)することも可能である。 In this case, each device that writes to the data array can update the similarity metadata accordingly, or, for example, invalidates the metadata for the portion of the data array that the other master writes to Or they can be cleared (all these portions of the data array are read to the processing device). The latter is necessary when a given master device cannot update the similarity metadata. Further, for example, when another master corrects a relatively large portion of the data array (or when the data array is corrected in the first place), the metadata for the entire data array can be invalidated (cleared).

より具体的には、データ配列の読み込みおよび/または書き込みも行っている「第三者」デバイスがある場合、読み込み排除のみが使用されている場合には、第三者デバイスは、データ配列から読み込むときに、単純に、通常は類似度メタデータを使用せずに(または、実際には、類似度メタデータについて知らずに)、データ配列を読み込むことが可能であるか、または第三者デバイスは、メタデータを使用して、読み込みトランザクションを排除することが可能である。 More specifically, if there is a “third party” device that also reads and / or writes the data array, the third party device reads from the data array if only read exclusion is used. Sometimes it is simply possible to read an array of data, usually without using similarity metadata (or actually without knowing about similarity metadata), or third-party devices Metadata can be used to eliminate read transactions.

第三者(third party)のデバイスが、データ配列に書き込んでいる場合、これは、データ配列に関連付けられているメタデータを更新することが可能であるか、またはデータ配列に対する類似度メタデータの一部または全体を無効化することが可能である。後者の場合、例えば、メタデータのごく最初の方にデータ配列のメタデータ無効化ビットがある可能性がある。 If a third party device is writing to the data array, this can update the metadata associated with the data array or the similarity metadata for the data array. It is possible to invalidate part or all. In the latter case, for example, there may be a metadata invalidation bit of the data array at the very beginning of the metadata.

読み込みと書き込みの両方のトランザクション排除が使用されている場合、データ配列から読み込む場合、第三者デバイスは類似度メタデータを使用して読み込みトランザクションを排除する。(読み込み排除のみが、使用されており、したがって、データ配列を読み込む第三者デバイスが、読み込みを排除するためにメタデータを望み通りに使用することができる場合もあればできない場合とは異なり、書き込み配置が有効になっている場合には、書き込み排除は使用されたときには、データ配列は「完全」ではないことがあるため(データ配列への書き込みが「排除」されていないデータブロックの場合には、読み込みデバイスはメタデータから代わりにどのブロックを使用すべきかを決定しなければならなくなるので)、第三者デバイスはデータ配列から読み出すときにメタデータを読み込んで使用しなければならない。)。 If both read and write transaction exclusions are used, the third party device uses similarity metadata to eliminate read transactions when reading from a data array. (Unless only read exclusion is used, so a third-party device reading the data array may or may not be able to use the metadata as desired to eliminate the read, If write placement is enabled, when write exclusion is used, the data array may not be `` complete '' (in the case of a data block whose write to the data array has not been `` excluded ''). (The reading device will have to determine which block to use instead from the metadata), so the third-party device must read and use the metadata when reading from the data array.)

この場合にデータ配列に書き込みを行う場合、読み込み排除のみが有効にされている上記の場合のように、第三者デバイスは、データをデータ配列に書き込むときに、メタデータを更新することが可能であるか、またはメタデータの一部または全体を無効化することが可能である。 In this case, when writing to the data array, the third party device can update the metadata when writing the data to the data array, as in the above case where only read exclusion is enabled Or part or all of the metadata can be invalidated.

メタデータ生成プロセス(および使用される場合にはデータブロック比較プロセス)を望み通りに実行することができる。好ましい一実施形態では、データ配列生成プロセッサ(例えば、GPU、CPUなど)それ自体によって実行されるが、他の好ましい実施形態では、これを行う、データ配列生成プロセスとデータ配列が格納されるべきメモリ(例えば、フレームバッファ)との中間にある、別のブロックまたはハードウェア要素(ロジック)がある。メタデータ生成「ユニット」がデータ配列生成プロセスから分離している(外部にある)場合、これは、独立したロジックブロックとして置くことができるか、または例えば、バスファブリックおよび/または相互接続部の一部とすることができる。 The metadata generation process (and data block comparison process, if used) can be performed as desired. In one preferred embodiment, the data array generation processor (e.g., GPU, CPU, etc.) itself executes, while in other preferred embodiments, this is done by the data array generation process and the memory in which the data array is to be stored. There is another block or hardware element (logic) in between (for example, a frame buffer). If the metadata generation “unit” is separate (external) from the data array generation process, it can be placed as a separate logic block or, for example, one of the bus fabric and / or interconnects. Part.

したがって、好ましい一実施形態では、データ配列生成プロセッサ(例えば、グラフィックスプロセッサ)に対し分離しているメタデータ生成ハードウェア要素(ロジック)があり、好ましい他の実施形態では、メタデータ生成ロジックは、そのプロセッサ(の一部)に組み込まれる。したがって、好ましい一実施形態では、メタデータ生成手段などは、データ生成プロセッサ(例えば、グラフィックスプロセッサ)それ自体の一部となるが、他の好ましい実施形態では、このシステムは、データ生成プロセッサ、および別の「メタデータ生成」ユニットまたは要素を備える。 Thus, in a preferred embodiment, there is a metadata generation hardware element (logic) that is separate to a data array generation processor (e.g., a graphics processor), and in another preferred embodiment, the metadata generation logic is Built into (part of) that processor. Thus, in a preferred embodiment, the metadata generation means, etc. are part of the data generation processor (e.g., graphics processor) itself, while in other preferred embodiments, the system includes a data generation processor, and Provide another “metadata generation” unit or element.

本発明は、比較およびその結果の類似度メタデータ判定を実行するための特定のハードウェア要素の装備にも拡張される。上述のように、例えば、このハードウェア要素(ロジック)は、1つの、例えば、グラフィックスプロセッサの一体となる部分として構成されるか、または例えば、グラフィックスプロセッサと外部メモリコントローラとを、例えばインターフェイスすることができるスタンドアロン要素とすることができる。これは、プログラム可能な、または専用のハードウェア要素であってもよい。 The present invention also extends to the provision of specific hardware elements for performing comparisons and resulting similarity metadata determinations. As described above, for example, this hardware element (logic) is configured as one integral part of, for example, a graphics processor or, for example, an interface between a graphics processor and an external memory controller, for example. Can be a stand-alone element that can be. This may be a programmable or dedicated hardware element.

したがって、本発明の他の態様によれば、データ処理システムによって生成されるデータの配列がデータの配列の特定の領域を表すデータのブロックを出力バッファから読み込むことによって出力バッファから読み込まれるデータ処理システムにおいて使用するためのメタデータ生成装置が実現され、この装置は、
データ配列に対するデータのブロックをデータ配列に対するデータの少なくとも1つの他のブロックと比較し、この比較に基づいてデータのブロックがデータ配列のデータの他のブロックに類似していると考えるべきかどうかを示す情報を生成するための手段と、
データ配列に関連してその類似度情報を格納するための手段とを備える。 Thus, according to another aspect of the invention, a data processing system in which an array of data generated by a data processing system is read from an output buffer by reading from the output buffer a block of data that represents a particular region of the array of data. A metadata generation device is provided for use in
Whether the block of data for the data array should be compared to at least one other block of data for the data array and whether the block of data should be considered similar to other blocks of data for the data array based on this comparison Means for generating information to indicate;
Means for storing similarity information associated with the data array.

当業者であれば理解するように、本発明のこれらの態様および実施形態は、本明細書で説明されている好ましい、およびオプションの機能の1つまたは複数あるいはすべてを備えることができ、好ましくは備える。したがって、例えば、比較は、好ましくは、各データブロックの内容の一部または全部を比較するステップを含む。 As those skilled in the art will appreciate, these aspects and embodiments of the present invention may comprise one or more or all of the preferred and optional features described herein, preferably Prepare. Thus, for example, the comparison preferably includes comparing some or all of the contents of each data block.

類似度判定プロセス(およびその結果のデータブロック選択プロセス)を、同様に、望み通りに実行することができる。好ましい一実施形態では、処理デバイス(例えば、ディスプレイコントローラ、GPU、CPUなど)それ自体によって実行されるが、他の好ましい実施形態では、これを行う、データ処理デバイスとデータ配列が格納されるメモリ(例えば、フレームバッファ)との中間にある、別のブロックまたはハードウェア要素(ロジック)がある。類似度判定などの「ユニット」が処理デバイスから分離している(外部にある)場合、これは、ここでもまた、独立したロジックブロックとして置くことができるか、または例えば、バスファブリックおよび/または相互接続部の一部とすることができる。 The similarity determination process (and the resulting data block selection process) can be similarly performed as desired. In a preferred embodiment, the processing device (e.g., display controller, GPU, CPU, etc.) is executed by itself, but in other preferred embodiments, this is done by a data processing device and a memory in which the data array is stored ( For example, there is another block or hardware element (logic) in between the frame buffer). If a “unit”, such as a similarity measure, is separate (external) from the processing device, it can again be placed as a separate logic block or, for example, bus fabric and / or mutual It can be part of the connection.

したがって、好ましい一実施形態では、データ配列処理デバイス(例えば、ディスプレイコントローラ)に対し分離している類似度判定ハードウェア要素(ロジック)があり、好ましい他の実施形態では、類似度判定ロジックは、データ配列処理デバイス(の一部)に組み込まれる。したがって、好ましい一実施形態では、類似度判定手段など(読み込みコントローラおよびシステムのコントローラ)は、処理デバイス(例えば、ディスプレイコントローラ)それ自体の一部であるが、他の好ましい実施形態では、システムは、処理デバイス、および別の「類似度判定」ユニットまたは要素(読み込みコントローラおよび/またはコントローラを含む)を備える。 Thus, in a preferred embodiment, there is a similarity determination hardware element (logic) that is separate to the data array processing device (e.g., display controller), and in another preferred embodiment, the similarity determination logic is the data Embedded in (part of) the array processing device. Thus, in a preferred embodiment, the similarity determination means etc. (reading controller and system controller) are part of the processing device (e.g. display controller) itself, while in other preferred embodiments the system is: A processing device and another “similarity determination” unit or element (including a read controller and / or controller).

本発明は、類似度およびその結果のデータブロック判定を実行するための特定のハードウェア要素の装備にも拡張される。上述のように、例えば、このハードウェア要素(ロジック)は、1つの、例えば、ディスプレイコントローラの一体となる部分として構成されるか、または例えば、ディスプレイコントローラと外部メモリコントローラとを、例えばインターフェイスすることができるスタンドアロン要素とすることができる。これは、プログラム可能な、または専用のハードウェア要素であってもよい。 The invention also extends to the provision of specific hardware elements for performing similarity and resulting data block determination. As described above, for example, this hardware element (logic) is configured as one integral part of, for example, a display controller, or, for example, to interface a display controller and an external memory controller, for example. Can be a stand-alone element. This may be a programmable or dedicated hardware element.

したがって、本発明の他の態様によれば、第1のメモリ内に格納されているデータの配列を処理するときに使用するための類似度判定装置が実現され、この装置は、
第1のメモリ内に格納されているデータの配列の特定の領域を表すデータのブロックを読み込み、データのブロックが処理デバイスによって処理される前にデータの配列を処理すべき処理デバイスのローカルメモリ内にデータのブロックを格納するように構成された読み込みコントローラと、
データ配列に対して処理されるべきデータのブロックが、処理デバイスのメモリ内にすでに格納されているデータのブロックに類似しているかどうかを判定し、類似度判定に基づいて、処理すべきデータのブロックに対して、処理デバイスのメモリ内にすでに格納されているデータのブロック、または第1のメモリ内に格納されているデータの配列からのデータの新しいブロックを処理デバイスに処理させるように構成されたコントローラとを備える。 Therefore, according to another aspect of the present invention, a similarity determination device is provided for use when processing an array of data stored in a first memory, the device comprising:
In a local memory of a processing device that reads a block of data that represents a specific area of the array of data stored in the first memory and should process the array of data before the block of data is processed by the processing device A read controller configured to store a block of data in
Determine whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and based on the similarity determination, For a block, configured to cause the processing device to process a block of data already stored in the memory of the processing device, or a new block of data from an array of data stored in the first memory And a controller.

当業者であれば理解するように、これらの態様および実施形態は、本明細書で説明されている好ましい、およびオプションの機能の1つまたは複数あるいはすべてを備えることができ、好ましくは備える。したがって、例えば、類似度判定は、好ましくは、データ配列に関連付けられている類似度メタデータに基づく。 As those skilled in the art will appreciate, these aspects and embodiments can, and preferably do, comprise one or more or all of the preferred and optional features described herein. Thus, for example, the similarity determination is preferably based on similarity metadata associated with the data array.

さまざまな他の好ましい、代替配置構成も可能である。例えば、左の画像と右の画像が生成されて使用される立体表示の場合、表示すべきそれぞれ「左」と「右」のブロックは、好ましくは、読み込み(および任意選択により、書き込み)排除の目的のために比較される(フレームの「左」画像に対するブロックを「左」画像に対するブロックのみと比較する(「右」ブロックを「右」ブロックとのみ比較するのではなく))。言い換えると、好ましくは、画像の左側部分と右側部分が互いに比較され、さらに画像の各部分の中のブロック同士が比較される。これは、出願人が認識しているように、画像中の左側タイルと右側タイルの多くが互いに同じになるので、読み込みにトランザクションの回数をさらに減らすのに役立つ。類似の配置構成が、2つよりも多い画像を使用する表示および体積表示に使用されうる(好ましくは使用される)。 Various other preferred and alternative arrangements are possible. For example, in the case of a stereoscopic display in which a left image and a right image are generated and used, the respective “left” and “right” blocks to be displayed are preferably read (and optionally written) exclusions. Compared for purposes (compare the block for the “left” image of the frame with only the block for the “left” image (rather than comparing the “right” block only with the “right” block)). In other words, preferably the left and right portions of the image are compared with each other, and the blocks within each portion of the image are compared. This helps to further reduce the number of transactions to read, as many of the left and right tiles in the image are the same as each other, as the applicant has recognized. Similar arrangements can be used (preferably used) for display and volume display using more than two images.

特に好ましい一実施形態では、決定された類似度情報は、処理デバイスのローカルメモリ内へのデータブロックの格納を管理するために使用され、特にローカルメモリからデータブロックの追い出しを決定する際の一因子としても使用される。例えば、好ましい一実施形態では、メタデータは、処理デバイスによって繰り返し使用されることになる(例えば、表示されているフレーム内で使用され)1つまたは複数のデータブロックを決定するために使用され、そのデータブロック(複数のデータブロック)は、次いで、処理デバイスのローカルメモリ内に一時的にロックされ(そこに書き込まれた後)、将来必要になったときにローカルメモリにおいて利用可能である。したがって、メタデータは、好ましくは、処理デバイスのローカルメモリ内に保持するのが都合がよい(それが可能な場合)データブロックをあらかじめ識別することを試みるために使用され、次いでそのローカルメモリはしかるべく管理される。これは、例えば、メタデータが作成されるときに所定のデータブロックに類似していると記載されている他のデータブロックの数をカウントすることによって実行することが可能である。次いで、この情報を使用して、処理デバイスのローカルメモリ内のデータブロックの格納をしかるべく制御することができる。 In one particularly preferred embodiment, the determined similarity information is used to manage the storage of data blocks in the local memory of the processing device, and in particular a factor in determining the eviction of data blocks from the local memory. Also used as For example, in a preferred embodiment, the metadata is used to determine one or more data blocks that will be used repeatedly by the processing device (e.g., used in the displayed frame), That data block (s) is then temporarily locked into the processing device's local memory (after being written there) and made available in the local memory when needed in the future. Thus, the metadata is preferably used to attempt to pre-identify data blocks that are convenient (if possible) to be kept in the local memory of the processing device, and then the local memory is reasonable Managed accordingly. This can be done, for example, by counting the number of other data blocks that are described as being similar to a given data block when the metadata is created. This information can then be used to control the storage of data blocks in the local memory of the processing device accordingly.

ローカルメモリ内の所定のデータブロックが近い将来使用される回数のカウントを記録しておき(例えば、処理されているデータ配列の部分に対しプリフェッチされたメタデータに基づいて)、その「使用」カウントがゼロになったときにはじめてローカルメモリからデータブロックを追い出すようにすることも可能である。 Keep a count of the number of times a given block of data in local memory will be used in the near future (eg, based on metadata prefetched for the portion of the data array being processed), and its “use” count It is also possible to drive out a data block from local memory only when becomes zero.

したがって、特に好ましい一実施形態では、処理デバイスのローカルメモリからデータブロックを追い出すことが、少なくとも一部は、注目しているデータ配列に関連付けられている類似度メタデータに応じて、制御される。 Thus, in a particularly preferred embodiment, eviction of data blocks from the local memory of the processing device is controlled at least in part according to the similarity metadata associated with the data array of interest.

本発明は、適宜構成されたマイクロプロセッサベースのシステムなどの、好適なシステムにおいて実装することができる。好ましい一実施形態では、本発明は、コンピュータおよび/またはマイクロプロセッサベースのシステムにおいて実装される。 The present invention can be implemented in any suitable system, such as a suitably configured microprocessor-based system. In a preferred embodiment, the present invention is implemented in a computer and / or microprocessor based system.

本発明のさまざまな機能も、同様に、所望の、また好適な仕方で実行されうる。例えば、本発明の機能は、所望するなら、ハードウェアまたはソフトウェアで実装することができる。したがって、例えば、本発明のさまざまな機能要素および「手段」は、適切に専用化されたハードウェア要素および/または所望の仕方で動作するようにプログラムすることができるプログラム可能なハードウェア要素などの、さまざまな機能などを実行するように動作可能な、好適な1つまたは複数のプロセッサ、1つまたは複数のコントローラ、機能ユニット、回路、処理ロジック、マイクロプロセッサ配置構成などを含むものとしてよい。 Various functions of the present invention may be performed in a desired and preferred manner as well. For example, the functionality of the present invention can be implemented in hardware or software if desired. Thus, for example, the various functional elements and “means” of the present invention are appropriately dedicated hardware elements and / or programmable hardware elements that can be programmed to operate in a desired manner, etc. May include one or more suitable processors, one or more controllers, functional units, circuits, processing logic, microprocessor arrangements, etc. operable to perform various functions and the like.

好ましい一実施形態では、出力データ配列生成プロセッサおよび/またはメタデータ生成ユニットは、ハードウェア要素(例えば、ASIC)として実装される。したがって、他の態様では、本発明は、本明細書で説明されている本発明の1つまたは複数の態様の装置、または本明細書で説明されている本発明の1つまたは複数の態様の方法に従って動作する装置を含む、ハードウェア要素を備える。 In a preferred embodiment, the output data array generation processor and / or the metadata generation unit is implemented as a hardware element (eg, ASIC). Accordingly, in other aspects, the invention is directed to an apparatus of one or more aspects of the invention described herein, or one or more aspects of the invention described herein. A hardware element is provided that includes an apparatus that operates according to the method.

また、当業者であれば理解するように、本発明のさまざまな機能などは、所定のプロセッサ上に複製され、および/または所定のプロセッサ上で並列実行されうることに留意されたい。 It should also be noted that various functions of the present invention may be replicated on a given processor and / or executed in parallel on a given processor, as will be appreciated by those skilled in the art.

グラフィックス処理システムにおいて使用される場合、本発明は、「パイプライン化された」レンダリング配置構成をとるプロセッサなどの、グラフィックスプロセッサおよびレンダラーの好適な形態もしくは構成に適用可能である(この場合、レンダラーは、レンダリングパイプラインの形態をとる)。これは、タイルベースのグラフィックスプロセッサおよびグラフィックス処理システムに特に適用可能である。 When used in a graphics processing system, the present invention is applicable to any suitable form or configuration of graphics processor and renderer, such as a processor that takes a “pipelined” rendering arrangement (in this case, The renderer takes the form of a rendering pipeline). This is particularly applicable to tile-based graphics processors and graphics processing systems.

上記から理解されるように、本発明は、もっぱらというわけではないが、2Dおよび3Dグラフィックスプロセッサならびに処理デバイスに特に適用可能であり、それに応じて、本明細書で説明されている本発明の1つまたは複数の態様の装置を含む、または本明細書で説明されている本発明の1つまたは複数の態様の方法に従って動作する装置を含む、2Dおよび/または3Dグラフィックスプロセッサならびに2Dおよび/または3Dグラフィックス処理プラットフォームにも拡大適用される。上述の特定の機能を実行するためにハードウェアが必要であることを前提条件とすると、そのような2Dおよび/または3Dグラフィックスプロセッサは、他の何らかの方法で、2Dおよび/または3Dグラフィックスプロセッサが備える、通常の機能ユニットなどの1つまたは複数またはすべてを備えることができる。 As will be appreciated from the above, the present invention is not exclusively applicable, but is particularly applicable to 2D and 3D graphics processors and processing devices, and accordingly, the invention described herein. 2D and / or 3D graphics processor and 2D and / or comprising an apparatus of one or more aspects or comprising an apparatus operating according to the method of one or more aspects of the invention described herein. Or extended to 3D graphics processing platforms. Assuming that hardware is required to perform the specific functions described above, such 2D and / or 3D graphics processors may in some other way be 2D and / or 3D graphics processors. Can comprise one or more or all of the normal functional units and the like.

当業者であれば、本発明の説明されている態様および実施形態はすべて、適宜、本明細書で説明されている好ましい、およびオプションの機能の1つまたは複数あるいはすべてを備えることができることも理解するであろう。 Those skilled in the art will also understand that all described aspects and embodiments of the present invention may include one or more or all of the preferred and optional features described herein, as appropriate. Will do.

本発明による方法は、少なくとも部分的にはソフトウェアを使用して、例えばコンピュータプログラムを使用して実装されうる。そのため、他の態様から見たときに、本発明は、データ処理手段にインストールされたときに本明細書で説明されている方法を実行するように特に適合されたコンピュータソフトウェア、データ処理手段上でプログラム要素が実行されたときに本明細書で説明されている方法を実行するコンピュータソフトウェアコード部分を含むコンピュータプログラム要素、およびデータ処理システム上でプログラムが実行されたときに本明細書で説明されている1つまたは複数の方法のすべてのステップを実行するように適合されたコード手段を含むコンピュータプログラムを提供することがわかる。データ処理システムは、マイクロプロセッサ、プログラム可能なFPGA(フィールドプログラマブルゲートアレイ)などとしてよい。 The method according to the invention can be implemented at least in part using software, for example using a computer program. As such, when viewed from another aspect, the present invention is directed to computer software, data processing means specifically adapted to perform the methods described herein when installed on a data processing means. A computer program element that includes computer software code portions that perform the methods described herein when the program element is executed, and is described herein when the program is executed on a data processing system It can be seen that a computer program is provided that includes code means adapted to perform all the steps of one or more methods. The data processing system may be a microprocessor, a programmable FPGA (Field Programmable Gate Array), or the like.

本発明は、さらに、データ処理手段を備えるプロセッサまたはシステムを動作させるために使用されたときに前記データ処理手段と併せて前記プロセッサまたはシステムに本発明の方法のステップを実行させるそのようなソフトウェアを含むコンピュータソフトウェアキャリアにも拡大適用される。このようなコンピュータソフトウェアキャリアは、ROMチップ、CD ROM、またはディスクなどの物理的記憶媒体とすることが可能であるか、またはケーブル上の電子信号、光信号、または衛星もしくは同様のものなどへの無線信号などの信号とすることが可能である。 The present invention further includes such software that, when used to operate a processor or system comprising data processing means, causes said processor or system to perform the steps of the method of the present invention in conjunction with said data processing means. It is also extended to include computer software carriers. Such a computer software carrier can be a physical storage medium such as a ROM chip, a CD ROM, or a disk, or an electronic signal on a cable, an optical signal, or a satellite or the like It can be a signal such as a radio signal.

さらに、本発明の方法のステップすべてが、コンピュータソフトウェアによって実行される必要はないことは理解されるであろうし、したがって、さらに広い観点から、本発明は、本明細書で説明されている方法の複数のステップのうちの少なくとも1つを実行するために、コンピュータソフトウェアおよびコンピュータソフトウェアキャリア上にインストールされたそのようなソフトウェアを提供する。 Furthermore, it will be understood that not all steps of the methods of the present invention need be performed by computer software, and thus, from a broader perspective, the present invention is not limited to the methods described herein. Computer software and such software installed on a computer software carrier are provided for performing at least one of the plurality of steps.

したがって、本発明は、コンピュータシステムとともに使用するためコンピュータプログラム製品として適宜実現されうる。このような実装は、非一時的なコンピュータ可読媒体、例えば、ディスケット、CD ROM、ROM、またはハードディスクなどの有形な媒体上に固定された一連のコンピュータ可読命令を含むことができる。これは、限定はしないが、光またはアナログ通信回線を含む有形の媒体上で、あるいは限定はしないが、マイクロ波、赤外線、または他の伝送技術を含む無線技術の無形の手段を使用して、モデムまたは他のインターフェイスデバイスを介して、コンピュータシステムに伝送可能な、一連のコンピュータ可読命令を含むことも可能である。これら一連のコンピュータ可読命令は、本明細書ですでに説明されている機能の全部または一部を実現する。 Therefore, the present invention can be appropriately implemented as a computer program product for use with a computer system. Such an implementation may include a series of computer readable instructions fixed on a tangible medium such as a non-transitory computer readable medium, such as a diskette, CD ROM, ROM, or hard disk. This can be done on tangible media including, but not limited to, optical or analog communication lines, or using intangible means of wireless technology including but not limited to microwave, infrared, or other transmission technologies, It may also include a series of computer readable instructions that can be transmitted to a computer system via a modem or other interface device. These series of computer readable instructions implement all or part of the functionality already described herein.

当業者であれば、このようなコンピュータ可読命令は、多くのコンピュータアーキテクチャまたはオペレーティングシステムとともに使用するために多くのプログラミング言語で作成されうることを理解するであろう。さらに、このような命令は、限定はしないが半導体、磁気、または光を含む、現在の、または将来のメモリ技術を使用して格納されるか、または、限定はしないが光、赤外線、またはマイクロ波を含む、現在の、または将来の通信技術を使用して伝送されうる。このようなコンピュータプログラム製品は、印刷または電子文書を添付した取り外し可能媒体、例えばシステムROMまたは固定ディスク上のコンピュータシステムにプリインストールされている、例えば収縮包装ソフトウェアとして配布されるか、あるいはサーバーまたはネットワーク、例えばインターネットまたはワールドワイドウェブ上の電子掲示板から配布されうることが意図される。 Those skilled in the art will appreciate that such computer readable instructions can be written in many programming languages for use with many computer architectures or operating systems. Further, such instructions may be stored using current or future memory technologies including, but not limited to, semiconductor, magnetic, or light, or light, infrared, or micro, without limitation. It can be transmitted using current or future communication technologies, including waves. Such computer program products are distributed pre-installed in a computer system on a removable medium, for example a system ROM or fixed disk, attached with a printed or electronic document, eg distributed as shrink wrap software, or on a server or network It is intended that it can be distributed from electronic bulletin boards on the Internet or the World Wide Web, for example.

本発明の多数の好ましい実施形態について、付属の図面を参照しつつ、実施例のみを使って説明する。 Numerous preferred embodiments of the present invention will now be described by way of example only with reference to the accompanying drawings.

本発明がタイルベースのグラフィックスプロセッサと連携して使用される第1の実施形態の概略を示す図である。1 is a schematic diagram of a first embodiment in which the present invention is used in conjunction with a tile-based graphics processor. FIG. 本発明の一実施形態において関連するデータがメモリ内にどのように格納されるかの概略を示す図である。FIG. 4 is a diagram illustrating an outline of how related data is stored in a memory in an embodiment of the present invention. 図1に示されている実施形態のディスプレイコントローラの概略および詳細を示す図である。FIG. 2 is a diagram showing an outline and details of the display controller of the embodiment shown in FIG. 図1に示されている実施形態におけるディスプレイコントローラのオペレーションを示す図である。FIG. 2 is a diagram illustrating the operation of the display controller in the embodiment shown in FIG. 図1に示されている実施形態のグラフィックスプロセッサの概略および詳細を示す図である。FIG. 2 is a diagram showing an outline and details of the graphics processor of the embodiment shown in FIG. 1; 図1に示されている実施形態におけるグラフィックスプロセッサのオペレーションを示す図である。FIG. 2 illustrates the operation of the graphics processor in the embodiment shown in FIG.

次に、本発明の多数の好ましい実施形態について説明する。これらの実施形態は、もっぱら、ディスプレイコントローラによる表示のためにグラフィックス処理システムによって生成される画像の処理に関して説明されるが、上記のように、本発明は、データ配列が配列全体の領域を表すブロック単位で処理される他の配置構成に適用可能である。 A number of preferred embodiments of the invention will now be described. Although these embodiments are described exclusively with respect to processing of images generated by a graphics processing system for display by a display controller, as described above, the present invention describes a data array that represents a region of the entire array. The present invention can be applied to other arrangement configurations processed in units of blocks.

図1は、本発明の実施形態に従って動作しうるシステムの配置構成の概略を示す。 FIG. 1 schematically illustrates an arrangement of systems that can operate in accordance with an embodiment of the present invention.

システムは、図1に示されているように、タイルベースのグラフィックスプロセッサ(GPU)1を示している。これは、この実施形態では、処理すべきデータ配列を生成するシステムの要素である。データ配列は、当技術分野で知られているように、典型的には、画面もしくはプリンタなどの、表示デバイス2上に表示することが意図されている出力フレームとすることができるが、例えば、グラフィックスプロセッサ1などの「テクスチャにレンダー」出力を含んでいてもよい。 The system shows a tile-based graphics processor (GPU) 1 as shown in FIG. This is in this embodiment an element of the system that generates the data array to be processed. The data array can typically be an output frame intended to be displayed on the display device 2, such as a screen or a printer, as is known in the art, for example, May include a “render to texture” output, such as graphics processor 1.

グラフィックスプロセッサは、当技術分野で知られているように、各出力データ配列の異なる領域を表すタイルを生成することによって処理すべき、出力フレームなどの出力データ配列を生成する。 The graphics processor generates an output data array, such as an output frame, to be processed by generating tiles representing different regions of each output data array, as is known in the art.

当技術分野で知られているように、このような配置構成では、タイルは、グラフィックスプロセッサ1によって生成された後、通常であれば、メモリコントローラ6に接続されている相互接続部5を介してシステムのメインメモリ4(このメモリはDDR-SDRAMでもよい)内のフレームバッファ3の形態の出力バッファに書き込まれる。 As is known in the art, in such an arrangement, the tiles are generated by the graphics processor 1 and then through the interconnect 5 which is normally connected to the memory controller 6. Are written in an output buffer in the form of a frame buffer 3 in the main memory 4 of the system (this memory may be DDR-SDRAM).

いつか後になってフレームバッファ3内のデータ配列が、ディスプレイコントローラ7によって読み込まれ、ディスプレイ2に出力される。(したがって、ディスプレイコントローラ7は、グラフィックスプロセッサ1によって生成されるデータ配列を処理すべき(この場合は表示する)処理デバイスである。) Sometime later, the data array in the frame buffer 3 is read by the display controller 7 and output to the display 2. (Thus, the display controller 7 is a processing device that should process (display in this case) the data array generated by the graphics processor 1.)

このプロセスの一部として、ディスプレイコントローラは、フレームバッファ3からデータのブロックを読み込み、ディスプレイコントローラ7のローカルメモリバッファ8内に格納してから、データのそれらのブロックをディスプレイ2に出力する。表示デバイス2は、例えば、画面またはプリンタであってもよい。 As part of this process, the display controller reads blocks of data from the frame buffer 3, stores them in the local memory buffer 8 of the display controller 7, and then outputs those blocks of data to the display 2. The display device 2 may be a screen or a printer, for example.

本発明の実施形態において、このプロセスは、表示のために出力(処理)すべきデータの新しいブロックがディスプレイコントローラ7のローカルメモリ8内にすでに格納されているデータのブロックに類似していると考えるべきかどうかを判定するディスプレイコントローラ7をさらに備える。このために、本発明の実施形態では、ディスプレイコントローラ7は、出力フレームを生成したときにグラフィックスプロセッサ1によって生成されているフレームバッファ内の出力フレームに関連付けられている類似度メタデータを使用する。(このプロセスについて、以下でさらに詳しく説明する。) In an embodiment of the present invention, this process considers that a new block of data to be output (processed) for display is similar to a block of data already stored in the local memory 8 of the display controller 7. It further includes a display controller 7 for determining whether or not it should be. To this end, in the embodiment of the present invention, the display controller 7 uses similarity metadata associated with the output frame in the frame buffer generated by the graphics processor 1 when generating the output frame. . (This process is described in more detail below.)

本質的に、また以下でさらに詳しく説明するように、ディスプレイコントローラ7は、処理すべきデータブロックがローカルバッファ8内にすでに格納されているデータブロックに類似していると考えるべきかどうかを判定し、処理すべきデータブロックがディスプレイコントローラ7のローカルバッファ8内にすでに格納されているデータブロックに類似していることが判明した場合に、ディスプレイコントローラは、フレームバッファ3から新しいデータブロックを読み込まず、その代わりにバッファ8内の既存のデータブロックをディスプレイ2に供給する。 In essence and as described in more detail below, the display controller 7 determines whether the data block to be processed should be considered similar to a data block already stored in the local buffer 8. If the data block to be processed is found to be similar to the data block already stored in the local buffer 8 of the display controller 7, the display controller does not read the new data block from the frame buffer 3, Instead, the existing data block in the buffer 8 is supplied to the display 2.

このようにして、本発明の実施形態では、ディスプレイコントローラ7のローカルバッファ8内にすでに格納されているデータのブロックに類似しているフレームバッファ3内のデータのブロックについてディスプレイコントローラ7とフレームバッファ3との間の読み込みトラヒックを回避することができる。(ゲームの場合には、例えば、これは、典型的には、ユーザーインターフェイスの多く、空など、さらにはカメラ位置が静止しているときにはプレイフィールドの大半に対する場合としてよい。)これにより、フレーム読み込みオペレーションに関して帯域幅および電力消費のかなりの量を節減することができる。 Thus, in an embodiment of the present invention, the display controller 7 and the frame buffer 3 for blocks of data in the frame buffer 3 that are similar to the blocks of data already stored in the local buffer 8 of the display controller 7. Can be avoided. (In the case of games, for example, this is typically the case for much of the user interface, the sky, etc., and most of the play field when the camera position is stationary.) A significant amount of bandwidth and power consumption can be saved in terms of operation.

その一方で、処理すべきデータブロックが、ディスプレイコントローラ7のローカルバッファ8内にすでに格納されているデータブロックに類似していないと判定された場合、ディスプレイコントローラは、フレームバッファ3から新しいデータブロックをそのローカルバッファ8内に読み込み、次いでその新しいデータブロックをディスプレイ2に供給する。 On the other hand, if it is determined that the data block to be processed is not similar to the data block already stored in the local buffer 8 of the display controller 7, the display controller will retrieve the new data block from the frame buffer 3. Read into the local buffer 8 and then supply the new data block to the display 2.

本発明の実施形態では、フレームバッファ3から読み込まれ、ディスプレイコントローラ7のバッファ8内にすでに格納されているデータブロックと比較されるデータブロックは、キャッシュラインを含むが、これは、フレームバッファ3からディスプレイコントローラ7によって読み込みオペレーション毎に読み込まれるデータの量であるからである。しかし、他の配置構成も可能である。例えば、ディスプレイコントローラは、グラフィックスプロセッサ1が生成するレンダリング済みタイル、またはレンダリング済みタイルの2D「部分タイル」に対応するデータブロックに関してこのプロセスを操作することが可能である。 In the embodiment of the present invention, the data block read from the frame buffer 3 and compared with the data block already stored in the buffer 8 of the display controller 7 includes a cache line, which is This is because the amount of data read by the display controller 7 for each read operation. However, other arrangements are possible. For example, the display controller can operate this process on data blocks corresponding to rendered tiles generated by the graphics processor 1 or 2D “partial tiles” of the rendered tiles.

図1は、相互接続5を介してメインメモリ4とデータのやり取りをすることもでき、例えば、メインメモリ4内のフレームバッファ3に書き込むこともできる、ホストCPU 9も示している。この可能性について、以下でさらに詳しく説明する。 FIG. 1 also shows a host CPU 9 that can exchange data with the main memory 4 via the interconnect 5 and can write to the frame buffer 3 in the main memory 4, for example. This possibility is described in more detail below.

本発明の実施形態では、上述のように、ディスプレイコントローラ7は、表示のために処理すべき所定のデータブロック(キャッシュライン)がローカルバッファ8内にすでに格納されているデータブロックに類似していると考えるべきかどうかを、注目しているフレームを構成するデータブロックに関連して格納されているビットマップの形のメタデータを評価することによって判定する。 In the embodiment of the present invention, as described above, the display controller 7 is similar to a data block in which a predetermined data block (cache line) to be processed for display is already stored in the local buffer 8. Is determined by evaluating the metadata in the form of bitmaps stored in association with the data blocks that make up the frame of interest.

フレームバッファ3内の格納されているデータ配列内のそれぞれのデータブロック位置(キャッシュライン)は、フレームに対応するビットマップ内の単一ビットに関連付けられている(ビットマップ内のそれぞれのビットはフレームの1つのデータブロック位置(この場合にはキャッシュライン)に対応している)。データブロック(キャッシュライン)に対するビットマップ内のビットは、そのデータブロックがフレームから読み込まれる(処理される)べき前のデータブロック(キャッシュライン)と同じであると考えられる場合に「1」に設定され、そのデータブロックが前のデータブロックと異なると考えられる場合に「0」に設定される。 Each data block position (cache line) in the stored data array in frame buffer 3 is associated with a single bit in the bitmap corresponding to the frame (each bit in the bitmap is a frame Corresponding to one data block position (in this case, a cache line). The bit in the bitmap for a data block (cache line) is set to `` 1 '' if the data block is considered the same as the previous data block (cache line) to be read (processed) from the frame And is set to “0” when the data block is considered to be different from the previous data block.

このようにして、ディスプレイコントローラは、処理することになっているデータブロックに関連付けられているビットマップエントリを読み込むことができ、そのビットマップエントリが「1」に設定されている場合に、そのデータブロックがディスプレイコントローラ7のバッファ8内に読み込まれた前のデータブロックと同じであると考えられることを認識する(したがって、ディスプレイコントローラ7のローカルメモリ8内に新しいデータを読み込む代わりにバッファ8内にすでにあるそのデータブロックを表示することができる)。あるいは、処理されるべきデータブロックに関連付けられているメタデータが、「0」である場合、ディスプレイコントローラは、フレームバッファ3から新しいデータブロックをローカルバッファ8内に読み込んで、その新しいブロックをディスプレイ2上に表示すべきであることを知る。 In this way, the display controller can read the bitmap entry associated with the data block that is to be processed, and if that bitmap entry is set to `` 1 '', the data Recognize that the block is considered to be the same as the previous data block read into buffer 8 of display controller 7 (thus instead of loading new data into local memory 8 of display controller 7 You can display that data block that already exists). Alternatively, if the metadata associated with the data block to be processed is “0”, the display controller reads a new data block from the frame buffer 3 into the local buffer 8 and displays the new block on the display 2 Know what should be displayed above.

図2は、フレームバッファ3内のデータ配列に対する例示的なメモリレイアウトおよびその関連付けられているメタデータ(データブロックの類似度情報)10を示している。この場合、フレームを構成するデータブロックは、フレームバッファ3として格納され、関連付けられているデータブロック類似度ビットマップ10は、メモリ4の別の部分に格納される。(他の処理方法も、もちろん、可能である。) FIG. 2 shows an exemplary memory layout and associated metadata (data block similarity information) 10 for a data array in the frame buffer 3. In this case, the data blocks constituting the frame are stored as the frame buffer 3, and the associated data block similarity bitmap 10 is stored in another part of the memory 4. (Other processing methods are of course possible.)

図2に示されているように、フレームバッファ3内のデータ配列内のそれぞれのデータブロックは、類似度情報ビットマップ10内に関連するエントリを持つ。したがって、例えば、フレームバッファ3内のデータブロック11は、ビットマップ10内のビットマップエントリ13に関連付けられ、フレームバッファ3内のデータブロック12は、類似度ビットマップ10内のビットマップエントリ14に関連付けられる。 As shown in FIG. 2, each data block in the data array in the frame buffer 3 has an associated entry in the similarity information bitmap 10. Thus, for example, data block 11 in frame buffer 3 is associated with bitmap entry 13 in bitmap 10, and data block 12 in frame buffer 3 is associated with bitmap entry 14 in similarity bitmap 10. It is done.

図2は、ビットマップエントリの性質も示している。したがって、ビットマップエントリ13は、フレームバッファ3内のデータ配列内のデータブロック11が、前のデータブロック(したがって、フレームバッファからディスプレイコントローラ7のローカルメモリ8内に読み込まれるべき「新しい」データブロック)と同じでないことを示す値「0」を有する。その一方で、次のデータブロック12に対するビットマップエントリ14は、データブロック12がフレームバッファ3内のデータブロック11と同じであることを示すエントリ「1」を持つ。これにより、ディスプレイコントローラは、新しいデータブロック12をフレームバッファ3から読み込む代わりに、ローカルメモリ8内に格納されているデータブロック11を表示することになる。 FIG. 2 also shows the nature of the bitmap entry. Thus, the bitmap entry 13 indicates that the data block 11 in the data array in the frame buffer 3 is the previous data block (thus the “new” data block to be read from the frame buffer into the local memory 8 of the display controller 7). It has a value “0” indicating that it is not the same. On the other hand, the bitmap entry 14 for the next data block 12 has an entry “1” indicating that the data block 12 is the same as the data block 11 in the frame buffer 3. As a result, the display controller displays the data block 11 stored in the local memory 8 instead of reading the new data block 12 from the frame buffer 3.

他の類似度メタデータ配置構成も、所望するなら使用することが可能である。例えば、それぞれのデータブロックは、データ配列内の複数のデータブロックに類似していると潜在的に示される可能性があり、この場合、それぞれのビットマップエントリは、ディスプレイコントローラ7にビットマップエントリが対応するデータブロックがデータ配列内のデータブロックのうちのどれに類似していると考えられるかを示すためにさらに多くのビットを含むことが可能である。これらの配置構成において、それぞれの類似度値(メタデータエントリ)は、例えば、注目しているデータブロックがデータ配列内の他のどのデータブロックに類似しているかを示す相対的指標(例えば「001」は現在のデータブロックに相対的に前にあるデータブロックを示す)、または注目してるデータブロックがデータ配列内の他のどのデータブロックに類似しているかを示す絶対的指標(例えば、メタデータ「125」は、ブロックが注目しているデータ配列内の125番目のデータブロックに類似していることを示す)を与えることができる。 Other similarity metadata arrangements can be used if desired. For example, each data block can potentially be shown to be similar to multiple data blocks in the data array, in which case each bitmap entry has a bitmap entry in the display controller 7. More bits can be included to indicate to which of the data blocks in the data array the corresponding data block is considered similar. In these arrangement configurations, each similarity value (metadata entry) is, for example, a relative indicator (for example, “001” indicating which other data block in the data array is similar to the data block of interest. '' Indicates a data block that precedes the current data block), or an absolute indicator that indicates which data block of interest is similar to other data blocks in the data array (e.g., metadata “125” indicates that the block is similar to the 125th data block in the data array of interest).

それぞれのメタデータエントリとともに、各データブロックがどれだけ類似しているかを示す「ライクネス」値を含むことも可能である。次いで、類似度判定プロセスが、例えば、このライクネス値を使用して、データ配列から新しいブロックを読み込むか、または使用中の処理デバイスのローカルメモリ内のすでに存在している類似データブロックを再利用するかを決定することが可能である。例えば、類似度判定プロセスは、ライクネス閾値を設定し、新しいデータブロックに対するライクネス値をその閾値と比較して新しいデータブロックを読み込むか、または読み込まないように、しかるべく設定することが可能である。 Along with each metadata entry, it is also possible to include a “likeness” value that indicates how similar each data block is. The similarity determination process then uses this likeness value, for example, to read a new block from the data array or to reuse an already existing similar data block in the local memory of the processing device being used. Can be determined. For example, the similarity determination process may set a likeness threshold and compare the likeness value for a new data block with the threshold to read or not read a new data block accordingly.

これは、階層的四分木などのビットマップ以外の配置構成を使用することも可能である。データ配列に関連付けられているメタデータ(類似度情報)も、それらの相対的類似度に応じて処理デバイスのローカルメモリ内にデータブロックを読み込むように処理デバイスに命令するコマンドリストの形態をとることが可能である。 It is also possible to use an arrangement other than a bitmap such as a hierarchical quadtree. The metadata (similarity information) associated with the data array also takes the form of a command list that instructs the processing device to read data blocks into the local memory of the processing device according to their relative similarity. Is possible.

以下でさらに説明するように、上記のビットマップ例において、類似度メタデータ(ビットマップ)は各データブロックがデータ配列内の他のデータブロックに類似していると考えるべきかどうかをディスプレイコントローラ7に直接示すけれども、これは、それぞれのデータブロックに、ディスプレイコントローラそれ自体がデータブロック間の比較を実行してそれらのデータブロックが類似していると考えるべきかどうかを判定することを可能にする何らかの情報を関連付けることも可能である。例えば、その代わりに、それぞれのデータブロックの内容を表す情報を格納すること、次いでディスプレイコントローラ7がデータブロックの各内容の情報を比較してそれらが類似していると考えるべきかどうかを判定することが可能である。 As described further below, in the above bitmap example, the similarity metadata (bitmap) indicates whether each data block should be considered similar to other data blocks in the data array. As shown directly in FIG. 1, this allows each data block to allow the display controller itself to perform a comparison between the data blocks to determine whether the data blocks should be considered similar. It is also possible to associate some information. For example, instead of storing information representing the contents of each data block, the display controller 7 then compares the information of each content of the data block to determine if they should be considered similar It is possible.

図3は、ディスプレイコントローラ7の構造をさらに詳しく示しており、また図4は、ディスプレイコントローラ7の上記のオペレーションを示す流れ図である。 FIG. 3 shows the structure of the display controller 7 in more detail, and FIG. 4 is a flowchart showing the above operation of the display controller 7.

図3に示されているように、ディスプレイコントローラ7は、バスインターフェイスユニット20、メタデータバッファ21、表示フォーマッタおよび出力ユニット22、ならびに状態機械コントローラ23を、フレームバッファ3からデータブロックをメインメモリ4に、表示される前に格納するローカルバッファ8に加えて、備える。 As shown in FIG. 3, the display controller 7 includes a bus interface unit 20, a metadata buffer 21, a display formatter and output unit 22, and a state machine controller 23, and a data block from the frame buffer 3 to the main memory 4. In addition to the local buffer 8 for storing before display.

状態機械コントローラ23は、上述の実施形態のオペレーションを実行するディスプレイコントローラ7を制御する働きをする。メタデータバッファ21は、注目しているフレーム(データ配列)に対するメタデータビットマップ10のチャンクを格納するために使用され、これにより、オフチップメモリアクセスの効率を改善する。ディスプレイコントローラが常にメインメモリ4内のメタデータを直接読み込むなどの他の配置構成も可能である。 The state machine controller 23 serves to control the display controller 7 that performs the operations of the above-described embodiments. The metadata buffer 21 is used to store a chunk of the metadata bitmap 10 for the frame of interest (data array), thereby improving the efficiency of off-chip memory access. Other arrangements are possible, such as the display controller always reading the metadata in the main memory 4 directly.

新しい表示フレームが表示される場合、ディスプレイコントローラは、最初に、メインメモリ4からそのフレームに関連付けられているメタデータ10の該当する部分を読み込んで、それをメタデータバッファ21内に格納する。次いで、ディスプレイコントローラは、メインメモリ4内のフレームバッファ3からデータのブロックをデータキャッシュ/バッファ8内に読み込んで、表示フォーマッタ/出力ユニット22を介してデータのこれらのブロックを表示のためにディスプレイ2に適切に供給する。ディスプレイコントローラは、表示すべきデータのブロックをローカルメモリ8内にプリフェッチするように動作する。これは、表示するのに利用可能なデータが常にあることを保証するためである(バッファ/メモリのアンダーランが生じると、表示される画像のグリッチを引き起こす可能性がある)。次いで、これらのブロックが、表示のためにローカルメモリ8から次から次へと読み込まれる。しかし、このオペレーションは、図4に示されている(また上で説明されている)プロセスに従うように状態機械23の制御の下で修正される。 When a new display frame is displayed, the display controller first reads a corresponding portion of the metadata 10 associated with the frame from the main memory 4 and stores it in the metadata buffer 21. The display controller then reads the blocks of data from the frame buffer 3 in the main memory 4 into the data cache / buffer 8 and displays these blocks of data for display 2 via the display formatter / output unit 22 for display. Supply properly. The display controller operates to prefetch the block of data to be displayed into the local memory 8. This is to ensure that there is always data available to display (buffer / memory underruns can cause glitches in the displayed image). These blocks are then read from local memory 8 from one to the next for display. However, this operation is modified under the control of the state machine 23 to follow the process shown in FIG. 4 (and described above).

図4に示されているように、表示のために処理するために(これは、例えば、ローカルメモリ8からのブロックの表示によってトリガーされることがあり、これにより、ローカルメモリ8内の「キュー」に加える新しいブロックをフェッチする必要があることが指示される)、新しいデータブロック(キャッシュライン)が、ローカルメモリ8内にプリフェッチされることになっている場合、状態機械コントローラ23は、その新しいデータブロックに対するメタデータバッファ21内の類似度メタデータビットマップ内の適切な配置を読み込む(ステップ31)。次いで、類似度ビットマップ内の適切な位置に格納されているビットが値「1」を持つかどうかを判定する(ステップ32)。 As shown in FIG. 4, to process for display (this may be triggered by the display of a block from local memory 8, for example, which If a new data block (cache line) is to be prefetched into the local memory 8, the state machine controller 23 An appropriate arrangement in the similarity metadata bitmap in the metadata buffer 21 for the data block is read (step 31). Next, it is determined whether or not the bit stored at an appropriate position in the similarity bitmap has the value “1” (step 32).

ビットマップ配置内の値が「1」であると判定される場合、それは、新しいデータブロックが前のデータブロックと同じであることを示し(したがって、ディスプレイコントローラのローカルメモリ8内にすでにあるべきである)、したがってフレームバッファ3から新しいデータブロックを読み込む代わりに、状態機械コントローラ23は、(適切な時期に)ローカルバッファ8内にすでにある前のデータブロックをディスプレイコントローラに使用させる、つまり、ローカルバッファ8からその前のデータブロックをディスプレイ2へ供給させる(ステップ33)。(ここで、類似のブロック(つまり、メタデータが値「1」を有するブロック)のシーケンスがある場合に、状態機械コントローラは、それぞれの連続する類似データブロックについてシーケンス内の第1のブロックを、ディスプレイコントローラに、実際に再利用させる(繰り返させる)ことは理解されるであろう。) If the value in the bitmap layout is determined to be `` 1 '', it indicates that the new data block is the same as the previous data block (thus should already be in the local memory 8 of the display controller). Thus, instead of reading a new data block from the frame buffer 3, the state machine controller 23 causes the display controller to use the previous data block already in the local buffer 8 (at the appropriate time), i.e. the local buffer The previous data block from 8 is supplied to the display 2 (step 33). (Here, if there is a sequence of similar blocks (i.e., blocks whose metadata has the value `` 1 ''), the state machine controller selects the first block in the sequence for each successive similar data block, (It will be understood that the display controller is actually reused (repeated).)

その一方で、ビットマップ内の値が「0」である場合、それは、データブロックが前のデータブロックと同じでないことを示し、したがって、そのデータブロックは、表示のためにフレームバッファ3からローカルメモリ8内にプリフェッチされる必要がある。この場合、状態機械コントローラ23は、メインメモリ4内のフレームバッファ3からデータブロックを読み込んで(ステップ34)、そのデータブロックをディスプレイコントローラのローカルバッファ8内に格納する(ステップ35)ことをディスプレイコントローラに行わせる。次いで、新しいブロックが(適切な時期に)ディスプレイコントローラ7のローカルバッファ8から表示デバイス2に供給される(ステップ36)。 On the other hand, if the value in the bitmap is "0", it indicates that the data block is not the same as the previous data block, so that the data block is taken from the frame buffer 3 for display. Must be prefetched within 8. In this case, the state machine controller 23 reads the data block from the frame buffer 3 in the main memory 4 (step 34), and stores the data block in the local buffer 8 of the display controller (step 35). Let me do it. A new block is then supplied from the local buffer 8 of the display controller 7 to the display device 2 (at the appropriate time) (step 36).

次いで、データブロックが表示される(ブロック37)。 The data block is then displayed (block 37).

次いで、このプロセスは、処理すべき次のデータブロック(ローカルメモリ8内にプリフェッチされる)、そして次のブロックというように繰り返される。 This process is then repeated, such as the next block of data to be processed (prefetched into local memory 8), and the next block.

本発明の実施形態では、処理すべき新しいブロックがローカルバッファ8内にすでに格納されているデータブロックと同じであるかどうかを判定するためにディスプレイコントローラ7によって使用されるメタデータは、フレームを構成するタイルが生成されるときにグラフィックスプロセッサ1によって生成される。図5は、このプロセスを実行するグラフィックスプロセッサ1のアーキテクチャを示しており、図6は、メタデータ生成プロセスのステップを示す流れ図である。 In an embodiment of the present invention, the metadata used by the display controller 7 to determine whether the new block to be processed is the same as the data block already stored in the local buffer 8 comprises the frame Generated by the graphics processor 1 when a tile to be generated is generated. FIG. 5 shows the architecture of the graphics processor 1 that performs this process, and FIG. 6 is a flowchart showing the steps of the metadata generation process.

図5に示されているように、グラフィックスプロセッサ1は、そのタイルレンダリングロジック40の後に、追加のデータブロック生成ロジックとフレームバッファ3内のデータ配列(フレーム)との関連付けのために適切なメタデータを生成するために使用されるブロック比較ロジックとを備えるように修正されている。 As shown in FIG. 5, the graphics processor 1 follows the tile rendering logic 40 and the appropriate meta-data for associating additional data block generation logic with the data array (frame) in the frame buffer 3. Modified to include block comparison logic used to generate the data.

ブロック生成ロジック41は、タイルレンダリングロジック40によって生成されるタイルから適切なデータブロックを生成するように動作する。本発明の実施形態では、ブロック生成ロジックは、ディスプレイコントローラ7のキャッシュメモリ8内のキャッシュラインに対応するブロックをしかるべく生成する。しかし、上で説明されているように、他のサイズおよび形式のデータブロックも可能であり、所望するなら、ブロック生成ロジック41によって生成することも可能である。 Block generation logic 41 operates to generate appropriate data blocks from the tiles generated by tile rendering logic 40. In the embodiment of the present invention, the block generation logic appropriately generates a block corresponding to the cache line in the cache memory 8 of the display controller 7. However, as described above, data blocks of other sizes and formats are possible and can be generated by the block generation logic 41 if desired.

ブロック生成ロジックは、バッファ42内に生成する連続ブロックを格納する。次いで、比較ロジック43は、バッファ42内に格納されている各データブロックを比較し(この場合、新しいデータブロックと直前のデータブロックとの比較)、この比較に基づいて適切なメタデータ出力ビットを生成する。メモリ効率を高めるために、複数のブロックに対するメタデータ出力ビットが収集され、1つのバッファ内にマージされ、次いで、メインメモリ4内のメタデータビットマップ10に適宜格納される(オフチップメモリに書き込まれる)。(他の処理方法も、もちろん、可能である。)これらのデータブロックも、バッファ42から読み込まれ、フレームバッファ3内に適宜格納される。 The block generation logic stores successive blocks to be generated in the buffer 42. The comparison logic 43 then compares each data block stored in the buffer 42 (in this case, comparing the new data block with the previous data block) and based on this comparison, sets the appropriate metadata output bit. Generate. To increase memory efficiency, metadata output bits for multiple blocks are collected, merged into one buffer, and then stored as appropriate in the metadata bitmap 10 in main memory 4 (written to off-chip memory) ) (Other processing methods are of course possible.) These data blocks are also read from the buffer 42 and stored in the frame buffer 3 as appropriate.

このオペレーションを容易にするために、出力フレームを構成するデータブロックは、特定の定義済み順序で処理される(フレームバッファに書き込む場合と読み込む場合の両方)。ブロック間の空間的コヒーレンスを利用することができる順序が、好ましくは使用される。 To facilitate this operation, the data blocks that make up the output frame are processed in a specific predefined order (both when writing to and reading from the frame buffer). An order in which spatial coherence between blocks can be utilized is preferably used.

このプロセスは、図6に流れ図として示されている。 This process is shown as a flow diagram in FIG.

図6に示されているように、ブロック生成ロジック41は、タイルレンダリングロジック40によって生成されたレンダリング済みタイルからデータブロック(この場合には、キャッシュラインに対応する)を生成する(ステップ51)。次いで、データブロックが、バッファ42に格納される。 As shown in FIG. 6, the block generation logic 41 generates a data block (corresponding to a cache line in this case) from the rendered tile generated by the tile rendering logic 40 (step 51). The data block is then stored in the buffer 42.

次いで、比較ロジック43は、新しいデータブロックを前のデータブロックと比較する(これは、バッファ42内にすでに格納されている)(ステップ52)。本発明の実施形態では、比較ロジック43は、データブロックの内容同士を比較する。他の処理方法も可能であろう。例えば、比較ロジックは、注目するブロック毎に、ブロックの内容を表す、32ビットCRCなどの署名を生成し、次いで、ブロックの実際の内容を比較するのではなくブロックの署名を比較する。 The comparison logic 43 then compares the new data block with the previous data block (which is already stored in the buffer 42) (step 52). In the embodiment of the present invention, the comparison logic 43 compares the contents of the data blocks. Other processing methods may be possible. For example, for each block of interest, the comparison logic generates a signature, such as a 32-bit CRC, that represents the contents of the block, and then compares the block signatures rather than comparing the actual contents of the blocks.

次いで、比較ロジックは、新しいブロックを前のブロックに類似していると考えるべきかどうかを判定する(ステップ53)。本発明の実施形態では、この評価は、比較されている2つのブロックの内容がどの程度類似しているかに基づく。ピクセルのLSBにおける差の特定の量の閾値が設定され、2つのブロックの内容の差が、この閾値より小さければ、ブロックは類似していると判定され、また逆も同様である。 The comparison logic then determines whether the new block should be considered similar to the previous block (step 53). In an embodiment of the invention, this evaluation is based on how similar the contents of the two blocks being compared are. A threshold is set for a particular amount of difference in the LSB of the pixel, and if the difference in the contents of the two blocks is less than this threshold, the blocks are determined to be similar and vice versa.

(この閾値は、使用中に変更する(例えば、プログラムする)ことができる。例えば、静的フレームデータと動的フレームデータの割合に基づき、および/または使用中の電力モード(例えば、低電力モードかどうか)に基づき。アプリケーション毎に設定することが可能である。) (This threshold can be changed (eg, programmed) during use. For example, based on the ratio of static frame data to dynamic frame data and / or power mode in use (eg, low power mode) (It can be set for each application.)

ブロックが、ステップ53において比較ロジックにより異なる(類似していない)と判定される場合、比較ロジックは、値「0」をメタデータビットマップ10内の適切な位置に書き込むように動作する(ステップ54)。この新しいデータブロックは、それ自体、バッファ42からメインメモリ4内のフレームバッファ3に書き込まれる(ステップ55)。 If the block is determined to be different (not similar) by the comparison logic in step 53, the comparison logic operates to write the value “0” to the appropriate location in the metadata bitmap 10 (step 54). ). This new data block is itself written from the buffer 42 to the frame buffer 3 in the main memory 4 (step 55).

その一方で、ステップ53において、ブロックが類似していると考えるべきであると判定された場合、比較ロジック43は、「1」をメタデータビットマップ10内の適切な位置に書き込ませるように動作する(ステップ56)。 On the other hand, if it is determined in step 53 that the blocks should be considered similar, the comparison logic 43 operates to cause “1” to be written to the appropriate location in the metadata bitmap 10. (Step 56).

次いで、ここでもまた、ブロックが異なるものと考えられた場合と同様に、新しいブロックをメインメモリ4内のフレームバッファ3に単純に書き込むことが可能である。しかし、図6は、可能な「書き込み排除」オペレーションがグラフィックスプロセッサ1において有効にされうる好ましい一配置構成を示している。この書き込み排除プロセスは、以下でさらに説明されるように、グラフィックスプロセッサが互いに類似していると判定されるブロックをフレームバッファ3内のデータ配列に書き込むことを回避させることができるように動作する。したがって、図6に示されているように、書き込み排除プロセスが有効になっている場合(ステップ57)、2つのブロックが、互いに類似していると考えられるならば、新しいブロックは、フレームバッファ内のデータ配列内に書き込まれない(ステップ58)。(その一方で、書き込み排除プロセスが、ステップ57で有効にされない場合、新しいブロックは、通常通りフレームバッファに書き込まれる(ステップ55)。) Then again, the new block can simply be written to the frame buffer 3 in the main memory 4 as if the blocks were considered different. However, FIG. 6 shows one preferred arrangement in which possible “write-out” operations can be enabled in the graphics processor 1. This write exclusion process operates to avoid writing blocks that are determined to be similar to each other to the data array in the frame buffer 3 as described further below. . Thus, as shown in FIG. 6, if the write exclusion process is enabled (step 57), if the two blocks are considered similar to each other, the new block will not be in the frame buffer. Is not written in the data array (step 58). (On the other hand, if the write exclusion process is not enabled in step 57, a new block is written to the frame buffer as usual (step 55).)

したがって、ステップ57の書き込み排除プロセスは、データブロックが前のデータブロックと同じである(つまり、フレームバッファ3内にすでに格納されていたことになるデータブロックと同じである)と判定された場合も、同様に、新しいデータブロックはフレームバッファに書き込まれないように動作する。このようにして、書き込み排除プロセスは、互いに同じであるデータ配列(フレームバッファ)のセクションに対する書き込みトラヒックを回避することができる。これにより、フレームバッファオペレーションに関して帯域幅および電力消費がさらに節減されうる。その一方で、データブロックが、異なると判定された場合、新しいデータブロックが、書き込み排除プロセスがない場合と同様に、フレームバッファに書き込まれる。 Therefore, the write exclusion process of step 57 may also determine that the data block is the same as the previous data block (i.e., the same data block that was already stored in frame buffer 3). Similarly, the new data block operates so that it is not written to the frame buffer. In this way, the write exclusion process can avoid write traffic to sections of the data array (frame buffer) that are identical to each other. This can further save bandwidth and power consumption for frame buffer operation. On the other hand, if it is determined that the data blocks are different, the new data block is written to the frame buffer as if there was no write exclusion process.

これらの配置構成では、データブロックそれ自体が、データ配列に書き込まれない場合があるけれども、処理デバイス(本発明の実施形態ではディスプレイコントローラ)が他のどのブロックが代わりに処理されるべきかを判定するためにその情報を使用する必要がまだあるので、類似度メタデータが注目するブロック位置についてまだ生成され、格納されるべきである。 In these arrangements, the data block itself may not be written to the data array, but the processing device (in the embodiment of the present invention, the display controller) determines which other blocks should be processed instead. Since there is still a need to use that information to do so, similarity metadata should still be generated and stored for the block location of interest.

これらの実施形態の特に好ましい一配置構成において、データブロック比較が正確でない(実際には異なるのに誤って一致するブロックであるとする)場合、システムは、新たに生成されたデータブロックをフレームバッファに定期的に、例えば、1秒に1回、それぞれの所定のデータブロック(データブロック位置)に関して、常に書き込むように構成される。したがって、新しいデータブロックはすべてのデータブロック位置について少なくとも定期的にフレームバッファ内に確実に書き込まれるようになり、これにより、例えば、誤って一致するデータブロックが、所定の、例えば、所望のもしくは選択された、期間より長い間、フレームバッファ内に保持されることが回避される。これは、例えば、新しい出力データ配列を丸ごと定期的に(例えば、1秒に1回)単純に書き出すことによって、または新しいデータブロックをフレームバッファに循環パターンによりローリングベースで書き出すことによって実行できるので、時間が経過するうちにすべてのデータブロック位置が最終的には書き出されて新しくなる。 In one particularly preferred arrangement of these embodiments, if the data block comparison is inaccurate (assuming that it is actually a different but erroneously matched block), the system uses the newly generated data block as a frame buffer. At regular intervals, for example, once per second, each predetermined data block (data block position) is always written. Thus, it is ensured that new data blocks are written into the frame buffer at least periodically for every data block position, so that, for example, erroneously matching data blocks are predetermined, eg desired or selected. It is avoided that it is held in the frame buffer for a longer period. This can be done, for example, by simply writing out a new output data array periodically (e.g. once per second) or by writing new data blocks to the frame buffer in a rolling pattern on a rolling basis, so Over time, all data block positions will eventually be written out and renewed.

上記の配置構成に対するさまざまな代替形態および修正形態が可能である。例えば、グラフィックスプロセッサが生成しているデータの出力配列は、グラフィックステクスチャ(例えば、レンダー「ターゲット」が、グラフィックスプロセッサを使用して生成するテクスチャである(例えば、「テクスチャにレンダー」オペレーションで))またはグラフィックスプロセッサシステムの出力の書き込み先となる他の表面などのグラフィックスプロセッサの他の出力を含むこともできるか、またはその代わりに含むことができる。 Various alternatives and modifications to the above arrangement are possible. For example, the output array of data generated by the graphics processor is a graphics texture (e.g., a render "target" is a texture generated using the graphics processor (e.g., a "render to texture" operation). )) Or other outputs of the graphics processor, such as other surfaces to which the output of the graphics processor system is written, may alternatively be included.

例えば、データブロックが直前のデータブロックと比較されるだけでなく、出力フレーム(データ配列)内の複数のデータブロックとも比較される場合に、より高度なメタデータ配置構成を使用することも可能である。この場合、各それぞれのブロック位置に関連付けられているメタデータ(例えば、ビットマップエントリ)は、対応するデータブロックが出力データ配列内の他のデータブロックに類似していることだけでなく、出力データ配列内のどのデータブロックに類似しているかも示すべきである。 For example, a more advanced metadata arrangement can be used when a data block is not only compared with the previous data block, but also with multiple data blocks in the output frame (data array). is there. In this case, the metadata (e.g., bitmap entry) associated with each respective block location will not only indicate that the corresponding data block is similar to other data blocks in the output data array, but also the output data. It should also indicate which data blocks in the array are similar.

同様に、現在の、完成したデータブロックは、データ配列内にある複数のデータブロックと比較することが可能である。これは、データ配列内の他の位置にあるデータブロックに類似しているデータブロックの読み込みをなくすことができるため、処理のためにメインメモリから読み込む必要のあるデータブロックの個数をさらに減らすのに役立ちうる。 Similarly, the current completed data block can be compared to multiple data blocks in the data array. This can eliminate the reading of data blocks similar to data blocks elsewhere in the data array, further reducing the number of data blocks that need to be read from main memory for processing. Can be helpful.

好ましい一実施形態では、ソフトウェアアプリケーション(例えば、データ配列の生成をトリガーするもの、および/または生成される出力配列を使用し、および/または受け取るもの)が出力データ配列のどの領域が本実施形態の仕方で処理されるかを指示し、制御し、特に、また好ましくは、データブロック比較プロセスが、出力配列のどの領域のために実行されるべきかを指示することが可能である。次いで、これにより、本発明のプロセスは、アプリケーションが常に更新されることを「知っている」出力配列の領域に対してアプリケーションによって「オフにする」ことができる。 In a preferred embodiment, a software application (e.g., one that triggers the generation of a data array and / or uses and / or receives a generated output array) which region of the output data array It is possible to indicate and control how it is processed, in particular and preferably, for which region of the output sequence the data block comparison process should be performed. This, in turn, allows the process of the present invention to be “turned off” by the application for regions of the output array that “know” that the application will always be updated.

これは、望む通りに達成されうる。好ましい一実施形態では、出力配列領域に対してデータブロック(例えば、レンダリング済みタイル)比較を有効/無効にするレジスタを備え、そこで、ソフトウェアアプリケーションがレジスタをしかるべく設定する(例えば、グラフィックスプロセッサドライバを介して)。 This can be achieved as desired. One preferred embodiment comprises a register that enables / disables data block (eg, rendered tile) comparison for the output array region, where the software application sets the register accordingly (eg, a graphics processor driver). Through).

本発明の実施形態は、グラフィックスプロセッサオペレーションを特に参照して上で説明されているけれども、出願人は、本発明の原理が、例えばタイルベースのグラフィックス処理システムに類似の仕方でブロックの形態のデータを処理し、例えば、フレームバッファまたはテクスチャを読み込む他のシステムにも同様に適用することができることを認識している。したがって、例えば、フレームバッファを操作するホストプロセッサ、テクスチャを読み込むグラフィックスプロセッサ、合成すべき画像を読み込む合成エンジン、またはビデオ復号化を行うために基準フレームを読み込むビデオプロセッサに適用することができる。したがって、実施形態の技術は、例えば、ビデオ処理(ビデオ処理はグラフィックス処理においてタイルに類似するデータのブロックに対し行われるので)、また合成画像処理(ここでもまた、合成フレームバッファはデータの異なるブロックとして処理されるので)にも等しく使用することができる。 Although embodiments of the present invention have been described above with particular reference to graphics processor operation, applicants have found that the principles of the present invention are in the form of blocks in a manner similar to, for example, a tile-based graphics processing system. It is recognized that other data processing systems, such as frame buffers or textures, can be applied as well. Thus, for example, the present invention can be applied to a host processor that operates a frame buffer, a graphics processor that reads a texture, a synthesis engine that reads an image to be synthesized, or a video processor that reads a reference frame for video decoding. Thus, the techniques of the embodiments are, for example, video processing (since video processing is performed on blocks of data similar to tiles in graphics processing), and composite image processing (again, the composite frame buffer is different in data It can be used equally well (because it is processed as a block).

これらは、例えば、(デジタル)カメラ(ビデオもしくは静止画)によって生成されるデータ(画像)を処理する場合に、使用することもできる。この場合、カメラのセンサーからのデータは、例えば、メモリに書き込まれる画像データに対する適切なメタデータを生成する(また所望するなら画像データの書き込みを制御する)ためにカメラのコントローラによって上述のように処理することが可能である。次いで、こうして格納されている画像およびメタデータは、例えば、カメラからの画像を表示すべきディスプレイコントローラによって本発明の仕方で処理することが可能である。 They can also be used, for example, when processing data (images) generated by a (digital) camera (video or still image). In this case, the data from the camera sensor, for example, as described above by the camera controller to generate appropriate metadata for the image data written to memory (and control writing of the image data if desired). Can be processed. The image and metadata thus stored can then be processed in the manner of the present invention, for example, by a display controller that is to display the image from the camera.

本実施形態は、それぞれが同じ出力データ配列、例えば、フレームバッファ内のフレームに書き込む複数のマスターデバイスがある場合に使用することもできる。これは、例えば、ホストプロセッサ9が、グラフィックスプロセッサ1によって生成されている画像上に表示すべき「オーバーレイ」を生成する場合としてよい。 This embodiment can also be used when there are multiple master devices each writing to the same output data array, eg, a frame in a frame buffer. This may be the case, for example, when the host processor 9 generates an “overlay” to be displayed on the image generated by the graphics processor 1.

この場合、出力データ配列に書き込むそれぞれのデバイスは、類似度メタデータをしかるべく更新することが可能であるか、または例えば、他のマスターの書き込み先である出力配列の部分に対するメタデータを無効化するか、またはクリアすることが可能である(出力配列のこれらの部分は、全部、出力デバイスに読み出される)。後者は、所定のマスターデバイスが類似度メタデータを更新することができない場合に必要になる。また、例えば、他のマスターが出力配列の比較的大きな部分を修正する場合(または出力配列をそもそも修正する場合)に出力配列全体に対するメタデータを無効化(クリア)することも可能である。 In this case, each device that writes to the output data array can update the similarity metadata accordingly, or, for example, invalidate the metadata for the portion of the output array that the other master writes to Or they can be cleared (all these parts of the output array are read to the output device). The latter is necessary when a given master device cannot update the similarity metadata. Further, for example, when another master corrects a relatively large portion of the output array (or when the output array is corrected in the first place), the metadata for the entire output array can be invalidated (cleared).

本実施形態のさまざまな他の好ましい、代替配置構成も可能である。 Various other preferred and alternative arrangements of this embodiment are possible.

例えば、メタデータは、ディスプレイコントローラ7のローカルメモリ8内へのデータブロックの格納を管理するためにも使用され、特にローカルメモリ8からデータブロックの追い出しを決定する際の一因子としても使用されうる。例えば、メタデータは、繰り返し使用されることになる1つまたは複数のデータブロックを決定するために使用され、そのデータブロック(複数のデータブロック)は、次いで、処理デバイスのローカルメモリ内に(当分の間)ロックされ(そこに書き込まれた後)、将来必要になったときにローカルメモリにおいて利用可能である。 For example, metadata can also be used to manage the storage of data blocks in the local memory 8 of the display controller 7, and in particular can be used as a factor in determining the eviction of data blocks from the local memory 8. . For example, the metadata is used to determine one or more data blocks that will be used repeatedly, and the data block (s) is then (for the time being) in the local memory of the processing device. Locked) (after being written to) and available in local memory when needed in the future.

ローカルメモリ8内の所定のデータブロックが近い将来使用される回数のカウントを記録しておき(例えば、処理されている出力配列の部分に対しプリフェッチされたメタデータに基づいて)、その「使用」カウントがゼロになったときにはじめてローカルメモリからデータブロックを追い出すようにすることも可能である。 Record a count of the number of times a given block of data in local memory 8 will be used in the near future (e.g., based on metadata prefetched for the portion of the output array being processed) and its `` use '' It is also possible to purge data blocks from local memory only when the count reaches zero.

上記のことから、本発明は、その好ましい実施形態において、例えば、少なくともディスプレイコントローラの電力消費およびメモリ帯域幅を低減するのに役立ちうることがわかる。 From the above, it can be seen that the present invention, in its preferred embodiments, can help, for example, at least reduce power consumption and memory bandwidth of the display controller.

これは、少なくとも本発明の好ましい実施形態では、不要な「メイン」メモリ読み込みトランザクションを排除することによって、達成される。これにより、メインメモリから読み込まれるデータの量が低減され、したがって、システムの電力消費量およびメモリ帯域幅の消費量が大幅に低減される。これは、グラフィックスフレームバッファ、テクスチャにグラフィックスレンダー、ビデオフレームバッファ、および合成フレームバッファ読み込みトランザクションなどに適用することができる。 This is accomplished, at least in the preferred embodiment of the present invention, by eliminating unnecessary “main” memory read transactions. This reduces the amount of data read from the main memory, thus greatly reducing the system power consumption and memory bandwidth consumption. This can be applied to graphics frame buffer, graphics to texture render, video frame buffer, composite frame buffer read transaction, etc.

本発明を使用した時の電力および帯域幅の節減効果は、比較的大きくなる可能性がある。例えば、32バイトのリニアブロックを使用する、標準精細度フレームバッファによる、ゲームおよびビデオのコンテンツでは、前の4つのブロックが解析される場合(マルチビットのビットマップを必要とする)、出願人は、読み込みおよび書き込みトランザクションの約17%を排除できることを発見した。高精細度フレームバッファでは、この排除率はなおいっそう高まる。類似の構成によるGUIのコンテンツでは、フレームバッファの読み込みおよび書き込みトランザクションの約80%を排除することができる。 The power and bandwidth savings when using the present invention can be relatively large. For example, for game and video content with a standard definition frame buffer using 32 byte linear blocks, if the previous four blocks are parsed (requires a multi-bit bitmap), the applicant will And found that about 17% of read and write transactions can be eliminated. In high definition frame buffers, this rejection rate is even higher. GUI content with a similar configuration can eliminate about 80% of frame buffer read and write transactions.

フレーム表示速度が60fps(読み込み)、フレーム更新速度が30fps(書き込み)であり、32ビットオフチップ転送1回当たり2.4nJを想定する、HD(1920×1080×24bpp)に対して読み込みと書き込み両方が排除されるが、これは、ゲームおよびビデオコンテンツについて約90MB/sの帯域幅節減および57mWの節電に相当する。GUIコンテンツでは、この節減は、427MB/sおよび268mWとなる。 Assuming a frame display speed of 60 fps (reading), a frame update speed of 30 fps (writing), and 2.4 nJ per 32-bit off-chip transfer, both reading and writing are possible for HD (1920 x 1080 x 24 bpp) Excluded, this corresponds to about 90 MB / s bandwidth savings and 57 mW power savings for gaming and video content. For GUI content, this saving is 427MB / s and 268mW.

本発明においてメタデータを格納する必要があることで生じる追加のオーバーヘッドに関する限り、先行するデータブロックのみが解析されるシステムについては(つまり、メタデータはデータブロック位置毎に単一ビットを含む)、32バイトキャッシュラインに対応するデータブロックを使用する高精細度フレームでは結果として7.9MBを占有するHDフレームに対して32KBの付加的な制御データが生じることがわかった。64バイトのタイルラインに対応するデータブロックを使用する場合、制御データは16KBである。512バイトの半タイルに対応するデータブロックについては、これは2KBであり、1024バイトタイルに対応するデータブロックについては、これは1KBである。 As far as the additional overhead incurred by the need to store metadata in the present invention is concerned, for systems where only the preceding data block is parsed (i.e. the metadata contains a single bit for each data block position) It was found that high definition frames using data blocks corresponding to 32 byte cache lines resulted in 32 KB of additional control data for HD frames occupying 7.9 MB. When using a data block corresponding to a 64-byte tile line, the control data is 16 KB. For a data block corresponding to a 512-byte half tile, this is 2 KB, and for a data block corresponding to a 1024-byte tile, this is 1 KB.

1 タイルベースのグラフィックスプロセッサ(GPU)
2 表示デバイス
3 フレームバッファ
4 メインメモリ
5 相互接続部
6 メモリコントローラ
7 ディスプレイコントローラ
8 ローカルメモリバッファ
9 ホストCPU
10 メタデータ(データブロックの類似度情報)
11 データブロック
12 データブロック
13 ビットマップエントリ
14 ビットマップエントリ
20 バスインターフェイスユニット
21 メタデータバッファ
22 表示フォーマッタおよび出力ユニット
23 状態機械コントローラ
40 タイルレンダリングロジック
41 ブロック生成ロジック
42 バッファ
43 比較ロジック 1 Tile-based graphics processor (GPU)
2 Display device
3 Frame buffer
4 Main memory
5 Interconnection
6 Memory controller
7 Display controller
8 Local memory buffer
9 Host CPU
10 Metadata (data block similarity information)
11 data blocks
12 data blocks
13 Bitmap entry
14 Bitmap entry
20 Bus interface unit
21 Metadata buffer
22 Display formatter and output unit
23 State machine controller
40 tile rendering logic
41 Block generation logic
42 buffers
43 Comparison logic

Claims

A method of processing an array of data, wherein a processing device processes the array of data by processing successive blocks of data, each representing a particular region of the array of data, A block of data representing a particular area is read from a first memory in which the array of data is stored and stored in the memory of the processing device before the block of data is processed by the processing device In the method
Determining whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and the block of data already stored Is a block of data from the same data array as the block of data to be processed and is for a position in the data array that is different from the block of data to be processed, the similarity determination Based on the block of data to be processed, the block of data to be processed is determined to be similar to the block of data already stored in the memory of the processing device. was the case, then processing block of the data that is already stored in the memory of the processing device, Other blocks of the data to be processed, if said is determined in the memory of the processing device is not already similar to a block of the data stored, stored in the first memory Processing a new block of data from said array of data.

Determining whether a block of data to be processed for the data array is similar to a block of data already stored in the memory of the processing device, and processing based on the similarity determination Process a block of data already stored in the memory of the processing device or a new block of data from the array of data stored in the first memory for the block of data to be processed The steps to do are
If the block of data to be processed is determined to be considered to be similar to the block of data already stored in the B Karumemori of the processing device, a new block of data, the first memory Instead of reading from the data array stored in, storing in the memory of the processing device, but instead processing the existing block of data in the memory of the processing device by the processing device Processing as a block of data to be performed;
If it is determined that the block of data to be processed is not considered similar to the block of data already stored in the memory of the processing device, the new block of data is designated as the first block. Reading from the data array stored in memory, storing it in the memory of the processing device, and then processing the new block of data as a block of data to be processed by the processing device; The method of claim 1 comprising:

The method according to claim 1 or 2, wherein the processing device is one of a display controller, a CPU, a video processor, and a graphics processor.

The similarity determination process determines whether a data block to be processed using similarity information associated with the array of data is similar to a block already stored in the memory of the processing device. 4. A method according to any one of claims 1 to 3 for determining.

The data array is associated with similarity information indicating whether each data block in the data array is similar to other data blocks in the data array, and the similarity determination The process determines whether a data block to be processed using the associated similarity information for the data block is similar to a data block already stored in the memory of the processing device. 5. The method according to any one of 4 to 4.

A method of generating metadata for use when processing an array of data stored in memory,
For each of one or more blocks of data representing a particular region of the array of data to be processed,
Determining whether the block of data should be considered similar to other blocks of data for the data array, the other block of data from the same data array as the block of data And for a position in the data array that is different from the block of data;
Generating similarity information indicating whether the block of data has been determined to be similar to other blocks of data in the data array;
Storing said similarity information indicating whether said block of data has been determined to be similar to other blocks of data for said data array in relation to said array of data.

The step of determining whether the block of data should be considered similar to other blocks of data for the data array comprises comparing part or all of the actual contents of the data block 7. The method of claim 6, comprising determining whether the data blocks are considered similar.

A method of processing an array of data,
Generating an array of data to be processed;
For each of one or more blocks of data representing a particular region of the array of data to be processed,
Determining whether the block of data should be considered similar to other blocks of data for the data array, the other block of data from the same data array as the block of data And for a position in the data array that is different from the block of data;
Generating similarity information indicating whether the block of data has been determined to be similar to other blocks of data in the data array;
Storing the array of data and its associated generated similarity information;
Reading a block of data each representing a particular region of the array of data from the stored array of data and processing the data array before the block of data is processed by the processing device Storing a block of data in memory;
Using the similarity information generated for the data array, the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device. Determining whether or not
For the block of data to be processed, based on the similarity determination, the block of data to be processed is similar to the block of data already stored in the memory of the processing device. If it is determined that the block of data already stored in the memory of the processing device is processed , or the block of data to be processed is already stored in the memory of the processing device Processing a new block of data from the array of data stored in the first memory if it is determined that it is not similar to the block of data being .

Not writing data blocks to the data array in memory if it is determined that the data block should be considered similar to other data blocks for the data array The method according to any one of claims 6 to 8, further comprising:

The method according to claim 1, wherein the array of data is data representing an image.

11. A method according to any one of the preceding claims, wherein each block of data considered includes a cache line or a 2D partial tile of the data array.

A first memory for storing an array of data to be processed;
A processing device having local memory that processes an array of data stored in the first memory by processing successive blocks of data each representing a particular region of the array of data;
Read a block of data representing a particular area of the array of data stored in the first memory and data in the local memory of the processing device before the block of data is processed by the processing device A read controller configured to store the blocks of
Determining whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and the block of data already stored Is a block of data from the same data array as the block of data to be processed and is for a position in the data array that is different from the block of data to be processed, the similarity determination Based on the block of data to be processed, the block of data to be processed is determined to be similar to the block of data already stored in the memory of the processing device. and when the processing block of the data that is already stored in the memory of the processing device Is processed device, or, if the block of the data to be processed is determined to not similar to the block of the data already stored in the memory of the processing device, the first memory A controller configured to cause the processing device to process a new block of data from the array of data stored therein.

The system of claim 12, wherein the read controller and controller are part of the processing device.

An apparatus for use in processing an array of data stored in a first memory,
The processing device that reads a block of data representing a particular area of an array of data stored in the first memory and that processes the array of data before the block of data is processed by a processing device A read controller configured to store said block of data in a local memory of
Determining whether the block of data to be processed for the data array is similar to the block of data already stored in the memory of the processing device, and the block of data already stored Is a block of data from the same data array as the block of data to be processed and is for a position in the data array that is different from the block of data to be processed, the similarity determination Based on the block of data to be processed, the block of data to be processed is determined to be similar to the block of data already stored in the memory of the processing device. and when the processing block of the data that is already stored in the memory of the processing device Is processed device, or, if the block of the data to be processed is determined to not similar to the block of the data already stored in the memory of the processing device, the first memory And a control circuit configured to cause the processing device to process a new block of data from the array of data stored therein.

The controller is
If it is determined that the block of data to be processed is similar to the block of data already stored in the local memory of the processing device, the read controller is given a new block of data. Without reading from the data array stored in the first memory, causing the new block of data to be stored in the memory of the processing device and causing the processing device to store data in the memory of the processing device; Processing the existing block of data as a block of data to be processed by the processing device;
If it is determined that the block of data to be processed is not considered similar to the block of data already stored in the memory of the processing device, the read controller is given a new block of data. Data to be read from the data array stored in a first memory, stored in the memory of the processing device, and then the processing device to process the new block of data by the processing device 15. A system or apparatus as claimed in any one of claims 12, 13, or 14 configured to be processed as a block of.

16. The system or apparatus according to any one of claims 12, 13, 14, or 15, wherein the processing device is one of a display controller, a CPU, a video processor, and a graphics processor.

The controller determines whether a data block to be processed using similarity information associated with the array of data is similar to a block already stored in the memory of the processing device. Item 17. The system or apparatus according to any one of Items 12 to 16.

The data array is associated with similarity information indicating for each data block of the data array whether the data block is similar to other data blocks in the data array, and the controller 18. The method of claims 12 to 17, wherein the associated similarity information for the data block is used to determine whether the data block to be processed is similar to a data block already stored in the memory of the processing device. A system or apparatus according to any one of the preceding claims.

A data processor for generating and processing an array of data;
For each of one or more blocks of data representing a particular region of the array of data for determining whether the block of data should be considered similar to other blocks of data for the data array Means wherein the other block of data is from the same data array as the block of data and is about a position in the data array different from the block of data;
Means for generating similarity information indicating whether the block of data has been determined to be similar to other blocks of data for the data array;
Means for storing said similarity information indicating whether a block of data has been determined to be similar to other blocks of data for said data array in relation to said array of data .

For each one or more blocks of data representing a particular region of the array of data, to determine whether the block of data should be considered similar to other blocks of data for the data array The means for generating similarity information indicating whether the block of data has been determined to be similar to other blocks of data for the data array, and the block of data is the array of data 20. The means for storing the similarity information indicating whether it is determined to be similar to other blocks of data for the data array relative to the data array is part of the data processor. System.

The system according to any one of claims 19 and 20, wherein the data processor is one of a camera controller, a graphics processor, a CPU, and a video processor.

The means for determining whether the block of data should be considered similar to other blocks of data for the data array comprises comparing part or all of the actual contents of the data block 22. A system according to any one of claims 19, 20, or 21 comprising means for determining whether said data blocks are considered similar.

An apparatus for use in the data processing system wherein an array of data generated by a data processing system is read from the output buffer by reading from the output buffer a block of data representing a particular region of the array of data ,
Comparing the block of data for the data array with at least one other block of data for the data array, wherein the at least one other block of data is from the same data array as the block of data; and , For a position in the data array that is different from the block of data, and based on the comparison, indicates whether the block of data should be considered similar to other blocks of data for the data array Means for generating information;
Means for storing similarity information associated with the data array.

A data processor for generating an array of data to be processed;
For each one or more blocks of data representing a particular region of the array of data, to determine whether the block of data should be considered similar to other blocks of data for the data array Means of
Means for generating similarity information indicating whether the block of data has been determined to be similar to other blocks of data for the data array;
Means for storing said array of data and its associated generated similarity information;
A processing device having local memory that processes the stored array of data by processing successive blocks of data each representing a particular region of the array of data;
Read a block of data representing a particular area of the array of data from the stored array of data and store the data in the local memory of the processing device before the block of data is processed by the processing device A read controller configured to store the block;
Using the similarity information generated for the data array, a block of data to be processed for the data array becomes a block of data already stored in the memory of the processing device. Determine whether they are similar and the block of data already stored is a block of data from the same data array as the block of data to be processed and the block of data to be processed The block of data to be processed in the memory of the processing device with respect to the block of data to be processed based on the similarity determination . If it already has been determined to be similar to the block of the data stored in the main of the processing device A block of the data already stored in the re is processed in the processing device, or block of the data to be processed, the block of the data already stored in the memory of the processing device A data processing system comprising: a controller configured to cause the processing device to process a new block of data from the array of data stored in the first memory if determined to be dissimilar .

To avoid writing a data block to the data array in memory if it is determined that the data block should be considered similar to other data blocks for the data array 25. The system or apparatus according to any one of claims 19 to 24, further comprising:

26. The system or apparatus according to any one of claims 12 to 25, wherein the array of data is data representing an image.

27. A system or apparatus as claimed in any one of claims 12 to 26, wherein each block of data under consideration comprises a cache line or 2D partial tile of the data array.

The computer, the computer program comprising code order to perform all the steps of the method according to any one of claims 1 to 12.