JP2003067360A

JP2003067360A - Method and device for sum of products calculation

Info

Publication number: JP2003067360A
Application number: JP2001254847A
Authority: JP
Inventors: Mitsuhiro Inazumi; 満広稲積
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2001-08-24
Filing date: 2001-08-24
Publication date: 2003-03-07

Abstract

PROBLEM TO BE SOLVED: To simplify a sum of products calculation requiring a large amount of calculations in an orthogonal transformation processing. SOLUTION: This device for sum of products calculation comprises a data selection means 2 for selecting eight pieces of data in unit of sum of products calculation from a data storage means 1, a zero/non-zero judgment means 3 for judging whether each of the selected eight pieces of data is zero or not zero, an address transformation means 4a for performing, based on the positions of the non-zero elements in the eight pieces of data, an address transformation for a data address generated in a CPU 6 so that only the non-zero elements can be read out continuously, and an address transformation means 4b for performing an address transformation for a conversion factor generated in a CPU 6 so that the conversion factor for the non-zero element can be read out.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音情報、画像情報、映
像情報などの情報圧縮、また符号化などの信号処理に用
いられる直交変換処理を高速に行うための積和演算方法
および積和演算装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a sum-of-products calculation method and a sum-of-products calculation for high-speed orthogonal transformation processing used for signal processing such as information compression of sound information, image information, video information, and coding. The present invention relates to a computing device.

【０００２】[0002]

【従来の技術】情報機器の高機能化により、情報の符号
化、圧縮に欠かせない直交変換処理は、より重要度を増
している。また、情報機器の汎用性の面から、また技術
革新の速さへの対応のために、より汎用的なハードウェ
ア、また、より一般的なデータ構造に対応できる直交変
換処理方法が求められている。2. Description of the Related Art As information devices have become more sophisticated, orthogonal transform processing, which is essential for encoding and compressing information, has become more important. In addition, in terms of versatility of information equipment, and in order to respond to the speed of technological innovation, more general-purpose hardware and an orthogonal transform processing method that can support more general data structures are required. There is.

【０００３】以下は具体的な説明のために、広く用いら
れている静止画符号化技術であるＪＰＥＧ処理を例にと
る。ＪＰＥＧは、離散コサイン変換（ＤＣＴ）、量子
化、ランレングス符号化、ハフマン符号化などを技術要
素として含む画像圧縮、符号化方法である。図６から図
８はこれらを簡単に説明するためのものである。For the sake of concrete description, the JPEG process, which is a widely used still image coding technique, will be taken as an example below. JPEG is an image compression / encoding method that includes discrete cosine transform (DCT), quantization, run-length encoding, Huffman encoding, and the like as technical elements. 6 to 8 are for the purpose of simply explaining these.

【０００４】図６は、画像データＤ１からＪＰＥＧ処理
の単位となるデータブロックＡ１，Ａ２，Ａ３，…の切
り出しを模式的に示すものである。ＪＰＥＧ処理は、Ｎ
とＭを自然数とし、８Ｎ×８Ｍのサイズのデータブロッ
クに対して行われる。８Ｎ×８Ｍのデータブロックは、
データを間引かれて、最終的に８×８のサイズとなり、
ＪＰＥＧ処理そのものは、全て８×８のデータブロック
に対して行われる。そのため、以降は８×８のデータブ
ロックを例にして説明する。FIG. 6 schematically shows the cutout of data blocks A1, A2, A3, ... As the unit of JPEG processing from the image data D1. JPEG processing is N
And M are natural numbers, and this is performed for a data block of size 8N × 8M. The 8Nx8M data block is
The data is thinned out and finally the size becomes 8x8.
The JPEG processing itself is performed on all 8 × 8 data blocks. Therefore, the description will be given below by taking an 8 × 8 data block as an example.

【０００５】具体的な例として、あるデータブロックの
データが図７（ａ）のようなものであるとする。ＪＰＥ
Ｇ処理の第１ステップは、これにＤＣＴ処理を行うもの
である。ＤＣＴ処理は直交変換の一種であり、画像デー
タを周波数領域に変換するものである。ＤＣＴ処理は数
式で表現すると、As a concrete example, it is assumed that the data of a certain data block is as shown in FIG. JPE
The first step of G processing is to perform DCT processing on it. The DCT process is a kind of orthogonal transform and transforms image data into the frequency domain. If the DCT process is expressed by a mathematical formula,

【０００６】[0006]

【数１】 [Equation 1]

【０００７】のように表される。この（１）式におい
て、Ｐは画像データであり、Ｃは変換係数、ｉ、ｊはそ
れぞれ横方向、縦方向の画素の位置を示す。また、ｍ，
ｎはそれぞれ横方向、縦方向の周波数成分を示す。ま
た、Ｂは本来、符号無しデータである画素データを計算
の都合上、符号有りデータへ変換するためのバイアス値
である。上記の演算によって得られるＳが、元画像を周
波数領域へ変換したものである。It is expressed as follows. In the equation (1), P is image data, C is a conversion coefficient, and i and j indicate the positions of pixels in the horizontal and vertical directions, respectively. Also, m,
n indicates frequency components in the horizontal and vertical directions, respectively. Further, B is a bias value for converting pixel data that is originally unsigned data into signed data for convenience of calculation. S obtained by the above calculation is a transformation of the original image into the frequency domain.

【０００８】この処理により、図７（ａ）のデータは、
図７（ｂ）のデータへ変換される。画像の復号時には、
このデータに対し、以下に示す（２）式のような逆ＤＣ
Ｔ処理を行うことにより、元のデータが復元される。By this processing, the data shown in FIG.
It is converted into the data of FIG. When decoding the image,
For this data, the inverse DC as shown in equation (2) below
The original data is restored by performing the T process.

【０００９】[0009]

【数２】 [Equation 2]

【００１０】ＪＰＥＧ処理における次のステップは、量
子化である。人間の視覚特性は周波数の低いデータに対
して敏感であり、逆に周波数の高いデータに対して鈍感
である。この特性を利用し、周波数の低い領域のデータ
をより細かなスケールで表現し、逆に周波数の高い領域
のデータをより粗いスケールで表現することにより、全
体としてのデータ量を削減することができる。このよう
な操作をスカラー量子化と呼ぶ。The next step in JPEG processing is quantization. Human visual characteristics are sensitive to low frequency data, and conversely insensitive to high frequency data. By using this characteristic, the data in the low frequency region can be expressed on a finer scale, while the data in the high frequency region can be expressed on a coarser scale, thereby reducing the overall data amount. . Such an operation is called scalar quantization.

【００１１】仮に図７（ｄ）のような量子化スケールを
用いた場合、図７（ｂ）のデータは、図７（ｃ）のデー
タに量子化される。この図７（ｃ）より明らかなよう
に、零となる要素の数が増えている。If a quantization scale as shown in FIG. 7 (d) is used, the data shown in FIG. 7 (b) is quantized into the data shown in FIG. 7 (c). As is clear from FIG. 7C, the number of zero elements is increasing.

【００１２】ＪＰＥＧ処理における次のステップは、ラ
ンレングス符号化である。図８および図９はこれを説明
するためのものである。まず最初に８×８の６４個の要
素は、図８（ａ）に示すようなジグザグスキャン順序
で、図８（ｂ）に示すような１列のデータへ並べ替えら
れる。これは、より最初の方に次元的な意味での低周波
側のデータがあり、後に行くほど高周波側のデータが並
べられることになる。The next step in the JPEG process is run-length coding. 8 and 9 are for explaining this. First, 64 elements of 8 × 8 are rearranged into one column of data as shown in FIG. 8B in a zigzag scan order as shown in FIG. 8A. This means that there is data on the low frequency side in a dimensional sense at the beginning, and the data on the high frequency side is arranged in a later order.

【００１３】図９は、図７（ｃ）のデータを並べ替えた
例であり、図９（ａ）のデータは、図９（ｂ）のように
並べ替えられる。実際のランレングス符号化では、この
並びの中の零の連鎖をさらに符号化する。たとえば、図
９（ｃ）に示すように、ある位置以降のデータが全て零
である場合、その零の連鎖の全てを、一つのデータ終端
記号（ＥｎｄＯｆＢｌｏｃｋ、略してＥＯＢ）で置
き換える。さらに、それ以外の零もそれぞれ記号に置き
換えられるのであるが、本発明の要点からは離れるので
省略する。FIG. 9 is an example in which the data of FIG. 7C is rearranged, and the data of FIG. 9A is rearranged as shown in FIG. 9B. In actual run length coding, the chain of zeros in this sequence is further coded. For example, as shown in FIG. 9C, when the data after a certain position are all zero, all the chains of the zeros are replaced with one data terminal symbol (End Of Block, EOB for short). Further, other zeros can be replaced with symbols, but this is omitted because it is far from the gist of the present invention.

【００１４】ＪＰＥＧにおける次のステップはハフマン
符号化であるが、これも本発明の要点から離れるので省
略する。The next step in JPEG is Huffman encoding, which is also omitted from the point of the present invention.

【００１５】ところで、ＪＰＥＧ全体の処理の中で、特
にＤＣＴ処理は非常に大量の演算を必要とする。つま
り、この部分を高速化することが、ＪＰＥＧ処理全体の
高速化につながる。以下、ＤＣＴ演算を例にとり、その
高速化について説明する。By the way, in the processing of the entire JPEG, especially the DCT processing requires a very large amount of calculation. In other words, speeding up this part leads to speeding up of the entire JPEG processing. Hereinafter, taking the DCT operation as an example, the speedup will be described.

【００１６】ＤＣＴ演算は前述の（１）式に示したよう
なものであるが、これは以下に示すような２段階の１次
元処理として実行される。The DCT operation is as shown in the above equation (1), which is executed as a two-step one-dimensional process as shown below.

【００１７】[0017]

【数３】 [Equation 3]

【００１８】[0018]

【数４】 [Equation 4]

【００１９】この計算を、より分かりやすくするため
に、変換係数をIn order to make this calculation easier to understand, the conversion coefficient is

【００２０】[0020]

【数５】 [Equation 5]

【００２１】とし、行列演算の形で書くと以下のような
形となる。When written in the form of matrix calculation, the following form is obtained.

【００２２】[0022]

【数６】 [Equation 6]

【００２３】[0023]

【数７】 [Equation 7]

【００２４】これは模式的に書くと、図１０（ａ），
（ｂ）のように、行方向の演算処理と列方向の演算処理
（処理方向それぞれ矢印で示す）を、行または列のいず
れかから順次実行すると言うことである。ここで重要で
あるのは、（３）式と（４）式の実行順序についての自
由度があると言うことである。原理的に言えば、演算が
有限精度で行われることによる誤差を除き、最終的な結
果は、この実行順序に依存しない。This is schematically shown in FIG. 10 (a),
As in (b), the row-direction arithmetic processing and the column-direction arithmetic processing (each indicated by an arrow in the processing direction) are sequentially executed from either the row or the column. What is important here is that there is a degree of freedom regarding the execution order of equations (3) and (4). In principle, the final result does not depend on this order of execution, except for errors due to operations being performed with finite precision.

【００２５】ＤＣＴ演算の高速化は、このような行列演
算を高速化することと同等になる。このような高速化に
おいて、従来例において用いられる方法は以下のような
ものがある。Speeding up the DCT calculation is equivalent to speeding up such matrix calculation. In order to increase the speed, there are the following methods used in the conventional example.

【００２６】特開平４−７００６０において用いられて
いる手法は、データブロック内の全てのデータが零の
時、そのＤＣＴ演算をスキップするというものである。The method used in Japanese Patent Laid-Open No. 4-70060 is to skip the DCT operation when all the data in the data block are zero.

【００２７】上の式から明らかであるように、このよう
な条件では、演算結果は全て零になるので、演算を省略
することが可能である。このような条件は、特にプログ
レッシブ型と呼ばれる時に成立する可能性が高い。As is clear from the above equation, under such conditions, the calculation results are all zero, so the calculation can be omitted. Such a condition is highly likely to be satisfied especially when it is called a progressive type.

【００２８】特開平４−１３７９７５において用いられ
ている手法は、４並列演算が可能なハードウェアにおい
て、その４要素が零の時に演算をスキップすると言うよ
うなものである。The method used in Japanese Patent Laid-Open No. 4-137975 is such that, in hardware capable of performing four parallel operations, the operation is skipped when the four elements are zero.

【００２９】ＤＣＴ演算で用いられるデータの精度は１
６ビットであることが多く、６４ビットの演算レジスタ
をもつハードウェアにおいては、４つのデータを同時に
処理することができる。この並列演算により処理を高速
化し、さらに、結果が自明であるデータが全て零の時に
演算を省略することにより、より高速化を行うものであ
る。この手法は特開平４−２０００７９においても述べ
られている。The accuracy of the data used in the DCT calculation is 1
Often 6 bits, four pieces of data can be processed simultaneously in hardware having a 64-bit arithmetic register. This parallel operation speeds up the processing, and further speeds up the processing by omitting the operation when the data for which the result is trivial is zero. This technique is also described in JP-A-4-200079.

【００３０】また、特開平４−２２００８１において用
いられている手法は、上述の特開平４−７００６０と類
似した手法であるが、データブロック内のデータが直流
成分、つまり、（０、０）要素のみの場合に対応したも
のである。このとき、ＤＣＴ演算結果は、全てのデータ
が同じ値となり、１つのデータの演算を行うだけで、演
算結果を得ることができる。The method used in Japanese Patent Laid-Open No. 4-220081 is similar to the above-mentioned Japanese Patent Laid-Open No. 4-70060, but the data in the data block is a DC component, that is, (0, 0) element. It corresponds to the case of only. At this time, all the data have the same value in the DCT calculation result, and the calculation result can be obtained only by calculating one data.

【００３１】また、特開平１０−６３６４６で用いられ
ている手法は、データの状態に対応した複数の演算手段
を持ち、ＥＯＢの位置により、それらの演算手段の内か
ら最適なものを選択することにより演算を高速化するも
のである。Further, the method used in Japanese Patent Laid-Open No. 10-63646 has a plurality of arithmetic means corresponding to the state of data, and selects the most suitable arithmetic means among these arithmetic means according to the position of EOB. The calculation speeds up.

【００３２】図１１はこれを簡単に説明するものであ
る。図１１（ａ）に示す位置にＥＯＢが現れたと仮定す
ると、先に説明したように、それ以降のデータは全て零
である。逆に、それ以前のデータは零ではない可能性が
高い。つまり、零ではないデータの位置を塗りつぶし、
零であるデータの位置を塗りつぶさないで表示すると、
ある点にＥＯＢが現れた場合の零、非零のデータの分布
は、図１１（ｂ）のようなものである可能性が高い。こ
のデータの偏りに対応する演算手段を複数用意し、最も
適切な演算手段を選択することにより、ＤＣＴ演算を高
速化することができる。FIG. 11 simply illustrates this. Assuming that the EOB appears at the position shown in FIG. 11A, the data after that are all zero, as described above. On the contrary, the data before that are likely to be non-zero. In other words, fill the non-zero data position,
When displaying the position of data that is zero without filling,
The distribution of zero and non-zero data when EOB appears at a certain point is likely to be as shown in FIG. 11 (b). The DCT calculation can be speeded up by preparing a plurality of calculation means corresponding to the bias of the data and selecting the most suitable calculation means.

【００３３】特開平１０−３２２６９９は、上述の例と
類似した考え方であるが、予めデータ領域を２つの領域
に分割し、データが零ではない可能性が高い低周波領域
においては常に高速ＤＣＴアルゴリズムを用いた演算を
行い、それ以外の領域では通常のＤＣＴを行うものであ
る。これは図１２（ａ）に示すように、たとえば、デー
タ領域を４つに分割し、左上の領域Ｚ１は常にデータが
あると仮定し、図１２（ｂ）に示すように、その左上の
領域はすべて処理対象とし、それ以外の領域は、個別に
零、非零の判断と処理を行うものである。Japanese Unexamined Patent Publication No. 10-322699 has a similar concept to the above example, but the data area is divided into two areas in advance, and the fast DCT algorithm is always used in the low frequency area where the data is likely to be not zero. Is performed, and normal DCT is performed in other areas. As shown in FIG. 12A, for example, the data area is divided into four areas, and it is assumed that the upper left area Z1 always has data. As shown in FIG. Are all to be processed, and the other areas are individually judged and processed as zero and non-zero.

【００３４】これは、処理をある程度固定することによ
り、先の従来例で必要であった条件判断のための負荷を
軽減するものである。By fixing the processing to some extent, this reduces the load for condition judgment required in the prior art example.

【００３５】特開平１１−４１６０１は、以上で説明し
た４並列演算、複数の演算手段、またＥＯＢの位置によ
る条件判断の全てを用いるものである。Japanese Unexamined Patent Application Publication No. 11-41601 uses all of the four parallel operations described above, a plurality of operation means, and the condition determination based on the EOB position.

【００３６】[0036]

【発明が解決しようとする課題】先に述べたような従来
例は、それぞれ有効なものであるが、以下に述べるよう
な課題は残る。The above-mentioned conventional examples are effective, respectively, but the following problems remain.

【００３７】特開平４−７００６０および特開平４−２
２００８１に述べられているのは、データが特殊な状態
である場合のみの高速化であり、より一般的な条件での
有効性は劣る。Japanese Unexamined Patent Publication Nos. 4-70060 and 4-2.
What is described in 20081 is speeding up only when the data is in a special state, and is less effective under more general conditions.

【００３８】特開平４−１３７９７５および特開平４−
２０００７９に述べられているような４並列演算ハード
ウェアを基本とするものは、４個のデータの整列、分解
の負荷、また、４個のデータが零か非零かの判断の負荷
が大きい。また、４並列演算処理の効果は非常に大きな
ものであるが、このような機能を持つハードウェア以外
においては有効なものではない。JP-A-4-137975 and JP-A-4-137975
The one based on the four parallel operation hardware as described in 201009 has a heavy load of sorting and disassembling four data, and a load of determining whether the four data are zero or non-zero. Further, although the effect of the four parallel arithmetic processing is very large, it is not effective except for hardware having such a function.

【００３９】具体的な例として、図１３（ａ）のような
データを仮定する。図中の塗りつぶした部分が非零要素
であるとする。この場合、４並列演算を用いると、図１
３（ｂ）の塗りつぶした位置の要素につき演算を行う必
要がある。勿論、並列演算の効果により、演算処理時間
の増大は少ないと考えられるが、同時に、単純な処理に
比較した処理の高速化の効果もない。また、データの
零、非零の判断、及び、データの整列、分解の処理が付
加的に必要となる。As a concrete example, assume the data as shown in FIG. It is assumed that the filled parts in the figure are non-zero elements. In this case, if four parallel operations are used, FIG.
It is necessary to perform the calculation for the element at the filled position in 3 (b). Of course, due to the effect of the parallel operation, it is considered that the increase of the operation processing time is small, but at the same time, there is no effect of speeding up the processing as compared with the simple processing. In addition, data zero / non-zero determination and data alignment / decomposition processing are additionally required.

【００４０】特開平１０−６３６４６、および、特開平
１０−３２２６９９は複数の演算手段を持つものであ
り、処理の複雑さが増す。またＥＯＢによる条件判断
は、必ずしも有効なものではない。たとえば、上と同じ
図１４（ａ）のようなデータを仮定する。そうすると、
ＥＯＢによる判断は、図１４（ｂ）のようなデータを仮
定することになり、非常に冗長な演算を行う必要がでて
くる。特開平１１−４１６０１も以上と同様である。Japanese Unexamined Patent Publication No. 10-63646 and Japanese Unexamined Patent Publication No. 10-322699 have a plurality of arithmetic means, which increases the complexity of processing. Further, the condition judgment based on EOB is not always effective. For example, assume the same data as shown in FIG. 14A as above. Then,
The determination by EOB is based on the assumption that the data shown in FIG. 14B is used, and it is necessary to perform a very redundant calculation. Japanese Patent Laid-Open No. 11-41601 is also similar to the above.

【００４１】以上のように従来例における直交変換処理
の高速化手法は、特殊なデータの場合のみへ対応したも
のであり、また、特定のハードウェア構造を仮定したも
のである。また、一部は非常に複雑な構造を必要とする
ものである。As described above, the method for speeding up the orthogonal transformation process in the conventional example is applicable only to the case of special data, and also assumes a specific hardware structure. In addition, some of them require a very complicated structure.

【００４２】本発明は、以上の課題を解決し、音情報、
画像情報、映像情報などの情報圧縮、また符号化などの
信号処理に用いられる直交変換処理を高速に行うための
積和演算方法および積和演算装置を提供することを目的
としている。The present invention solves the above problems and provides sound information,
It is an object of the present invention to provide a product-sum calculation method and a product-sum calculation device for performing high-speed orthogonal transformation processing used for information processing such as image information and video information and signal processing such as encoding.

【００４３】[0043]

【課題を解決するための手段】上述した目的を達成する
ため、本発明の積和演算方法は、積和演算単位を構成す
るＮ個のデータを所定のアドレス順で順次読み出して積
和演算を行う積和演算方法において、前記Ｎ個のデータ
のそれぞれについて零か非零かを判断し、そのＮ個のデ
ータにおける非零要素の存在する位置を示す情報に基づ
いて、前記アドレスに対し、前記非零要素のみを連続的
に読み出し可能となるようなアドレス変換を行い、その
変換後のアドレスによって読み出されたデータに対して
順次積和演算を行うようにしている。In order to achieve the above-mentioned object, the product-sum calculation method of the present invention performs a product-sum calculation by sequentially reading N pieces of data constituting a product-sum calculation unit in a predetermined address order. In the multiply-accumulate operation method, it is determined whether each of the N pieces of data is zero or non-zero, and based on the information indicating the position of the non-zero element in the N pieces of data, Address conversion is performed so that only non-zero elements can be continuously read, and the product-sum operation is sequentially performed on the data read by the converted address.

【００４４】この積和演算方法において、前記Ｎ個のデ
ータにおける非零要素の存在する位置に基づくアドレス
変換は、積和演算を行うに必要な変換係数を読み出すた
めのアドレスに対しても行い、その変換後のアドレスに
よって前記非零要素に対応する変換係数の読み出しを行
うようにしている。In this product-sum operation method, the address conversion based on the position where the non-zero element exists in the N pieces of data is also performed for the address for reading the conversion coefficient necessary for performing the product-sum operation. The conversion coefficient corresponding to the non-zero element is read by the converted address.

【００４５】また、本発明の積和演算装置は、積和演算
単位を構成するＮ個のデータを所定のアドレス順で順次
読み出して積和演算を行う積和演算装置において、処理
対処となるデータが記憶されるデータ記憶手段と、この
データ記憶手段に記憶された処理対象となるデータか
ら、積和演算単位となるＮ個のデータを切り出すデータ
選択手段と、このデータ選択手段で切り出されたＮ個の
データのそれぞれが零か非零かを判断する零・非零判定
手段と、この零・非零判定結果において非零要素の存在
する位置を示す情報に基づいて、前記アドレスに対し、
前記非零要素のみを連続的に読み出し可能となるような
アドレス変換を行うアドレス変換手段とを有し、このア
ドレス変換手段によるアドレス変換後のアドレスによっ
て読み出されたデータに対して前記積和演算手段により
積和演算を行うようにしている。In addition, the product-sum calculation apparatus of the present invention is a data-processing apparatus that performs a product-sum calculation by sequentially reading out N pieces of data forming a product-sum calculation unit in a predetermined address order. Is stored in the data storage means, the data selection means that cuts out N pieces of data, which is a unit of product operation, from the data to be processed stored in the data storage means, and the N selected by the data selection means. Based on the information indicating the position where the non-zero element exists in the zero / non-zero determination means for determining whether each of the data is zero or non-zero and the zero / non-zero determination result, with respect to the address,
Address conversion means for performing address conversion such that only the non-zero element can be continuously read, and the product-sum operation is performed on the data read by the address after the address conversion by the address conversion means. The means is used to perform the sum of products operation.

【００４６】このような積和演算装置において、前記Ｎ
個のデータにおける非零要素の存在する位置に基づいた
アドレス変換を行うアドレス変換手段は、前記Ｎ個のデ
ータを順次読み出すためのアドレスに対してアドレス変
換を行うデータアドレス変換手段と、前記変換係数を読
み出すためのアドレスに対してアドレス変換を行う変換
係数アドレス変換手段を設け、前記データを読み出すた
めのアドレスに対しては、前記データアドレス変換手段
により前記非零要素のみを連続的に読み出し可能となる
ようなアドレス変換を行い、前記変換係数を読み出すた
めのアドレスに対しては、前記変換係数アドレス変換手
段により前記非零要素に対応する変換係数の読み出しが
可能となるようなアドレス変換を行うようにしている。In such a product-sum calculation device, the N
Address conversion means for performing address conversion based on the position where a non-zero element exists in each piece of data, data address conversion means for performing address conversion for addresses for sequentially reading the N pieces of data, and the conversion coefficient A conversion coefficient address conversion unit for performing address conversion on the address for reading the data is provided, and only the non-zero element can be continuously read by the data address conversion unit for the address for reading the data. The address conversion is performed so that the address for reading the conversion coefficient is subjected to the address conversion so that the conversion coefficient address conversion means can read the conversion coefficient corresponding to the non-zero element. I have to.

【００４７】このように本発明は、積和演算単位を構成
するＮ個のデータのそれぞれについて零か非零かを判断
し、そのＮ個のデータにおける非零要素の存在する位置
を示す情報に基づき、たとえば、ＣＰＵやＤＳＰ（Digi
tal Signal Processor）などの積和演算処理手段で生成
されるアドレスに対し、前記非零要素のみを連続的に読
み出し可能となるようなアドレス変換を行い、そのアド
レス変換後のアドレス順で読み出されたデータに対して
順次積和演算を行うようにしている。As described above, according to the present invention, it is determined whether each of the N pieces of data forming the product-sum operation unit is zero or non-zero, and the information indicating the position where the non-zero element exists in the N pieces of data is used. Based on, for example, CPU and DSP (Digi
(Tal Signal Processor) etc., the address conversion is performed so that only the non-zero elements can be continuously read, and the addresses are read in the address order after the address conversion. The product-sum operation is sequentially performed on the data.

【００４８】このように、ＣＰＵやＤＳＰなどが生成す
る連続的なアドレス（積和演算単位として切り出された
Ｎ個のデータを、たとえば、データ位置０から順に１，
２，３，…とアクセスするためのアドレス）を、非零要
素のデータのみを順次アクセス可能なアドレスに変換し
て、その変換後のアドレス順でデータを読み出して積和
演算を行うようにしている。In this way, consecutive addresses generated by the CPU, DSP, etc. (N pieces of data cut out as a unit of multiply-accumulate operation are stored in order from the data position 0, 1, for example.
2, 3, ...) are converted into addresses in which only non-zero element data can be sequentially accessed, and the data is read in the converted address order to perform the multiply-accumulate operation. There is.

【００４９】これにより、ＣＰＵやＤＳＰなどは単に連
続したアドレスによって演算すべきデータを読み出して
積和演算する動作を行うだけで、実際には、疎な状態で
飛び飛びにしか存在しない非零要素のみを次々と連続し
て読み出す動作を行うことになる。したがって、たと
え、積和演算すべき非零要素のデータが疎な状態で散ら
ばって存在していたとしても、その疎な状態の非零要素
を次々と連続的に読み出すことができ、これによって、
直交変換処理に伴う積和演算量を効率よく行うことがで
き、その演算量を大幅に削減することができる。As a result, the CPU, DSP, etc. simply read the data to be operated by consecutive addresses and perform the sum-of-products operation, but in reality, only non-zero elements that are sparsely and only randomly exist. Will be read continuously one after another. Therefore, even if the data of the non-zero elements to be subjected to the multiply-accumulate operation exist in a sparse state, the non-zero elements in the sparse state can be continuously read one after another.
It is possible to efficiently perform the product-sum calculation amount that accompanies the orthogonal transform processing, and to significantly reduce the calculation amount.

【００５０】また、そのアドレス変換は、積和演算を行
うに必要な変換係数を読み出すためのアドレスに対して
も行うようにしているので、疎な状態で存在する非零要
素のデータに対応する変換係数のみを順次アクセスする
ことができる。Further, since the address conversion is performed also for the address for reading the conversion coefficient necessary for performing the product-sum operation, it corresponds to the non-zero element data existing in a sparse state. Only the transform coefficients can be accessed sequentially.

【００５１】[0051]

【発明の実施の形態】以下、本発明の実施の形態につい
て説明する。なお、この実施の形態で説明する内容は、
本発明の積和演算方法および積和演算装置についての説
明である。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below. The contents explained in this embodiment are
It is an explanation of the product-sum calculation method and the product-sum calculation device of the present invention.

【００５２】また、本発明の積和演算方法および積和演
算装置は特定の直交変換に適用されるものではないが、
この実施の形態では具体的な説明のために、広く用いら
れている静止画符号化技術であるＪＰＥＧ処理に適用す
る場合を例にとる。Although the product-sum calculation method and the product-sum calculation device of the present invention are not applied to a specific orthogonal transformation,
In this embodiment, for concrete description, a case where the present invention is applied to JPEG processing which is a widely used still image coding technique is taken as an example.

【００５３】本発明の基本的な考え方は、一般的なＣＰ
ＵやＤＳＰにおいては、積和演算すべきデータが疎な状
態で並んで存在する場合の積和演算効率は悪いが、密な
状態に並んで存在するデータについては効率良く積和演
算を行うことができることを利用したものである。The basic idea of the present invention is that a general CP
In the U and DSP, the product-sum operation efficiency is poor when the data to be sum-added are sparsely arranged side by side, but the product-sum operation is efficiently performed on the data that are densely arranged. It utilizes what can be done.

【００５４】このため、本発明では、疎な状態で並んだ
データ位置に依存したアドレス変換処理を行うことによ
り、ＣＰＵやＤＳＰなどの積和演算処理手段におけるソ
フトウエア側からは、あたかも密にデータが分布してい
るかのようにみせるようにしたものである。Therefore, in the present invention, the address conversion processing depending on the data positions arranged in a sparse state is performed, so that the software in the product-sum operation processing means such as the CPU and the DSP is as if the data is densely packed. Is made to appear as if they are distributed.

【００５５】本発明は大別して2つの処理（これを第1の
処理と第2の処理という）に分けて考えることできる。The present invention can be roughly divided into two processes (the first process and the second process).

【００５６】第1の処理は、処理対象データが記憶され
ているデータ記憶手段から、１つの積和演算単位を構成
するＮ個（ここではＪＰＥＧを例にとっているのでＮ＝
８）データを切り出し、その積和演算単位の中に非零要
素がどのような位置に存在ししているかを判断する非零
要素位置の判断処理である。In the first process, N pieces of one product-sum operation unit are formed from the data storage means in which the data to be processed is stored (here, since JPEG is taken as an example, N =
8) This is a non-zero element position determination process of cutting out data and determining at what position the non-zero element exists in the product-sum operation unit.

【００５７】また、第2の処理は、第1の処理で判断され
た非零要素位置の判断処理に基づき、ＣＰＵやＤＳＰで
生成されるアドレスを変換するアドレス変換処理であ
る。The second process is an address conversion process for converting the address generated by the CPU or DSP based on the determination process of the non-zero element position determined by the first process.

【００５８】なお、このＣＰＵやＤＳＰで生成されるア
ドレスは、この場合、８個のデータを順次読み出すため
のアドレス（データアドレスという）と、それぞれのデ
ータに対応した変換係数を読み出すためのアドレス（変
換係数アドレスという）が存在し、これらデータアドレ
スと変換係数アドレスをともにアドレス変換する必要が
あるが、これらのアドレス変換処理は同様に考えること
ができるので、まずは、データアドレスについての処理
を説明する。The addresses generated by the CPU and DSP are, in this case, addresses for sequentially reading eight data (referred to as data addresses) and addresses for reading conversion coefficients corresponding to the respective data ( (There is a conversion coefficient address), and it is necessary to perform address conversion of both the data address and the conversion coefficient address. However, since the address conversion processing can be considered in the same way, first, the processing for the data address will be described. .

【００５９】図1は第1の処理を説明するもので、図2は
第2の処理を説明する図である。図1において、データ記
憶手段１に記憶されているデータの中から積和演算単位
として切り出された８個のデータ（上述したように、こ
こではＪＰＥＧを例にしているので１つの積和演算単位
を８個のデータとし、そのデータ位置を「０」〜「７」
で表す）をデータ選択手段２が切り出し、その切り出し
た8個のデータの零・非零判定を零・非零判定手段３に
より行う。FIG. 1 is a diagram for explaining the first process, and FIG. 2 is a diagram for explaining the second process. In FIG. 1, eight pieces of data cut out from the data stored in the data storage means 1 as the product-sum operation unit (as described above, since JPEG is taken as an example, one product-sum operation unit) As 8 pieces of data, and the data position is "0" to "7"
Is expressed by the data selection means 2 and the zero / non-zero judgment means 3 judges the zero / non-zero of the cut-out eight data.

【００６０】ここでは、黒く塗りつぶした部分を非零要
素とし、その非零要素のデータを“１”で表し、また、
零要素のデータを“０”で表すと、この切り出された8
個のデータの零・非零判定結果は、“１００１０１１
０”となる。これを10進数で表すと、「１５０」とな
り、この零・非零判定手段による零・非零判定結果であ
る非零要素の位置を示す情報に基づいてアドレス変換処
理を行なう。Here, the black-painted portion is a non-zero element, the data of the non-zero element is represented by "1", and
If the zero element data is represented by "0", this cut-out 8
The result of the zero / non-zero judgment for each piece of data is “1001011
It becomes 0 ". When expressed in decimal, it becomes" 150 ", and the address conversion processing is performed based on the information indicating the position of the non-zero element which is the result of the zero / non-zero judgment by the zero / non-zero judgment means. .

【００６１】まず、この非零要素の位置を示す情報を用
いて、アドレス変換手段がアドレッシングモード設定処
理を行う。このアドレッシングモード設定処理は、アド
レッシングモード選択手段４１における「０」〜「２５
５」の「１５０」の位置をkeyとして、アドレッシング
データ記憶手段４２内のアドレッシングデータを取得す
る処理である。First, the address conversion means performs the addressing mode setting process using the information indicating the position of the non-zero element. This addressing mode setting process is performed by "0" to "25" in the addressing mode selection means 41.
This is a process of acquiring the addressing data in the addressing data storage means 42 by using the position of "150" of "5" as a key.

【００６２】このアドレッシングデータ記憶手段４２に
は、アドレッシングモード選択手段４１の「０」〜「２
５５」のそれぞれに対応したデータ長とそのデータ位置
が記述されていて、たとえば、アドレッシングモード選
択手段４１の「１５０」はデータ長が「４」でそのデー
タ位置が「１，２，４，７」であることを示している。
つまり、アドレッシングモード選択手段４１の「１５
０」は、８個のデータのうち非零要素のデータが４個存
在し、その４個は「０」〜「７」のデータ位置のうち、
「１」，「２」，「４」，「７」の位置に存在している
ことを示している。The addressing data storage means 42 stores "0" to "2" in the addressing mode selection means 41.
The data length and the data position corresponding to each of "55" are described. For example, "150" of the addressing mode selecting means 41 has a data length of "4" and data positions of "1, 2, 4, 7". ".
That is, “15” of the addressing mode selection means 41
“0” has four non-zero element data out of eight data, and four of them are data positions of “0” to “7”.
It is shown that it exists at the positions of "1", "2", "4", and "7".

【００６３】ちなみに、アドレッシングモード選択手段
４１の「０」は、非零要素が０個であるので、そのデー
タ位置は無いことを意味している。また、アドレッシン
グモード選択手段４１の「２５５」は、８個のデータ全
てが非零要素であって、その８個のデータは「０」，
「１」，「２」，「３」，「４」，「５」，「６」，
「７」のデータ位置に存在していることを示している。Incidentally, "0" of the addressing mode selection means 41 means that there is no non-zero element, so that the data position does not exist. Further, in "255" of the addressing mode selection means 41, all eight data are non-zero elements, and the eight data are "0",
"1", "2", "3", "4", "5", "6",
It indicates that the data exists at the data position of "7".

【００６４】このようにして、1つの積和演算単位とし
て切り出された８個のデータに対し、非零要素の数とそ
のデータ位置が設定される。そして、このように設定さ
れた内容に基づいたアドレス変換処理がなされる。この
アドレス変換処理は図２に示すようにして行われる。In this way, the number of non-zero elements and their data positions are set for the eight pieces of data cut out as one product-sum operation unit. Then, the address conversion process based on the contents set in this way is performed. This address conversion process is performed as shown in FIG.

【００６５】まず、ベースアドレス記憶手段４３に記憶
されているベースアドレス（積和演算単位として切り出
された８個のデータに対し、最初のアドレスとして設定
されるアドレス）を読み出して、読み出されたベースア
ドレス（０ｘ１０００とする）を、積和演算処理手段と
してのＣＰＵやＤＳＰが、積和演算単位として切り出さ
れた８個のデータを順に読み出すように生成した連続的
なアドレス（これをここでは仮想アドレスと呼び、０ｘ
１０００，０ｘ１００１，０ｘ１００２，…で表す）か
ら引き算する。なお、このＣＰＵやＤＳＰから出力され
る仮想アドレスは仮想アドレス入力手段４４に入力され
ている。First, the base address stored in the base address storage means 43 (the address set as the first address for the eight pieces of data cut out as the product-sum operation unit) is read out and read out. A base address (0x1000) is a continuous address generated by a CPU or DSP as a product-sum operation processing unit so as to sequentially read out eight pieces of data cut out as a product-sum operation unit (this is a virtual address here). Called address, 0x
1000,0x1001, 0x1002, ...)). The virtual address output from the CPU or DSP is input to the virtual address input means 44.

【００６６】たとえば、ベースアドレス０ｘ１０００を
仮想アドレス０ｘ１０００から引き算すると「０」であ
り、その「０」をkeyとして、図1で設定されたアドレッ
シングデータ記憶手段４２のアドレッシングデータ（こ
の場合、データ長が「４」でそのデータ位置が「１，
２，４，７」）を見ると、データ位置として「１」が書
き込まれていて、この「１」をベースアドレス０ｘ１０
００に加算して、物理アドレスとして０ｘ１００１を得
る。この物理アドレス０ｘ１００１は物理アドレス記憶
手段４５に記憶される。For example, when the base address 0x1000 is subtracted from the virtual address 0x1000, it is "0". The addressing data of the addressing data storage means 42 set in FIG. 1 (in this case, the data length is The data position is "1,"
2, 4, 7 ”),“ 1 ”is written as the data position, and this“ 1 ”is set to the base address 0x10.
00 to obtain 0x1001 as the physical address. This physical address 0x1001 is stored in the physical address storage means 45.

【００６７】同様に、ベースアドレス０ｘ１０００を仮
想アドレス０ｘ１００１から引き算すると「１」であ
り、その「１」をkeyとして、図１で設定されたアドレ
ッシングデータ記憶手段４２のアドレッシングデータ
（同じく、データ長が「４」でそのデータ位置が「１，
２，４，７」）を見ると、この場合、データ位置として
「２」が書き込まれていて、この「２」をベースアドレ
ス０ｘ１０００に加算して、物理アドレスとして０ｘ１
００２を得る。この物理アドレス０ｘ１００２は物理ア
ドレス記憶手段４５に記憶される。Similarly, when the base address 0x1000 is subtracted from the virtual address 0x1001, the result is "1", and with the "1" as a key, the addressing data of the addressing data storage means 42 set in FIG. The data position is "1,"
2, 4, 7 ”), in this case,“ 2 ”is written as the data position, and this“ 2 ”is added to the base address 0x1000 to obtain the physical address 0x1.
002 is obtained. This physical address 0x1002 is stored in the physical address storage means 45.

【００６８】また、ベースアドレス０ｘ１０００を仮想
アドレス０ｘ１００２から引き算すると「２」であり、
その「２」をkeyとして、図1で設定されたアドレッシン
グデータ記憶手段４２のアドレッシングデータ（同じ
く、データ長が「４」でそのデータ位置が「１，２，
４，７」）を見ると、この場合、データ位置として
「４」が書き込まれていて、この「４」をベースアドレ
ス０ｘ１０００に加算して、物理アドレスとして０ｘ１
００４を得る。この物理アドレス０ｘ１００４は物理ア
ドレス記憶手段４５に記憶される。When the base address 0x1000 is subtracted from the virtual address 0x1002, the result is "2",
Using the "2" as a key, the addressing data of the addressing data storage means 42 set in FIG. 1 (similarly, the data length is "4" and the data position is "1, 2,
4, 7 ”), in this case,“ 4 ”is written as the data position, and this“ 4 ”is added to the base address 0x1000 to obtain the physical address 0x1.
To get 004. This physical address 0x1004 is stored in the physical address storage means 45.

【００６９】さらに、ベースアドレス０ｘ１０００を仮
想アドレス０ｘ１００３から引き算すると「３」であ
り、その「３」をkeyとして、図1で設定されたアドレッ
シングデータ記憶手段４２のアドレッシングデータ（同
じく、データ長が「４」でそのデータ位置が「１，２，
４，７」）を見ると、この場合、データ位置として
「７」が書き込まれていて、この「７」をベースアドレ
ス０ｘ１０００に加算して、物理アドレスとして０ｘ１
００７を得る。この物理アドレス０ｘ１００７は物理ア
ドレス記憶手段４５に記憶される。Further, when the base address 0x1000 is subtracted from the virtual address 0x1003, it is "3", and with the "3" as a key, the addressing data of the addressing data storage means 42 set in FIG. The data position is "1, 2,
4, 7 ”), in this case,“ 7 ”is written as the data position, and this“ 7 ”is added to the base address 0x1000 to obtain the physical address 0x1.
Get 007. This physical address 0x1007 is stored in the physical address storage means 45.

【００７０】このような処理によって、ＣＰＵやＤＳＰ
が生成した仮想アドレス０ｘ１０００，０ｘ１００１，
０ｘ１００２，０ｘ１００３に対する物理アドレス０ｘ
１００１，０ｘ１００２，０ｘ１００４，０ｘ１００７
が得られる。By such processing, the CPU and DSP
Generated virtual addresses 0x1000, 0x1001,
Physical address 0x for 0x1002 and 0x1003
1001,0x1002,0x1004,0x1007
Is obtained.

【００７１】この物理アドレス０ｘ１００１，０ｘ１０
０２，０ｘ１００４，０ｘ１００７は、図１に示すデー
タ選択手段２によって切り出された積和演算単位の８個
のデータに対し、それぞれ非零要素に対応する位置を指
し示すアドレスとなる。This physical address 0x1001, 0x10
02, 0x1004, 0x1007 are addresses indicating the positions corresponding to the non-zero elements for the eight pieces of product-sum operation units cut out by the data selecting means 2 shown in FIG.

【００７２】このように、本発明によれば、積和演算す
べき非零要素のデータが疎な状態で散らばって並んでい
るような場合、上述したようなアドレス変換を行うこと
によって、ＣＰＵやＤＳＰ側からみたとき、疎な状態で
散らばっている非零要素のデータがあたかも密な状態で
連続して並んでいるかのように見えることになる。As described above, according to the present invention, when the data of the non-zero elements to be sum-of-products calculated are lined up in a sparse manner, the CPU and When viewed from the DSP side, the non-zero element data scattered in a sparse state will appear as if they were continuously arranged in a dense state.

【００７３】これは、前述したように、一般的なＣＰＵ
やＤＳＰにおいては、積和演算すべきデータが疎な状態
で並んで存在する場合の積和演算効率は悪いが、密な状
態に並んで存在するデータについては効率よく積和演算
を行うことができることを利用したものであり、このよ
うな連続したデータについて積和演算を行うようなプロ
グラムの設定されたＣＰＵやＤＳＰにとっては極めて都
合のよいものとなる。This is a common CPU, as described above.
In a DSP or DSP, the product-sum operation efficiency is poor when the data to be subjected to the product-sum operation are arranged side by side in a sparse state. However, the product-sum operation can be efficiently performed on the data existing in a dense state. This is what is possible, and is extremely convenient for a CPU or DSP in which a program that performs a product-sum operation on such continuous data is set.

【００７４】すなわち、積和演算単位として切り出され
た８個のデータのうち、積和演算すべき非零要素のデー
タが疎な状態でしか存在しない場合でも、上述したアド
レス変換処理を行うことによって、ＣＰＵやＤＳＰは連
続したアドレスによって演算すべきデータを読み出して
積和演算する動作を行うだけで、実際には、疎な状態で
しか存在しない非零要素のみを次々と連続して読み出す
動作を行うことになる。それによって、効率よく積和演
算が行え、その演算量を大幅に削減することができ、高
速な積和演算が可能となる。That is, even if the non-zero element data to be subjected to the multiply-accumulate operation among the eight pieces of data cut out as the multiply-accumulate operation unit exist only in a sparse state, the address conversion processing described above is performed. , The CPU and DSP only read the data to be calculated by successive addresses and perform the multiply-accumulate operation. In reality, only the non-zero elements existing only in the sparse state are continuously read. Will be done. As a result, the product-sum calculation can be performed efficiently, the amount of calculation can be significantly reduced, and high-speed product-sum calculation can be performed.

【００７５】図3は以上説明した処理の手順を示すフロ
ーチャートである。これまでの説明と重複するが、この
図3を用いてその処理手順を再度説明すると、まず、積
和演算単位として切り出されたＮ個のデータ（この実施
の形態では、ＪＰＥＧを例にしているので8個のデー
タ）を切り出し（ステップｓ１）、切り出された８個の
データの個々のデータが零であるかどうかの判定を行う
（ステップｓ２）。FIG. 3 is a flowchart showing the procedure of the processing described above. Although overlapping with the above description, the processing procedure will be described again with reference to FIG. 3. First, N pieces of data cut out as a product-sum operation unit (JPEG is taken as an example in this embodiment). Therefore, eight pieces of data) are cut out (step s1), and it is determined whether or not each piece of the cut out eight pieces of data is zero (step s2).

【００７６】そして、その零・非零判定処理による零・
非零判定結果によって、アドレッシングモード選択手段
４１とアドレッシングデータ記憶手段４２を用いて、ア
ドレッシングモード設定を行う（ステップｓ３）。Then, by the zero / non-zero determination processing,
According to the non-zero determination result, the addressing mode setting means 41 and the addressing data storage means 42 are used to set the addressing mode (step s3).

【００７７】これは、零・非零判定結果によって得られ
る非零要素の位置を示す情報に依存したアドレッシング
データをアドレッシングデータ記憶手段４２から取得す
ることであり、上述した図１の例においては、切り出さ
れた８個のデータの零・非零判定結果は、“１００１０
１１０”であって、これを10進数で表した「１５０」を
keyとして、アドレッシングモード選択手段４１がアド
レッシングデータ記憶手段４２からデータ長が「４」で
そのデータ位置が「１，２，４，７」のアドレッシング
データを取得してそれを設定する。This is to obtain the addressing data depending on the information indicating the position of the non-zero element obtained from the zero / non-zero determination result from the addressing data storage means 42. In the example of FIG. The zero / non-zero judgment result of the cut out eight data is "10010".
110 ", which is the decimal number" 150 "
As the key, the addressing mode selection means 41 acquires the addressing data having the data length “4” and the data positions “1, 2, 4, 7” from the addressing data storage means 42 and sets it.

【００７８】そして、このアドレッシングモード設定と
ともに、ベースアドレスの設定を行う（ステップｓ
４）。このベースアドレスは、前述したように、ここで
は０ｘ１０００としている。そして、残データ数をデー
タ長に設定する（ステップｓ５）。この残データ数をデ
ータ長に設定というのは、アドレッシングデータ記憶手
段４２により得られたデータ長を設定することであり、
上述の例では、アドレッシングデータ記憶手段４２によ
り得られたデータ長は「４」であるので、その「４」が
設定されることになる。Then, the base address is set together with the addressing mode setting (step s).
4). This base address is 0x1000 here, as described above. Then, the number of remaining data is set to the data length (step s5). Setting the number of remaining data to the data length means setting the data length obtained by the addressing data storage means 42.
In the above example, since the data length obtained by the addressing data storage means 42 is "4", that "4" is set.

【００７９】そして、その残データ数が零か否かを判断
し（ステップｓ６）、零であれば処理を終了し、零でな
ければ、物理アドレスで指定されるアドレスのデータに
対して積和演算を行う（ステップｓ７）。Then, it is judged whether or not the number of remaining data is zero (step s6), and if it is zero, the processing is terminated. If it is not zero, the sum of products is applied to the data at the address designated by the physical address. Calculation is performed (step s7).

【００８０】次に、仮想アドレスに1を加えて（ステッ
プｓ８）、残データ数から１を減じて（ステップｓ
９）、ステップｓ６の処理に戻る。これは、たとえば、
仮想アドレス０ｘ１０００に対する積和演算処理が終了
したら、それに１を加えて０ｘ１００１とし、そのとき
の残データ数（この場合４）から「１」を減算して残デ
ータ数を「３」としてステップｓ６を行い、この場合、
残データ数が零でないので、仮想アドレス０ｘ１００１
に対する物理アドレス０ｘ１００２による積和演算を行
う。Next, 1 is added to the virtual address (step s8), and 1 is subtracted from the number of remaining data (step s8).
9) and returns to the process of step s6. This is for example
When the multiply-accumulate calculation process for the virtual address 0x1000 is completed, 1 is added to 0x1001, and "1" is subtracted from the remaining data number (4 in this case) at that time to set the remaining data number to "3" and step s6 is executed. Done, in this case,
Since the number of remaining data is not zero, virtual address 0x1001
Is performed with the physical address 0x1002 for

【００８１】そして、この仮想アドレス０ｘ１００１に
対する積和演算処理が終了したら、それに１を加えて０
ｘ１００２とし、そのときの残データ数（この場合３）
から「１」を減算して残データ数を「２」としてステッ
プｓ６を行い、この場合、残データ数が零でないので、
仮想アドレス０ｘ１００２に対する物理アドレス０ｘ１
００４による積和演算を行う。Then, when the product-sum operation processing for this virtual address 0x1001 is completed, 1 is added to it and 0 is added.
x1002 and the number of remaining data at that time (3 in this case)
Is subtracted from "1" to set the number of remaining data to "2", and step s6 is performed. In this case, since the number of remaining data is not zero,
Physical address 0x1 for virtual address 0x1002
The sum of products operation according to 004 is performed.

【００８２】この仮想アドレス０ｘ１００２に対する積
和演算処理が終了したら、それに１を加えて０ｘ１００
３とし、そのときの残データ数（この場合２）から
「１」を減算して残データ数を「１」としてステップｓ
６を行い、この場合、残データ数が零でないので、仮想
アドレス０ｘ１００３に対する物理アドレス０ｘ１００
７による積和演算を行う。When the product-sum calculation process for this virtual address 0x1002 is completed, 1 is added to it and 0x100 is added.
3, and "1" is subtracted from the number of remaining data (2 in this case) at that time to set the number of remaining data to "1".
6. In this case, since the remaining data number is not zero, the physical address 0x100 with respect to the virtual address 0x1003
The sum of products operation according to 7 is performed.

【００８３】この仮想アドレス０ｘ１００３に対する積
和演算処理が終了したら、それに１を加えて０ｘ１００
４とし、そのときの残データ数（この場合１）から
「１」を減算して残データ数を「０」としてステップｓ
６を行うと、この場合、残データ数が零であるので、処
理を終了する。When the product-sum operation processing for this virtual address 0x1003 is completed, 1 is added to it and 0x100 is added.
4 and subtract “1” from the number of remaining data (1 in this case) at that time to set the number of remaining data to “0”
If step 6 is performed, the number of remaining data is zero in this case, so the process ends.

【００８４】このように、ＣＰＵやＤＳＰは連続的なデ
ータアドレスを生成し、その連続的なデータアドレスに
従って演算処理を行なうが、実際には、その連続的なア
ドレスが疎の状態で存在する積和演算すべきデータを次
々と指し示す物理アドレスとして変換する処理が行わ
れ、それによって、疎な状態でしか存在しない非零要素
のみを次々と連続して読み出す動作を行うことになる。As described above, the CPU and DSP generate continuous data addresses and perform arithmetic processing in accordance with the continuous data addresses. However, in reality, the product in which the continuous addresses exist in a sparse state is used. A process of converting the data to be summed into physical addresses pointing one after another is performed, and as a result, only non-zero elements that exist only in a sparse state are continuously read out.

【００８５】なお、積和演算を行う際は、（６）式や
（７）式で示した変換係数を用いて行うが、この変換係
数に対しても、疎の状態で存在する非零要素のデータに
対応し読み出す必要があり、変換係数を読み出すための
変換係数アドレスも上述同様のアドレス変換を行う必要
がある。When the sum of products operation is performed, the conversion coefficient shown in the equations (6) and (7) is used. For this conversion coefficient, nonzero elements existing in a sparse state are also used. It is necessary to read corresponding data, and the conversion coefficient address for reading the conversion coefficient also needs to be subjected to the same address conversion as described above.

【００８６】図４は本発明の積和演算装置の全体的な構
成図であり、処理対象となるデータを記憶するデータ記
憶手段１、このデータ記憶手段１から1つの積和演算単
位を構成するデータ（ここでは8個のデータ）を切り出
すデータ選択手段２、このデータ選択手段２で切り出さ
れた積和演算単位の8個のデータに対し、零か非零かの
判定を行う零・非零判定手段３、上述したアドレス変換
処理を行なうアドレス変換手段（このアドレス変換手段
はデータアドレスに対するアドレス変換手段４ａと変換
係数アドレスに対するアドレス変換手段４ｂが存在す
る）、積和演算を行うに必要な変換係数を記憶する変換
係数記憶手段５、積和演算処理を行うＣＰＵあるいはＤ
ＳＰ（ここではＣＰＵとする）６から構成されている。
なお、この図4の構成要素のうち、図1および図２で示し
た構成要素と同じものには同一符号が付されている。FIG. 4 is an overall block diagram of the product-sum calculation apparatus of the present invention. Data storage means 1 for storing the data to be processed, and this data storage means 1 constitutes one product-sum calculation unit. Data selecting means 2 for cutting out data (here, 8 pieces of data), and zero / non-zero for judging 0 or non-zero with respect to 8 pieces of data of the product-sum operation unit cut out by this data selecting means 2. Judgment means 3, address conversion means for performing the above-mentioned address conversion processing (this address conversion means has an address conversion means 4a for a data address and an address conversion means 4b for a conversion coefficient address), and a conversion necessary for performing a product-sum operation. Transform coefficient storage means 5 for storing coefficients, CPU or D for performing product-sum calculation processing
It is composed of an SP (here, CPU) 6.
Note that, of the constituent elements in FIG. 4, the same constituent elements as those shown in FIGS. 1 and 2 are designated by the same reference numerals.

【００８７】アドレス変換手段４ａ，４ｂは、それぞれ
アドレッシングモード選択手段４１、アドレッシングデ
ータ記憶手段４２、ベースアドレス記憶手段４３を有し
た構成となっており、図１および図２で説明したような
動作を行う。The address conversion means 4a and 4b have an addressing mode selection means 41, an addressing data storage means 42, and a base address storage means 43, respectively, and operate as described with reference to FIGS. To do.

【００８８】図５はこの図2をブロック図として表した
ものであり、ベースアドレス記憶手段４３、アドレッシ
ングモード選択手段４１、アドレッシングデータ記憶手
段４２の他に、ＣＰＵ６で生成されるデータアドレスあ
るいは変換係数アドレスを仮想アドレスとして入力する
仮想アドレス入力手段４４と、その仮想アドレスをアド
レス変換処理することによって生成された物理アドレス
を記憶する物理アドレス記憶手段４５を有した構成とな
っている。FIG. 5 is a block diagram showing this FIG. 2, and in addition to the base address storage means 43, the addressing mode selection means 41, and the addressing data storage means 42, a data address generated by the CPU 6 or a conversion coefficient. It is configured to have a virtual address input means 44 for inputting an address as a virtual address, and a physical address storage means 45 for storing a physical address generated by performing address conversion processing on the virtual address.

【００８９】ＣＰＵ６はデータアドレス生成手段６１、
変換係数アドレス生成手段６２、積和演算手段６３を有
した構成となっており、データアドレスや変換係数アド
レスの生成、積和演算処理を行なうとともに、この図４
に示す各構成要素全体の動作を制御する機能をも有して
いる。The CPU 6 uses the data address generating means 61,
The conversion coefficient address generation means 62 and the product-sum calculation means 63 are provided. The data address and the conversion coefficient address are generated, and the product-sum calculation processing is performed.
It also has a function of controlling the overall operation of each component shown in FIG.

【００９０】データアドレス生成手段６１で生成される
アドレスは、積和演算単位として切り出された８個のデ
ータを順に読み出すためのアドレス（データアドレス）
であり、このデータアドレスはアドレス変換手段４ａに
与えられ、このアドレス変換手段４ａに含まれる仮想ア
ドレス入力手段４４（図５参照）に仮想アドレスとして
入力される。The address generated by the data address generating means 61 is an address (data address) for sequentially reading out eight pieces of data cut out as a product-sum operation unit.
This data address is given to the address conversion means 4a and inputted as a virtual address to the virtual address input means 44 (see FIG. 5) included in this address conversion means 4a.

【００９１】また、変換係数アドレス生成手段６２で生
成されるアドレスは、変換係数を読み出すためのアドレ
ス（変換係数アドレス）であり、この変換係数アドレス
は、アドレス変換手段４ｂに与えられ、このアドレス変
換手段４ｂに含まれる仮想アドレス入力手段４４（図５
参照）に仮想アドレスとして入力される。Further, the address generated by the conversion coefficient address generation means 62 is an address (conversion coefficient address) for reading the conversion coefficient, and this conversion coefficient address is given to the address conversion means 4b to perform this address conversion. The virtual address input means 44 (FIG. 5) included in the means 4b.
Input) as a virtual address.

【００９２】このような構成の積和演算装置におけるア
ドレス変換手段４ａ，４ｂのアドレス変換処理などにつ
いてはすでに詳細に説明したので、このアドレス変換処
理についての説明は省略して、全体的な処理について説
明する。Since the address conversion processing of the address conversion means 4a and 4b in the product-sum calculation device having such a configuration has already been described in detail, the description of the address conversion processing will be omitted and the overall processing will be omitted. explain.

【００９３】まず、図1で説明したように、積和演算単
位として切り出された8個のデータにおける非零要素の
位置を示す情報に基づいて、アドレッシングモード選択
手段４１がアドレッシングデータ記憶手段４２から或る
データ長とデータ位置を示す情報を指示し、そのデータ
長とデータ位置を示す情報がアドレッシングデータとし
て設定される。その後、図2で説明したように、ＣＰＵ
６のデータアドレス生成手段６１で生成される連続的な
データアドレス（仮想アドレス０ｘ１０００，０ｘ１０
０１，０ｘ１００２，…）をアドレス変換手段４ａでア
ドレス変換処理することで、非零要素のみを連続的に指
し示す物理アドレス（上述の例では、０ｘ１００１，０
ｘ１００２，０ｘ１００４，０ｘ１００７）を生成し、
データ記憶手段１からその物理アドレスで示されるデー
タを読み出して、その読み出されたデータが積和演算手
段６３に与えられる。First, as described with reference to FIG. 1, the addressing mode selection means 41 is operated by the addressing data storage means 42 based on the information indicating the position of the non-zero element in the eight pieces of data cut out as the product-sum operation unit. Information indicating a certain data length and data position is designated, and information indicating the data length and data position is set as addressing data. Then, as described in FIG. 2, the CPU
No. 6 continuous data address generated by the data address generating means 61 (virtual addresses 0x1000, 0x10
, 0) by the address conversion means 4a, the physical addresses (0x1001,0 in the above example) that continuously indicate only non-zero elements.
x1002, 0x1004, 0x1007),
The data indicated by the physical address is read from the data storage means 1, and the read data is given to the product-sum calculation means 63.

【００９４】一方、同じく、ＣＰＵ６の変換係数アドレ
ス生成手段６２で生成される変換係数アドレスをアドレ
ス変換手段４ｂでデータアドレスと同様にアドレス変換
処理することで、非零要素に対応する変換係数を連続的
に指し示す物理アドレスを生成し、変換係数記憶手段５
からその物理アドレスによって示される変換係数を読み
出して、その読み出された変換係数が積和演算手段６３
に与えられる。On the other hand, similarly, the conversion coefficient address generated by the conversion coefficient address generation means 62 of the CPU 6 is subjected to the address conversion processing by the address conversion means 4b in the same manner as the data address, so that the conversion coefficients corresponding to the non-zero elements are consecutive. To generate a physical address that is physically pointed to, and the conversion coefficient storage unit 5
The conversion coefficient indicated by the physical address is read from, and the read conversion coefficient is used as the product-sum calculation means 63.
Given to.

【００９５】このように、アドレス変換手段４ａ，４ｂ
によって生成された物理アドレスによって非零要素のデ
ータが順次読み出されるとともに、その非零要素に対応
する変換係数が順次読み出され、それらが積和演算手段
６３に与えられることで積和演算が行われる。In this way, the address translation means 4a, 4b
The non-zero element data is sequentially read by the physical address generated by, and the conversion coefficient corresponding to the non-zero element is sequentially read, and these are supplied to the sum-of-products calculating means 63 to perform the sum-of-products calculation. Be seen.

【００９６】このとき、ＣＰＵ６からは連続的なデータ
アドレスと変換係数アドレスが生成されるだけである
が、その連続的なデータアドレスと変換係数アドレス
は、疎な状態で存在する非零要素とそれに対応する変換
係数を次々と連続的に指し示す物理アドレスとして変換
され、それによって、たとえ、積和演算すべき非零要素
のデータが疎な状態で散らばって存在していても、ＣＰ
Ｕ６は単に連続的なデータアドレスおよび変換係数アド
レスを生成し、それに従ったデータの読み出しを行って
積和演算する動作を行うだけで、実際には、疎に存在す
る非零要素のデータとそれに対応する変換係数のみを順
次読み出して積和演算する動作を行うことになり、効率
のよい積和演算が行え、その演算量を大幅に削減するこ
とができる。At this time, only the continuous data address and the conversion coefficient address are generated from the CPU 6, but the continuous data address and the conversion coefficient address are non-zero elements existing in a sparse state and The corresponding conversion coefficients are converted into physical addresses that successively point to each other, so that even if the non-zero element data to be subjected to the multiply-accumulate data exist in a sparse state, the CP
U6 merely generates continuous data addresses and conversion coefficient addresses, reads out data according to the data addresses, and performs a product-sum operation. In reality, sparse nonzero element data and Since only the corresponding conversion coefficient is sequentially read out to perform the product-sum calculation, efficient product-sum calculation can be performed, and the amount of calculation can be significantly reduced.

【００９７】なお、本発明は以上説明した実施の形態に
限定されるものではなく、本発明の要旨を逸脱しない範
囲で種々変形実施可能となるものである。The present invention is not limited to the embodiments described above, and various modifications can be made without departing from the gist of the present invention.

【００９８】また、本発明は、以上説明した本発明を実
現するための処理手順が記述された処理プログラムを作
成し、その処理プログラムをフロッピィディスク、光デ
ィスク、ハードディスクなどの記録媒体に記録させてお
くことができ、本発明はその処理プログラムが記録され
た記録媒体をも含むものである。また、ネットワークか
ら当該処理プログラムを得るようにしてもよい。Further, according to the present invention, a processing program in which a processing procedure for realizing the above-described present invention is described is created, and the processing program is recorded in a recording medium such as a floppy disk, an optical disk, a hard disk. The present invention also includes a recording medium in which the processing program is recorded. Further, the processing program may be obtained from the network.

【００９９】[0099]

【発明の効果】以上で説明したように本発明によれば、
積和演算単位を構成するＮ個のデータのそれぞれについ
て零か非零かを判断し、そのＮ個のデータにおける非零
要素の存在する位置を示す情報に基づき、ＣＰＵやＤＳ
Ｐなどの積和演算処理手段で生成されるアドレスに対
し、非零要素のみを連続的に読み出し可能となるような
アドレス変換を行い、そのアドレス変換後のアドレス順
で読み出されたデータに対して順次積和演算を行うよう
にしている。As described above, according to the present invention,
It is determined whether each of the N pieces of data forming the product-sum operation unit is zero or non-zero, and based on the information indicating the position where the non-zero element exists in the N pieces of data, the CPU and the DS
Addresses generated by the product-sum operation processing means such as P are subjected to address conversion so that only non-zero elements can be continuously read out, and data read out in the address order after the address conversion is performed. The sequential product-sum calculation is performed.

【０１００】このように、ＣＰＵやＤＳＰなどが生成す
る連続的なアドレスを、非零要素のみを順次アクセス可
能なアドレスに変換して、その変換後のアドレス順でデ
ータを読み出して積和演算を行うようにしている。As described above, continuous addresses generated by the CPU or DSP are converted into addresses in which only non-zero elements can be sequentially accessed, and the data is read in the converted address order to perform the product-sum operation. I am trying to do it.

【０１０１】これにより、ＣＰＵやＤＳＰなどは単に連
続したアドレスによって演算すべきデータを読み出して
積和演算する動作を行うだけで、実際には、疎な状態で
しか存在しない非零要素のみを次々と連続して読み出す
動作を行うことになる。したがって、たとえ、積和演算
すべき非零要素のデータが疎な状態で散らばって存在し
ていたとしても、その疎な状態の非零要素とそれに対応
する変換係数を次々と連続的に読み出すことができ、こ
れによって、直交変換処理に伴う積和演算を効率よく行
うことができ、その演算量を大幅に削減することができ
る。As a result, the CPU, DSP, etc. simply read the data to be operated by consecutive addresses and perform the sum-of-products operation. In reality, only non-zero elements that exist only in a sparse state are successively detected. Will be read continuously. Therefore, even if the non-zero element data to be subjected to the multiply-accumulate operation exist in a sparse state, the non-zero elements in the sparse state and the corresponding conversion coefficient are continuously read one after another. As a result, the product-sum operation associated with the orthogonal transform process can be efficiently performed, and the amount of the operation can be significantly reduced.

[Brief description of drawings]

【図１】本発明の積和演算処理を説明する図であり、処
理対象となるデータから積和演算単位となる8個のデー
タを切り出し、その8個のデータ中に非零要素がどのよ
うな位置に存在しているかを判断し、それに基づいてデ
ータ長とデータ位置を示す情報を取得する処理を説明す
る図である。FIG. 1 is a diagram for explaining a product-sum calculation process of the present invention, in which eight pieces of data serving as a product-sum calculation unit are cut out from data to be processed, and how non-zero elements are included in the eight pieces of data. It is a figure explaining the process which judges whether it exists in a certain position, and acquires the information which shows a data length and a data position based on it.

【図２】本発明の積和演算処理を説明する図であり、図
1で示した処理に基づいて、ＣＰＵやＤＳＰで生成され
るアドレスを変換するアドレス変換処理を説明する図で
ある。FIG. 2 is a diagram illustrating a product-sum calculation process according to the present invention.
It is a figure explaining the address translation process which translates the address produced | generated by CPU or DSP based on the process shown by 1.

【図３】本発明の積和演算処理の全体的な処理手順を説
明するフローチャートである。FIG. 3 is a flowchart illustrating an overall processing procedure of sum of products calculation processing according to the present invention.

【図４】本発明の積和演算装置の全体的な構成図であ
る。FIG. 4 is an overall configuration diagram of a product-sum calculation apparatus of the present invention.

【図５】図２で示したアドレス変換動作をブロック図と
して表した図である。5 is a diagram showing the address conversion operation shown in FIG. 2 as a block diagram.

【図６】画像データから処理単位となるＮ×Ｎ（８×
８）のデータブロックの切り出しを模式的に説明する図
である。FIG. 6 shows N × N (8 ×
It is a figure which illustrates the cutout of the data block of 8) typically.

【図７】処理単位となるＮ×Ｎ（８×８）のデータブロ
ックのデータ例と、そのデータを周波数領域に変換した
のち量子化する処理例を説明する図である。FIG. 7 is a diagram illustrating an example of data of an N × N (8 × 8) data block that is a processing unit, and an example of processing of converting the data into a frequency domain and then performing quantization.

【図８】ランレングス符号化処理について説明する図で
ある。[Fig. 8] Fig. 8 is a diagram for describing run-length encoding processing.

【図９】図７（ｃ）のデータを並べ替え、さらに、零の
連鎖をＥＯＢに置き換えた場合を示す図である。FIG. 9 is a diagram showing a case where the data in FIG. 7C is rearranged and a chain of zeros is replaced with EOB.

【図１０】処理単位となるＮ×Ｎ（８×８）のデータブ
ロックに対する処理を行方向または列方向のいずれから
行うかを説明する図である。FIG. 10 is a diagram illustrating whether the processing for N × N (8 × 8) data blocks as a processing unit is performed in the row direction or the column direction.

【図１１】従来技術において、ＥＯＢの位置により最適
な演算手段を選択して直交変換演算の簡略化を図る例を
説明する図である。FIG. 11 is a diagram for explaining an example of selecting an optimum arithmetic means according to the position of EOB to simplify the orthogonal transform arithmetic in the conventional technique.

【図１２】従来技術において、処理単位となるＮ×Ｎ
（８×８）のデータブロックをいくつかの領域に分割し
て演算を行うことで直交演算の簡略化を図る例を説明す
る図である。FIG. 12 is a processing unit N × N in the related art.
It is a figure explaining the example which aims at simplification of orthogonal calculation by dividing a data block of (8x8) into some areas, and performing a calculation.

【図１３】処理単位となるＮ×Ｎ（８×８）のデータブ
ロックにおけるデータにおいて、従来技術の１つである
４並列演算（行方向の４並列演算）により積和演算を行
なう場合の問題点を説明する図である。FIG. 13 is a problem in performing multiply-accumulate operation on data in N × N (8 × 8) data blocks, which is a processing unit, by four parallel operations (four parallel operations in the row direction) which is one of the conventional techniques It is a figure explaining a point.

【図１４】処理単位となるＮ×Ｎ（８×８）のデータブ
ロックにおけるデータにおいて、従来技術の１つである
ＥＯＢの判断を用いて積和演算を行なう場合の問題点を
説明する図である。FIG. 14 is a diagram for explaining a problem when performing a product-sum operation on the data in N × N (8 × 8) data blocks, which is a processing unit, using the determination of EOB, which is one of the conventional techniques. is there.

[Explanation of symbols]

１データ記憶手段２データ選択手段３零・非零判定手段４ａアドレス変換手段（データアドレスに対するアド
レス変換手段）４ｂアドレス変換手段（変換係数アドレスに対するア
ドレス変換手段）５変換係数記憶手段６ＣＰＵ４１アドレッシングモード選択手段４２アドレッシングデータ記憶手段４３ベースアドレス記憶手段６１データアドレス生成手段６２変換係数アドレス生成手段６３積和演算手段1 data storage means 2 data selection means 3 zero / non-zero determination means 4a address conversion means (address conversion means for data addresses) 4b address conversion means (address conversion means for conversion coefficient addresses) 5 conversion coefficient storage means 6 CPU 41 addressing mode Selection means 42 Addressing data storage means 43 Base address storage means 61 Data address generation means 62 Transform coefficient address generation means 63 Sum of products calculation means

Claims

[Claims]

1. A product-sum operation method in which N pieces of data forming a product-sum operation unit are sequentially read out in a predetermined address order to perform a product-sum operation, and whether each of the N pieces of data is zero or non-zero is determined. Based on the information indicating the position of the non-zero element in the N data, the address is converted so that only the non-zero element can be continuously read, and the conversion is performed. A product-sum operation method characterized by sequentially performing a product-sum operation on data read by a subsequent address.

2. The address conversion based on the position where a non-zero element exists in the N pieces of data is also performed for an address for reading a conversion coefficient necessary for performing a product-sum operation, and the converted address The sum-of-products calculation method according to claim 1, wherein the conversion coefficient corresponding to the non-zero element is read by.

3. A product-sum calculation apparatus which sequentially reads N pieces of data constituting a product-sum calculation unit in a predetermined address order to perform a product-sum calculation, and data storage means for storing data to be processed. From the data to be processed stored in the data storage means, data selecting means for cutting out N pieces of data serving as a product-sum operation unit and N pieces of data cut out by the data selecting means are each zero or non-zero. Only the non-zero element is continuously read from the address based on the zero / non-zero determining means for determining whether it is zero and the information indicating the position where the non-zero element exists in the zero / non-zero determination result. Address conversion means for performing address conversion so that it becomes possible, and multiply-accumulate operation is performed on the data read by the address after the address conversion by this address conversion means. Product-sum operation unit, wherein the door.

4. The address conversion means for performing address conversion based on the position where a non-zero element exists in the N data is a data address for performing address conversion on addresses for sequentially reading the N data. A conversion unit and a conversion coefficient address conversion unit that performs address conversion on the address for reading the conversion coefficient are provided, and only the non-zero element is provided by the data address conversion unit for the address for reading the data. Is converted so that the conversion coefficient can be continuously read, and for the address for reading the conversion coefficient, the conversion coefficient address conversion unit can read the conversion coefficient corresponding to the non-zero element. 4. The product-sum calculation apparatus according to claim 3, which performs such address conversion.