JP2864598B2

JP2864598B2 - Digital arithmetic circuit

Info

Publication number: JP2864598B2
Application number: JP33731989A
Authority: JP
Inventors: 清一郎岩瀬
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1989-12-26
Filing date: 1989-12-26
Publication date: 1999-03-03
Anticipated expiration: 2014-03-03
Also published as: JPH03196711A

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、ディジタルフィルタ等の積和演算に適用
できるディジタル演算回路に関する。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a digital operation circuit applicable to a product-sum operation such as a digital filter.

[Summary of the Invention]

この発明は、入力データに対して係数を乗算するフィ
ルタ演算に適用できるディジタル演算回路において、入
力データがビット並列にロード可能であり、複数のシフ
ト量の中で選択されたビットシフト量の入力データを発
生するシフトレジスタと、シフトレジスタの出力を２の
べき乗倍すると共に、シフトレジスタの出力の正又は負
の出力を発生するセレクタと、セレクタの出力が供給さ
れ、キャリー及びサムに分割した形態でセレクタの出力
を累加算する累加算器と、部分積の発生のためのプログ
ラムに従って、シフトレジスタ及びセレクタに対する制
御信号を発生する回路とからなり、乗算処理に必要とさ
れるステップ数を減少できるものである。The present invention relates to a digital operation circuit applicable to a filter operation for multiplying input data by a coefficient, wherein input data can be loaded in a bit-parallel manner, and input data having a bit shift amount selected from a plurality of shift amounts is provided. , A selector for multiplying the output of the shift register by a power of 2 and generating a positive or negative output of the shift register, and an output of the selector supplied, divided into a carry and a sum. An accumulator for accumulating the output of the selector, and a circuit for generating a control signal for the shift register and the selector in accordance with a program for generating a partial product, capable of reducing the number of steps required for multiplication processing It is.

[Conventional technology]

ｎタップ例えば４タップのFIRディジタルフィルタ
は、第７図に示すように、入力系列をx_iとし、出力系列
をy_iとし、インパルス応答（係数）をh₀〜h_n-1とする
時、なる演算を行うものである。第７図では、４個の単位遅
延素子からなるシフトレジスタ部のタップから取り出さ
れたデータが乗算器に供給され、乗算器で係数h₀,h₁,
h₂,h₃が乗じられる。乗算器の出力が加算トリーで加算
され、出力データy_iが得られる。As shown in FIG. 7, an n-tap, for example, 4-tap FIR digital filter has an input sequence x _i , an output sequence y _i , and an impulse response (coefficient) h ₀ to h _n−1 , as shown in FIG. The following calculation is performed. In FIG. 7, the data extracted from the tap of the shift register unit composed of four unit delay elements is supplied to the multiplier, where the coefficients h ₀ , h ₁ ,
h ₂ and h ₃ are multiplied. The outputs of the multipliers are added by an addition tree, and output data y _i is obtained.

オーディオ信号やモデムのための変調等の応用では、
データレートが低いので、DSP（ディジタル信号プロセ
ッサ）と称されるプログラム制御の構成を用いて上述の
フィルタ演算が実現されている。DSPは、ハードウエア
の乗算器を持ったり、内部バスを複雑化して積和演算を
効率良く繰り返すことができる。第８図は、フィルタ演
算に適用できるDSPを示す。For applications such as modulation for audio signals and modems,
Since the data rate is low, the above-described filter operation is realized using a program control configuration called a DSP (Digital Signal Processor). DSPs can have hardware multipliers or complicate internal buses to efficiently repeat product-sum operations. FIG. 8 shows a DSP applicable to the filter operation.

第８図で、31が乗算器、32が加減算器、33がレジスタ
であり、レジスタ33の出力が加減算器32に帰還され、累
加算器が構成される。また、34が入力データが格納され
るデータメモリ、35がデータメモリ34のアドレス発生回
路、36が係数が格納された係数メモリ、37が係数メモリ
36のアドレス発生回路である。更に、乗算器31、加減算
器32、レジスタ33等は、レジスタ38からの制御信号で制
御される。39で示すシーケンサからのアドレスがマイク
ロプログラムメモリ40に供給され、マイクロプログラム
メモリ40から制御信号が読み出され、レジスタ38に取り
込まれる。In FIG. 8, 31 is a multiplier, 32 is an adder / subtractor, and 33 is a register. The output of the register 33 is fed back to the adder / subtractor 32 to form a cumulative adder. 34 is a data memory for storing input data, 35 is an address generation circuit for the data memory 34, 36 is a coefficient memory for storing coefficients, and 37 is a coefficient memory.
36 address generation circuits. Further, the multiplier 31, the adder / subtractor 32, the register 33 and the like are controlled by a control signal from the register 38. An address from the sequencer indicated by 39 is supplied to the microprogram memory 40, and a control signal is read from the microprogram memory 40 and taken into the register 38.

第８図に示す構成により第７図のディジタルフィルタ
と同様の演算を行う時には、新たな入力データx_iが来る
度に、アドレス発生回路35は、４サイクル（４クロック
周期を意味する）で過去の４個のデータx_i、x_i-1、
x_i-2、x_i-3が記憶されているアドレスを発生する。アド
レス発生回路37は、４サイクルでh₀、h₁、h₂、h₃が記憶
されているアドレスを発生する。乗算器31は、データメ
モリ34及び係数メモリ36から読み出されたデータ及び係
数の乗算を４サイクルで行い、h₀x_i、h₁x_i-1、h₂x_i-2、
h₃x_i-3の乗算出力を順次発生する。これらの乗算出力の
総和が加減算器32、レジスタ33及び帰還路からなる累加
算器で生成され、出力データy_iが得られる。When performing the same operation as the digital filter of FIG. 7 by the configuration shown in FIG. 8, each time a new input data x _i arrives, the address generating circuit 35, the last four cycles (4 means a clock period) Of four data x _i , x _i-1 ,
Generate an address where x _i-2 and x _i-3 are stored. The address generation circuit 37 generates an address in which h ₀ , h ₁ , h ₂ , and h ₃ are stored in four cycles. The multiplier 31 multiplies the data and the coefficients read from the data memory 34 and the coefficient memory 36 in four cycles, and outputs h ₀ x _i , h ₁ x _i-1 , h ₂ x _i-2 ,
Generates a multiplied output of h ₃ x _i-3 sequentially. The sum of these multiplied outputs is generated by an adder / subtracter 32, a register 33, and a cumulative adder including a feedback path, and output data _yi is obtained.

[Problems to be solved by the invention]

上述のDSPは、ゲート数が少なく、回路規模が小さい
利点がある。しかしながら、オーディオデータよりデー
タレートがはるかに高いビデオデータの処理には、適用
できない問題があった。つまり、乗算器31及び加減算器
32の速度がデータレートと同程度であるため、入力デー
タx_iが来てから、次の入力データx_i+1が来る迄に、プロ
グラムの複数ステップを実行する時間的余裕がない。例
えばｎタップFIRフィルタの場合はｎステップの処理が
必要であった。The above-mentioned DSP has advantages in that the number of gates is small and the circuit scale is small. However, there is a problem that cannot be applied to processing of video data having a data rate much higher than that of audio data. That is, the multiplier 31 and the adder / subtractor
For 32 speed of about the same as the data rate, from coming input data x _i is until arrival of the next input data x _{i + 1,} we do not have enough time to perform several steps of the program. For example, in the case of an n-tap FIR filter, n-step processing was required.

従って、この発明の目的は、DSPの利点を損なわず
に、ステップ数を減少でき、ビデオデータのような高速
のデータの処理に適用可能なディジタル信号処理回路を
提供することにある。Accordingly, an object of the present invention is to provide a digital signal processing circuit which can reduce the number of steps without deteriorating the advantages of a DSP and can be applied to processing of high-speed data such as video data.

[Means for solving the problem]

この発明は、入力データがビット並列にロード可能で
あり、複数のシフト量の中で選択されたビットシフト量
の入力データを発生するシフトレジスタ（１、２）と、シフトレジスタ（１、２）の出力を２のべき乗倍する
と共に、シフトレジスタ（１、２）の出力の正又は負の
出力を発生するセレクタ（３、４）と、セレクタ（３、４）の出力が供給され、キャリー及び
サムに分割した形態でセレクタ（３、４）の出力を累加
算する累加算器（５、６、７）と、部分積の発生のためのプログラムに従って、シフトレ
ジスタ（１、２）及びセレクタ（３、４）に対する制御
信号を発生する手段（12、13、14）とからなる。According to the present invention, a shift register (1, 2) which can load input data in a bit parallel manner and generates input data of a bit shift amount selected from a plurality of shift amounts, and a shift register (1, 2) And a selector (3, 4) for generating a positive or negative output of the output of the shift register (1, 2) and an output of the selector (3, 4). Accumulators (5, 6, 7) for accumulating the outputs of the selectors (3, 4) in a form divided into sums; and a shift register (1, 2) and a selector (1, 2) according to a program for generating a partial product. Means (12, 13, 14) for generating control signals for (3, 4).

[Action]

部分積が零になる係数即ち乗数として、“1"のビット
数が少ないもの、又は“0"又は“1"がなるべく連続して
いるものを選定することで、ブースのアルゴリズムを用
いる場合に比して、ステップ数を減少できる。従って、
演算の高速化が可能となり、また、回路規模の小型化を
図ることができる。By selecting a coefficient with a small number of “1” bits or a coefficient with as few consecutive “0” s or “1s” as possible, that is, a coefficient that makes the partial product zero, that is, a multiplier that is smaller than when Booth's algorithm is used Thus, the number of steps can be reduced. Therefore,
The operation can be speeded up, and the circuit size can be reduced.

〔Example〕

以下、この発明の一実施例について図面を参照して説
明する。第１図において、１は、ｎビットの入力データ
x_iがビット並列的に入力されるシフトレジスタである。
入力データx_iは、例えば２を補数とするコードである。
２は、語長を上位方向にｍビット拡張するためのシフト
レジスタである。２を補数とするコードの場合では、MS
B（最上位ビット）をｍビット付加することで、語長を
拡大できる。拡張するビット数ｍは、係数の語長に対応
するものである。シフトレジスタ１及び２は、第２図に
示す１ビット分の単位回路がｎ個及びｍ個夫々直列接続
されたものである。An embodiment of the present invention will be described below with reference to the drawings. In FIG. 1, 1 is input data of n bits.
x _i is a shift register to which bits are input in parallel.
The input data x _i is, for example, a code with 2's complement.
Reference numeral 2 denotes a shift register for extending the word length by m bits in the upper direction. In the case of 2's complement code, MS
The word length can be expanded by adding m bits of B (most significant bit). The number m of bits to be extended corresponds to the word length of the coefficient. The shift registers 1 and 2 each have n and m unit circuits of one bit shown in FIG. 2 connected in series.

第２図おいて、21がセレクタ、22がセレクタ21の出力
が供給されるフリップフロップである。フリップフロッ
プ22の出力が後段の単位回路に供給される。セレクタ21
には、１段前（前段）の出力ST1、２段前の出力ST2、３
段前の出力ST3、４段前の出力ST4、フリップフロップ22
の出力が供給される。シフトレジスタ１及び２には、夫
々の単位回路のセレクタに対する制御信号Ｓが共通に供
給される。セレクタ21は、制御信号Ｓに応じた量、左シ
フトされた一つの入力を選択する。In FIG. 2, reference numeral 21 denotes a selector, and reference numeral 22 denotes a flip-flop to which the output of the selector 21 is supplied. The output of the flip-flop 22 is supplied to a subsequent unit circuit. Selector 21
Output ST1 of the previous stage (previous stage), output ST2 of the previous stage,
Output ST3 before stage, output ST4 before stage 4, flip-flop 22
Is supplied. The control signals S for the selectors of the respective unit circuits are commonly supplied to the shift registers 1 and 2. The selector 21 selects one input shifted leftward by an amount corresponding to the control signal S.

シフトレジスタ１は、図面に向かって左方向（LSBか
らMSBの方向）へシフトできるものである。シフトレジ
スタ１は、最初のサイクルで入力データx_iを選択し、フ
リップフロップ22に取り込む。これがデータロードの状
態である。次のサイクルからは、入力データ以外をセレ
クタ21が選択する。セレクタ21が１段前の出力ST1を選
択する時がデータシフトの状態であり、セレクタ21が２
段前の出力ST2を選択する時が１段スキップでシフトす
る状態であり、セレクタ21が３段前の出力ST3を選択す
る時が２段スキップでシフトする状態であり、セレクタ
21が４段前の出力ST4を選択する時が３段スキップでシ
フトする状態である。シフトレジスタ２もシフトレジス
タ１と同様のシフト処理を入力データx_iの上位ビットに
ついて行う。従って、制御信号Ｓがシフト段数及び並列
ロードのタイミングを決定する。このように、シフトレ
ジスタ１及び２を構成しているのは、部分積の加算の時
に必要なビットずらしを実現するためである。The shift register 1 can shift leftward (in the direction from the LSB to the MSB) as viewed in the drawing. Shift register 1 selects input data x _i for the first cycle, taking the flip-flop 22. This is the state of data loading. From the next cycle, the selector 21 selects data other than the input data. When the selector 21 selects the output ST1 one stage before, it is in the data shift state.
When the output ST2 of the previous stage is selected, shifting is performed by skipping one stage. When the selector 21 selects the output ST3 of the previous stage, shifting is performed by skipping two stages.
When 21 is to select the output ST4 four steps before, it is a state of shifting by skipping three steps. The shift register 2 is also the same shift processing as the shift register 1 is performed for the upper bits of the input data x _i. Therefore, the control signal S determines the number of shift stages and the timing of parallel loading. The reason why the shift registers 1 and 2 are configured in this way is to realize the necessary bit shift at the time of addition of the partial products.

３及び４は、夫々ブースのセレクタである。（Ｘ×
Ｙ）の乗算を２次のブースのアルゴリズムで行う場合、
乗数Ｙ（具体的には係数）の２ビット毎に部分積が形成
される。この場合、乗数Ｙの連続する３ビットを見て、
被乗数Ｘの０倍、±１倍、±２倍のいずれかの部分積が
ブースのセレクタ３及び４で形成され、部分積が加算さ
れることで乗算出力が求められる。Reference numerals 3 and 4 are booth selectors, respectively. (XX
Y) multiplication by the second order Booth algorithm,
A partial product is formed every two bits of the multiplier Y (specifically, a coefficient). In this case, looking at three consecutive bits of the multiplier Y,
Any one of 0 times, ± 1 times, and ± 2 times the multiplicand X is formed by the booth selectors 3 and 4, and the partial products are added to obtain a multiplied output.

Ｐ、Ｑ、Ｒは、セレクタ３及び４に共通に供給される
制御信号である。通常、乗数Ｙの連続する３ビットをブ
ースのデコーダに供給することにより制御信号Ｐ、Ｑ、
Ｒが形成される。この発明では、後述のように、既知の
係数を乗算するのに必要なプログラムのステップと対応
して制御信号Ｐ、Ｑ、Ｒが形成されている。セレクタ３
及び４は、第３図に示す単位路がｎビット分及びｍビッ
ト分、直列接続されたものである。第３図で23が制御信
号Ｐで制御されるセレクタであり、24がセレクタ23の出
力信号と制御信号Ｑとが供給されるANDゲートであり、2
5がANDゲート24の出力と制御信号Ｒとが供給されるEX−
ORゲートである。セレクタ23には、シフトレジスタ１及
び２の出力と共に、その１ビット下位のビットからの入
力とが供給される。P, Q, and R are control signals commonly supplied to the selectors 3 and 4. Usually, by supplying three consecutive bits of the multiplier Y to the booth decoder, the control signals P, Q,
R is formed. In the present invention, the control signals P, Q, and R are formed corresponding to the steps of the program required to multiply the known coefficients, as described later. Selector 3
And 4 are the unit paths shown in FIG. 3 connected in series for n bits and m bits. In FIG. 3, reference numeral 23 denotes a selector controlled by the control signal P, reference numeral 24 denotes an AND gate to which the output signal of the selector 23 and the control signal Q are supplied,
5 is an EX- to which the output of the AND gate 24 and the control signal R are supplied.
OR gate. The selector 23 is supplied with the outputs of the shift registers 1 and 2 as well as the input from the lower one bit.

ブースのセレクタ３及び４は、セレクタ23がシフトレ
ジスタ１及び２の入力をそのまま選択する時に、その出
力として１倍のデータが得られ、１ビット下位からのデ
ータをセレクタ23が選択して１ビットシフトされたデー
タを出力する時に、２倍のデータが得られる。制御信号
Ｐが“0"の時に１倍のデータがセレクタ23から得られ、
これが“1"の時に２倍のデータがセレクタ23から得られ
る。ANDゲート24は、制御信号Ｑが“0"の時に、０を出
力するための禁止ゲートである。EX−ORゲート25は、制
御信号Ｒが“1"の時に“0"と“1"の反転を行う。When the selector 23 selects the input of the shift register 1 or 2 as it is, the selectors 3 and 4 of the Booth obtain 1-fold data as the output. When outputting the shifted data, double data is obtained. When the control signal P is "0", one-time data is obtained from the selector 23,
When this is "1", double data is obtained from the selector 23. The AND gate 24 is a prohibition gate for outputting 0 when the control signal Q is “0”. The EX-OR gate 25 inverts "0" and "1" when the control signal R is "1".

なお、２次のブースのデコーダでは、乗数Ｙの３ビッ
トに応じて下記の制御信号Ｐ、Ｑ、Ｒが生成される。In the secondary booth decoder, the following control signals P, Q, and R are generated according to the three bits of the multiplier Y.

また、ブースのセレクタ３及び４は、制御信号Ｐ、
Ｑ、Ｒに応じて下記のように、部分積を発生する。 The booth selectors 3 and 4 control signals P,
A partial product is generated according to Q and R as follows.

ブースのセレクタ３及び４で１サイクル毎に形成され
た（ｎ＋ｍ）ビットの部分積が累加算器５及び６に夫々
供給される。また、シフトレジスタ２と同様に、語長の
拡大のために、ｌビットの累加算器７が設けられ、累加
算器７に対してMSBが供給される。累加算器７により、
各タップの乗算結果を多数加算した時に、オーバーフロ
ーの発生が防止される。累加算器５の下位のキャリー入
力として制御信号Ｒが供給される。これは、EX−ORゲー
ト25で“0"と“1"の反転を行った時に、LSBに“1"の制
御信号Ｒを加えて２つの補数データの反転を実現するた
めである。 The partial products of (n + m) bits formed every cycle by the selectors 3 and 4 of the booth are supplied to the accumulators 5 and 6, respectively. Similarly to the shift register 2, an l-bit accumulator 7 is provided for expanding the word length, and the MSB is supplied to the accumulator 7. By the accumulator 7,
When a large number of multiplication results of each tap are added, occurrence of overflow is prevented. The control signal R is supplied as a lower carry input of the accumulator 5. This is because when the EX-OR gate 25 inverts “0” and “1”, the control signal R of “1” is added to the LSB to realize the inversion of two's complement data.

累加算器５、６及び７は、図９のように入力の１ビッ
ト毎に全加算器と全加算器の出力が供給されるレジスタ
とレジスタの出力を全加算器に帰還する帰還路とからな
る。累加算器５、６及び７は、サムとキャリーが別の冗
長２進数の形態で累算を行い、従って、１ビットの入力
に対してサムとキャリーの２ビットの出力が発生する。
累加算器５、６及び７の計２（ｎ＋ｍ＋ｌ）ビットの出
力のうちのサムの出力がシフトレジスタ８に供給され、
そのキャリーの出力がシフトレジスタ９に供給される。The accumulators 5, 6, and 7 are composed of a full adder, a register to which the output of the full adder is supplied for each input bit, and a feedback path for feeding back the output of the register to the full adder as shown in FIG. Become. The accumulators 5, 6 and 7 perform the accumulation in the form of a redundant binary number in which the sum and the carry are different, so that a 1-bit input produces a 2-bit output of the sum and the carry.
The output of the sum of the outputs of a total of 2 (n + m + 1) bits of the accumulators 5, 6, and 7 is supplied to the shift register 8,
The output of the carry is supplied to the shift register 9.

シフトレジスタ８及び９の直列出力が下位のビットか
ら順に全加算器10に供給される。全加算器10の出力がフ
リップフロップ11に供給される。フリップフロップ11の
サム出力が取り出されると共に、そのキャリー出力が次
の上位ビットの入力として全加算器10に帰還される。全
加算器10は、３本の入力の（mod.2）の加算出力（サ
ム）と各２本の入力の論理積出力であるキャリーとの２
本の出力を発生する。この全加算器10及びフリップフロ
ップ11からなる累加算器により下位のビットから順に桁
上げ加算がされ、ビット直列出力が得られる。フリップ
フロップ11は、シフトレジスタ８及び９からの各（ｎ＋
ｍ＋ｌ）ビットの累算の初期にクリアされる。The serial outputs of the shift registers 8 and 9 are supplied to the full adder 10 in order from the lower bit. The output of the full adder 10 is supplied to the flip-flop 11. The sum output of the flip-flop 11 is taken out, and the carry output is fed back to the full adder 10 as the input of the next upper bit. The full adder 10 has two outputs, that is, an addition output (sum) of three inputs (mod. 2) and a carry that is a logical product output of each two inputs.
Generate book output. Carry addition is performed by the accumulator including the full adder 10 and the flip-flop 11 in order from the lower bit, and a bit serial output is obtained. The flip-flop 11 receives each (n +) signal from the shift registers 8 and 9.
Cleared early in the accumulation of (m + 1) bits.

累加算器５、６、７は、全加算器10及びフリップフロ
ップ11からなる累加算器と同様の構成を各１ビットの入
力に対して設けられたものである。即ち、図９のように
全加算器の第１の入力としてフリップフロップを介され
たセレクタ３及び４の出力の１ビットが供給され、全加
算器のキャリー出力が上位のビットへ渡されると共に、
サムがフリップフロップに供給され、下位のビットから
のキャリーがフリップフロップに供給され、これらのフ
リップフロップから２本の出力が取り出されると共に、
この２本の出力が全加算器の入力側に帰還される構成で
ある。かかる累加算器５、６、７は、キャリー及びサム
の２本の信号の形態で部分積の加算を行うので、キャリ
ーの伝播が無く、高速の加算処理を行うことができる。Each of the accumulators 5, 6, and 7 has a configuration similar to that of the accumulator including the full adder 10 and the flip-flop 11, provided for each 1-bit input. That is, as shown in FIG. 9, one bit of the outputs of the selectors 3 and 4 via the flip-flop is supplied as the first input of the full adder, and the carry output of the full adder is passed to the upper bits.
The sum is provided to the flip-flop, the carry from the lower bit is provided to the flip-flop, and two outputs are taken from these flip-flops,
The two outputs are fed back to the input side of the full adder. Since the accumulators 5, 6, and 7 perform addition of partial products in the form of two signals, that is, carry and sum, there is no carry propagation and high-speed addition processing can be performed.

第１図の構成に関連して、第４図に示す制御回路が設
けられている。第４図で、12が制御信号Ｓ、Ｐ、Ｑ、Ｒ
等が格納されたプログラムメモリであり、13がプログラ
ムメモリ12に対するアドレス発生回路であり、14がメモ
リ12から読み出された制御信号Ｓ、Ｐ、Ｑ、Ｒが格納さ
れるレジスタである。この制御信号Ｓ、Ｐ、Ｑ、Ｒは、
既知の係数に応じたものである。The control circuit shown in FIG. 4 is provided in connection with the configuration of FIG. In FIG. 4, 12 is a control signal S, P, Q, R
Is an address generating circuit for the program memory 12, and 14 is a register for storing control signals S, P, Q, and R read from the memory 12. The control signals S, P, Q, R are
It depends on a known coefficient.

メモリ12には、一例とし第５図に示すように、制御信
号が格納されている。後述のように、各タップの係数を
乗算するのに必要なステップ数を２次のブースのアルゴ
リズムを用いるのに比して減少できる。係数が12ビット
の時には、２次のブースのアルゴリズムの場合、常に６
ステップが必要であるが、この発明は、ステップ数を５
以下に減少できる。第５図は、４タップのディジタルフ
ィルタの場合を示し、第１タップの係数h₀の乗算MPY1が
４ステップでなされ、第２タップの係数h₁の乗算MPY2が
３ステップでなされ、第３タップの係数h₂の乗算MPY3が
５ステップでなされ、第４タップの係数h₃の乗算MPY4が
３ステップでなされる例を示している。従って、合計で
24ステップが必要であった演算を15ステップで行うこと
ができる。この第５図に示されるように、ディジタルフ
ィルタの用途に限定するのであれば、アドレス発生回路
13は、プログラムのステップ数（例えば15）を繰り返す
アドレスを発生するカウンタで構成できる。The memory 12 stores a control signal as shown in FIG. 5 as an example. As will be described later, the number of steps required to multiply the coefficient of each tap can be reduced as compared to using a secondary Booth algorithm. When the coefficient is 12 bits, it is always 6 for the second-order Booth algorithm.
Although steps are required, the present invention reduces the number of steps to five.
It can be reduced to the following. Figure 5 shows the case of a digital filter 4 taps, multiplier MPY1 coefficients h ₀ of the first tap is performed in four steps, multiplication MPY2 coefficients h ₁ of the second tap is performed in three steps, the third tap multiplication MPY3 coefficients h ₂ of is made in five steps, an example in which the fourth multiplication coefficient h ₃ of the tap MPY4 is made in three steps. Therefore, in total
Operations that required 24 steps can be performed in 15 steps. As shown in FIG. 5, if the use is limited to a digital filter, an address generation circuit
The counter 13 can be constituted by a counter for generating an address for repeating the number of steps (for example, 15) of the program.

上述の一実施例の演算処理について、係数が12ビット
の例について第６図を参照してより詳しく説明する。The calculation processing of the above-described embodiment will be described in more detail with reference to FIG.

第６図Ａは、係数が“1"のビットが少ない（00001000
1000）の場合を示す。従来と同様に、２次のブースのア
ルゴリズムを用いると、６ステップで演算が進められ
る。第１ステップで制御信号Ｓは入力（IN）、即ち、x_i
を選び、且つ制御信号Ｐ、Ｑ、Ｒが０であるので、累加
算器５、６、７には、０が入る。制御信号Ｐ、Ｑ、Ｒが
０であるのは、係数の下位２ビットが（00）で、最下位
ビットより下位のビットとして０を想定して、（000）
の３ビットを見るためである。FIG. 6A shows that the number of bits having a coefficient of “1” is small (00001000
1000). As in the conventional case, when the secondary Booth algorithm is used, the calculation proceeds in six steps. In the first step, the control signal S is input (IN), ie, x _i
Is selected, and the control signals P, Q, and R are 0, so that the accumulators 5, 6, and 7 receive 0. The control signals P, Q, and R are 0 because the lower 2 bits of the coefficient are (00), and 0 is assumed as a bit lower than the least significant bit.
In order to see the three bits of

次のステップ（サイクル）では、係数の下から４ビッ
ト目、３ビット目、２ビット目が（100）であるので、
制御信号（Ｐ、Ｑ、Ｒ）が（111）となる。この場合
は、シフトレジスタ１及び２に対する制御信号Ｓは、セ
レクタ21が２段前の出力ST2（第６図では、単に２で表
す）を選択するものとされ、即ち、２ビット左にシフト
し、入力をＸで表すと（−2X・2²）をセレクタ３及び４
が発生する。従って、累加算器５、６及び７には、最初
が０であったので、（−2X・2²）が格納される。In the next step (cycle), the fourth, third, and second bits from the bottom of the coefficient are (100).
The control signal (P, Q, R) becomes (111). In this case, the control signal S for the shift registers 1 and 2 is such that the selector 21 selects the output ST2 two steps earlier (indicated simply by 2 in FIG. 6), that is, shifts two bits to the left. When the input is represented by X, (−2X · 2 ² ) is represented by selectors 3 and 4
Occurs. Therefore, since the initial value is 0 in the accumulators 5, 6, and 7, (−2 × 2 ² ) is stored.

次のステップでは、係数の３ビットが（001）となる
ので、制御信号（Ｐ、Ｑ、Ｒ）が（010）となる。従っ
て、＋Ｘが形成される。また、制御信号Ｓは、セレクタ
21が２段前の出力ST2を選択するものとされ、前のステ
ップの２ビットの左シフトと合わせて2⁴倍がなされる。
従って、（Ｘ・2⁴）が累加算器５、６及び７に供給さ
れ、累加算器５、６及び７の累算で、（−2X・2²＋Ｘ・
2⁴）が形成される。In the next step, since the three bits of the coefficient are (001), the control signals (P, Q, R) are (010). Therefore, + X is formed. The control signal S is supplied to the selector
21 is intended to select a two-stage before the output ST2, 2 ^4-fold is made in conjunction with the shift left 2 bits of the previous step.
Accordingly, (X · 2 ⁴ ) is supplied to the accumulators 5, 6 and 7, and (−2X · 2 ² + X ·
²⁴ ) is formed.

以下、同様の処理が係数の２ビット毎になされる。従
って、係数が12ビットの時には、２次のブースのアルゴ
リズムで、常に、６ステップの処理で乗算処理が完了す
る。Hereinafter, the same processing is performed for every two bits of the coefficient. Therefore, when the coefficient has 12 bits, the multiplication process is always completed by the 6-step process by the secondary Booth algorithm.

第６図Ａで、６ステップの処理の右側には、ブースの
アルゴリズムを使用しない３ステップの処理が示されて
いる。つまり、係数の中の“1"が立っているビットにの
み注目して、その位置に応じたビットシフトされた入力
＋Ｘを選択する処理である。係数の中の“0"のビット
は、０を累加算する処理であるので、この処理は、省略
される。従って、第６図Ａに示すように、“1"のビット
が少ない係数の場合には、むしろブースのアルゴリズム
を用いない方がステップ数を減少できる。In FIG. 6A, a 3-step process not using the Booth algorithm is shown on the right side of the 6-step process. That is, this is a process in which attention is paid only to the bit where “1” is set in the coefficient, and the input + X bit-shifted according to the position is selected. Since the bit of “0” in the coefficient is a process of cumulatively adding 0, this process is omitted. Therefore, as shown in FIG. 6A, in the case of a coefficient having a small number of "1" bits, the number of steps can be reduced by not using the Booth algorithm.

第６図Ｂから第６図Ｆも、係数の具体例に関して、２
次のブースのアルゴリズムを使用した時のステップと、
改良された処理のステップとを夫々示している。FIG. 6B to FIG. 6F also show 2
Steps when using the next booth algorithm,
The improved processing steps are respectively shown.

第６図Ｂ及び第６図Ｃは、“0"又は“1"が連続するパ
ターンの係数を示している。“0"又は“1"が連続してい
る時には、制御信号（Ｐ、Ｑ、Ｒ）が（000）となり、
ブースのセレクタ３及び４が０の出力を選択する。この
０のデータを選択する処理を省くことで、係数の乗算に
必要なステップ数を減少できる。FIG. 6B and FIG. 6C show coefficients of a pattern in which “0” or “1” continues. When “0” or “1” continues, the control signal (P, Q, R) becomes (000),
Booth selectors 3 and 4 select 0 output. By omitting the process of selecting the 0 data, the number of steps required for multiplication of the coefficients can be reduced.

しかしながら、０のデータを選択するステップを省略
する方法は、第６図Ｄ及び第６図Ｅに夫々示すような
“0"と“1"が交互に現れるパターンの場合には、それほ
ど効果的でない。第６図Ｄの例では、１ステップの減少
にとどまり、第６図Ｅの例では、ステップ数が減少しな
い。However, the method of omitting the step of selecting 0 data is not very effective in the case of a pattern in which “0” and “1” appear alternately as shown in FIGS. 6D and 6E. . In the example of FIG. 6D, only one step is reduced, and in the example of FIG. 6E, the number of steps does not decrease.

更に、第６図Ｆに示すように、“1"及び“0"が交互に
現れるパターンと、“1"（又は“0"）が連続するパター
ンとが混在する係数の時には、２ステップを減少でき
る。一般的には、上位ビット或いは下位ビットに“0"又
は“1"が連続する第６図Ｆに示すようなパターンを持つ
係数が多く使用されている。Further, as shown in FIG. 6F, when the coefficient is a mixture of a pattern in which “1” and “0” appear alternately and a pattern in which “1” (or “0”) continues, two steps are reduced. it can. Generally, a coefficient having a pattern as shown in FIG. 6F in which “0” or “1” is consecutive in the upper bit or the lower bit is often used.

第６図の具体例から分るように、フィルタ演算の係数
のパターンとして、演算誤差の許容範囲内で、“1"のビ
ット数を少なくしたり、“0"又は“1"が連続するものを
選定することで、処理のステップ数をブースのアルゴリ
ズムを使用するのに比して、大幅に減少できる。ステッ
プ数の減少は、演算回路の効率が良いということであ
り、高速化及び回路規模の小型化の利点をもたらす。As can be seen from the specific example of FIG. 6, the pattern of the coefficient of the filter operation is such that the number of bits of “1” is reduced or “0” or “1” is continuous within the allowable range of the operation error. By selecting, the number of processing steps can be significantly reduced as compared to using Booth's algorithm. The reduction in the number of steps means that the efficiency of the arithmetic circuit is high, and brings about the advantages of high speed and small circuit size.

なお、累加算器の出力として、キャリー及びサムの２
本の出力の形態のままでも良い。また、この発明は、フ
ィルタ演算に限らず、FFT、コサイン変換等の演算に対
して適用して、同様の利点が得られる。Note that the output of the accumulator is two of carry and sum.
The output form of the book may be used as it is. In addition, the present invention is not limited to the filter operation, and can be applied to operations such as FFT and cosine transform to obtain similar advantages.

〔The invention's effect〕

この発明は、係数の乗算に必要なプログラムのステッ
プ数を減少でき、また、キャリーの伝播が無い累加算器
で部分積を加算しているので、図８の乗算器31を含む回
路に比べて非常に高速に動作し、回路の高速化を達成で
きる。従って、ディジタル画像データの処理に適用でき
る。また、この発明は、プログラム制御で積和演算を行
うので、複数の乗算器と複数の加算トリーを設ける構成
と比して、回路規模を小さくすることでできる。According to the present invention, the number of program steps required for multiplying coefficients can be reduced, and the partial products are added by the accumulator without carry propagation. It operates at a very high speed and can achieve a high-speed circuit. Therefore, the present invention can be applied to processing of digital image data. Further, according to the present invention, since the product-sum operation is performed under program control, the circuit scale can be reduced as compared with a configuration in which a plurality of multipliers and a plurality of addition trees are provided.

[Brief description of the drawings]

第１図はこの発明の一実施例のブロック図、第２図はシ
フトレジスタの１ビット分の構成を示すブロック図、第
３図はブースのセレクタの１ビット分を示すブロック
図、第４図は制御回路の一例のブロック図、第５図はプ
ログラムメモリの一例を示す略線図、第６図はこの発明
の一実施例の動作の説明に用いる略線図、第７図はこの
発明を適用できるディジタルフィルタの一例のブロック
図、第８図は従来のディジタル演算回路の一例のブロッ
ク図、第９図は累加算器の１ビット分を示すブロック図
である。図面における主要な符号の説明１、2:シフトレジスタ、３、4:ブースのセレクタ、５、６、7:累加算器、８、9:シフトレジスタ、 12:プログラムメモリ。FIG. 1 is a block diagram of one embodiment of the present invention, FIG. 2 is a block diagram showing the structure of one bit of a shift register, FIG. 3 is a block diagram showing one bit of a booth selector, and FIG. FIG. 5 is a block diagram of an example of a control circuit, FIG. 5 is a schematic diagram showing an example of a program memory, FIG. 6 is a schematic diagram used for explaining the operation of one embodiment of the present invention, and FIG. FIG. 8 is a block diagram of an example of a digital filter which can be applied, FIG. 8 is a block diagram of an example of a conventional digital operation circuit, and FIG. 9 is a block diagram showing one bit of a accumulator. Description of main reference numerals in the drawings 1, 2: shift register, 3, 4: Booth selector, 5, 6, 7: accumulator, 8, 9: shift register, 12: program memory.

Claims

(57) [Claims]

1. A shift register which can load input data in parallel and generates the input data of the shift amount selected from a plurality of shift amounts, and multiplies an output of the shift register by a power of two. Along with
A selector for generating a positive or negative output of the shift register; an accumulator for receiving the output of the selector and accumulating the output of the selector in a form divided into carry and sum; and generating a partial product. Means for generating a control signal for the shift register and the selector according to a program for the digital operation circuit.