JPH03196712A

JPH03196712A - Digital arithmetic circuit

Info

Publication number: JPH03196712A
Application number: JP33731889A
Authority: JP
Inventors: Seiichiro Iwase; 岩瀬　清一郎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1989-12-26
Filing date: 1989-12-26
Publication date: 1991-08-28
Anticipated expiration: 2014-03-03
Also published as: JP2864597B2

Abstract

PURPOSE:To obtain the result of addition of partial product of a number being a half the bit number of a multiplier in time division by using a full adder and a pipe line FF for input and output side of the adder. CONSTITUTION:A selector 13 is controlled by a 3-bit control signal and outputs a signal being 0 time, + or -1 time, + or - twice of an input data selectively in response to a control signal given to each of 2 bits of a coefficient. A low-order carry via a carry connection circuit 14 and an input data (a) in 2 bits are fed to the selector 13 and a double output is generated. In the case of a code of 2' complement, the polarity inversion is realized by inverting '0' and '1' and by adding '1' to the least significant bit. An output of the selector 13 is fed to a full adder 16 via an FF 15. A carry (c) and a sum (s) in two bits of the full adder 16 are outputted via the FF 17 and fed back to the input of the full adder 16 to constitute an accumulator. The addition of the partial product is implemented in the time division processing. The selector 13, the FFs 15, 17 and the full adder 16 are operated by using a clock whose frequency is 4 times the sampling frequency.

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、ディジタルフィルタ等の積和演算に適用で
きるディジタル演算回路に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a digital arithmetic circuit that can be applied to a product-sum operation such as a digital filter.

[Summary of the invention]

請求項（１）の発明は、２進数の第１の入力とキャリー
である第２の入力とサムである第３の入力とが供給され
る全加算器と、第２の入力に対して設けられ、下位から
のキャリーを全加算器に供給すると共に、上位へキャリ
ーを渡すためのキャリー接続手段と、全加算器から出力
されるキャリー及びサムをクロックに同期してホールド
する手段と、ホールド手段からのキャリーをキャリー接
続手段を介して上位へ供給すると共に、サムの出力を全
加算器に帰還する帰還路とからなり、ゲート数が少ない
構成とできる。The invention of claim (1) provides a full adder to which a first input of a binary number, a second input that is a carry, and a third input that is a sum are provided, and a full adder that is provided for the second input. carry connection means for supplying the carry from the lower order to the full adder and passing the carry to the upper order, means for holding the carry and sum outputted from the full adder in synchronization with a clock, and holding means. It is comprised of a feedback path that supplies the carry from the input terminal to the upper level via the carry connection means and returns the output of the sum to the full adder, allowing for a configuration with a small number of gates.

請求項（２）の発明は、少なくとも２個の第１及び第２
の乗算器と第１及び第２の乗算器の出力を加算する加算
器とからなるディジタル演算回路であって、第１及び第
２の乗算器は、被乗数と乗数の分割されたデータとの部
分積を形成し、キャリー及びサムに分割した形態で部分
積を累加算する構成とされ、加算器は、第１及び第２の
乗算器の夫々のサム及びキャリーを順次選択するセレク
タを有し、セレクタの出力信号を累加算する構成とされ
たもので、ゲート数が少ない積和演算回路の構成とでき
る。The invention of claim (2) provides at least two first and second
A digital arithmetic circuit comprising a multiplier and an adder that adds the outputs of the first and second multipliers, wherein the first and second multipliers add the multiplicand and the divided data of the multiplier. The adder has a selector that sequentially selects the sum and carry of each of the first and second multipliers; It is configured to cumulatively add the output signals of the selector, and can be configured as a product-sum calculation circuit with a small number of gates.

[Conventional technology]

ｎタップのＦＩＲディジタルフィルタは、入力系列をＸ
、とし、出力系列をｙ、とし、インパルス応答をｈ０〜
ｈ７−１　とする時、なる演算を行うものである。オーディオ信号のディジタ
ルフィルタ処理では、乗算器と累加算器を各１個持ち、
プログラムで上述の演算を制御する構成（所謂ＤＳＰ）
が用いられている。しかしながら、サンプリング周波数
が高い画像データのリアルタイム処理では、乗算器及び
累加算器に時分割処理をさせる時間的余裕がない、従っ
て、第５図に示すように、上述の演算処理と対応して回
路を配した構成が用いられていた。第５図は、（ｎ＝４
）の例である。第５図の構成は、第６図のように表現す
ることができる。即ち、シフトレジスタ部３１と乗算部
３２と加算トリ一部３３とからなる。An n-tap FIR digital filter converts the input sequence to
, the output series is y, and the impulse response is h0~
When h7-1 is set, the following calculation is performed. Digital filter processing of audio signals uses one multiplier and one accumulator,
A configuration in which the above calculations are controlled by a program (so-called DSP)
is used. However, in real-time processing of image data with a high sampling frequency, there is not enough time to perform time-sharing processing in multipliers and accumulators. Therefore, as shown in FIG. A configuration with . Figure 5 shows (n=4
) is an example. The configuration shown in FIG. 5 can be expressed as shown in FIG. That is, it consists of a shift register section 31, a multiplication section 32, and an addition section 33.

上述のディジタルフィルタに使用される乗算器として、
部分積加算回路を並列に並べた並列乗算器が通常、使用
されている。第７図は、本願出願人の捉案にかかわるブ
ースの乗算アルゴリズムを使用した並列乗算器の一例を
示している（特開昭６４−８６２７０号公報参照）。但
し、第７図では、１ビツト分の回路構成を示しており、
ＩＯビットの乗数（係数）を想定して５個の部分積を加
算する構成とされている。As a multiplier used in the digital filter mentioned above,
A parallel multiplier in which partial product adder circuits are arranged in parallel is usually used. FIG. 7 shows an example of a parallel multiplier using the Booth multiplication algorithm proposed by the applicant (see Japanese Patent Laid-Open No. 86270/1986). However, Figure 7 shows the circuit configuration for 1 bit.
The configuration is such that five partial products are added assuming a multiplier (coefficient) of IO bits.

シフトレジスタ部のタップから取り出された入力データ
の１ビツトがフリップフロップ３４を介してセレクタ３
５．３６．３７．３８．３９とビット接続回路４０．４
１．４２．４３．４４に夫々供給される。セレクタ３５
〜３９には、係数、即ち、乗数の２ビツト毎に形成され
た３ビツトの制御信号が図示せずブースのデコーダから
供給される。このデコーダには、係数の注目する２ビツ
トとその下位の１ビツトの合計３ビツトが供給される。One bit of the input data taken out from the tap of the shift register section is sent to the selector 3 via the flip-flop 34.
5.36.37.38.39 and bit connection circuit 40.4
1.42.43.44 respectively. Selector 35
.about.39 are supplied with a 3-bit control signal formed for every 2 bits of the coefficient, that is, the multiplier, from a Booth decoder (not shown). This decoder is supplied with a total of 3 bits, including the 2 bits of the coefficient of interest and the lower 1 bit.

セレクタ３５〜３９は、２次のブースのアルゴリズムに
基づいて、入力の１ビツトの±２倍と±１倍と０倍とを
選択する。人力の±２倍のデータは、ビット接続回路４
０〜４４により下位のビットを選択することで実現され
る。即ち、１ビツトシフトで２倍の値が形成される。The selectors 35 to 39 select ±2 times, ±1 times, and 0 times the input 1 bit based on the second-order Booth algorithm. Data that is ±2 times human power is bit connection circuit 4
This is achieved by selecting the lower bits from 0 to 44. That is, a one-bit shift forms a double value.

４５．４６．４７．４８は、１ビツトの全加算器（フル
アダー）で、３人力が供給され、キャリーｃとサムＳの
２ビツトを出力する。全加算器４６．４７及び４８のキ
ャリー人力としてビット接続回路４９．５０．５１を夫
々介して下位からのキャリーが供給される。キャリー接
続回路４９．５０．５１は、桁上げのためのキャリーを
上のビットプレーンの回路に接続することと、下のビッ
トプレーンの回路からの桁上げキャリーを受は入れるこ
とを行うことを示している。45, 46, 47, and 48 are 1-bit full adders, which are supplied with three inputs and output two bits: carry c and sum S. Carries from the lower order are supplied to full adders 46, 47 and 48 via bit connection circuits 49, 50 and 51, respectively. Carry connection circuit 49.50.51 connects a carry for a carry to the circuit of the upper bit plane and accepts a carry from the circuit of the lower bit plane. ing.

通常の並列乗算器では、最後にキャリーも加算してしま
うのであるが第７図の構成では、加算トリーの後でキャ
リーの加算を行う前提で、キャリー及びサムの２ビツト
の冗長２進数の形態の出力をフリップフロップ５２及び
５３から出力している。この場合、加算トリ一部として
、第８図に示すように、入力側のフリップフロップ５４
を介された６本の入力を４段の全加算器５５．５６．５
７．５８で順次加算し、出力側のフリップフロップ５９
及び６０を介して出力する構成を使用できる。第８図も
、１ビツト分のみの構成を示している。第８図から明ら
かなように、１ビツトのデータを全てキャリーＣとサム
Ｓの２本で扱っている点が通常の構成と異なっている。In a normal parallel multiplier, the carry is also added at the end, but in the configuration shown in Figure 7, the carry is added after the addition tree, and the carry and sum are in the form of 2-bit redundant binary numbers. The outputs are outputted from flip-flops 52 and 53. In this case, as part of the addition circuit, as shown in FIG.
A 4-stage full adder 55.56.5
7. Sequential addition at 58, output side flip-flop 59
and 60 can be used. FIG. 8 also shows the configuration for only one bit. As is clear from FIG. 8, the configuration differs from the normal configuration in that all 1-bit data is handled by two wires, carry C and sum S.

この第８図は、ＦＩＲディジタルフィルタの３タップ分
の部分積の加算の例である。FIG. 8 is an example of addition of partial products of three taps of an FIR digital filter.

通常の構成とは、１ビツトフルアダーの場合、第９図へ
に示すように、Ａ及びＢの入力と下位からのキャリー人
力ｃｉとが供給され、加算出力Ｓと上位へのキャリー出
力ｃｏとが発生することを意味する。これに対して、上
述の第８図の構成は、演算出力では、冗長２進数として
扱い、全ての演算が済んだ後で、冗長２進数を普通の２
進数にする考え方に基づいている。つまり、第９図Ｂに
示すように、Ａ、Ｂ、Ｃの同じビット桁の３本の入力を
加算して２本の同じビット桁の出力Ｓ１及びＳ２を出力
している。この考え方では、多数の同じビット桁の人力
は、最後に１ビツトにつき２本の出力まで減らせるが、
１本にすることができない。従って、第９図Ｃに示すよ
うに、多数の全加算器を直列に接続して、ビット毎に１
本の通常の２進数の出力を形成することが必要である。In the case of a 1-bit full adder, as shown in FIG. 9, the normal configuration is that inputs A and B and carry input ci from the lower order are supplied, and the addition output S and the carry output co to the upper order are supplied. This means that this occurs. On the other hand, the configuration shown in FIG.
It is based on the idea of converting it into a base number. That is, as shown in FIG. 9B, three inputs of the same bit digits A, B, and C are added and two outputs S1 and S2 of the same bit digit are output. In this way of thinking, the human effort required for a large number of the same bit digits can be reduced to two outputs per bit in the end, but
I can't make it into one. Therefore, as shown in FIG. 9C, a large number of full adders are connected in series, and one
It is necessary to form the normal binary output of the book.

しかしながら、この第９図Ｃに示す構成は、キャリーが
多段に伝播して低速な演算回路である。However, the configuration shown in FIG. 9C is a slow arithmetic circuit in which carries propagate through multiple stages.

かかる加算回路を高速とする方法として、キャリー先見
（キャリールックアヘッド）とかキャリーセレクトとか
が知られている。しかしながら、これらの方法は、ゲー
ト数が増大する欠点がある。Carry lookahead and carry select are known as methods for increasing the speed of such adder circuits. However, these methods have the disadvantage of increasing the number of gates.

従って、第５図におけるディジタルフィルタを構成する
各乗算器や各加算トリー毎にこのような加算回路を設け
ることは、高速化の障害となる。そこで、先の出願では
、各乗算器や各加算器では、１ビット当り２本の演算途
中で止めて、次の演算に入り、全ての演算の後で第７図
の高速化したもので、冗長２進数から普通の２進数に変
換している。Therefore, providing such an adder circuit for each multiplier or each adder tree constituting the digital filter in FIG. 5 becomes an obstacle to speeding up. Therefore, in the previous application, in each multiplier and each adder, two operations per bit are stopped in the middle, the next operation is started, and after all operations are completed, the speed is increased as shown in FIG. Converts redundant binary numbers to normal binary numbers.

第８図は、乗算部の１ビツトブレーン分を示したもので
、ｎビットの乗算器とするには、第１０図のように、重
ねてｎブレーンにする必要がある。FIG. 8 shows one bit brane of the multiplication section, and in order to make an n-bit multiplier, it is necessary to overlap the n-branes as shown in FIG. 10.

第１０図では、簡単のために、乗算により語長が延びる
ことは、加味されていない。第９図も、加算トリ一部の
１ビツトプレーンを示すもので、ｎビット分とするには
、第１０図のように、重ねてｎブレーンとする必要があ
る。第１０図において、ＭＰＹが第７図の構成に対応し
ており、ＡＴが第８図の構成に対応している。また、第
１０図において、接続線は、簡単のため最上位ビットの
ブレーンについてのみ示しであるが、他のビットブレー
ンについて同様に接続される。In FIG. 10, for the sake of simplicity, the fact that the word length increases due to multiplication is not taken into consideration. FIG. 9 also shows a 1-bit plane of a part of the addition tree, and in order to have n bits, it is necessary to overlap them to form n planes as shown in FIG. In FIG. 10, MPY corresponds to the configuration in FIG. 7, and AT corresponds to the configuration in FIG. 8. Further, in FIG. 10, connection lines are shown only for the most significant bit brain for simplicity, but the other bit brains are connected in the same way.

[Problem to be solved by the invention]

先に提案されている構成では、第７図及び第８図のよう
に、フリップフロップとフリップフロップとの間に全加
算器等のゲート回路が多数挟まれていた。つまり、パイ
プラインレジスタの間にゲート回路が多数段直列になっ
たものが挟まっている構成である。かかる構成は、各ゲ
ート回路が働いている時間が僅かで、クロックサイクル
の大半で休んでいるために、効率が悪い回路と言える。In the previously proposed configuration, as shown in FIGS. 7 and 8, a large number of gate circuits such as full adders are sandwiched between flip-flops. In other words, it has a configuration in which multiple stages of gate circuits connected in series are sandwiched between pipeline registers. Such a configuration can be said to be an inefficient circuit because each gate circuit works only for a short time and rests for most of the clock cycle.

かかる効率の悪さを改善しないと、画像信号処理用の高
速な演算回路が大規模となり、消費電力の増大、コスト
の増大が生じる。If such inefficiency is not improved, a high-speed arithmetic circuit for image signal processing will become large-scale, resulting in increased power consumption and cost.

かかる効率の悪さを解決するためには、ゲート回路をな
るべく小規模の形でパイプラインレジスタ間に挟めば良
いので、第１１図に示すように、全加算器６１の入力側
及び出力側に夫々フリップフロップ６２及び６３が設け
られる。しかしながら、第１１図のように、全加算器単
位或いはブースのセレクタ単位でパイプライン化するこ
とは、クロックを３倍の周波数に上げることができるが
、フリップフロップが増えてゲート数が増大する問題を
生じる。In order to solve this problem of inefficiency, the gate circuit should be sandwiched between the pipeline registers in a form as small as possible, so as shown in FIG. Flip-flops 62 and 63 are provided. However, as shown in Figure 11, pipelining in full adder units or Booth selector units can triple the clock frequency, but there is a problem in that the number of flip-flops increases and the number of gates increases. occurs.

従って、この発明の目的は、ゲート数が少なく、また、
ゲートが無駄に遊ぶことがないように、改良されたディ
ジタル演算回路を提供することにある。Therefore, an object of the present invention is to reduce the number of gates, and
An object of the present invention is to provide an improved digital arithmetic circuit so that gates do not play unnecessarily.

[Means to solve the problem]

請求項（１）の発明は、２進数の第１の入力とキャリー
である第２の入力とサムである第３の入力とが供給され
る全加算器（１６）と、第２の入力に対して設けられ、下位からのキャリーを全
加算器（１６）に供給すると共に、上位へキャリーを渡
すためのキャリー接続手段（１８）と、全加算器（１６）から出力されるキャリー及びサムをク
ロックに同期してホールドする手段（１７）と、ホールド手段（１７）からのキャリーをキャリー接続手
段（１８）を介して上位へ供給すると共に、サムの出力
を全加算器（１６）に帰還する帰還路とからなるディジタル演算回路である。The invention of claim (1) provides a full adder (16) to which a first input of a binary number, a second input that is a carry, and a third input that is a sum are supplied; carry connection means (18) for supplying carry from the lower order to the full adder (16) and passing the carry to the upper order; Means for holding in synchronization with the clock (17), supplying the carry from the hold means (17) to the upper side via the carry connection means (18), and feeding back the output of the sum to the full adder (16) This is a digital arithmetic circuit consisting of a feedback path.

請求項（２）の発明は、少なくとも２個の第１及び第２
の乗算器（８Ａ、８Ｂ）と第１及び第２の乗７算器（８
Ａ、８Ｂ）の出力を加算する加算器（９）とからなるデ
ィジタル演算回路であって、第１及び第２の乗算器（８
Ａ、８Ｂ）は、被乗数と乗数の分割されたデータとの部
分積を形成し、キャリー及びサムに分割した形態で部分
積を累加算する構成とされ、加算器（９）は、第１及び第２の乗算器（８Ａ、８Ｂ）
の夫々のサム及びキャリーを順次選択するセレクタ（２
０）を有し、セレクタ（２０）の出力信号を累加算する
構成とされたディジタル演算回路である。The invention of claim (2) provides at least two first and second
multipliers (8A, 8B) and first and second multipliers (8
A, 8B) is a digital arithmetic circuit consisting of an adder (9) that adds the outputs of the first and second multipliers (8B).
A, 8B) is configured to form a partial product of the divided data of the multiplicand and the multiplier, and accumulate the partial products in the form of a carry and a sum. Second multiplier (8A, 8B)
A selector (2) that sequentially selects each sum and carry of
0), and is a digital arithmetic circuit configured to cumulatively add the output signals of the selector (20).

[Effect]

請求項（１）の発明において、１個の全加算器１６とそ
の入力及び出力側のパイプライン用のフリップフロップ
１５．１７とにより、乗数のビット数の２の個数の部分
積の加算結果を時分割動作で得ることができる。In the invention of claim (1), one full adder 16 and flip-flops 15 and 17 for pipelines on its input and output sides add the result of addition of partial products whose number is 2, which is the number of bits of the multiplier. It can be obtained by time division operation.

請求項（２）の発明では、ディジタルフィルタのような
積和演算を行う時に、乗算部８Ａ及び８Ｂと加算トリー
９が共に、キャリー及びサムの２本の信号の形態で処理
を行う、加算トリー９は、乗算部８Ａ及び８Ｂからの４
本の入力をセレクタ２０で順に選択して累加算を行う。In the invention of claim (2), when performing a product-sum operation such as a digital filter, the multipliers 8A and 8B and the addition tree 9 both perform processing in the form of two signals, carry and sum. 9 is 4 from the multipliers 8A and 8B.
The book inputs are sequentially selected by the selector 20 and cumulative addition is performed.

従って、ゲート数を少なくでき、また、全加算器等の回
路が無駄に遊ぶことを防止できる。Therefore, the number of gates can be reduced, and circuits such as full adders can be prevented from being idle.

〔Example〕

以下、この発明を４タツプのＦＩＲディジタルフィルタ
に適用した一実施例について図面を参照して説明する。An embodiment in which the present invention is applied to a four-tap FIR digital filter will be described below with reference to the drawings.

第１図は、この一実施例の全体的な構成を示す、入力デ
ータは、その１サンプルが例えば８ビット並列のもので
、２を補数とするコードである。但し、第１図では、１
ビツトプレーンに関しての構成のみが示されている。FIG. 1 shows the overall configuration of this embodiment. The input data is a two-complement code in which one sample is, for example, 8 bits in parallel. However, in Figure 1, 1
Only the configuration with respect to the bitplane is shown.

第１図において、■、２．３及び４は、夫々入力データ
のサンプリング周期と等しい遅延時間を有する単位遅延
素子例えばフリップフロップである。フリップフロップ
１（第１タツプ）の出力データａが乗算部８Ａに供給さ
れる。フリップフロップ２（第２タツプ）の出力データ
及びフリップフロップ３（第３タツプ）の出力データが
２τ（τ：クロックの周期）の遅延量の遅延回路５及び
６に夫々供給され、遅延回路５及び６の出力データｂ′
及びＣ′が乗算部８Ｂ及び８Ｃに夫々供給される。フリ
ップフロップ４（第４タツプ）の出力データｄが４τの
遅延回路７に供給され、遅延回路７の出力データｄ′が
乗算部８Ｄに供給される。In FIG. 1, 2, 2.3, and 4 are unit delay elements, such as flip-flops, each having a delay time equal to the sampling period of input data. Output data a of flip-flop 1 (first tap) is supplied to multiplier 8A. The output data of flip-flop 2 (second tap) and the output data of flip-flop 3 (third tap) are supplied to delay circuits 5 and 6 with a delay amount of 2τ (τ: clock period), respectively. 6 output data b'
and C' are supplied to multipliers 8B and 8C, respectively. Output data d of the flip-flop 4 (fourth tap) is supplied to a 4τ delay circuit 7, and output data d' of the delay circuit 7 is supplied to a multiplier 8D.

乗算部８Ａ〜８Ｄは、２次のブースのアルゴリズムによ
り、係数と各タップのデータとの乗算を行うものである
。即ち、（ＸｘＹ）（Ｘ：被乗数（データ）、Ｙ：乗数
（係数））の乗算を行う時に、乗数の相続く符号のパタ
ーンによって、（０、＋Ｘ、−Ｘ、＋２Ｘ、又は−２Ｘ
）の演算を行うことにより乗算を行うものである。従っ
て、各乗算部８Ａ〜８Ｄに夫々設けられたブースのセレ
クタには、係数の相続く３ビツトがブースのデコーダに
供給されることで形成された制御信号が供給される。こ
れらのＯ１士Ｘ、±２Ｘが部分積と称される。The multipliers 8A to 8D are for multiplying the coefficients by the data of each tap using the second-order Booth algorithm. That is, when performing multiplication of (XxY) (X: multiplicand (data), Y: multiplier (coefficient)), depending on the pattern of successive signs of the multiplier, (0, +X, -X, +2X, or -2X
) is used to perform multiplication. Therefore, the Booth selector provided in each of the multipliers 8A to 8D is supplied with a control signal formed by supplying successive three bits of the coefficient to the Booth decoder. These O1−X, ±2X are called partial products.

また、乗算部８Ａ〜８Ｄは、フリップフロップｌ、２．
３及び４からなるシフトレジスタ部からの入力データに
ついて、係数語長のηに相当する数の部分積をクロック
サイクル毎に異積する。この部分積は、２ビツトの桁ず
れを有している必要がある。従って、シフトレジスタ部
の出力は、４クロツクサイクル毎に右シフトするだけで
なく、クロックサイクル毎に２ビツトシフトを行う。こ
のシフトの方法としては、乗算部８八〜８Ｄの夫々の入
力側にセレクタを設けたり、乗算部８Ａ〜８Ｄの夫々の
入力を記憶し、ビット桁を上げる方向にシフトできるシ
フトレジスタを設ける等を採用できる。Furthermore, the multipliers 8A to 8D include flip-flops l, 2.
Regarding the input data from the shift register section consisting of 3 and 4, a number of partial products corresponding to the coefficient word length η are cross-producted every clock cycle. This partial product must have a 2-bit shift. Therefore, the output of the shift register section is not only shifted to the right every four clock cycles, but also shifted two bits every clock cycle. This shifting method includes providing a selector on the input side of each of the multipliers 88 to 8D, or providing a shift register that stores the inputs of each of the multipliers 8A to 8D and can shift the bit digits upward. can be adopted.

乗算部８Ａの出力ｅと乗算部８Ｂの出力ｆとが加算トリ
ー９Ａに供給される０乗算部８Ｃの出力ｇと乗算部８Ｄ
の出力りとが加算トリー９Ｂに供給される。加算トリー
９Ａ及び９Ｂの夫々の出力ｉ及びｊが加算トリー１０に
夫々供給される。これらの加算トリー９Ａ、９Ｂ及び１
０では、キャリーとサムの２組分の４ビツトを累加算す
る。The output e of the multiplier 8A and the output f of the multiplier 8B are supplied to the addition tree 9A. The output g of the 0 multiplier 8C and the multiplier 8D
The output of is supplied to the addition tree 9B. The respective outputs i and j of adder trees 9A and 9B are supplied to adder tree 10, respectively. These addition trees 9A, 9B and 1
At 0, two sets of 4 bits, carry and sum, are cumulatively added.

加算トリー１０の出力（キャリー及びサム）ｋがフリッ
プフロップ１１及び１２に供給され、フリップフロップ
１１及び１２から出力ｌが得られる。図示せずも、この
出力ｌは、冗長２進数であり、累加算器の構成により、
１ビツトが１本の普通の２進数に変換される。The output (carry and sum) k of the adder tree 10 is supplied to flip-flops 11 and 12, from which an output l is obtained. Although not shown, this output l is a redundant binary number, and due to the configuration of the accumulator,
One bit is converted to one ordinary binary number.

第２図は、第１図に示す回路の動作を示すタイミングチ
ャートである０乗算部８Ａ〜８Ｄ、加算トリー９Ａ、９
Ｂ、１０の動作クロックは、入力データのサンプリング
周波数の４倍の周波数である。つまり、入力データのサ
ンプリング周期をＴとし、クロックの周期をτで表すと
、（Ｔ−４τ）の関係にある。ｘＬｘ２、・・・は、並
列化された入力データの同一の桁（ＭＳＢ、ＬＳＢ等）
の１ビツトを夫々表している。FIG. 2 is a timing chart showing the operation of the circuit shown in FIG.
The operating clock of B, 10 has a frequency four times the sampling frequency of input data. That is, if the sampling period of input data is T and the period of the clock is represented by τ, there is a relationship of (T−4τ). xLx2,... are the same digits (MSB, LSB, etc.) of the parallelized input data
Each bit represents one bit of .

フリップフロップ１の出力データａに対して、フリップ
フロップ２の出力データｂ、フリップフロップ３の出力
データＣ、フリップフロップ４の出力データｄは、Ｔ、
２Ｔ、３Ｔ夫々遅れている。For output data a of flip-flop 1, output data b of flip-flop 2, output data C of flip-flop 3, and output data d of flip-flop 4 are T,
Both 2T and 3T are delayed.

遅延回！５の出力データｂ′は、ｂに対して、２τの遅
れを持ち、遅延回路６の出力データＣ′は、Ｃに対して
、２τの遅れを持ち、遅延回路７の出力データｄ′は、
ｄに対して４τの遅れを持つ。Delayed episode! The output data b' of the delay circuit 5 has a delay of 2τ with respect to b, the output data C' of the delay circuit 6 has a delay of 2τ with respect to C, and the output data d' of the delay circuit 7 is as follows.
It has a delay of 4τ with respect to d.

乗算部８への出力ｅ、乗算部８Ｂの出力ｆ、乗算部８Ｃ
の出力ｇ、乗算部８Ｄの出力りの夫々において、Ｌｎは
、入力データのｘｎと係数の乗算結果が得られるタイミ
ングを示している。例えば乗算部８Ａの出力ｅにおいて
、ｔ４は、ｘ４と第１タツプの係数との乗算結果が得ら
れるタイミングである。Output e to multiplication section 8, output f of multiplication section 8B, multiplication section 8C
In each of the output g of , and the output of the multiplier 8D, Ln indicates the timing at which the multiplication result of the input data xn and the coefficient is obtained. For example, in the output e of the multiplier 8A, t4 is the timing at which the multiplication result of x4 and the coefficient of the first tap is obtained.

加算トリー９Ａの出力ｉ及び加算トリー９Ｂの出力ｊに
おいて、ｔｍｎは、係数及びｘｍの積と係数及びｘｎの
積の和が得られるタイミングを示している。例えば加算
トリー９Ａの出力ｉにおいて、ｔ５４は、ｘ５と第１タ
ツプの係数の積（乗算出力ｅにおいてｔ５で示すタイミ
ングで得られる）と、ｘ４と第２タツプの係数の積（乗
算ｆにおいてｔ４てで示すタイミングで得られる）との
和が得られるタイミングを示している。In the output i of the addition tree 9A and the output j of the addition tree 9B, tmn indicates the timing at which the sum of the product of the coefficient and xm and the product of the coefficient and xn is obtained. For example, in the output i of the addition tree 9A, t54 is the product of x5 and the coefficient of the first tap (obtained at the timing indicated by t5 in the multiplication output e) and the product of x4 and the coefficient of the second tap (obtained at t4 in the multiplication f). (obtained at the timing indicated by) is obtained.

更に、加算トリー１０の出力ｋにおいて、ｔ＋＋＋ｎｏ
ｐは、ｔｍｎ及びｔｏｐで夫々得られた和出力の和が得
られるタイミングを示す０例えばｔ　４３２１は、加算
トリー９Ａの出力ｌでｔ４３のタイミングで発生する加
算出力と、加算トリー９Ｂの出力ｊでｔ２１のタイミン
グで発生する加算出力との和が得られるタイミングを示
している。フリップフロップ１１及び１２では、元のサ
ンプリング周波数のクロックで加算トリー１０の出力が
サンプリングされ、フィルタ演算出力ｌがフリップフロ
ップ１１及び１２から得られる。Furthermore, at the output k of the addition tree 10, t+++no
p indicates the timing at which the sum of the sum outputs obtained at tmn and top is obtained, respectively.For example, t4321 is the addition output generated at the timing t43 at the output l of the addition tree 9A, and the output j of the addition tree 9B. shows the timing at which the sum with the addition output generated at timing t21 is obtained. In the flip-flops 11 and 12, the output of the addition tree 10 is sampled with a clock having the original sampling frequency, and a filter operation output l is obtained from the flip-flops 11 and 12.

上述の乗算部８Ａは、第３図に示す構成を有している。The multiplication section 8A described above has the configuration shown in FIG. 3.

第３図において、１３がブースのセレクタを示し、セレ
クタ１３は、第１タツプに対する係数をブースのデコー
ダに供給することで得られた３ビツトの制御信号で制御
される。セレクタ１３は、係数の２ビツト毎の制御信号
に応じて入力データの０倍、±１倍、±２倍の信号を選
択的に出力する。このセレクタ１３には、キャリー接続
回路１４を介された下位からのキャリーと入力データａ
との２ビツトが供給される。キャリー接続回路１４から
の下位のキャリーが選択される時には、２倍の出力が発
生することを意味する。また、２を補数とするコードの
場合には、“０”と“１”とを反転して、最下位ビット
に“ｌ＃を足すことで極性の反転が実現される。In FIG. 3, 13 indicates a Booth selector, and the selector 13 is controlled by a 3-bit control signal obtained by supplying the coefficient for the first tap to the Booth decoder. The selector 13 selectively outputs signals that are 0 times, ±1 times, and ±2 times the input data in accordance with control signals for every two bits of the coefficient. This selector 13 receives carry and input data a from the lower level via the carry connection circuit 14.
Two bits are supplied. This means that when the lower carry from the carry connection circuit 14 is selected, twice the output is generated. Furthermore, in the case of a two-complement code, polarity inversion is achieved by inverting "0" and "1" and adding "l#" to the least significant bit.

セレクタ１３の出力がフリップフロップ１５を介して全
加算器１６に供給される。全加算器１６のキャリーＣ及
びサムＳの２ビツトがフリップフロップ１７を介して出
力されると共に、全加算器１６の入力側に帰還される。The output of selector 13 is supplied to full adder 16 via flip-flop 15. The carry C and sum S bits of the full adder 16 are outputted via the flip-flop 17 and fed back to the input side of the full adder 16.

この帰還路により累加算器（アキュムレータ）が構成さ
れる。この累加算器の構成で、部分積の加算が時分割処
理でなされる。前述のように、サンプリング周波数の４
倍のクロックでセレクタ１３、フリップフロップ１５及
び１７、全加算器１６が動作する。This feedback path constitutes an accumulator. With this accumulator configuration, partial products are added by time-sharing processing. As mentioned above, the sampling frequency of 4
The selector 13, flip-flops 15 and 17, and full adder 16 operate with the doubled clock.

入力データａの１ビツト例えばｘ４が供給されると、第
１タツプの係数（８ビツト）の２ビツト毎に部分積がセ
レクタ１３から生じる。合計４個の部分積が全加算器１
６により、４クロック周期で累加算され、第２図の乗算
部８Ａの出力ｅにおいて、ｔ４で示すタイミングでＸ４
と係数の乗算出力が得られる。各累加算に先行してフリ
ップフロップ１７がクリアされるか、又は帰還路にＡＮ
Ｄゲートを挿入して初期化がなされる。この累加算の時
に発生したキャリーがキャリー接続回路１８を介して上
位の桁の全加算器に供給され、また、下位の桁のキャリ
ーがキャリー接続回路１８を介して全加算器１５に供給
される。When one bit of input data a, for example x4, is supplied, a partial product is generated from the selector 13 for every two bits of the coefficient (8 bits) of the first tap. A total of 4 partial products are added to full adder 1
6, the cumulative addition is performed in four clock cycles, and at the output e of the multiplier 8A in FIG.
The multiplication output of the coefficient is obtained. Flip-flop 17 is cleared prior to each accumulation, or AN
Initialization is performed by inserting a D gate. The carry generated during this cumulative addition is supplied to the full adder of the higher digits via the carry connection circuit 18, and the carry of the lower digit is supplied to the full adder 15 via the carry connection circuit 18. .

上述の乗算部８Ａは、１ビツト分の入力をキャリーＣと
サムＳとの２ビツトで表現する形で累加算を行う。乗算
部８Ｂ、８Ｃ１８Ｄも第３図に示す乗算部８Ａと同一の
構成とされている。The multiplier 8A described above performs cumulative addition in such a way that one bit of input is represented by two bits, carry C and sum S. The multipliers 8B and 8C18D also have the same configuration as the multiplier 8A shown in FIG.

加算トリー９Ａの詳細を第４図に示す０乗算部８Ａ及び
８Ｂからは、上述のように、キャリーとサムの２ビツト
の出力ｅ及びｆが発生するので、この２ビツトの加算を
することができる構成が必要とされる。The details of the addition tree 9A are shown in FIG. 4. Since the 0 multipliers 8A and 8B generate 2-bit outputs e and f, carry and sum, as described above, it is possible to perform addition of these 2 bits. A configuration that can do this is required.

第４図において、２０は、乗算部８Ａ及び８Ｂからの入
力ｅ及びｒ（合計４個の入力）を切り替えるためのセレ
クタである。遅延回路５が挿入されているので、加算す
べき入力ｅ及びｆの間には、２クロック周期の遅延があ
る。これらの入力のキャリーに関しては、キャリー接続
回路２１及び２２が挿入されている。また、サムの入力
に関してのみフリップフロップ２３及び２４が挿入され
、夫々の入力ｅ及びｒにおいて、キャリーに対してサム
が１クロック周期遅れてセレクタ２０に供給される。従
って、セレクタ２０に対しては、入力ｅのキャリー、入
力ｅのサム、入力ｆのキャリー人力ｆのサムが４クロッ
ク周期で順番に供給され、セレクタ２０は、これらの１
ビツトを順番に選択して出力する。In FIG. 4, 20 is a selector for switching inputs e and r (four inputs in total) from multipliers 8A and 8B. Since the delay circuit 5 is inserted, there is a delay of two clock periods between the inputs e and f to be added. Regarding the carry of these inputs, carry connection circuits 21 and 22 are inserted. Further, flip-flops 23 and 24 are inserted only for the sum input, and the sum is supplied to the selector 20 with a delay of one clock period relative to the carry at the respective inputs e and r. Therefore, the carry of input e, the sum of input e, and the sum of carry human power f of input f are sequentially supplied to the selector 20 in four clock cycles, and the selector 20 receives these one
Select and output bits in order.

セレクタ２０で選択された一つの入力がフリップフロッ
プ２５に供給される。フリップフロップ２５、全加算器
２６、フリップフロップ２７は、乗算部８Ａ〜８Ｄと同
様の累加算器を構成している。この累加算器でフリップ
フロップ２５を介された４個の入力が累加算される。従
って、入力Ｃが供給されたタイミングから（４＋１＋１
＝６）クロック周期後に、フリップフロップ２７から加
算トリー９Ａの出力ｉが得られる。例えば第２図におけ
る入力ｅのタイミングＬ４から６クロツク周期後のタイ
ミングｔ４３が加算トリー９Ａの出力ｉが得られるタイ
ミングである。One input selected by the selector 20 is supplied to the flip-flop 25. Flip-flop 25, full adder 26, and flip-flop 27 constitute an accumulative adder similar to multipliers 8A to 8D. This accumulator accumulates the four inputs passed through the flip-flop 25. Therefore, from the timing when input C is supplied, (4+1+1
=6) After a clock period, the output i of the adder tree 9A is obtained from the flip-flop 27. For example, the timing t43, which is six clock cycles after the timing L4 of the input e in FIG. 2, is the timing at which the output i of the addition tree 9A is obtained.

加算トリー９Ｂ及びＩＯも第４図と同一の構成を有して
いる。加算トリー９Ａ及び９Ｂを設けずに、１個の加算
トリー１０のみで、加算処理を行うことができる。但し
、その場合には、セレクタが８人力の一つを順次選択し
、繰り返し加算の回数が８回に増えるので、回路の演算
速度が第４図の構成の２倍の必要がある。Addition tree 9B and IO also have the same configuration as in FIG. 4. Addition processing can be performed using only one addition tree 10 without providing addition trees 9A and 9B. However, in that case, the selector sequentially selects one of the eight inputs, and the number of repeated additions increases to eight, so the calculation speed of the circuit needs to be twice that of the configuration shown in FIG.

また、フィルタのタップ数、語長、累加算器の語長等は
、上述の実施例に限定されるものではない。特に、累加
算器の語長は、１ビツトのものが最も高速であるが、ｎ
ビットの語長に拡張しても良い。更に、この発明は、デ
ィジタルフィルタに限らず、ＦＦＴ、コサイン変換等の
積和演算に対しても適用できる。Further, the number of taps of the filter, the word length, the word length of the accumulator, etc. are not limited to the above-mentioned embodiments. In particular, the word length of the accumulator is 1 bit, which is the fastest, but n
It may be expanded to a word length of bits. Furthermore, the present invention is applicable not only to digital filters but also to product-sum operations such as FFT and cosine transform.

〔Effect of the invention〕

この発明は、全加算器及び帰還路からなる累加算器がパ
イプライン構成とされており、少ないゲート数で加算或
いは乗算器を行うことができると共に、ゲートが無駄に
遊ぶことを防止できる。また、フィルタ演算のような積
和演算を行う場合、乗算部及び加算トリーでの処理が冗
長２進数で行うことができ、乗算部及び加算トリーの夫
々でキャリーの桁上げの加算を行う必要がなくなり、演
算速度が低下することを防止できる。更に、この発明で
は、累加算器のように同じ回路構成のものが多いので、
ＩＣ化に適している。In this invention, an accumulator consisting of a full adder and a feedback path has a pipeline configuration, so that addition or multiplier operation can be performed with a small number of gates, and it is possible to prevent the gates from being idle. In addition, when performing a product-sum operation such as a filter operation, processing in the multiplication section and addition tree can be performed using redundant binary numbers, and it is not necessary to perform carry carry addition in each of the multiplication section and addition tree. This prevents the calculation speed from decreasing. Furthermore, in this invention, many of the circuit configurations are the same, such as the accumulator, so
Suitable for IC implementation.

[Brief explanation of drawings]

第１図はこの発明の一実施例の全体のブロック図、第２
図はこの一実施例の動作を示すタイミングチャート、第
３図は乗算部の一例の構成を示すブロック図、第４図は
加算トリーの一例の構成を示すブロック図、第５図及び
第６図はＦＩＲディジタルフィルタの説明に用いるブロ
ック図、第７図及び第８図は先に提案されている乗算器
及び加算トリーを夫々示すブロック図、第９図は加算処
理の説明のための路線図、第１Ｏ図はビットブレーン間
の接続関係を示す路線図、第１１図は全加算器毎にパイ
プライン化する構成を示すブロック図である。図面における主要な符号の説明８Ａ〜８Ｄ：乗算部、９Ａ、９Ｂ、１０：加算トリー１３：ブースのセレクタ、１６．２６：全加算器、２０：セレクタ。FIG. 1 is an overall block diagram of one embodiment of the present invention, and FIG.
FIG. 3 is a timing chart showing the operation of this embodiment, FIG. 3 is a block diagram showing an example of the configuration of a multiplication section, FIG. 4 is a block diagram showing an example of the configuration of an addition tree, and FIGS. 5 and 6. is a block diagram used to explain the FIR digital filter, FIGS. 7 and 8 are block diagrams showing the previously proposed multiplier and addition tree, respectively, and FIG. 9 is a route diagram used to explain the addition process. FIG. 1O is a route diagram showing the connection relationship between bit brains, and FIG. 11 is a block diagram showing a configuration in which each full adder is pipelined. Explanation of main symbols in the drawings 8A to 8D: Multiplier, 9A, 9B, 10: Addition tree 13: Booth's selector, 16.26: Full adder, 20: Selector.

Claims

[Claims]

(1) A full adder to which a first input of a binary number, a second input which is a carry, and a third input which is a sum are supplied; carry connection means for supplying the carry to the full adder and passing the carry to the upper level; means for holding the carry and sum outputted from the full adder in synchronization with a clock; A digital arithmetic circuit comprising a return path for supplying a carry to an upper level via the carry connection means and for feeding back the output of the sum to the full adder.

(2) A digital arithmetic circuit comprising at least two first and second multipliers and an adder that adds the outputs of the first and second multipliers, the first and second multipliers The adder is configured to form a partial product of the divided data of the multiplicand and the multiplier, and accumulate the partial products in the form of a carry and a sum. 1. A digital arithmetic circuit comprising a selector that sequentially selects the sum and carry of each multiplier, and is configured to cumulatively add the output signals of the selector.