JPH04229323A

JPH04229323A - Arithmetic circuit

Info

Publication number: JPH04229323A
Application number: JP2414814A
Authority: JP
Inventors: Seiichiro Iwase; 岩瀬　清一郎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1990-12-27
Filing date: 1990-12-27
Publication date: 1992-08-18

Abstract

PURPOSE:To reduce a gate scale and realize high-speed operation while widening functions and to prevent the circuit from becoming complex by extending and applying the algorithm of a secondary booth in 4-bit units, and skipping the arithmetic of partial products in a next arithmetic cycle when it is judged that the arithmetic is unnecessary and decreasing the number of arithmetic cycles, and thus generating the partial products efficiently. CONSTITUTION:The selector of a circuit block B1 is a 4-input selector, which can holds an output R(i) in addition to a data bit P1(i) with control signals SFT11 and SFT10, select and shift R(i-4) to the left by four bits, or select and shift R(i-8) to the left by eight bits; and the selector 35 of a circuit block BJ is a 4-input selector, which selects data of bits among the outputs R(i), R(i-1), R(i-2), and R(i-3) with control signals SFT21 and SFT20 and selects one of X1, X2, X4, and X8 partial products.

Description

【発明の詳細な説明】【０００１】【産業上の利用分野】この発明は、演算回路、特に、ビ
デオ信号処理に好適な演算回路に関する。【０００２】【従来の技術】従来、高速を要求される演算回路、例え
ば、積和演算回路には、並列乗算回路が使用されていた
。しかしながら、フルアダー（全加算器）の多段接続は
、キャリーの伝搬遅延の問題により高速性に限界があっ
たり、または、或るゲートが論理演算を行って出力を決
定しようとしている時に、直列に配置された次段のゲー
トでは出力を待っていてゲートに遊休時間が発生すると
いったような問題があった。【０００３】そこで、高速性を要求される画像のリアル
タイム信号処理用の演算回路では、信号をビット単位に
分割し時分割多重的に処理する技術が、例えば、本願出
願人の提案にかかる特願平１−２８４３９８号明細書、
特願平１−３３７３１９号明細書に開示されている。【０００４】従来から直並列乗算回路或いはパイプライ
ン乗算回路が知られているが、これらは基本的にはビッ
トシリアルの乗算回路であり演算結果をビットシリアル
で出力するので、乗算結果を次々と累加算する積和演算
回路では、図１１或いは図１４に示される回路構成が必
要になる。【０００５】以下、図１１に示される積和演算回路につ
いて説明する。図１１の構成に於いて、ビット長がＢ１
とされている被乗数Ｘのデータがレジスタ１０１　を介
してバレルシフタ１０２　に供給される。【０００６】一方、乗数Ｐ２が図示せぬデコーダに供給
され、このデコーダにて以下に示す制御信号ＳＦＴ　、
制御信号ＭＯＤ　が形成される。そして、制御信号ＳＦ
Ｔ　がバレルシフタ１０２　に供給され、制御信号ＭＯ
Ｄ　が論理回路１０３　に供給される。【０００７】バレルシフタ１０２　は、制御信号ＳＦＴ
　によって算術的に被乗数Ｘを任意ビット数分、シフト
させる論理回路である。この結果、バレルシフタ１０２
　から出力されるデータのビット長Ｂ２は、被乗数Ｘの
ビット長Ｂ１の２倍以上長いものとされる。そして、こ
のバレルシフタ１０２　からビット長Ｂ２のデータが論
理回路１０３　に供給される。【０００８】このバレルシフタ１０２　の具体的な構成
が図１２に示されている。図１２に示されるバレルシフ
タ１０２　は１ビット分の構成であり、これと同様の構
成のバレルシフタ１０２　がビット長Ｂ２分、備えられ
ている。つまり、１ビット分のバレルシフタ１０２　に
対するビット長Ｂ１の被乗数Ｘは、各バレルシフタ１０
２　の全てに共通に接続されている。【０００９】このバレルシフタ１０２　には、（ａ１＋
Ｂ１＋ａ２）ビットのデータが供給されるようになされ
ており、このバレルシフタ１０２では、（ａ１＋Ｂ１＋
ａ２）ビットのデータの内から、上述の制御信号ＳＦＴ
　に基づいて、１ビットデータが選択され、論理回路１
０３　に供給される。従って、このバレルシフタ１０２
　はセレクタとしての機能を果たしている。【００１０】上述のデータａ１は被乗数Ｘのビット長Ｂ
１を拡張するもので、被乗数ＸのＭＳＢの符号ビットを
拡張して左シフトしデータを小さくするための上位ビッ
トを作る。また、データａ２は“０”を入力し右シフト
してデータを大きくするための下位ビットを作る。【００１１】論理回路１０３　は、例えば、ブースのア
ルゴリズムを実現するための部分積生成回路であり、こ
の論理回路１０３　は制御信号ＭＯＤによって制御され
る。この論理回路１０３　の一としての論理回路１０３
１が図１３に示され、同じく他の例としての論理回路１
０３２が図１４図に示されている。図１３図に示される
論理回路１０３１に於いて、単に乗数Ｐ２のＬＳＢ側か
ら順に１ビットずつ乗算するのであれば、各論理回路１
０３１は、図１２に示されるようにアンドゲート１０４
　を用いればよい。しかしながら、乗数Ｐ２のＬＳＢ側
から順次、２ビット毎に２次のブースのアルゴリズムに
基づいて部分積を生成する場合には、各ビット毎に図１
４に示されるような回路構成が必要になる。尚、１０５
　は端子、Ｘ（ｉ）　は被乗数Ｘ中の第ｉビットのデー
タである。【００１２】図１４に示される論理回路１０３２は１ビ
ット分の構成を示すものであり、これと同様の構成がＭ
ＳＢからＬＳＢにかけてビット長Ｂ２分設けられている
。【００１３】この論理回路１０３２では、被乗数Ｘ中の
第ｉビットのデータビットＸ（ｉ）　が端子１０６　　
　を介してセレクタ１０７　と、図示せぬ上位ビット、
即ち、第（ｉ＋１）　ビットの回路のセレクタに供給さ
れる。また、端子１０８　を介して、図示せぬ下位ビッ
ト、即ち、第（ｉ−１）　ビットの回路から供給される
データビットＸ（ｉ−１）　が、セレクタ１０７　に供
給される。【００１４】セレクタ１０７　では、制御信号ＭＯＤ　
に基づいて論理回路１０３２へのデータビットＸ（ｉ）
　或いはデータビットＸ（ｉ−１）　の何れか一方が選
択される。この場合、データビットＸ（ｉ）　を選択す
ることはそのまま通すことであり、データビットＸ（ｉ
−１）　を選択することは１ビット左シフトして２倍す
ることになる。そして、イクスクルーシブオアゲート１
０８　ではセレクタ１０７　からの出力をそのまま通す
か或いは反転するかして最後にアンドゲート１０９　で
イクスクルーシブオアゲート１０８　からの出力をその
まま通すか或いは“零”とするかが選択される。尚、上
述のイクスクルーシブオアゲート１０８　にて、セレク
タ１０７　からの出力を反転した時には、論理回路１０
３２の次段に配されている加算回路１１１　のＬＳＢの
キャリー入力に“１”を与えて単なる２進数の反転では
なく数値の符号反転にする。論理回路１０３　によって
、生成された部分積が加算回路１１１　に供給される。　　【００１５】加算回路１１１　及びレジスタ１１２
　によって部分積の累加算がなされる。加算回路１１１
　では、順次、供給される部分積が、レジスタ１１２　
からフィードバックされる累加算値と加算され、その加
算出力ＳＵＭ　がレジスタ１１２　に供給される。【００１６】レジスタ１１２　では、加算回路１１１　
から供給される加算値を、加算出力ＳＵＭ　として保持
すると共に、加算回路１１１　にフィードバックしてい
る。尚、このレジスタ１１２　は、演算に先立って最初
の被乗数Ｘの部分積をレジスタ１１２　に格納する際に
、クリヤ信号ＣＬによってレジスタ１１２　の内容をク
リヤし、最後の演算サイクルが終了した段階で次段の回
路に上述の加算出力ＳＵＭ　を供給する。このように、
図１１に示される積和演算回路では、乗数Ｐ２の語長に
応じたサイクル数だけ時間がかかる。【００１７】次いで、図１５に示される積和演算回路に
ついて説明する。尚、図１５に於いて、図１１に示され
る構成と共通する部分には同一符号を付し、重複する説
明を省略する。図１５の構成に於いて、ビット長がＢ１
とされている被乗数Ｘのデータがセレクタ１２１　に供
給される。また、このセレクタ１２１　にはレジスタ１
２２　からビット長がＢ２ビットとされているデータが
フィードバックされている。一方、乗数Ｐ２が図示せぬ
デコーダに供給され、このデコーダにて以下に示す制御
信号ＳＦＴ　、制御信号ＭＯＤ　が形成される。そして
、制御信号ＳＦＴ　がセレクタ１２１　に供給され、制
御信号ＭＯＤ　が論理回路１０３　に供給される。【００１８】このセレクタ１２１　の具体的な構成が図
１６に示されている。図１６に示されるセレクタ１２１
　は１ビット分の構成であり、これと同様の構成のセレ
クタ１２１　がビット長Ｂ２分、備えられている。つま
り、１ビット分のセレクタ１２１　に対してビット長Ｂ
２のデータがレジスタ１２２　からフィードバックされ
て供給されており、このビット長Ｂ２のデータは、各セ
レクタ１２１　の全てに共通に接続されている。【００１９】上述のセレクタ１２１　とレジスタ１２２
　とは、シフトレジスタのように動作する。即ち、最初
の演算サイクルに於いて、図１６のセレクタ１２１　で
は、供給される被乗数Ｘを選択してレジスタ１２２　に
取込む。この時、セレクタ１２１　では図１６に於いて
データビットＸ（ｉ）　が選択される。そして、次のサ
イクルでは、乗数Ｐ２のＬＳＢから順次、部分積を生成
するためにセレクタ１２１　とレジスタ１２２　は恰も
シフトレジスタの如く動作する。【００２０】つまり、図１３に示されるような乗算がな
されるのであれば、レジスタ１２２　に取込んだデータ
を１ビット左シフトするために図１６に示される各ビッ
トのセレクタ１２１　は、当該ビットのセレクタ１２１
　の１ビット下位の第（ｉ−１）　ビットのレジスタか
らの出力を選択する。そして、次の演算サイクルからは
前の演算サイクルで選択したビットより１ビット下位の
レジスタの出力を選択する。【００２１】また、図１４に示されるような乗算がなさ
れるのであれば、レジスタ１２２　にロードしたデータ
を２ビット左シフトするために、図１６に示される各ビ
ットのセレクタ１２１　は、第ｉビットより２ビット下
位のレジスタからの出力を選択する。そして、次の演算
サイクルからは前の演算サイクルで選択したビットより
２ビット下位のレジスタの出力を選択する。レジスタ１
２２　では、セレクタ１２１　から供給されるビット長
Ｂ２のデータを論理回路１０３　に保持すると共に、セ
レクタ１２１　にフィードバックしている。【００２２】図１２に示されるセレクタとしてのバレル
シフタ１０２　、図１５に示されるセレクタ１２１　は
、多入力セレクタとされているが、これは次のような理
由による。例えば、〔００００１０００１０００〕のよ
うに“０”の多いデータ、〔１１１００００１１１１０
〕のように“０”或いは“１”の連続の多いデータの場
合には、部分積を生成するための演算をスキップでき高
速化を図ることができる。このようなビットスキップの
ために任意ビットのビットシフトが必要とされる。但し
、〔１０１０１１０１０１０１〕のように“０”と“１
”が交互に現れるようなデータでは演算をスキップ出来
ない。【００２３】【発明が解決しようとする課題】上述したような従来技
術では、いずれの場合でも多入力セレクタが必要なため
にゲート数が多くなり、回路規模が大きくなり回路が複
雑化してしまうという問題点があった。また、不要な演
算をスキップすることなく、全てのビットに於いて演算
を行っていたので、効率が良くないという問題点があっ
た。従って、この発明の目的は、ゲート数、演算サイク
ル数等の点でより効率の良い演算回路を提供することに
ある。【００２４】【課題を解決するための手段】請求項１にかかる演算回
路は、複数のビットから構成されるデータの乗算結果を
、部分積を形成して求める演算回路に於いて、データに
対し、２次ブースのアルゴリズムを４ビット単位で適用
し、データを４ビット単位でシフトして各演算サイクル
で部分積を４ビット単位で生成すると共に、データを先
見することで次の演算サイクルで生成されるべき部分積
が零になると判断される時は次の部分積の演算をスキッ
プする構成としている。【００２５】請求項２にかかる演算回路は、各種制御信
号を形成する制御信号発生回路と、データの１ビット毎
に処理を行なう回路コンポーネントと、回路コンポーネ
ントの出力を累加算する累加算器とを備え、回路コンポ
ーネントは、下位ビットの回路コンポーネントから供給
され所定ビット間隔を有する複数のビットデータ或いは
演算用のビットデータの内から、データのシフト量を規
定するための任意のビットデータを選択する第１の選択
手段と、第１の選択手段から供給される１ビットデータ
と、当該回路コンポーネントより下位側に連続し第１の
選択手段と同一ビット数で構成される１ビットデータの
内から、データのシフト量を規定するための任意の１ビ
ットデータを選択する第２の選択手段と、第２の選択手
段からの出力に基づいて加算出力とキャリー出力とを形
成し、当該回路コンポーネントのキャリー出力を上位ビ
ットの回路コンポーネントに供給すると共に、下位ビッ
トの回路コンポーネントからのキャリー出力と加算出力
とを当該回路コンポーネントの出力として形成するアキ
ュムレータと、上位ビットの回路コンポーネントから供
給される加算出力及びキャリー出力と、アキュムレータ
から供給される加算出力及びキャリー出力との何れか一
方を選択して保持する共に、該保持された値を下位ビッ
トの回路コンポーネントに供給する出力回路とを備えた
構成としている。【００２６】【作用】請求項１にかかる演算回路は、複数のビットか
ら構成されるデータに対し、２次ブースのアルゴリズム
を４ビット単位で適用し、上述のデータを４ビット単位
でシフトして各演算サイクルで部分積を４ビット単位で
生成する。この時、次のデータを先見することで、演算
サイクルの部分積が零になると判断される時は次の演算
サイクルに於ける部分積の演算をスキップする。【００２７】請求項２にかかる演算回路では、回路コン
ポーネントの第１の選択手段にてブースのアルゴリズム
を適用するに際して必要なデータのシフト量が制御信号
に基づいて規定され、このシフト量に対応する任意のビ
ットデータがセレクタによって選択される。第２の選択
手段では上述のブースのアルゴリズムにて規定されてい
る倍数を得るに必要なデータのシフト量が制御信号に基
づいて規定され、このシフト量に対応する任意のビット
データがセレクタによって選択される。アキュムレータ
では第２の選択手段からの出力に基づいて当該回路コン
ポーネントの部分積の累加算がなされ、当該回路コンポ
ーネントの加算出力とキャリー出力とが形成される。各
回路コンポーネントからの出力が、累加算器によって、
順次、加算されて、乗算結果が得られる。【００２８】【実施例】以下、この発明の一実施例について図１乃至
図１０を参照して説明する。第１図に示される演算回路
は、Ｎ個の回路コンポーネントＳ（Ｎ−１）　〜Ｓ（０
）　と、制御信号発生回路１０と、累加算器２０とから
主に構成されている。【００２９】制御信号発生回路１０は、制御信号ＳＦＴ
１１　、ＳＦＴ１０　、ＳＦＴ２１　、ＳＦＴ２０　、
制御信号ＭＯＤ−０、ＭＯＤ−１、クロック信号ＣＬＫ
　、クリヤ信号ＣＬ、ロード信号ＬＤ等を生成すると共
に、これらの信号を上述の各回路コンポーネントＳ（Ｎ
−１）　〜Ｓ（０）　に供給して各回路コンポーネント
Ｓ（Ｎ−１）　〜Ｓ（０）　の回路動作を制御している
。【００３０】回路コンポーネントＳ（Ｎ−１）　〜Ｓ（
０）　は、従来技術を示す図１１或いは図１５中のビッ
ト長Ｂ２に対応する数、即ち、Ｎ（Ｎ＝Ｂ２）だけ設け
られている。ここで、端子ＡＴ（Ｎ−１）　〜ＡＴ（０
）を介して供給される被乗数Ｘのビット長は、本来、図
１１或いは図１５に示されるＢ１であるため、上述のビ
ット長Ｂ２のデータは、ビット長Ｂ１のＭＳＢを符号拡
張して形成されている。【００３１】例えば、回路コンポーネントＳ（Ｎ−１）
　〜Ｓ（０）　に於ける回路コンポーネントＳ（ｉ）　
について見ると、下位ビットの回路コンポーネントから
は各１ビットのデータＲ（ｉ−１）　〜Ｒ（ｉ−４）　
、Ｒ（ｉ−８）　と、１ビットの桁上げ出力ＣＹ（ｉ−
１）　とが供給され、上位ビットの回路コンポーネント
Ｓ（ｉ＋１）　からは各１ビットのデータＵ（ｉ＋１）
　及びＶ（ｉ＋１）　が供給される。また、端子ＡＴ（
ｉ）　を介してデータビットＸ（ｉ）　が供給される。尚、このデータビットＸ（ｉ）　は被乗数Ｘの第（ｉ）
　ビット目のデータを意味するものである。【００３２】当該回路コンポーネントＳ（ｉ）　からは
、上位ビットの回路コンポーネントＳ（ｉ＋１）　に対
して１ビットの桁上げ出力ＣＹ（ｉ）と、１ビットのデ
ータＲ（ｉ）　を供給すると共に、下位ビットの回路コ
ンポーネントＳ（ｉ−１）　に対して各１ビットのデー
タＵ（ｉ）　及びＶ（ｉ）　を供給する。この図１及び
図２の構成に於いて、括弧内、例えば、（ｉ）　、（ｉ
−１）　、（ｉ−４）　、（ｉ−８）　等の値が“負”
になるものは全て“ゼロ（グランド）”に接続されてい
るものとする。上述の各回路コンポーネントＳ（Ｎ−１
）　〜Ｓ（０）　は全て同一の構成とされているので、
以下の回路コンポーネントに対する説明では回路コンポ
ーネントＳ（ｉ）　を例とする。【００３３】累加算器２０は、フルアダー２１と、フリ
ップフロップ２２、２３とから主に構成されている。回
路コンポーネントＳ（０）　からの出力Ｕ（０）　及び
Ｖ（０）　は、フルアダー２１にて前サイクルに於ける
桁上げ出力ＣＹ２０と共に加算され、加算出力ＳＵＭ２
０　、桁上げ出力ＣＹ２０とが形成される。桁上げ出力
ＣＹ２０はフリップフロップ２２を介してフルアダー２
１の入力側に戻され、また、加算出力ＳＵＭ２０　はフ
リップフロップ２３を介して端子２４からＬＳＢ側より
順次、取出される。【００３４】回路コンポーネントＳ（ｉ）　の構成が図
２に示されている。破線で示される回路ブロックＢＩは
図１５のセレクタ１２１　とレジスタ１２２　に対応し
、同様に回路ブロックＢＪは図１４のセレクタ１０７　
に対応し、回路ブロックＢＫは図１４のイクスクルーシ
ブオアゲート１０８　とアンドゲート１０９　に対応し
、回路ブロックＢＭは図１５の加算回路１１１　とレジ
スタ１１２　に対応している。【００３５】端子ＡＴ（ｉ）　を介して供給される被乗
数ＸのデータビットＸ（ｉ）　がフリップフロップ３１
に取込まれる。フリップフロップ３１は、図１５に示さ
れるレジスタ１２２　の第ｉビット目のレジスタに対応
するもので、データビットＸ（ｉ）　はフリップフロッ
プ３１から回路ブロックＢＩのセレクタ３２に供給され
る。【００３６】回路ブロックＢＩのセレクタ３２は４入力
セレクタであり、このセレクタ３２には上述のデータビ
ットＸ（ｉ）　と、下位ビットの回路コンポーネントか
ら供給されるデータＲ（ｉ−４）　、Ｒ（ｉ−８）　と
、フリップフロップ３３からフィードバックされるデー
タＲ（ｉ）　とが供給されている。また、このセレクタ
３２には、制御信号発生回路１０から制御信号ＳＦＴ１
１　、ＳＦＴ１０　が供給されている。【００３７】セレクタ３２では、上述の制御信号ＳＦＴ
１１　、ＳＦＴ１０　に基づいて、データＲ（ｉ−８）
　、Ｒ（ｉ−４）　、Ｒ（ｉ）　、Ｘ（ｉ）　の内の一
つが選択されフリップフロップ３３に供給される。この
セレクタ３２によって、データビットＸ（ｉ）　の他、
データＲ（ｉ）　を保持したり、また、データＲ（ｉ−
４）　を選択して４ビット左シフトしたり、或いはデー
タＲ（ｉ−８）　を選択して８ビット左シフトすること
が可能となる。尚、上述の制御信号ＳＦＴ１１　、ＳＦ
Ｔ１０　による選択については後述する。【００３８】フリップフロップ３３からの出力Ｒ（ｉ）
　は、他の上位ビットの回路コンポーネントに供給され
ると共に、回路ブロックＢＪのセレクタ３５の入力側に
供給され、更に、上述のセレクタ３３の入力側にフィー
ドバックされる。【００３９】回路ブロックＢＪでは、２次のブースのア
ルゴリズムを４ビットに拡張して適用しているもので、
各１ビットの制御信号ＳＦＴ２１　、ＳＦＴ２０　によ
って、データＲ（ｉ）　、Ｒ（ｉ−１）　、Ｒ（ｉ−２
）　、Ｒ（ｉ−３）　の内から一つを選択し、部分積の
とりかたを１倍、２倍、４倍、８倍の中から選択できる
ようにしている。【００４０】回路ブロックＢＪのセレクタ３５は４入力
セレクタであり、このセレクタ３５には上述のフリップ
フロップ３３からのデータＲ（ｉ）　と、下位ビットの
回路コンポーネントＳ（ｉ−１）　、Ｓ（ｉ−２）　、
Ｓ（ｉ−３）　から供給されるデータＲ（ｉ−１）　、
Ｒ（ｉ−２）　、Ｒ（ｉ−３）　が供給されている。ま
た、このセレクタ３５には、制御信号発生回路１０から
制御信号ＳＦＴ２１　、ＳＦＴ２０　が供給されている
。【００４１】セレクタ３５では、上述の制御信号ＳＦＴ
２１　、ＳＦＴ２０　に基づいて、データＲ（ｉ）　、
Ｒ（ｉ−１）　、Ｒ（ｉ−２）　、Ｒ（ｉ−３）　の内
から一つが選択され、データＲがフリップフロップ３６
を介して回路ブロックＢＫに供給される。尚、上述の制
御信号ＳＦＴ２１　、ＳＦＴ２０　によって、どのデー
タＲが選択されるかについては後述する。【００４２】回路ブロックＢＫでは、ナンドゲート４１
、４２、オアゲート４３によって、従来技術を示す図１
４のイクスクルーシブオアゲート１０８　とアンドゲー
ト１０９　が構成されている。上述のフリップフロップ
３６から出力されたデータがナンドゲート４１、４２に
供給される。また、制御信号ＭＯＤ−０がナンドゲート
４１に、制御信号ＭＯＤ−１がナンドゲート４２に夫々
供給される。そして、ナンドゲート４１、４２の出力が
オアゲート４３に供給され、このオアゲート４３の出力
がフリップフロップ４４を介して回路ブロックＢＬに供
給される。【００４３】上述の制御信号ＭＯＤ−０、ＭＯＤ−１の
組合わせと、回路ブロックＢＫの動作には以下のような
関係がある。　　　　　　【００４４】回路ブロックＢＬはアキュム
レータであり、この回路ブロックＢＬは、フルアダー５
１と、フリップフロップ５２、５３とから主に構成され
ている。【００４５】演算の実行に先立ち、制御信号発生回路１
０から供給されるクリヤ信号ＣＬによって、フリップフ
ロップ５２、５３の内容がクリヤされる。【００４６】フルアダー５１には回路ブロックＢＫのフ
リップフロップ４４からの出力と、上述のフリップフロ
ップ５３からフィードバックされる出力と、下位ビット
の回路コンポーネントＳ（ｉ−１）　の回路ブロックＢ
Ｋからの桁上げ出力ＣＹ（ｉ−１）　とが供給され、こ
れらの値の加算がなされる。このフルアダー５１では、
桁上げ出力ＣＹ（ｉ）　、加算出力ＳＵＭ（ｉ）が形成
される。桁上げ出力ＣＹ（ｉ）　は　　フリップフロッ
プ５２に供給され、加算出力ＳＵＭ（ｉ）はフリップフ
ロップ５３に供給される。【００４７】最初の演算サイクルでは、フリップフロッ
プ５２、５３の内容がクリヤされているので、各演算サ
イクルに於いて、加算出力ＳＵＭ（ｉ）はフリップフロ
ップ５３を介してフルアダー５１の入力側にフィードバ
ックされると共に、回路ブロックＢＭのセレクタ５６に
供給され、また、桁上げ出力ＣＹ（ｉ）　はフリップフ
ロップ５２を介して上位ビットの回路コンポーネントＳ
（ｉ＋１）　の回路ブロックＢＬに供給される。一方、
下位ビットの回路コンポーネントＳ（ｉ−１）　から供
給される桁上げ出力ＣＹ（ｉ−１）　がフルアダー５１
の入力側に供給されると共に、回路ブロックＢＭのセレ
クタ５７に供給される。上述の回路ブロックＢＬからの
出力は、各ビットが２ビットで表される冗長２進数であ
る。【００４８】回路ブロックＢＭは、セレクタ５６、５７
と、フリップフロップ５８、５９とから主に構成されて
おり、制御信号発生回路１０からはロード信号ＬＤが各
セレクタ５６、５７に供給される。【００４９】セレクタ５６には加算出力ＳＵＭ（ｉ）と
上位ビットの回路コンポーネントＳ（ｉ＋１）　から供
給される出力Ｕ（ｉ）　が供給されており、また、セレ
クタ５７には下位ビットの回路コンポーネントＳ（ｉ−
１）　から供給される桁上げ出力ＣＹ（ｉ−１）　と、
上位ビットの回路コンポーネントＳ（ｉ＋１）　から供
給される出力Ｖ（ｉ）　が夫々供給されている。【００５０】回路ブロックＢＬからの積和の結果が得ら
れた時に、ロード信号ＬＤが供給されて回路ブロックＢ
Ｍのセレクタ５６、５７が切り換えられて、上述の回路
ブロックＢＬからの加算出力ＳＵＭ（ｉ）がフリップフ
ロップ５８にロードされ、桁上げ出力ＣＹ（ｉ）　がフ
リップフロップ５９にロードされる。【００５１】その後は、出力Ｕ（ｉ）　及びＶ（ｉ）　
がクロック信号ＣＬＫ　のタイミングでＭＳＢの回路コ
ンポーネントＳ（Ｎ−１）　側からＬＳＢの回路コンポ
ーネントＳ（０）　側にシフトし、前述した累加算器２
０にて累加算がなされる。この回路からのシフトによる
出力は、出力語長分だけのサイクル数が必要であるが、
これは１回の積和演算に要する回路ブロックＢＩ〜ＢＭ
のサイクル数に比して小さい限り問題はない。【００５２】ところで、図１及び図２に於いて、注目す
べき点は回路ブロックＢＩ、ＢＪのセレクタ３２、３５
である。これは従来技術を示す図１５のセレクタ１２１
　と論理回路１０３　中のセレクタ１０７　に対し、演
算スキップのための回路配分を変えている。積和演算に
於いて、部分積の生成をスキップしないならば、回路ブ
ロックＢＩは図８に示されるようなものでもよい。【００５３】図８に於いて、セレクタ６１には、端子Ａ
Ｔ（ｉ）　を介して供給されるデータビットＸ（ｉ）　
と、データＲ（ｉ−１）　或いはＲ（ｉ−２）　が供給
されている。この場合、ブースのアルゴリズムを使用　
　しないならば、ただのシフトレジスタ（シフト入力は
出力Ｒ（ｉ−１））になる。また、２次のブースのアル
ゴリズムを使用するのであれば、常に２ビットシフトす
るので図８のシフト入力は出力Ｒ（ｉ−２）とされる。【００５４】一方、部分積の生成をスキップするならば
、回路ブロックＢＩは従来技術を示す図１６のセレクタ
１２１　のようなものでなければならない。しかしなが
ら、このような構成では、頻度の少ない多ビットスキッ
プのために多くのゲート数が必要になり効率が低下する
。【００５５】そこで、図９或いは図１０に示されるよう
に、スキップのできる範囲を或る程度限定した構成が考
えられる。ここで、図９の構成は、２次のブースのアル
ゴリズムを適用することを前提とし２ビットスキップの
みできるようにしたものでありる。また、図１０は、従
来技術を示す図１３のセレクタ１０７　を合成してセレ
クタ６５を構成しているもので、データＲ（ｉ−１）　
〜Ｒ（ｉ−６）　の範囲内で選択的にスキップできるよ
うにしたものである。しかしながら、これらの構成では
、まだ多入力のセレクタ６３、６５が必要であり、その
ため、データＲ（ｉ−１）　〜Ｒ（ｉ−６）　の選択の
ためのループに時間を要するためあまり適切ではない。【００５６】これに対し、図２に示される回路ブロック
ＢＩ、ＢＪは、夫々、４入力のセレクタ３２、３５に均
一化されている。このような少入力のセレクタ３２、３
５は少ないゲート数で構成でき、フィードバックループ
も高速化でき好ましい。【００５７】データＲ（ｉ）　を保持する回路ブロック
ＢＩのフリップフロップ３３のフアンアウトも他の方法
に比べて小さく収まっている。また、回路ブロックＢＪ
〜ＢＭは、回路ブロックＢＬ中のフィードバックループ
を除いては、フィードフォワードばかりであり、このよ
うなフィードフォワードの回路では、パイプライン化に
よって容易に高速化を図ることが可能である。このよう
に、回路構成が均等化されたことと併せ、図２に示され
るようにパイプライン化された回路は非常に高速な動作
が可能である。【００５８】図１に於ける制御信号発生回路１０は、前
述したように各種制御信号を発生するものであり、この
ような機能を実現する一つの手段としてメモリを用いる
ことができる。これは、乗数の数値によって必要な演算
サイクルが異なり、しかも演算に必要なサイクル数は乗
数の部分積分解方法に各種の組み合わせがあって、その
中から最小サイクル数の方法を探索すべきであるから、
このようなことはリアルタイムに行わず予め解を求めて
おき各種制御信号の時系列のみをメモリに記憶させてお
いて発生させるのが適当であるからである。【００５９】また、予め乗数を決定できない場合或いは
乗数が多数あるためにメモリに記憶しきれない場合には
、制御信号発生回路１０として、例えば、図３、図４に
示されるような構成の回路を用いることになる。尚、以
下の図３、図４に関する説明は、ハードウエアの説明で
あるのみならず、前述したようにメモリをコントローラ
の代わりに使用する場合の、予め各種制御信号の時系列
を決定する方法の一例を示している。【００６０】図３の構成に於いて、ＭＳＢからＬＳＢま
での（Ｎ＋１）　個の回路コンポーネントＴ（Ｎ−１）
　〜Ｔ（０）　が縦続接続されている。この回路コンポ
ーネントＴ（Ｎ−１）　〜Ｔ（０）　としては図２の回
路ブロックＢＩが用いられており、回路コンポーネント
Ｔ（Ｎ−１）　〜Ｔ（０）　に供給される制御信号ＳＦ
Ｔ１１　、ＳＦＴ１０　も共通とされている。【００６１】この回路コンポーネントＴ（Ｎ−１）　〜
Ｔ（０）　には端子ＣＴ（Ｎ−１）　〜ＣＴ（０）　を
介して乗数Ｐ２が供給されており、この回路コンポーネ
ントＴ（Ｎ−１）　〜Ｔ（０）　は全体としてシフトレ
ジスタを構成しているがシフトレジスタを構成する時の
接続の仕方が図３と図１とでは異なっている。つまり、
図１では左シフトであったものが図３では右シフトにな
っている。そして、Ｎは乗数Ｐ２の入力語長に一致して
おり、Ｂ２とは異なる。尚、回路ブロックＢＩに供給さ
れるデータＲ（ｉ−８）　、Ｒ（ｉ−４）　の括弧、例
えば、（ｉ−８）　、（ｉ−４）　内の値が（Ｎ−１）
より大なる時は（Ｎ−１）とされる。【００６２】例えば、回路コンポーネントＴ（Ｎ−１）
　〜Ｔ（−１）に於ける任意の回路コンポーネントＴ（
ｉ）　について見ると、上位ビットの回路コンポーネン
トＴ（ｉ＋８）　、Ｔ（ｉ＋４）　からは各１ビットの
データＲ（ｉ＋８）　、Ｒ（ｉ＋４）　と、端子ＣＴ（
ｉ）を介してデータビットＰ２（ｉ）　が供給され、下
位ビットの回路コンポーネントに対しては１ビットのデ
ータＲ（ｉ）　が供給される。【００６３】上述の各回路コンポーネントＴ（Ｎ−１）
　〜Ｔ（−１）は、制御信号発生回路１０から供給され
る制御信号ＳＦＴ１１　、ＳＦＴ１０　、クリヤ信号Ｃ
Ｌ等によって回路動作が制御される。この制御信号ＳＦ
Ｔ１１　、ＳＦＴ１０　は図１と共通であり、シフト方
向が逆になるように制御される。また、上述の各回路コ
ンポーネントＴ（Ｎ−１）　〜Ｔ（−１）は全て同一の
構成とされているので、以下の回路コンポーネントに対
する説明では上述の回路コンポーネントＴ（ｉ）　を例
とする。回路コンポーネントＴ（−１）は２次ブースの
アルゴリズムのためのもので、この回路コンポーネント
Ｔ（−１）は乗数Ｐ２の入力ロード時にクリヤされる。【００６４】上述の回路コンポーネントＴ（７）　〜Ｔ
（−１）からのデータＲ（７）　〜Ｒ（−１）が図４に
示される回路ブロックＢＯに供給される。この回路ブロ
ックＢＯは、信号Ｒ（３）　〜Ｒ（−１）が供給され後
述する図６及び図７の論理を実現するロジック回路７１
と、信号Ｒ（７）　〜Ｒ（３）　が供給されるコンパレ
ータ７２と、次の演算サイクルのシフト量を決定するた
めのスキップ回路７３とから主に構成される。【００６５】ロジック回路７１は、供給される信号Ｒ（
３）　〜Ｒ（−１）に基づいて図６及び図７の上欄のシ
フト量規定信号Ｘ〜８Ｘ、制御信号ＭＯＤ−０及びＭＯ
Ｄ−１に対応する信号ＴＨ、ＩＶ　、スキップ信号ＳＫ
等を形成するものである。このスキップ信号ＳＫがスキ
ップ回路７３に供給される。【００６６】コンパレータ７２では、供給される信号Ｒ
（７）　〜Ｒ（３）　の一致、不一致が調べられ、例え
ば、上述の信号Ｒ（７）　〜Ｒ（３）　の５ビットが全
てハイレベル或いはローレベルの時にのみハイレベルの
比較信号ＣＰが形成され、この比較信号ＣＰがスキップ
回路７３に供給される。【００６７】スキップ回路７３は、図１及び図３の回路
ブロックＢＩで構成されるシフトレジスタに於ける次の
演算サイクルのシフト量を求める回路である。このスキ
ップ信号ＳＫでは、上述の比較信号ＣＰ及びスキップ信
号ＳＫのレベルの組み合わせに基づいて、図５に示され
る内容の制御信号ＳＦＴ１１　、ＳＦＴ１０　が形成さ
れ、端子７４、７５から取出される。【００６８】前述したようにロジック回路７１では、各
種信号、制御信号等が形成されており、以下、これにつ
いて説明する。ロジック回路７１に供給される信号Ｒ（
３）　〜Ｒ（−１）の５ビットのデータは、レベルの組
み合わせに応じて３２通りのパターンに分けることがで
きる。２次ブースのアルゴリズムを４ビットに適用することに
より、各パターンに対応する被乗数Ｘの部分積ＰＰ０は
、図６に示されるようなものとなる。【００６９】この部分積ＰＰ０は、更に部分積ＰＰ１の
ように分解することができる。この部分積ＰＰ１は、被
乗数Ｘの２のべき乗倍の２項以下の組み合わせでなる部
分積ＰＰ２によって形成することができる。【００７０】この部分積ＰＰ２は、何れも２のべき乗倍
の項を２つ組み合わせて形成できるので、図６に示され
るように第１サイクルＣ１のみで、或いは第１サイクル
Ｃ１及び第２サイクルＣ２の夫々で部分積ＰＰ２の演算
を行った後に加算することにより、部分積ＰＰ０を得る
ことができる。【００７１】上述の部分積ＰＰ２は、何れも被乗数Ｘの
２のべき乗倍の項〔０Ｘ、±１Ｘ、±２Ｘ、±４Ｘ、±
８Ｘ〕で表せ、また、被乗数Ｘの２のべき乗倍はシフト
動作により得ることができるので、上述の部分積ＰＰ２
を構成する被乗数Ｘの２のべき乗倍の項は、被乗数Ｘを
所要のビット数分、シフト動作させることによって得る
ことができる。【００７２】このシフト量を規定するための信号が図７
のシフト量規定信号Ｘ〜８Ｘとして定められている。こ
のシフト量規定信号Ｘ〜８Ｘの内、ハイレベルとされる
信号に対応して被乗数Ｘのシフト量が規定される。そし
て、このシフト量規定信号Ｘ〜８Ｘのレベルの組み合わ
せに基づいて制御信号ＳＦＴ２１　、ＳＦＴ２０　が決
定され、この制御信号ＳＦＴ２１　、ＳＦＴ２０　に基
づいてシフト動作がなされる。【００７３】上述の部分積ＰＰ２は、図２に示される回
路ブロックＢＪのセレクタ３５に於けるデータＲ（ｉ）
　〜Ｒ（ｉ−３）　の選択により得ることができる。即
ち、データＲ（ｉ）　を選択することが被乗数Ｘを１倍
することになり、データＲ（ｉ−１）　を選択すること
が被乗数Ｘを２倍することに相当し、データＲ（ｉ−２
）　を選択することが被乗数Ｘを４倍することに相当し
、データＲ（ｉ−３）　を選択することが被乗数Ｘを８
倍することに相当する。【００７４】また、図７の信号ＴＨ、ＩＶは、夫々制御
信号ＭＯＤ−０、ＭＯＤ−１に対応しており、図２の回
路ブロックＢＫに於いて、前段の回路ブロックＢＪから
供給される出力をそのまま通過させるか、反転させるか
或いは“零”とするかの動作を制御している。例えば、
信号ＴＨ（＝Ｌ）、ＩＶ（＝Ｌ）の時は“零”とされ、
信号ＴＨ（＝Ｌ）、ＩＶ（＝Ｈ）の時は反転とされ、更
に信号ＴＨ（＝Ｈ）、ＩＶ（＝Ｌ）の時はそのまま通過
とされる。【００７５】また、図７のスキップ信号ＳＫは、図４に
示されるロジック回路７１に入力される５ビットのデー
タＲ（３）　〜Ｒ（−１）に基づいて部分積の演算の必
要のないときにハイレベルとなるようにされている。例
えば、部分積ＰＰ２のパターン中、被乗数Ｘの２のべき
乗倍の項が１つしかない場合には、第２サイクルＣ２目
の演算は不要なので第１サイクルＣ１目に於いてハイレ
ベルとされる。【００７６】例えば、図７に示される部分積ＰＰ２に於
いて、１つの項しか存在しないパターンでは第２サイク
ルＣ２に於ける部分積ＰＰ２の計算がないので、第１サ
イクルＣ１に於けるスキップ信号ＳＫはハイレベルとな
る。勿論、第２サイクルＣ２ではスキップ信号ＳＫはハ
イレベルとなる。上述の第２サイクルＣ２の欄内に“＊
”の記載されている個所はスキップ可能な部分であるこ
とを意味している。【００７７】次いで、動作例について説明する。ＦＩＲ
デジタルフイルタを例とし、タップ数を３、係数ｈ１〜
ｈ３を夫々１２ビット（Ｎ＝１２）とする。（１）　係数ｈ１＝〔００００　　０１００　００１０
　〕（２）　係数ｈ２＝〔００１０　　００００　０１
０１　〕（３）　係数ｈ３＝〔１１１１　　１１１０　
１１００　〕但し、上述の係数ｈ１〜ｈ３は２の補数表
示とされている。【００７８】（１）　係数ｈ１＝〔００００　　０１０
０　００１０　〕の場合被乗数Ｘは回路コンポーネントＳ（ｉ）　内の回路ブロ
ックＢＩに供給され、乗数Ｐ２、この場合には上述の係
数ｈ１が回路コンポーネントＴ（ｉ）　に供給される。回路コンポーネントＳ（ｉ）　内の回路ブロックＢＩで
は、供給される制御信号ＳＦＴ１１　、ＳＦＴ１０　に
より端子ＡＴ（ｉ）　を介して供給されるデータビット
Ｘ（ｉ）　が選択され、取込まれる。【００７９】係数ｈ１の乗算のための第１サイクルでは
、図４のデータＲ（３）　〜Ｒ（−１）に相当する乗数
４ビットとその下位１ビット〔但し、この場合、Ｒ（−
１）＝０とされる〕がロジック回路７１にて調べられる
。【００８０】ロジック回路７１では、上述の５ビットの
データが〔００１００〕なので、図６及び図７から以下
のことが判る。シフト量規定信号Ｘ〜８Ｘは“ＬＨＬＬ
”でシフト量規定信号２Ｘのみがハイレベルで出力され
ること、信号ＴＨが“Ｈ”、信号ＩＶが“Ｌ”で出力さ
れること、そして、スキップ信号ＳＫが“Ｈ”ベルで出
力される等である。【００８１】これによって、部分積ＰＰ０は“２Ｘ”を
求めればよく、また、回路ブロックＢＫでは回路ブロッ
クＢＪからの出力がそのまま出力されること、更に、演
算は第１サイクルＣ１のみで終了し乗数Ｐ２の次の４ビ
ットの処理へスキップできることが判る。そこで、“２
Ｘ”を求めるため、回路コンポーネントＳ（ｉ）　中の
回路ブロックＢＪでは、制御信号ＳＦＴ２１　、ＳＦＴ
２０　によって、データＲ（ｉ−１）　、即ち、１ビッ
ト下位の回路コンポーネントＳ（ｉ−１）　の回路ブロ
ックＢＩの出力を選択する。【００８２】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力がそのまま出力され、こ
の出力は回路ブロックＢＬのアキュムレータに保持され
る。【００８３】係数ｈ１の下位４ビットの処理が終了する
と、次の上位側４ビットへ移る。回路コンポーネントＳ
（ｉ）　、Ｔ（ｉ）　の回路ブロックＢＩでは制御信号
ＳＦＴ１１　、ＳＦＴ１０　によって、データＲ（ｉ−
４）　を選択する。【００８４】上述のデータＲ（ｉ−４）　が選択される
ことによって４ビットシフトされる。尚、このシフト動
作は、乗数Ｐ２（ｉ）　、即ち、回路コンポーネントＳ
（ｉ）　では左シフト、データビットＸ（ｉ）　、即ち
、回路コンポーネントＴ（ｉ）　では右シフトとされる
。【００８５】係数ｈ１の中央の４ビットは〔０１００〕
であり、その下位ビットは〔０　〕であるから、図４の
データＲ（３）　〜Ｒ（−１）に相当する５ビットのデ
ータは〔０１０００　〕となる。【００８６】ロジック回路７１では、上述の５ビットの
データが〔０１０００〕なので、図６及び図７から以下
のことが判る。シフト量規定信号Ｘ〜８Ｘは“ＬＬＨＬ
”でシフト量規定信号４Ｘのみがハイレベルで出力され
ること、信号ＴＨが“Ｈ”、信号ＩＶが“Ｌ”で出力さ
れること、そして、スキップ信号ＳＫが“Ｈ”で出力さ
れることが判る。【００８７】これによって、部分積ＰＰ０は“４Ｘ”を
求めればよく、また、回路ブロックＢＫでは回路ブロッ
クＢＪからの出力がそのまま出力されること、更に、演
算は第１サイクルＣ１のみで終了し乗数Ｐ２の次の４ビ
ットの処理へスキップできることが判る。そこで、“４
Ｘ”を求めるため、回路コンポーネントＳ（ｉ）　中の
回路ブロックＢＪでは、制御信号ＳＦＴ２１　、ＳＦＴ
２０　によって、データＲ（ｉ−２）　、即ち、２ビッ
ト下位の回路コンポーネントＳ（ｉ−２）　の回路ブロ
ックＢＩの出力を選択し、乗数Ｐ２の４倍、即ち、“４
Ｘ”のデータを得ることができる。【００８８】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力がそのまま出力され、こ
の出力は回路ブロックＢＬのアキュムレータに保持され
る。【００８９】係数ｈ１の中位４ビットの処理が終了する
と、次の最上位側４ビット、即ち、〔００００〕の処理
へ移るために回路コンポーネントＳ（ｉ）　、Ｔ（ｉ）
　の回路ブロックＢＩではデータＲ（ｉ−８）　を選択
しようとする。【００９０】この時、図４に示されるコンパレータ７２
には、上述の最上位側４ビット〔００００〕が供給され
、また、上述の中位４ビットの最上位のデータＲ（３）
　〔＝０　〕が付加されて５ビットのデータが〔０００
００　〕となるため、比較信号ＣＰは“Ｈ”で出力され
る。【００９１】スキップ回路７３では、上述の５ビットの
データが〔０００００〕となるため、次の４ビットの処
理が不要であることを判別する。従って、この段階で係
数ｈ１の処理が終了することになる。係数ｈ１の処理は
２サイクルであった。【００９２】（２）　係数ｈ２＝〔００１０　　０００
０　０１０１　〕の場合被乗数Ｘは回路コンポーネントＳ（ｉ）　内の回路ブロ
ックＢＩに供給され、乗数Ｐ２、この場合には上述の係
数ｈ１が回路コンポーネントＴ（ｉ）　に供給される。回路コンポーネントＳ（ｉ）　内の回路ブロックＢＩで
は、供給される制御信号ＳＦＴ１１　、ＳＦＴ１０　に
より端子ＡＴ（ｉ）　を介して供給されるデータビット
Ｘ（ｉ）　が選択され、取込まれる。【００９３】係数ｈ２の乗算のための第１サイクルでは
、図４のデータＲ（３）　〜Ｒ（−１）に相当する乗数
４ビットとその下位１ビット〔但し、この場合、Ｒ（−
１）＝０とされる〕がロジック回路７１にて調べられる
。【００９４】ロジック回路７１では、上述の５ビットの
データが〔０１０１０〕なので、図６及び図７から以下
のことが判る。即ち、第１サイクルＣ１では、シフト量
規定信号Ｘ〜８Ｘは“ＬＬＨＬ”でシフト量規定信号４
Ｘのみがハイレベルで出力されること、信号ＴＨが“Ｈ
”、信号ＩＶが“Ｌ”で出力されること、そして、スキ
ップ信号ＳＫが“Ｌ”ベルで出力され、また、第２サイ
クルＣ２では、シフト量規定信号Ｘ〜８Ｘは“ＨＬＬＬ
”でシフト量規定信号Ｘのみがハイレベルで出力される
こと、信号ＴＨが“Ｈ”、信号ＩＶが“Ｌ”で出力され
ること、そして、スキップ信号ＳＫが“Ｈ”ベルで出力
される等である。【００９５】これによって、第１サイクルＣ１では、部
分積ＰＰ０は“４Ｘ”を求めればよく、また、回路ブロ
ックＢＫでは回路ブロックＢＪからの出力がそのまま出
力されること、更に、演算は第１サイクルＣ１のみで終
了せず乗数Ｐ２の次の４ビットの処理へスキップできな
いことが判る。また、第２サイクルＣ１では、部分積Ｐ
Ｐ０は“Ｘ”を求めればよく、また、回路ブロックＢＫ
では回路ブロックＢＪからの出力がそのまま出力される
こと、更に、演算は第２サイクルＣ２にて終了すること
が判る。【００９６】そこで、第１サイクルＣ１では、“４Ｘ”
を求めるため、回路コンポーネントＳ（ｉ）　中の回路
ブロックＢＪでは、制御信号ＳＦＴ２１　、ＳＦＴ２０
　によって、データＲ（ｉ−２）　、即ち、２ビット下
位の回路コンポーネントＳ（ｉ−２）　の回路ブロック
ＢＩの出力を選択する。【００９７】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力がそのまま出力され、こ
の出力は回路ブロックＢＬのアキュムレータに保持され
る。【００９８】次に、第２サイクルＣ２では、“Ｘ”を求
めるため、回路コンポーネントＳ（ｉ）　中の回路ブロ
ックＢＪでは、制御信号ＳＦＴ２１　、ＳＦＴ２０　に
よって、データＲ（ｉ）　、即ち、当該ビットの回路コ
ンポーネントＳ（ｉ）　の回路ブロックＢＩの出力を選
択する。【００９９】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力がそのまま出力され、こ
の出力は回路ブロックＢＬのアキュムレータに保持され
る。【０１００】係数ｈ２の下位４ビットの処理が終了する
と、次の中位４ビット、即ち、〔００００〕の処理へ移
るために回路コンポーネントＳ（ｉ）、Ｔ（ｉ）　の回
路ブロックＢＩでは制御信号ＳＦＴ１１　、ＳＦＴ１０
　によって、４ビットシフトするべくデータＲ（ｉ−４
）　を選択する。【０１０１】この時、図４に示されるコンパレータ７２
には、上述の中位側４ビット〔００００〕が供給され、
また、上述の中位４ビットの最上位の信号Ｒ（３）　〔
＝０　〕が付加されて５ビットのデータ〔０００００〕
が形成されるため、図６及び図７から明らかなようにス
キップ信号ＳＫ及び比較信号ＣＰは共に“Ｈ”で出力さ
れる。【０１０２】スキップ回路７３では、上述の５ビットの
データが〔０００００〕となるため、次の４ビットの処
理が不要であることを判別する。従って、この段階で係
数ｈ２の中位４ビットの処理が終了することになる。【０１０３】係数ｈ２の中位４ビットの処理が終了する
と、次の上位４ビット、即ち、〔００１０〕の処理へ移
るために回路コンポーネントＳ（ｉ）、Ｔ（ｉ）　の回
路ブロックＢＩでは制御信号ＳＦＴ１１　、ＳＦＴ１０
　によって、８ビットシフトするべくデータＲ（ｉ−８
）　を選択する。【０１０４】上述の上位４ビット〔００１０〕のデータ
には、中位４ビットの最上位のデータＲ（３）　〔＝０
　〕が付加されて５ビットのデータ〔００１００　〕が
形成される。【０１０５】ロジック回路７１では、上述の５ビットの
データが〔００１００　〕なので、図６及び図７から以
下のことが判る。第１サイクルＣ１では、シフト量規定
信号Ｘ〜８Ｘは“ＬＨＬＬ”でシフト量規定信号２Ｘの
みがハイレベルで出力されること、信号ＴＨが“Ｈ”、
信号ＩＶが“Ｌ”で出力されること、そして、スキップ
信号ＳＫが“Ｈ”で出力される等である。【０１０６】これによって、部分積ＰＰ０は“２Ｘ”を
求めればよく、また、回路ブロックＢＫでは回路ブロッ
クＢＪからの出力がそのまま出力されること、更に、演
算は第１サイクルＣ１のみで終了することが判る。そこ
で、“２Ｘ”を求めるため、回路コンポーネントＳ（ｉ
）　中の回路ブロックＢＪでは、制御信号ＳＦＴ２１　
、ＳＦＴ２０　によって、データＲ（ｉ−１）　、即ち
、１ビット下位の回路コンポーネントＳ（ｉ−１）　の
回路ブロックＢＩの出力を選択する。【０１０７】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力がそのまま出力され、こ
の出力は回路ブロックＢＬのアキュムレータに保持され
る。従って、この段階で係数ｈ２の処理が終了すること
になる。係数ｈ２の処理は３サイクルであり、係数ｈ１
、ｈ２のサイクル数の合計は５サイクルである。【０１０８】（３）　係数ｈ３＝〔１１１１　　１１１
０　１１００　〕被乗数Ｘは回路コンポーネントＳ（ｉ
）　内の回路ブロックＢＩに供給され、乗数Ｐ２、この
場合には上述の係数ｈ３が回路コンポーネントＴ（ｉ）
　に供給される。回路コンポーネントＳ（ｉ）　内の回
路ブロックＢＩでは、供給される制御信号ＳＦＴ１１　
、ＳＦＴ１０　により端子ＡＴ（ｉ）　を介して供給さ
れるデータビットＸ（ｉ）　が選択され、取込まれる。【０１０９】係数ｈ３の乗算のための第１サイクルでは
、図４のデータＲ（３）　〜Ｒ（−１）に相当する乗数
４ビットとその下位１ビット〔但し、この場合、Ｒ（−
１）＝０とされる〕がロジック回路７１にて調べられる
。【０１１０】ロジック回路７１では、上述の５ビットの
データが〔１１０００〕なので、図６及び図７から以下
のことが判る。シフト量規定信号Ｘ〜８Ｘは“ＬＬＨＬ
”でシフト量規定信号４Ｘのみがハイレベルで出力され
ること、信号ＴＨが“Ｌ”、信号ＩＶが“Ｈ”で出力さ
れること、そして、スキップ信号ＳＫが“Ｈ”で出力さ
れること等である。【０１１１】これによって、部分積ＰＰ０は“４Ｘ”を
求めればよく、また、回路ブロックＢＫでは回路ブロッ
クＢＪからの出力が反転されること、更に、演算は第１
サイクルＣ１のみで終了し乗数Ｐ２の次の４ビットの処
理へスキップできることが判る。そこで、“４Ｘ”を求
めるため、回路コンポーネントＳ（ｉ）　中の回路ブロ
ックＢＪでは、制御信号ＳＦＴ２１　、ＳＦＴ２０　に
よって、データＲ（ｉ−２）　、即ち、２ビット下位の
回路コンポーネントＳ（ｉ−２）　の回路ブロックＢＩ
の出力を選択する。【０１１２】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力が反転され、この出力は
回路ブロックＢＬのアキュムレータに保持される。【０１１３】係数ｈ３の下位４ビットの処理が終了する
と、次の中位４ビットへ移る。めに回路コンポーネント
Ｓ（ｉ）　、Ｔ（ｉ）　の回路ブロックＢＩでは制御信
号ＳＦＴ１１　、ＳＦＴ１０　によって、データＲ（ｉ
−４）　を選択する。【０１１４】上述のデータＲ（ｉ−４）　が選択される
ことによって４ビットシフトされる。尚、このシフト動
作は、乗数Ｐ２（ｉ）　、即ち、回路コンポーネントＳ
（ｉ）　では左シフト、データビットＸ（ｉ）　、即ち
、回路コンポーネントＴ（ｉ）　では右シフトとされる
。【０１１５】係数ｈ３の中央の４ビットは〔１１１０〕
であり、その下位ビットは〔１　〕であるから、図４の
データＲ（３）　〜Ｒ（−１）に相当する５ビットのデ
ータは〔１１１０１　〕となる。【０１１６】ロジック回路７１では、上述の５ビットの
データが〔１１１０１〕なので、図６及び図７から以下
のことが判る。シフト量規定信号Ｘ〜８Ｘは“ＨＬＨＬ
Ｌ”でシフト量規定信号Ｘのみがハイレベルで出力され
ること、信号ＴＨが“Ｌ”、信号ＩＶが“Ｈ”で出力さ
れること、そして、スキップ信号ＳＫが“Ｈ”で出力さ
れることが判る。【０１１７】これによって、部分積ＰＰ０は“Ｘ”を求
めればよく、また、回路ブロックＢＫでは回路ブロック
ＢＪからの出力が反転されること、更に、演算は第１サ
イクルＣ１のみで終了し乗数Ｐ２の次の４ビットの処理
へスキップできることが判る。そこで、“Ｘ”を求める
ため、回路コンポーネントＳ（ｉ）　中の回路ブロック
ＢＪでは、制御信号ＳＦＴ２１　、ＳＦＴ２０　によっ
て、データＲ（ｉ）　、当該ビットの回路コンポーネン
トＳ（ｉ）　の回路ブロックＢＩの出力を選択する。【０１１８】また、上述したように回路ブロックＢＫで
は回路ブロックＢＪからの出力が反転され、この出力は
回路ブロックＢＬのアキュムレータに保持される。【０１１９】係数ｈ３の中位４ビットの処理が終了する
と、次の最上位側４ビット、即ち、〔１１１１〕の処理
へ移るために回路コンポーネントＳ（ｉ）　、Ｔ（ｉ）
　の回路ブロックＢＩではデータＲ（ｉ−８）　を選択
しようとする。【０１２０】この時、図４に示されるコンパレータ７２
には、上述の最上位側４ビット〔１１１１〕が供給され
、また、上述の中位４ビットの最上位のデータＲ（３）
　〔＝１　〕が付加されて５ビットのデータが〔１１１
１１　〕となるため、スキップ信号ＳＫ及び比較信号Ｃ
Ｐは“Ｈ”で出力される。【０１２１】スキップ回路７３では、上述の５ビットの
データが〔１１１１１〕となるため、次の４ビットの処
理が不要であつまり、部分積ＰＰ０を形成することが不
要であることを意味している。従って、この段階で係数
ｈ３の処理が終了することになる。係数ｈ３の処理は２
サイクルであり、係数ｈ１〜係数ｈ３までの全サイクル
数は７サイクルであった。【０１２２】１２ビットの場合、通常、アキュムレータ
があるだけだと１２サイクルかかり、３タップだと３６
サイクル要する。２次のブースのアルゴリズムを用いる
と半分の１８サイクル要する。しかしながら、この一実
施例では７サイクルで済ますことができる。【０１２３】一般的には、Ｂビットの乗算は１つのアキ
ュムレータがあれば基本的には演算サイクル数がＢで実
現でき、２次のブースのアルゴリズムを使用すれば演算
サイクル数が（Ｂ／２）となり、４次のブースのアルゴ
リズムを使用すれば演算サイクル数が（３／８）Ｂとな
る。この一実施例では、４次のブースのアルゴリズムを基本
とし演算スキップの工夫を加えているが、係数ｈ１〜ｈ
３を取込むサイクルがスキップできない等の点から、演
算サイクル数は概略（３／８）Ｂとなる。このように、
複数のビットから構成されるデータに対し、２次ブース
のアルゴリズムを４ビット単位で適用し各演算サイクル
で部分積を４ビット単位で生成し、また、この時、デー
タを別途、先見することで次の演算サイクルの部分積が
零になるか否か判断しているので、次の演算サイクルに
於ける部分積の演算が不要になる判断される時は該演算
をスキップすることができ、演算サイクル数を削減でき
て、部分積の生成の効率が良い。固定特性のデジタルフ
イルタ或いはコサイン変換等、乗数Ｐ２が固定値の場合
、演算サイクル数が減少するように乗数Ｐ２を決定する
と演算サイクルの減少の効果が大きくなる。【０１２４】回路ブロックＢＩのセレクタ３２は４入力
セレクタであり、制御信号ＳＦＴ１１　、ＳＦＴ１０　
によってデータビットＰ１（ｉ）の他、出力Ｒ（ｉ）　
を保持したり、また、Ｒ（ｉ−４）　を選択して４ビッ
ト左シフトしたり、或いは、Ｒ（ｉ−８）　を選択して
８ビット左シフトすることが可能となり、また、回路ブ
ロックＢＪのセレクタ３５は４入力セレクタであり、制
御信号ＳＦＴ２１　、ＳＦＴ２０　によって、出力Ｒ（
ｉ）　、Ｒ（ｉ−１）　、Ｒ（ｉ−２）　、Ｒ（ｉ−３
）　の内から１ビットのデータを選択し、部分積のとり
かたを１倍、２倍、４倍、８倍の中から選択できるよう
にしているので、上述の効果に加えて機能を広くとりな
がらゲート規模を小型化でき、高速化を実現でき、回路
の複雑化を防止できる。【０１２５】回路コンポーネントＴ（７）　〜Ｔ（−１
）からの信号Ｒ（７）　〜Ｒ（−１）に基づいて、回路
ブロックＢＯでは、シフト量規定信号Ｘ〜８Ｘ、制御信
号ＭＯＤ−０及びＭＯＤ−１に対応する信号ＴＨ、ＩＶ
　、スキップ信号ＳＫ等を形成するものである。このス
キップ信号ＳＫがスキップ回路７３に供給される。更に
、不要な部分積の生成をスキップし演算サイクル数を最
小化する作業をハードウエアで行わず、予め、別途に求
めているので、最適化が可能となるという効果がある。【０１２６】【発明の効果】請求項１の発明に係る演算回路によれば
、複数のビットから構成されるデータに対し、２次ブー
スのアルゴリズムを４ビット単位で適用して各演算サイ
クルで部分積を４ビット単位で生成すると共に、データ
を別途、先見することで次の演算サイクルの部分積が零
になるか否かを判断しているので、次の演算サイクルに
於ける部分積の演算が不要になると判断される時は該演
算をスキップすることができるという効果があり、演算
サイクル数を削減でき、部分積の生成の効率が良いとい
う効果がある。【０１２７】請求項２の発明に係る演算回路によれば、
２次ブースのアルゴリズムを４ビット単位に拡張して適
用すると共に、ブースのアルゴリズムを適用するに際し
て必要とされる機能を、従来の多入力セレクタを用いず
に少入力セレクタ〔４入力セレクタ〕で実現する構成と
しているので、請求項１の発明の効果に加えて、機能を
広くとりながらゲート規模を小型化でき、高速化を実現
でき、回路の複雑化を防止できるという効果がある。更
に、不要な部分積の生成をスキップし演算サイクル数を
最小化する作業をハードウエアで行わず、予め、別途に
求めているので、最適化が可能となるという効果がある
。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an arithmetic circuit, and particularly to an arithmetic circuit suitable for video signal processing. 2. Description of the Related Art Conventionally, parallel multiplier circuits have been used in arithmetic circuits that require high speed, such as product-sum arithmetic circuits. However, the multistage connection of full adders has a limited high speed due to the problem of carry propagation delay, or when a certain gate is performing a logical operation to determine the output, There was a problem in that the next stage of gates that were installed had idle time waiting for output. [0003] Therefore, in arithmetic circuits for real-time signal processing of images that require high-speed performance, a technique for dividing signals into bits and processing them in a time-division multiplexed manner is proposed, for example, in the patent application proposed by the applicant of the present application. Specification No. 1-284398,
It is disclosed in Japanese Patent Application No. 1-337319. [0004]Series-parallel multiplier circuits or pipeline multiplier circuits have been known for a long time, but these are basically bit-serial multiplier circuits and output the operation results in bit-serial format, so the multiplication results are accumulated one after another. The product-sum calculation circuit for addition requires the circuit configuration shown in FIG. 11 or 14. The product-sum calculation circuit shown in FIG. 11 will be explained below. In the configuration of FIG. 11, the bit length is B1.
The data of the multiplicand X, which is assumed to be , is supplied to the barrel shifter 102 via the register 101. On the other hand, the multiplier P2 is supplied to a decoder (not shown), and this decoder outputs the following control signals SFT,
A control signal MOD is formed. And the control signal SF
T is supplied to the barrel shifter 102 and the control signal MO
D is supplied to logic circuit 103. Barrel shifter 102 receives control signal SFT
This is a logic circuit that arithmetically shifts the multiplicand X by an arbitrary number of bits. As a result, the barrel shifter 102
The bit length B2 of the data output from the multiplicand X is assumed to be at least twice as long as the bit length B1 of the multiplicand X. Then, data of bit length B2 is supplied from this barrel shifter 102 to a logic circuit 103. A specific configuration of this barrel shifter 102 is shown in FIG. The barrel shifter 102 shown in FIG. 12 has a configuration for one bit, and a barrel shifter 102 having a similar configuration is provided for a bit length B2. In other words, the multiplicand X of bit length B1 for one bit of barrel shifter 102 is
2 are commonly connected to all. This barrel shifter 102 has (a1+
B1+a2) bit data is supplied to this barrel shifter 102, (a1+B1+
a2) The above control signal SFT is selected from among the bit data.
Based on , 1 bit data is selected and logic circuit 1
03 is supplied. Therefore, this barrel shifter 102
functions as a selector. The above data a1 is the bit length B of the multiplicand
1 is extended, and the sign bit of the MSB of the multiplicand X is extended and shifted to the left to create upper bits for reducing data. Furthermore, data a2 is input with "0" and shifted to the right to create lower bits for increasing the data. The logic circuit 103 is, for example, a partial product generation circuit for implementing Booth's algorithm, and is controlled by a control signal MOD. Logic circuit 103 as one of this logic circuit 103
1 is shown in FIG. 13, also as another example logic circuit 1
032 is shown in FIG. In the logic circuit 1031 shown in FIG. 13, if the multiplier P2 is simply multiplied one bit at a time starting from the LSB side, each logic circuit
031 is the AND gate 104 as shown in FIG.
You can use However, when generating partial products based on the quadratic Booth algorithm for every 2 bits sequentially from the LSB side of the multiplier P2, as shown in Fig.
A circuit configuration as shown in 4 is required. In addition, 105
is the terminal, and X(i) is the data of the i-th bit in the multiplicand X. The logic circuit 1032 shown in FIG. 14 shows a configuration for one bit, and a similar configuration is used for M
A bit length of B2 is provided from SB to LSB. In this logic circuit 1032, the i-th data bit X(i) in the multiplicand X is connected to the terminal 106.
via the selector 107 and the upper bits (not shown).
That is, it is supplied to the selector of the (i+1)th bit circuit. Further, a data bit X(i-1) supplied from a lower bit (not shown), that is, the (i-1)th bit circuit, is supplied to the selector 107 via the terminal 108. In the selector 107, the control signal MOD
data bits X(i) to logic circuit 1032 based on
Alternatively, either data bit X(i-1) is selected. In this case, selecting data bit X(i) means passing it through; data bit X(i)
-1) will shift left by 1 bit and multiply by 2. And exclusive or gate 1
At step 08, the output from the selector 107 is passed through as is or is inverted, and finally the AND gate 109 selects whether the output from the exclusive OR gate 108 is passed through as is or is set to "zero". In addition, when the output from the selector 107 is inverted in the exclusive OR gate 108 described above, the logic circuit 10
By applying "1" to the LSB carry input of the adder circuit 111 arranged next to 32, the sign of the numerical value is inverted instead of simply inverting the binary number. The generated partial product is supplied by the logic circuit 103 to the addition circuit 111. Addition circuit 111 and register 112
The partial products are accumulated by . Addition circuit 111
Then, the sequentially supplied partial products are stored in the register 112.
The summation value SUM fed back from the SUM is added to the cumulative addition value fed back from the SUM, and the summation output SUM is supplied to the register 112. In the register 112, the adder circuit 111
The addition value supplied from the addition circuit 111 is held as an addition output SUM and is fed back to the addition circuit 111. Note that this register 112 clears the contents of the register 112 with a clear signal CL when storing the first partial product of the multiplicand The above-mentioned addition output SUM is supplied to the circuit. in this way,
The product-sum operation circuit shown in FIG. 11 takes time equal to the number of cycles corresponding to the word length of the multiplier P2. Next, the product-sum operation circuit shown in FIG. 15 will be explained. Note that in FIG. 15, parts common to the configuration shown in FIG. 11 are denoted by the same reference numerals, and redundant explanation will be omitted. In the configuration of FIG. 15, the bit length is B1.
Data of the multiplicand X, which is assumed to be , is supplied to the selector 121. Also, this selector 121 has register 1
22, data whose bit length is B2 bits is fed back. On the other hand, the multiplier P2 is supplied to a decoder (not shown), and this decoder forms the control signal SFT and the control signal MOD shown below. Then, the control signal SFT is supplied to the selector 121, and the control signal MOD is supplied to the logic circuit 103. A specific configuration of this selector 121 is shown in FIG. Selector 121 shown in FIG.
has a configuration for one bit, and a selector 121 having a similar configuration is provided for a bit length B2. In other words, the bit length B for the selector 121 for 1 bit
2 data is fed back from the register 122 and is supplied, and this data with a bit length of B2 is commonly connected to all of the selectors 121. The above-mentioned selector 121 and register 122
It operates like a shift register. That is, in the first calculation cycle, the selector 121 in FIG. 16 selects the supplied multiplicand X and takes it into the register 122. At this time, selector 121 selects data bit X(i) in FIG. Then, in the next cycle, the selector 121 and the register 122 operate like a shift register in order to sequentially generate partial products from the LSB of the multiplier P2. In other words, if the multiplication shown in FIG. 13 is performed, the selector 121 for each bit shown in FIG. 16 shifts the data taken into the register 122 to the left by one bit. Selector 121
The output from the register of the (i-1)th bit lower than 1 bit is selected. Then, from the next operation cycle, the output of the register one bit lower than the bit selected in the previous operation cycle is selected. Furthermore, if multiplication as shown in FIG. 14 is performed, in order to shift the data loaded into the register 122 to the left by 2 bits, the selector 121 for each bit shown in FIG. Selects the output from the register 2 bits lower than the register. Then, from the next operation cycle, the output of the register 2 bits lower than the bit selected in the previous operation cycle is selected. register 1
22, the data of bit length B2 supplied from the selector 121 is held in the logic circuit 103 and fed back to the selector 121. The barrel shifter 102 as a selector shown in FIG. 12 and the selector 121 shown in FIG. 15 are multi-input selectors for the following reasons. For example, data with many “0” like [000010001000], [111000011110]
] In the case of data that has many consecutive "0"s or "1"s, the operation for generating partial products can be skipped and speeding up can be achieved. Bit shifting of arbitrary bits is required for such bit skipping. However, “0” and “1” like [101011010101]
It is not possible to skip calculations for data in which "" appears alternately. [Problem to be Solved by the Invention] In the above-mentioned conventional techniques, a multi-input selector is required in any case, so the number of gates is large. There was a problem that the number of bits increased, the circuit size increased, and the circuit became complicated.Also, since operations were performed on all bits without skipping unnecessary operations, it was not efficient. Therefore, an object of the present invention is to provide an arithmetic circuit that is more efficient in terms of the number of gates, the number of arithmetic cycles, etc. [Means for Solving the Problems] Claims The arithmetic circuit according to item 1 applies the quadratic Booth algorithm to the data in units of 4 bits in the arithmetic circuit that calculates the multiplication result of data consisting of a plurality of bits by forming partial products. , when the data is shifted in 4-bit units to generate partial products in 4-bit units in each calculation cycle, and by looking ahead at the data, it is determined that the partial product to be generated in the next calculation cycle will be zero. is configured to skip the calculation of the next partial product. The calculation circuit according to claim 2 includes a control signal generation circuit that forms various control signals, and a circuit component that processes each bit of data. , and an accumulator that accumulates the outputs of the circuit components, and the circuit component is configured to shift data from among a plurality of bit data or bit data for operation supplied from the lower bit circuit components and having a predetermined bit interval. a first selection means for selecting arbitrary bit data for defining the amount; one-bit data supplied from the first selection means; a second selection means for selecting arbitrary 1-bit data for specifying the shift amount of data from among the 1-bit data composed of the number of bits; and an addition output based on the output from the second selection means. an accumulator that forms a carry output and a carry output of the circuit component, supplies the carry output of the circuit component to the circuit component of the higher bit, and forms a carry output and an addition output from the circuit component of the lower bit as outputs of the circuit component; It selects and holds either the addition output and carry output supplied from the upper bit circuit component or the addition output and carry output supplied from the accumulator, and also transfers the held value to the lower bit circuit component. The configuration includes an output circuit that supplies the [Operation] The arithmetic circuit according to claim 1 applies the quadratic Booth algorithm in units of 4 bits to data consisting of a plurality of bits, and shifts the above-mentioned data in units of 4 bits. Partial products are generated in units of 4 bits in each calculation cycle. At this time, by looking ahead at the next data, if it is determined that the partial product of the calculation cycle will be zero, the calculation of the partial product in the next calculation cycle is skipped. In the arithmetic circuit according to claim 2, the amount of data shift required when applying the Booth algorithm in the first selection means of the circuit component is defined based on the control signal, and the amount of data corresponding to the shift amount is defined based on the control signal. Arbitrary bit data is selected by a selector. In the second selection means, the shift amount of data necessary to obtain the multiple specified by the Booth algorithm described above is defined based on the control signal, and arbitrary bit data corresponding to this shift amount is selected by the selector. be done. In the accumulator, partial products of the circuit component are accumulated based on the output from the second selection means, and an addition output and a carry output of the circuit component are formed. The output from each circuit component is
They are added in sequence to obtain a multiplication result. [Embodiment] An embodiment of the present invention will be described below with reference to FIGS. 1 to 10. The arithmetic circuit shown in FIG. 1 consists of N circuit components S(N-1) to S(0
), a control signal generation circuit 10, and an accumulator 20. The control signal generation circuit 10 generates a control signal SFT
11, SFT10, SFT21, SFT20,
Control signals MOD-0, MOD-1, clock signal CLK
, clear signal CL, load signal LD, etc., and send these signals to each of the above-mentioned circuit components S(N
-1) to S(0) to control the circuit operation of each circuit component S(N-1) to S(0). Circuit components S(N-1) to S(
0) are provided in a number corresponding to the bit length B2 in FIG. 11 or FIG. 15 showing the prior art, that is, N (N=B2). Here, terminals AT(N-1) to AT(0
) is originally B1 shown in FIG. 11 or 15, so the data with bit length B2 described above is formed by sign-extending the MSB of bit length B1. ing. For example, circuit component S(N-1)
~Circuit component S(i) at S(0)
Looking at the circuit components of the lower bits, each 1-bit data R(i-1) to R(i-4)
, R(i-8) and 1-bit carry output CY(i-
1) are supplied, and each 1-bit data U(i+1) is supplied from the upper bit circuit component S(i+1).
and V(i+1) are supplied. In addition, the terminal AT (
i) Data bits X(i) are supplied via. Note that this data bit X(i) is the (i)th bit of the multiplicand
It means the data of the bit. The circuit component S(i) supplies a 1-bit carry output CY(i) and 1-bit data R(i) to the upper bit circuit component S(i+1), and 1-bit data U(i) and V(i) are each supplied to the lower bit circuit component S(i-1). In the configurations of FIGS. 1 and 2, the information in parentheses, for example, (i), (i
-1), (i-4), (i-8), etc. are “negative”
It is assumed that all of the following are connected to “zero (ground)”. Each circuit component S (N-1
) ~S(0) all have the same configuration, so
In the following description of circuit components, circuit component S(i) will be taken as an example. The accumulator 20 mainly includes a full adder 21 and flip-flops 22 and 23. The outputs U(0) and V(0) from the circuit component S(0) are added together with the carry output CY20 in the previous cycle in the full adder 21, and the addition output SUM2
0 and a carry output CY20 are formed. The carry output CY20 is sent to the full adder 2 via the flip-flop 22.
Further, the addition output SUM20 is sequentially taken out from the terminal 24 from the LSB side via the flip-flop 23. The structure of circuit component S(i) is shown in FIG. Circuit block BI indicated by a broken line corresponds to selector 121 and register 122 in FIG. 15, and circuit block BJ corresponds to selector 107 in FIG.
The circuit block BK corresponds to the exclusive OR gate 108 and the AND gate 109 in FIG. 14, and the circuit block BM corresponds to the adder circuit 111 and the register 112 in FIG. The data bit X(i) of the multiplicand X supplied via the terminal AT(i) is transferred to the flip-flop 31.
be taken into account. Flip-flop 31 corresponds to the i-th bit of register 122 shown in FIG. 15, and data bit X(i) is supplied from flip-flop 31 to selector 32 of circuit block BI. The selector 32 of the circuit block BI is a 4-input selector, and this selector 32 receives the above-mentioned data bit X(i) and the data R(i-4) and R( i-8) and data R(i) fed back from the flip-flop 33. The selector 32 also receives a control signal SFT1 from the control signal generation circuit 10.
1, SFT10 is supplied. In the selector 32, the above-mentioned control signal SFT
11, based on SFT10, data R(i-8)
, R(i-4), R(i), and X(i) is selected and supplied to the flip-flop 33. By this selector 32, in addition to data bit X(i),
Data R(i) can be held or data R(i-
4) It is possible to select data R(i-8) and shift it to the left by 4 bits, or to select data R(i-8) and shift it to the left by 8 bits. In addition, the above-mentioned control signals SFT11, SF
The selection based on T10 will be described later. Output R(i) from flip-flop 33
is supplied to other high-order bit circuit components, is supplied to the input side of the selector 35 of the circuit block BJ, and is further fed back to the input side of the selector 33 described above. In circuit block BJ, the second-order Booth algorithm is extended to 4 bits and applied.
Data R(i), R(i-1), R(i-2
), R(i-3), and the method of calculating the partial product can be selected from among 1x, 2x, 4x, and 8x. The selector 35 of the circuit block BJ is a four-input selector, and this selector 35 receives the data R(i) from the flip-flop 33 described above and the lower bit circuit components S(i-1) and S(i -2),
Data R(i-1) supplied from S(i-3),
R(i-2) and R(i-3) are supplied. Further, the selector 35 is supplied with control signals SFT21 and SFT20 from the control signal generation circuit 10. In the selector 35, the above-mentioned control signal SFT
21, based on SFT20, data R(i),
One of R(i-1), R(i-2), and R(i-3) is selected, and data R is sent to the flip-flop 36.
is supplied to circuit block BK via. Note that which data R is selected by the above-mentioned control signals SFT21 and SFT20 will be described later. In circuit block BK, NAND gate 41
, 42, or gate 43, FIG.
4 exclusive or gates 108 and AND gates 109 are configured. Data output from the flip-flop 36 described above is supplied to NAND gates 41 and 42. Further, the control signal MOD-0 is supplied to the NAND gate 41, and the control signal MOD-1 is supplied to the NAND gate 42, respectively. The outputs of the NAND gates 41 and 42 are then supplied to an OR gate 43, and the output of this OR gate 43 is supplied to the circuit block BL via a flip-flop 44. The following relationship exists between the combination of control signals MOD-0 and MOD-1 described above and the operation of circuit block BK. The circuit block BL is an accumulator, and the circuit block BL is a full adder 5.
1 and flip-flops 52 and 53. Prior to execution of the calculation, the control signal generation circuit 1
The contents of the flip-flops 52 and 53 are cleared by the clear signal CL supplied from 0. The full adder 51 receives the output from the flip-flop 44 of the circuit block BK, the output fed back from the above-mentioned flip-flop 53, and the circuit block B of the lower bit circuit component S(i-1).
A carry output CY(i-1) from K is supplied, and these values are added. In this full adder 51,
A carry output CY(i) and an addition output SUM(i) are formed. The carry output CY(i) is supplied to the flip-flop 52, and the addition output SUM(i) is supplied to the flip-flop 53. In the first calculation cycle, the contents of the flip-flops 52 and 53 are cleared, so in each calculation cycle, the addition output SUM(i) is fed back to the input side of the full adder 51 via the flip-flop 53. At the same time, the carry output CY(i) is supplied to the selector 56 of the circuit block BM, and the carry output CY(i) is sent to the upper bit circuit component S via the flip-flop 52.
(i+1) is supplied to the circuit block BL. on the other hand,
The carry output CY(i-1) supplied from the circuit component S(i-1) of the lower bit is the full adder 51
It is supplied to the input side of the circuit block BM, and also to the selector 57 of the circuit block BM. The output from the circuit block BL described above is a redundant binary number in which each bit is represented by two bits. [0048] The circuit block BM has selectors 56 and 57.
and flip-flops 58 and 59, and a load signal LD is supplied from the control signal generation circuit 10 to each selector 56 and 57. The selector 56 is supplied with the addition output SUM(i) and the output U(i) supplied from the circuit component S(i+1) of the upper bits, and the selector 57 is supplied with the output U(i) supplied from the circuit component S(i+1) of the lower bits. (i-
1) Carry output CY(i-1) supplied from
The outputs V(i) supplied from the circuit components S(i+1) of the upper bits are respectively supplied. When the product-sum result from circuit block BL is obtained, load signal LD is supplied to circuit block B.
M selectors 56 and 57 are switched, the addition output SUM(i) from the circuit block BL described above is loaded into the flip-flop 58, and the carry output CY(i) is loaded into the flip-flop 59. After that, the outputs U(i) and V(i)
is shifted from the MSB circuit component S(N-1) side to the LSB circuit component S(0) side at the timing of the clock signal CLK, and the above-mentioned accumulator 2
Cumulative addition is performed at 0. The shift output from this circuit requires the number of cycles equal to the output word length, but
This is the circuit blocks BI to BM required for one product-sum operation.
There is no problem as long as it is small compared to the number of cycles. By the way, in FIGS. 1 and 2, the noteworthy point is the selectors 32 and 35 of the circuit blocks BI and BJ.
It is. This is the selector 121 in FIG. 15 showing the prior art.
The circuit allocation for the operation skip is changed for the selector 107 in the logic circuit 103. In the product-sum calculation, if generation of partial products is not skipped, the circuit block BI may be as shown in FIG. 8. In FIG. 8, the selector 61 has a terminal A.
Data bits X(i) supplied via T(i)
and data R(i-1) or R(i-2) are supplied. In this case, use Booth's algorithm
Otherwise, it becomes just a shift register (shift input is output R(i-1)). Furthermore, if the second-order Booth algorithm is used, a two-bit shift is always performed, so the shift input in FIG. 8 is the output R(i-2). On the other hand, if the generation of partial products is to be skipped, the circuit block BI must be similar to the selector 121 in FIG. 16 showing the prior art. However, in such a configuration, a large number of gates are required for infrequent multi-bit skipping, resulting in a decrease in efficiency. Therefore, as shown in FIG. 9 or 10, a configuration may be considered in which the range in which skipping is possible is limited to a certain extent. Here, the configuration shown in FIG. 9 is designed to allow only 2-bit skipping on the premise that the second-order Booth algorithm is applied. In addition, FIG. 10 shows a selector 65 configured by combining the selector 107 of FIG. 13 showing the prior art, and data R(i-1)
It is possible to selectively skip within the range of ~R(i-6). However, these configurations still require multi-input selectors 63 and 65, and therefore are not very suitable because the loop for selecting data R(i-1) to R(i-6) takes time. do not have. On the other hand, the circuit blocks BI and BJ shown in FIG. 2 have uniform four-input selectors 32 and 35, respectively. Such small input selectors 32, 3
5 is preferable because it can be configured with a small number of gates and the feedback loop can be made faster. The fan-out of the flip-flop 33 of the circuit block BI that holds data R(i) is also kept small compared to other methods. Also, circuit block BJ
~BM is all feedforward except for the feedback loop in circuit block BL, and in such a feedforward circuit, it is possible to easily increase the speed by pipelining. In addition to the equalized circuit configuration, the pipelined circuit shown in FIG. 2 is capable of very high-speed operation. The control signal generating circuit 10 in FIG. 1 generates various control signals as described above, and a memory can be used as one means for realizing such functions. This is because the required calculation cycles differ depending on the value of the multiplier, and there are various combinations of partial integral decomposition methods for the multiplier, and the method with the minimum number of cycles should be searched for. from,
This is because it is appropriate not to perform such a process in real time, but to obtain a solution in advance and store only the time series of various control signals in a memory before generating them. Furthermore, if the multiplier cannot be determined in advance, or if there are too many multipliers to store in the memory, the control signal generating circuit 10 may be configured as shown in FIGS. 3 and 4, for example. will be used. The following explanation regarding FIGS. 3 and 4 is not only an explanation of the hardware, but also a description of the method of determining the time series of various control signals in advance when a memory is used in place of the controller as described above. An example is shown. In the configuration of FIG. 3, (N+1) circuit components T(N-1) from MSB to LSB
~T(0) are connected in cascade. The circuit block BI in FIG. 2 is used as the circuit components T(N-1) to T(0), and the control signal SF supplied to the circuit components T(N-1) to T(0) is
T11 and SFT10 are also common. This circuit component T(N-1) ~
Multiplier P2 is supplied to T(0) via terminals CT(N-1) to CT(0), and this circuit component T(N-1) to T(0) as a whole constitutes a shift register. However, the way of connection when constructing a shift register is different between FIG. 3 and FIG. In other words,
What was a left shift in FIG. 1 is a right shift in FIG. Further, N matches the input word length of the multiplier P2 and is different from B2. Note that the values in the parentheses of the data R(i-8) and R(i-4) supplied to the circuit block BI, for example, (i-8) and (i-4) are (N-1).
When it is larger, it is assumed to be (N-1). For example, circuit component T(N-1)
Any circuit component T(
i), the upper bit circuit components T(i+8) and T(i+4) output 1-bit data R(i+8) and R(i+4), respectively, and the terminal CT(
The data bit P2(i) is supplied via the bit P2(i), and one bit of data R(i) is supplied to the circuit component of the lower bit. Each of the above circuit components T(N-1)
~T(-1) are the control signals SFT11 and SFT10 supplied from the control signal generation circuit 10, and the clear signal C
The circuit operation is controlled by L and the like. This control signal SF
T11 and SFT10 are the same as those in FIG. 1, and are controlled so that the shift directions are opposite. Further, since each of the above-mentioned circuit components T(N-1) to T(-1) all have the same configuration, the above-mentioned circuit component T(i) will be used as an example in the following description of the circuit components. Circuit component T(-1) is for the quadratic Booth algorithm, and this circuit component T(-1) is cleared upon input loading of multiplier P2. The above circuit components T(7) to T
Data R(7) to R(-1) from (-1) are supplied to circuit block BO shown in FIG. This circuit block BO includes a logic circuit 71 that is supplied with signals R(3) to R(-1) and realizes the logic of FIGS. 6 and 7, which will be described later.
, a comparator 72 to which signals R(7) to R(3) are supplied, and a skip circuit 73 for determining the shift amount for the next calculation cycle. The logic circuit 71 receives the supplied signal R(
3) Based on ~R(-1), shift amount regulation signals X to 8X, control signals MOD-0 and MO in the upper columns of FIGS. 6 and 7
Signals TH, IV, and skip signal SK corresponding to D-1
etc. This skip signal SK is supplied to the skip circuit 73. In the comparator 72, the supplied signal R
(7) -R(3) are checked for coincidence or mismatch, and for example, a high-level comparison signal CP is detected only when all 5 bits of the above-mentioned signals R(7) -R(3) are at high level or low level. This comparison signal CP is supplied to the skip circuit 73. The skip circuit 73 is a circuit for determining the shift amount for the next operation cycle in the shift register constituted by the circuit block BI of FIGS. 1 and 3. In this skip signal SK, control signals SFT11 and SFT10 having the contents shown in FIG. As described above, various signals, control signals, etc. are formed in the logic circuit 71, and these will be explained below. The signal R(
3) The 5-bit data of ~R(-1) can be divided into 32 patterns depending on the combination of levels. By applying the quadratic Booth algorithm to 4 bits, the partial product PP0 of the multiplicand X corresponding to each pattern becomes as shown in FIG. This partial product PP0 can be further decomposed into partial product PP1. This partial product PP1 can be formed by a partial product PP2 that is a combination of two terms or less of the multiplicand X times a power of 2. This partial product PP2 can be formed by combining two terms that are powers of 2, so as shown in FIG. The partial product PP0 can be obtained by calculating the partial product PP2 for each and then adding the partial products PP2. The partial products PP2 mentioned above are all terms of powers of 2 times the multiplicand X [0X, ±1X, ±2X, ±4X, ±
8
The term that is a power of 2 times the multiplicand X that constitutes can be obtained by shifting the multiplicand X by the required number of bits. The signal for defining this shift amount is shown in FIG.
are defined as shift amount regulation signals X to 8X. The shift amount of the multiplicand X is defined in accordance with the signal that is at a high level among the shift amount regulation signals X to 8X. Then, control signals SFT21 and SFT20 are determined based on the combination of levels of the shift amount regulation signals X to 8X, and a shift operation is performed based on these control signals SFT21 and SFT20. The partial product PP2 mentioned above is the data R(i) in the selector 35 of the circuit block BJ shown in FIG.
~R(i-3) can be obtained by selecting. That is, selecting data R(i) corresponds to multiplying the multiplicand X by 1, selecting data R(i-1) corresponds to multiplying the multiplicand X by 2, and data R(i- 2
) is equivalent to multiplying the multiplicand X by 4, and selecting the data R(i-3) is equivalent to multiplying the multiplicand X by 8.
Equivalent to doubling. Signals TH and IV in FIG. 7 correspond to control signals MOD-0 and MOD-1, respectively, and in circuit block BK in FIG. It controls whether to pass through as is, invert it, or set it to "zero". for example,
When the signal TH (=L), IV (=L), it is "zero",
When the signals TH (=L) and IV (=H) are present, the signal is inverted, and when the signals TH (=H) and IV (=L) are present, the signal is passed through as is. Furthermore, the skip signal SK in FIG. 7 is based on the 5-bit data R(3) to R(-1) input to the logic circuit 71 shown in FIG. It is sometimes made to reach a high level. For example, in the pattern of partial product PP2, if there is only one term that is a power of 2 times the multiplicand . For example, in the pattern in which there is only one term in the partial product PP2 shown in FIG. 7, there is no calculation of the partial product PP2 in the second cycle C2, so the skip signal in the first cycle C1 is SK becomes high level. Of course, in the second cycle C2, the skip signal SK becomes high level. In the column of the second cycle C2 mentioned above, “*
” means that it is a skippable part. Next, an operation example will be explained. FIR
Taking a digital filter as an example, the number of taps is 3, and the coefficient h1 ~
Let each h3 be 12 bits (N=12). (1) Coefficient h1=[0000 0100 0010
](2) Coefficient h2=[0010 0000 01
01 ] (3) Coefficient h3 = [1111 1110
1100] However, the above-mentioned coefficients h1 to h3 are expressed in two's complement numbers. (1) Coefficient h1=[0000 010
0 0010 ], the multiplicand X is supplied to the circuit block BI in the circuit component S(i), and the multiplier P2, in this case the coefficient h1 mentioned above, is supplied to the circuit component T(i). In the circuit block BI in the circuit component S(i), the data bit X(i) supplied via the terminal AT(i) is selected and taken in by the supplied control signals SFT11 and SFT10. In the first cycle for multiplication of the coefficient h1, the 4 bits of the multiplier corresponding to data R(3) to R(-1) in FIG.
1)=0] is checked in the logic circuit 71. In the logic circuit 71, the above-mentioned 5-bit data is [00100], so the following can be seen from FIGS. 6 and 7. The shift amount regulation signals X to 8X are “LHLL”.
”, only the shift amount regulation signal 2X is output at high level, the signal TH is output at “H”, the signal IV is output at “L”, and the skip signal SK is output at “H” level. [0081] As a result, the partial product PP0 only needs to be calculated as "2X", and the circuit block BK outputs the output from the circuit block BJ as it is, and furthermore, the calculation is performed only in the first cycle C1. It can be seen that it is possible to skip to the processing of the next 4 bits of the multiplier P2.
In order to obtain X'', in circuit block BJ in circuit component S(i), control signals SFT21, SFT
20 selects the output of the circuit block BI of the data R(i-1), that is, the one-bit lower circuit component S(i-1). Furthermore, as described above, in circuit block BK, the output from circuit block BJ is output as is, and this output is held in the accumulator of circuit block BL. When the processing of the lower 4 bits of the coefficient h1 is completed, the process moves to the next upper 4 bits. circuit component S
(i), T(i) circuit block BI uses control signals SFT11, SFT10 to control data R(i-
4) Select. The above data R(i-4) is selected and shifted by 4 bits. Note that this shift operation is performed by the multiplier P2(i), that is, the circuit component S
(i) is left shifted, and data bit X(i), ie, circuit component T(i), is shifted right. The central 4 bits of coefficient h1 are [0100]
Since the lower bit is [0], the 5-bit data corresponding to data R(3) to R(-1) in FIG. 4 is [01000]. In the logic circuit 71, the above-mentioned 5-bit data is [01000], so the following can be seen from FIGS. 6 and 7. The shift amount regulation signals X to 8X are “LLHL”.
”, only the shift amount regulation signal 4X is output at high level, the signal TH is output at “H”, the signal IV is output at “L”, and the skip signal SK is output at “H”. [0087] As a result, the partial product PP0 only needs to be calculated as "4X", and the circuit block BK outputs the output from the circuit block BJ as it is, and furthermore, the calculation is performed only in the first cycle C1. It can be seen that it is possible to skip to the processing of the next 4 bits of the multiplier P2.
In order to obtain X'', in circuit block BJ in circuit component S(i), control signals SFT21, SFT
20 selects the output of the circuit block BI of the data R(i-2), that is, the 2-bit lower circuit component S(i-2), and selects the output of the circuit block BI of the data R(i-2), that is, the 2-bit lower circuit component S(i-2), and
The data of " When the processing of the middle 4 bits of coefficient h1 is completed, circuit components S(i), T(i)
The circuit block BI attempts to select data R(i-8). At this time, the comparator 72 shown in FIG.
is supplied with the most significant 4 bits [0000] mentioned above, and the most significant data R(3) of the middle 4 bits mentioned above is supplied to
[=0] is added and the 5-bit data becomes [000]
00], the comparison signal CP is output at "H". The skip circuit 73 determines that since the above-mentioned 5-bit data becomes [00000], processing of the next 4 bits is unnecessary. Therefore, the processing of the coefficient h1 ends at this stage. The processing of coefficient h1 took two cycles. (2) Coefficient h2=[0010 000
0 0101 ], the multiplicand X is supplied to the circuit block BI in the circuit component S(i), and the multiplier P2, in this case the above-mentioned coefficient h1, is supplied to the circuit component T(i). In the circuit block BI in the circuit component S(i), the data bit X(i) supplied via the terminal AT(i) is selected and taken in by the supplied control signals SFT11 and SFT10. In the first cycle for multiplication by the coefficient h2, the 4 bits of the multiplier corresponding to data R(3) to R(-1) in FIG.
1)=0] is checked in the logic circuit 71. In the logic circuit 71, the above-mentioned 5-bit data is [01010], so the following can be seen from FIGS. 6 and 7. That is, in the first cycle C1, the shift amount regulation signals X to 8X are "LLHL" and the shift amount regulation signal 4 is "LLHL".
Only X is output at high level, and signal TH is “H”.
", the signal IV is output at "L", the skip signal SK is output at "L", and in the second cycle C2, the shift amount regulation signals X to 8X are "HLLL".
”, only the shift amount regulation signal X is output at high level, the signal TH is output at “H”, the signal IV is output at “L”, and the skip signal SK is output at “H” level. As a result, in the first cycle C1, the partial product PP0 only needs to be calculated as "4X", and in the circuit block BK, the output from the circuit block BJ is output as is, and furthermore, the calculation It can be seen that the process does not end only in the first cycle C1 and cannot be skipped to the processing of the next 4 bits of the multiplier P2.Also, in the second cycle C1, the partial product P
P0 only needs to find "X", and circuit block BK
It can be seen that the output from circuit block BJ is output as is, and that the calculation ends in the second cycle C2. Therefore, in the first cycle C1, "4X"
In order to obtain the control signals SFT21 and SFT20 in the circuit block BJ in the circuit component S(i),
selects the output of the circuit block BI of the data R(i-2), that is, the 2-bit lower circuit component S(i-2). Furthermore, as described above, in circuit block BK, the output from circuit block BJ is output as is, and this output is held in the accumulator of circuit block BL. Next, in the second cycle C2, in order to obtain "X", the circuit block BJ in the circuit component S(i) uses the control signals SFT21 and SFT20 to control the data R(i), that is, the relevant bit. Select the output of circuit block BI of circuit component S(i). Furthermore, as described above, in circuit block BK, the output from circuit block BJ is output as is, and this output is held in the accumulator of circuit block BL. [0100] When the processing of the lower 4 bits of the coefficient h2 is completed, the circuit block BI of the circuit components S(i) and T(i) performs control in order to proceed to the processing of the next middle 4 bits, that is, [0000]. Signal SFT11, SFT10
The data R(i-4
). At this time, the comparator 72 shown in FIG.
is supplied with the above-mentioned middle 4 bits [0000],
In addition, the most significant signal R(3) of the middle 4 bits described above [
=0] is added to create 5-bit data [00000]
Therefore, as is clear from FIGS. 6 and 7, both the skip signal SK and the comparison signal CP are output at "H". The skip circuit 73 determines that since the above-mentioned 5-bit data becomes [00000], processing of the next 4 bits is unnecessary. Therefore, at this stage, the processing of the middle 4 bits of the coefficient h2 is completed. When the processing of the middle four bits of the coefficient h2 is completed, the circuit block BI of the circuit components S(i) and T(i) controls the processing of the next upper four bits, that is, [0010]. Signal SFT11, SFT10
The data R(i-8
). [0104] The data of the upper 4 bits [0010] mentioned above includes the most significant data R(3) [=0] of the middle 4 bits.
] is added to form 5-bit data [00100]. In the logic circuit 71, the above-mentioned 5-bit data is [00100], so the following can be seen from FIGS. 6 and 7. In the first cycle C1, the shift amount regulation signals X to 8X are "LHLL" and only the shift amount regulation signal 2X is output at a high level, and the signal TH is "H".
For example, the signal IV is output at "L" and the skip signal SK is output at "H". [0106] As a result, the partial product PP0 only needs to be calculated as "2X", and the circuit block BK outputs the output from the circuit block BJ as it is, and furthermore, the calculation ends only in the first cycle C1. I understand. Therefore, in order to find “2X”, the circuit component S(i
) In the circuit block BJ inside, the control signal SFT21
, SFT20 selects the output of the circuit block BI of the data R(i-1), that is, the one-bit lower circuit component S(i-1). Furthermore, as described above, in circuit block BK, the output from circuit block BJ is output as is, and this output is held in the accumulator of circuit block BL. Therefore, the processing of the coefficient h2 ends at this stage. The processing of coefficient h2 takes 3 cycles, and the processing of coefficient h1
, h2 has a total number of cycles of 5 cycles. (3) Coefficient h3=[1111 111
0 1100] The multiplicand X is the circuit component S(i
), the multiplier P2, in this case the above-mentioned coefficient h3, is applied to the circuit component T(i)
is supplied to In the circuit block BI in the circuit component S(i), the supplied control signal SFT11
, SFT10 selects and captures the data bit X(i) supplied via terminal AT(i). In the first cycle for multiplication by coefficient h3, the 4 bits of the multiplier corresponding to data R(3) to R(-1) in FIG.
1)=0] is checked in the logic circuit 71. In the logic circuit 71, the above-mentioned 5-bit data is [11000], so the following can be seen from FIGS. 6 and 7. The shift amount regulation signals X to 8X are “LLHL”.
”, only the shift amount regulation signal 4X is output at high level, the signal TH is output at “L”, the signal IV is output at “H”, and the skip signal SK is output at “H”. [0111] As a result, the partial product PP0 only needs to be calculated as "4X", and the output from the circuit block BJ is inverted in the circuit block BK.
It can be seen that it is possible to complete only cycle C1 and skip to processing of the next 4 bits of multiplier P2. Therefore, in order to obtain "4X", circuit block BJ in circuit component S(i) uses control signals SFT21 and SFT20 to obtain data R(i-2), that is, circuit component S(i-2 ) circuit block BI
Select the output of Furthermore, as described above, in circuit block BK, the output from circuit block BJ is inverted, and this output is held in the accumulator of circuit block BL. [0113] When the processing of the lower 4 bits of coefficient h3 is completed, the process moves on to the next middle 4 bits. In the circuit block BI of circuit components S(i) and T(i), data R(i) is controlled by control signals SFT11 and SFT10.
-4) Select. The above data R(i-4) is selected and shifted by 4 bits. Note that this shift operation is performed by the multiplier P2(i), that is, the circuit component S
(i) is left shifted, and data bit X(i), ie, circuit component T(i), is shifted right. [0115] The central 4 bits of coefficient h3 are [1110]
Since the lower bit is [1], the 5-bit data corresponding to data R(3) to R(-1) in FIG. 4 is [11101]. In the logic circuit 71, the above-mentioned 5-bit data is [11101], so the following can be seen from FIGS. 6 and 7. The shift amount regulation signals X to 8X are “HLHL”.
At "L", only the shift amount regulation signal X is output at high level, the signal TH is output at "L", the signal IV is output at "H", and the skip signal SK is output at "H". [0117] From this, the partial product PP0 can be obtained by finding "X", and the output from the circuit block BJ is inverted in the circuit block BK, and furthermore, the calculation is performed only in the first cycle C1. It can be seen that it is possible to skip to the processing of the next 4 bits of the multiplier P2.Therefore, in order to obtain "X", the circuit block BJ in the circuit component S(i) uses the control signals SFT21 and SFT20 to process the data R(i ), selects the output of the circuit block BI of the circuit component S(i) of the bit. Furthermore, as described above, the output from the circuit block BJ is inverted in the circuit block BK, and this output is transmitted to the circuit block BL. When processing of the middle 4 bits of the coefficient h3 is completed, circuit components S(i), T (i)
The circuit block BI attempts to select data R(i-8). At this time, the comparator 72 shown in FIG.
is supplied with the most significant 4 bits [1111] mentioned above, and the most significant data R(3) of the above mentioned middle 4 bits.
[=1] is added and the 5-bit data becomes [111]
11], so the skip signal SK and comparison signal C
P is output at "H". [0121] In the skip circuit 73, since the above-mentioned 5-bit data becomes [11111], there is no need to process the next 4 bits, which means that there is no need to form the partial product PP0. . Therefore, the processing of the coefficient h3 ends at this stage. The processing of coefficient h3 is 2
The total number of cycles from coefficient h1 to coefficient h3 was 7 cycles. [0122] In the case of 12 bits, it usually takes 12 cycles if there is only an accumulator, and 36 cycles if there are 3 taps.
It takes a cycle. Using the second-order Booth algorithm requires half that, 18 cycles. However, in this embodiment, only seven cycles are required. Generally speaking, multiplication of B bits can basically be realized with the number of calculation cycles B if there is one accumulator, and if the second-order Booth algorithm is used, the number of calculation cycles can be reduced to (B/2). ), and if the fourth-order Booth algorithm is used, the number of calculation cycles becomes (3/8)B. In this embodiment, the fourth-order Booth algorithm is used as the basis, and a calculation skip is added, but the coefficients h1 to h
The number of calculation cycles is approximately (3/8)B because the cycle for acquiring 3 cannot be skipped. in this way,
The quadratic Booth algorithm is applied to data consisting of multiple bits in units of 4 bits, and partial products are generated in units of 4 bits in each calculation cycle. Since it is determined whether the partial product in the next calculation cycle becomes zero, if it is determined that the partial product calculation in the next calculation cycle is unnecessary, the calculation can be skipped. The number of cycles can be reduced and partial product generation is efficient. When the multiplier P2 is a fixed value, such as in a digital filter with fixed characteristics or cosine transformation, if the multiplier P2 is determined so as to reduce the number of calculation cycles, the effect of reducing the calculation cycles becomes greater. The selector 32 of the circuit block BI is a 4-input selector, and the control signals SFT11, SFT10
In addition to data bit P1(i), output R(i)
It is also possible to select R(i-4) and shift left by 4 bits, or select R(i-8) and shift left by 8 bits. The BJ selector 35 is a 4-input selector, and the output R(
i) , R(i-1) , R(i-2) , R(i-3
), and the partial product calculation method can be selected from 1x, 2x, 4x, and 8x, so in addition to the above-mentioned effects, it has a wide range of functions. It is possible to reduce the gate scale, achieve high speed, and prevent circuit complexity. [0125] Circuit components T(7) to T(-1
), the circuit block BO outputs signals TH and IV corresponding to shift amount regulation signals X to 8X and control signals MOD-0 and MOD-1.
, skip signal SK, etc. This skip signal SK is supplied to a skip circuit 73. Furthermore, since the work of skipping the generation of unnecessary partial products and minimizing the number of arithmetic cycles is not performed by hardware, but is separately determined in advance, there is an effect that optimization becomes possible. [0126] According to the arithmetic circuit according to the invention of claim 1, the quadratic Booth algorithm is applied to data consisting of a plurality of bits in units of 4 bits, and a partial calculation is performed in each arithmetic cycle. The product is generated in units of 4 bits, and the data is separately looked ahead to determine whether the partial product in the next calculation cycle will be zero, so the calculation of the partial product in the next calculation cycle is This has the effect that the calculation can be skipped when it is determined that the calculation is no longer necessary, the number of calculation cycles can be reduced, and the efficiency of partial product generation is high. According to the arithmetic circuit according to the invention of claim 2,
Expanding and applying the secondary Booth's algorithm to 4-bit units, and realizing the functions required when applying the Booth's algorithm with a small-input selector [4-input selector] without using the conventional multi-input selector. In addition to the effects of the invention of claim 1, this configuration has the advantage that the gate scale can be reduced while providing a wide range of functions, high speed can be realized, and circuit complexity can be prevented. Furthermore, since the work of skipping the generation of unnecessary partial products and minimizing the number of arithmetic cycles is not performed by hardware, but is separately determined in advance, there is an effect that optimization becomes possible.

[Brief explanation of the drawing]

【図１】この発明に係る演算回路の構成を示すブロック
図である。FIG. 1 is a block diagram showing the configuration of an arithmetic circuit according to the present invention.

【図２】回路コンポーネントＳ（ｉ）　の構成を示すブ
ロック図である。FIG. 2 is a block diagram showing the configuration of a circuit component S(i).

【図３】制御信号発生回路の構成を示すブロック図であ
る。FIG. 3 is a block diagram showing the configuration of a control signal generation circuit.

【図４】制御信号発生回路の構成を示すブロック図であ
る。FIG. 4 is a block diagram showing the configuration of a control signal generation circuit.

【図５】制御信号により制御される内容を示す略線図で
ある。FIG. 5 is a schematic diagram showing contents controlled by control signals.

【図６】論理回路へ入力されるデータと、これに対応し
て出力される信号の関係を示す略線図である。FIG. 6 is a schematic diagram showing the relationship between data input to a logic circuit and a signal output corresponding thereto.

【図７】論理回路へ入力されるデータと、これに対応し
て出力される信号の関係を示す略線図である。FIG. 7 is a schematic diagram showing the relationship between data input to a logic circuit and signals output in response to the data.

【図８】回路コンポーネントの回路ブロックＢＩの構成
を示すブロック図である。FIG. 8 is a block diagram showing the configuration of a circuit block BI of circuit components.

【図９】回路コンポーネントの回路ブロックＢＩの構成
を示すブロック図である。FIG. 9 is a block diagram showing the configuration of a circuit block BI of circuit components.

【図１０】回路コンポーネントの回路ブロックＢＩの構
成を示すブロック図である。FIG. 10 is a block diagram showing the configuration of a circuit block BI of circuit components.

【図１１】従来の積和演算回路を示すブロック図である
。FIG. 11 is a block diagram showing a conventional product-sum calculation circuit.

【図１２】バレルシフタを示すブロック図である。FIG. 12 is a block diagram showing a barrel shifter.

【図１３】論理回路の例を示す回路図である。FIG. 13 is a circuit diagram showing an example of a logic circuit.

【図１４】論理回路の例を示す回路図である。FIG. 14 is a circuit diagram showing an example of a logic circuit.

【図１５】従来の積和演算回路を示すブロック図である
。FIG. 15 is a block diagram showing a conventional product-sum calculation circuit.

【図１６】セレクタを示すブロック図である。FIG. 16 is a block diagram showing a selector.

[Explanation of symbols]

３２、３５　　　　セレクタ５１　　フルアダー５２、５３、５８、５９　　フリップフロップＸ　　被
乗数Ｐ２　　乗数プｈ１〜ｈ３　　係数ＰＰ０〜ＰＰ２　　部分積ＳＫ　　スキップ信号32, 35 Selector 51 Full adder 52, 53, 58, 59 Flip-flop

Claims

[Claims]

Claim 1: In an arithmetic circuit that obtains the multiplication result of data consisting of a plurality of bits by forming partial products, the quadratic Booth algorithm is applied to the data in units of 4 bits, and the When the data is shifted in 4-bit units to generate partial products in 4-bit units in each calculation cycle, and by looking ahead at the above data, it is determined that the partial product to be generated in the next calculation cycle will be zero. is an arithmetic circuit characterized by skipping the next partial product operation.

2. The circuit component comprises a control signal generation circuit that forms various control signals, a circuit component that processes each bit of data, and an accumulator that accumulates the outputs of the circuit component. , a first selection means for selecting arbitrary bit data for specifying a data shift amount from a plurality of bit data or bit data for operation supplied from the lower bit circuit component and having a predetermined bit interval; , the first
from among the 1-bit data supplied from the selection means and the 1-bit data that is continuous from the circuit component to the lower side and has the same number of bits as the first selection means,
a second selection means for selecting arbitrary 1-bit data for defining the amount of data shift, and an addition output and a carry output based on the output from the second selection means, an accumulator that supplies a carry output to a higher-order bit circuit component and forms the carry output from a lower-bit circuit component and the addition output as outputs of the circuit component; and an addition output supplied from the upper-bit circuit component. and a carry output, and an output circuit that selects and holds either the addition output or the carry output supplied from the accumulator, and supplies the held value to the lower bit circuit component. An arithmetic circuit characterized by: