JP2004054384A

JP2004054384A - Arithmetic unit

Info

Publication number: JP2004054384A
Application number: JP2002207978A
Authority: JP
Inventors: Takashi Hirozawa; 広沢　隆
Original assignee: Renesas Technology Corp
Current assignee: Renesas Technology Corp
Priority date: 2002-07-17
Filing date: 2002-07-17
Publication date: 2004-02-19

Abstract

<P>PROBLEM TO BE SOLVED: To achieve multiplication in a Galois extension field by a small circuit scale. <P>SOLUTION: A plurality of vector elements T<SB>41</SB>constituting a matrix which is set based on a multiplier are defined as one input of an AND gate 41 in a product sum unit 401. In the same way, the elements T<SB>42</SB>, T<SB>43</SB>, T<SB>44</SB>are defined as the one input of the AND gates 41 in the respective product sum units 402, 403, and 404. One bit of a multiplicand is defined as the other input of the AND gates 41 in the respective product sum units 401-404. The output of the AND gate 41 is given to the one input of an XOR gate 42. The output of a register 43 is given to the other input of the XOR gate 42. The output of the XOR gate 42 is inputted to the register 43. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
この発明は演算器に関し、例えばガロア拡大体における乗算に適する。
【０００２】
【従来の技術】
近年、あらゆる分野においてデジタル化、ＩＴ化が進み、デジタルデータ記録や通信に対して注目が集まっている。この、デジタルデータ記録方式や通信方式における誤り訂正方法として、リードソロモン符号やＢＣＨ符号などがよく利用されている。これらの符号はガロア拡大体の性質を利用して作られており、その扱いにはガロア拡大体における演算が必須となる。特にガロア体ＧＦ（２）からガロア拡大体ＧＦ（２^ｍ）を得る際に採用されるｍ次の原始既約多項式の根をαとすると、ガロア拡大体における乗算ではａ＝α^ｉと、ｂ＝α^ｊとの乗算が必要となる。
【０００３】
図１４は、ガロア拡大体における乗算を行う従来の乗算器の構成を示すブロック図である。被乗数ａ及び乗数ｂはそれぞれ対数変換ブロック１０１，１０２に入力し、各々で指数ｉ，ｊが求められる。指数ｉ，ｊは加算ブロック１０３に与えられ、ここにおいて指数ｉ，ｊの和の（ｑ−１）を法とする剰余として指数ｋが得られる（但し体の位数２^ｍをｑとする）。その後指数ｋは逆対数変換ブロック１０４に与えられ、積ｃがα^ｋとして求められる。
【０００４】
【発明が解決しようとする課題】
上記の乗算器では、対数変換、逆対数変換の際、いずれも変換表を使うのが一般的であった。しかし、リードソロモン符号の符号器に採用される乗算器では、符号を構成するための生成多項式は唯一に決定されて乗数は一定であることが多い。従って、上述の変換表を用いた乗算器をリードソロモン符号の符号器に採用すると、異なる複数の乗数に対応するという不要な機能を果たすので、回路規模が冗長であるという問題点があった。
【０００５】
この発明は上記問題点を解決するためになされたもので、ガロア拡大体における乗算を小さな回路規模で実現する技術を提供することを目的としている。
【０００６】
【課題を解決するための手段】
この発明のうち請求項１にかかるものは演算器であって、入力端と、前記入力端に与えられた信号を少なくとも一つの単位時間だけ遅延して出力する出力端とを有する遅延器と、第１端及び第２端、並びに自身の前記第１端及び前記第２端に入力する信号の二値論理についての論理積を出力する出力端を有する第１ゲートと、前記第１ゲートの出力を入力する第１端と、前記遅延器の出力端に接続された第２端と、並びに自身の前記第１端及び前記第２端に入力する信号の二値論理についての排他的論理和を前記遅延器の前記入力端に出力する出力端とを有する第２ゲートとを備える積和単位を、少なくとも一つ含む。
【０００７】
この発明のうち請求項２にかかるものは、請求項１記載の演算器であって、乗数に基づいて設定される行列を構成する複数のベクトルが順次出力される乗数提供部と、被乗数を１ビットずつ出力する被乗数提供部とを更に備える。そして前記積和単位は前記ベクトルの要素毎に複数設けられて、各々の前記積和単位が自身に対応する前記要素を前記第１ゲートの前記第１端に入力され、いずれの前記積和単位も、自身の前記第１ゲートの前記第２端には前記被乗数の前記１ビットが入力され、前記遅延器は一つの単位時間だけ、自身の前記入力端に与えられた信号を遅延させる。
【０００８】
この発明のうち請求項３にかかるものは、請求項１記載の演算器であって、乗数に基づいて設定される行列を構成する複数のベクトルが、要素毎に順次出力される乗数提供部と、被乗数を１ビットずつ出力する被乗数提供部とを更に備える。そして、前記積和単位の前記第１ゲートの前記第１端には前記要素が入力され、前記積和単位の前記第１ゲートの前記第２端には前記被乗数の前記１ビットが入力され、前記遅延器は前記ベクトルを構成する要素の数と前記単位時間との積で、自身の前記入力端に与えられた信号を遅延させる。
【０００９】
【発明の実施の形態】
実施の形態１．
図１は本発明にかかる乗算器が適用可能な符号器１の構造を示すブロック図であり、（８，４）リードソロモン符号器として公知である。ここでは既約原始多項式及び生成多項式としてそれぞれ式（１），（２）を採用した場合を例示する。
【００１０】
【数１】

【００１１】
【数２】

【００１２】
符号器１は生成多項式に基づいて入力データの剰余を求めるため、生成多項式の係数に基づいた乗算器２００，２０１，２０２，２０３を備えている。生成多項式の係数は、不定元χの低次から高次へと向かう順にα^６，１，α^４，α^１２，１であるので、乗算器２００，２０１，２０２，２０３はそれぞれ乗数α^６，１，α^４，α^１２を用いてガロア拡大体の乗算を行う。最高次に対応する乗算器は不要である。
【００１３】
ここでは符号化の対象となる入力データＤは４ビット毎に更新され、符号器１は４ビットのレジスタ２０４，２０６，２０８，２１０、４ビットの排他的論理和回路（以下「ＸＯＲゲート」と称す）２０５，２０７，２０９，２１１を備えている。入力データＡは１クロック毎に４ビットずつＸＯＲゲート２１１に入力する。
【００１４】
符号化の際の初期値としてレジスタ２０４，２０６，２０８，２１０には４ビットの値“００００”が格納されている。またＸＯＲゲート２０５，２０７，２０９，２１１はビット毎に対をなす二入力の排他的論理和を求め、ビット毎に出力する。図２はＸＯＲゲート２０５，２０７，２０９，２１１の各々の構成例を示す回路図である。各々４ビットのデータＬ１，Ｌ２を入力し、これらの排他的論理和を採って４ビットのデータＬ３を出力する構成が示されている。二入力のＸＯＲゲート２１，２２，２３，２４が備えられる。
【００１５】
ＸＯＲゲート２１１はレジスタ２１０の出力と入力データＤとの排他的論理和を出力する。排他的論理和は通常の和の２を法とする剰余であり、ガロア拡大体ＧＦ（２^ｍ）の加算結果と一致する。ＸＯＲゲート２１１は乗算器２００，２０１，２０２，２０３に与えられ、これらにおいて上述の乗算が行われる。但し、乗算器において採用される乗数は生成多項式に依存する。図１では式（２）で例示された生成多項式Ｇ（χ）に基づいて、それぞれの乗算器の乗数が例示されているが、生成多項式が異なれば、他の構造を採ることができる。乗算器２００，２０１，２０２，２０３から得られた乗算結果たる出力は、それぞれレジスタ２０４，ＸＯＲゲート２０５，２０７，２０９に与えられる。
【００１６】
ＸＯＲゲート２０５はレジスタ２０４の出力と乗算器２０１の出力との排他的論理和を採ってレジスタ２０６へと出力する。ＸＯＲゲート２０７はレジスタ２０６の出力と乗算器２０２の出力との排他的論理和を採ってレジスタ２０８へと出力する。ＸＯＲゲート２０９はレジスタ２０８の出力と乗算器２０３の出力との排他的論理和を採ってレジスタ２１０へと出力する。但し乗算器２０１の乗数は１であるので、これを省略し、ＸＯＲゲート２１１の出力を直接にＸＯＲゲート２０５に与えることができる。
【００１７】
１クロック分の動作が進む毎に、入力データＡは新たな４ビットのデータをＸＯＲゲート２１１に与え、かつレジスタ２０４，２０６，２０８，２１０は、自身に与えられた４ビットのデータをそれぞれＸＯＲゲート２０５，２０７，２０９，２１１へと出力する。これにより、符号語の内の検査ブロックの４ブロックが、それぞれ上位から順にレジスタ２１０，２０８，２０６，２０４の出力として得られる。即ちレジスタ２０４からは検査ブロックの最下位４ビットが、レジスタ２１０からは検査ブロックの最上位４ビットが、それぞれ出力される。
【００１８】
図３は、乗算器２００，２０２，２０３として採用可能な、実施の形態１にかかる乗算器の構成を示す回路図である。
【００１９】
当該乗算器は、被乗数提供部１００と、乗数提供部３００と、積和処理部４００とを備えている。被乗数提供部１００は被乗数Ａの４ビットデータＡ_３Ａ_２Ａ_１Ａ_０を最上位ビットから順に、即ちＡ_３，Ａ_２，Ａ_１，Ａ_０の順に、１ビットずつ積和処理部４００に入力する。被乗数提供部１００は直列に接続された４つのレジスタ１０５，１０６，１０７，１０８を有しており、これらはこの順に積和処理部４００へと近づく配置がなされている。レジスタ１０５，１０６，１０７，１０８はそれぞれ１クロック分だけ遅延してデータを伝達する。
【００２０】
乗数提供部３００は既約原始多項式に依存した構造を有し、乗数に依存した初期値が設定される。図３では式（１）で示される既約原始多項式ｇ（χ）が採用された場合の構造を例示しており、いずれも自身に入力した信号に対して１クロック分の遅延を与えて出力するレジスタ３０１，３０２，３０３，３０５及びＸＯＲゲート３０４を備えている。式（１）で示される既約原始多項式ｇ（χ）が０に等しくなる根をαとおくと、ガロア拡大体ＧＦ（２^４）の元は、αの累乗として表すことができ、指数が増加する順に、ベクトル表現で次の様に表される。即ち（０１００），（００１０），（０００１），（１１００），（０１１０），（００１１），（１１０１），（１０１０），（０１０１），（１１１０），（０１１１），（１１１１），（１０１１），（１００１），（１０００）である。これ以上に指数が増加する場合には上記のベクトル表現が循環して採用される。
【００２１】
上記のベクトル表現を得るため、レジスタ３０１の出力はレジスタ３０２に入力し、レジスタ３０２の出力はレジスタ３０３に入力し、レジスタ３０３の出力はＸＯＲゲート３０４の一方の入力となり、ＸＯＲゲート３０４の出力はレジスタ３０５の入力となり、レジスタ３０５の出力はレジスタ３０１の入力となると共に、ＸＯＲゲート３０４の他方の入力ともなる。乗数提供部３００は１クロック毎にガロア拡大体ＧＦ（２^４）の元を、αの指数を１次ずつ低くして出力する。
【００２２】
一般に符号長が４ビットであるガロア拡大体の乗算は、上記公報にも開示されているように、式（３）で表され
【００２３】
【数３】

【００２４】
但し係数Ｔ_１１，Ｔ_１２，Ｔ_１３，Ｔ_１４は乗数をベクトル表現した値と一致する。また、式（４）〜（６）が成立する。ここで示された乗算記号「×」はガロア拡大体ＧＦ（２^ｍ）におけるベクトルの乗算を示す。
【００２５】
【数４】

【００２６】
【数５】

【００２７】
【数６】

【００２８】
つまり（３）式で表された行列は、乗数に基づいた複数のベクトル（Ｔ_１１，Ｔ_１２，Ｔ_１３，Ｔ_１４），（Ｔ_２１，Ｔ_２２，Ｔ_２３，Ｔ_２４），（Ｔ_３１，Ｔ_３２，Ｔ_３３，Ｔ_３４），（Ｔ_４１，Ｔ_４２，Ｔ_４３，Ｔ_４４）で構成されている。
【００２９】
さて、上述の通り、乗数提供部３００は１クロック毎にガロア拡大体ＧＦ（２^４）の元を、αの指数を１次ずつ低くして出力する。よって乗数Ｎとα^３との間でガロア拡大体の乗算を行った結果を、初期値としてレジスタ３０１，３０２，３０３，３０５に与えておく。こうすれば１クロック毎に式（３）の行列の縦ベクトルが右側から順に出力され、これと被乗数Ａとの積和演算が実行される。
【００３０】
積和処理部４００は、それぞれレジスタ３０１，３０２，３０３，３０５の出力を受け、またいずれもがレジスタ１０８の出力を受ける積和単位４０１，４０２，４０３，４０４を有している。つまり積和単位４０１，４０２，４０３，４０４は縦ベクトルの要素毎に複数設けられる。
【００３１】
積和単位４０１，４０２，４０３，４０４はいずれも同一の構成を有している。即ち、積和単位の各々はアンドゲート４１，ＸＯＲゲート４２、レジスタ４３を備えている。レジスタ４３は、入力端と、自身の入力端に与えられた信号を１クロック分遅延して出力する出力端とを有する。アンドゲート４１は、第１端及び第２端、並びに自身の第１端及び前記第２端に入力する信号の二値論理についての論理積を出力する出力端を有する。ＸＯＲゲートは、アンドゲート４１の出力を入力する第１端と、レジスタ４３の出力端に接続された第２端、並びに自身の第１端及び第２端に入力する信号の二値論理についての排他的論理和をレジスタ４３の入力端に出力する出力端とを有する。そしてアンドゲート４１の第１端にはレジスタ１０８の出力が与えられる。またアンドゲート４１の第２端には、乗算提供部３００が有しているレジスタのうちで、このアンドゲート４１が属する積和単位に対応するものの出力が入力される。例えば積和単位４０１について言えば、そのアンドゲート４１の第２端にはレジスタ３０１の出力が与えられる。
【００３２】
以下、図３に示された乗算器を、図１に示された乗算器２０３として採用した場合の動作について説明する。
【００３３】
図３は最初の積和演算を行う状態を示している。いずれの積和単位においても、その備えるレジスタ４３にはデータ“０”が格納されている。また被乗数提供部１００においては、レジスタ１０５，１０６，１０７，１０８からはそれぞれＡ_０，Ａ_１，Ａ_２，Ａ_３が出力されている。積和処理部４００が備えるレジスタ乗算器２０３の乗数はα^１２であるので積和処理部４００に出力する初期値としてはα^１５＝１が採用される。よってこの初期値に対応して、レジスタ３０１，３０２，３０３，３０５からはそれぞれＴ_４１，Ｔ_４２，Ｔ_４３，Ｔ_４４に対応する値“０”，“０”，“０”，“１”が出力されている。
【００３４】
積和単位４０１において、アンドゲート４１にはいずれも１ビットのデータＴ_４１、Ａ_３が入力する。レジスタ４３からその初期値として“０”が出力されているので、ＸＯＲゲート４２はアンドゲート４１の出力をそのまま出力する。これによりレジスタ４３の入力には、被乗数Ａの１ビットＡ_３と、乗数α^１５の１ビットＴ_４１とのガロア拡大体ＧＦ（２^ｍ）の乗算結果であるＡ_３・Ｔ_４１が与えられる。乗算記号「・」はガロア拡大体ＧＦ（２^ｍ）における乗算を示す。但し積和単位４０１の出力はレジスタ４３の出力から得られるので、この時点では積和単位４０１の出力は“０”である。
【００３５】
同様にして、積和単位４０２のレジスタ４３の入力にはＡ_３・Ｔ_４２が与えられ、積和単位４０３のレジスタ４３の入力にはＡ_３・Ｔ_４３が与えられ、積和単位４０４のレジスタ４３の入力にはＡ_３・Ｔ_４４が与えられる。そして積和単位４０２，４０３，４０４のいずれの出力も“０”である。
【００３６】
図４は、図３に示された状態から１クロック分の時間が経過した場合の状態を示すブロック図である。被乗数提供部１００においてレジスタ１０６，１０７，１０８からは、それぞれ１ビットのデータＡ_０，Ａ_１，Ａ_２が出力される。一方、乗数提供部３００からはレジスタ３０１，３０２，３０３，３０５においてそれぞれＴ_３１，Ｔ_３２，Ｔ_３３，Ｔ_３４に対応する値“１”，“０”，“０”，“１”が出力されている。
【００３７】
積和単位４０１において、レジスタ４３からは１クロック分前の動作で入力されたＡ_３・Ｔ_４１が出力されている。一方、アンドゲート４１にはいずれも１ビットのデータＴ_３１、Ａ_２が入力するので、ＸＯＲゲート４２はＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１（但し加算記号「＋」は排他的論理和を示す）を出力する。これによりレジスタ４３の入力にはＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１が与えられる。但し積和単位４０１の出力はＡ_３・Ｔ_４１である。
【００３８】
同様にして、積和単位４０２のレジスタ４３の入力にはＡ_３・Ｔ_４２＋Ａ_２・Ｔ_３２が与えられ、積和単位４０３のレジスタ４３の入力にはＡ_３・Ｔ_４３＋Ａ_２・Ｔ_３３が与えられ、積和単位４０４のレジスタ４３の入力にはＡ_３・Ｔ_４４＋Ａ_２・Ｔ_３４が与えられる。そして積和単位４０２，４０３，４０４の出力は、それぞれＡ_３・Ｔ_４２，Ａ_３・Ｔ_４３，Ａ_３・Ｔ_４４となる。
【００３９】
図５は、図４に示された状態から１クロック分の時間が経過した場合の状態を示すブロック図である。被乗数提供部１００においてレジスタ１０７，１０８からは、それぞれ１ビットのデータＡ_０，Ａ_１が出力される。一方、乗数提供部３００ではレジスタ３０１，３０２，３０３，３０５からそれぞれＴ_２１，Ｔ_２２，Ｔ_２３，Ｔ_２４に対応する値“１”，“１”，“０”，“１”が出力されている。
【００４０】
積和単位４０１において、レジスタ４３からは１クロック分前の動作で入力されたＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１が格納されている。一方、アンドゲート４１にはいずれも１ビットのデータＴ_２１、Ａ_１が入力するので、ＸＯＲゲート４２はＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１を出力する。これによりレジスタ４３の入力にはＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１が与えられる。但し積和単位４０１の出力はＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１である。
【００４１】
同様にして、積和単位４０２のレジスタ４３の入力にはＡ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２が与えられ、積和単位４０３のレジスタ４３の入力にはＡ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３が与えられ、積和単位４０４のレジスタ４３の入力にはＡ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４が与えられる。そして積和単位４０２，４０３，４０４の出力は、それぞれＡ_３・Ｔ_４２＋Ａ_２・Ｔ_３２，Ａ_３・Ｔ_４３＋Ａ_２・Ｔ_３３，Ａ_３・Ｔ_４４＋Ａ_２・Ｔ_３４となる。
【００４２】
図６は、図５に示された状態から１クロック分の時間が経過した場合の状態を示すブロック図である。被乗数提供部１００においてレジスタ１０８からは１ビットのデータＡ_０が出力される。一方、乗数提供部３００ではレジスタ３０１，３０２，３０３，３０５からはそれぞれＴ_１１，Ｔ_１２，Ｔ_１３，Ｔ_１４に対応する値“１”，“１”，“１”，“１”が出力されている。
【００４３】
図３乃至図５に示された場合と同様に動作が進み、積和単位４０１のレジスタ４３の入力にはＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１＋Ａ_０・Ｔ_１１が与えられ、積和単位４０２のレジスタ４３の入力にはＡ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２＋Ａ_０・Ｔ_１２が与えられ、積和単位４０３のレジスタ４３の入力にはＡ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３＋Ａ_０・Ｔ_１３が与えられ、積和単位４０４のレジスタ４３の入力にはＡ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４＋Ａ_０・Ｔ_１４が与えられる。そして積和単位４０１，４０２，４０３，４０４の出力は、それぞれＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１，Ａ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２，Ａ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３，Ａ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４となる。
【００４４】
図７は、図６に示された状態から１クロック分の時間が経過した場合の状態を示すブロック図である。被乗数提供部１００において積和単位４０１，４０２，４０３，４０４からは、それぞれ１クロック分前においてレジスタ４３に入力していたＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１＋Ａ_０・Ｔ_１１，Ａ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２＋Ａ_０・Ｔ_１２，Ａ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３＋Ａ_０・Ｔ_１３，Ａ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４＋Ａ_０・Ｔ_１４が出力される。
【００４５】
これにより式（３）が計算されたことになり、被乗数Ａと乗数α^１２とのガウス拡大体ＧＦ（２^４）における乗算が行われたことになる。ＸＯＲゲート４２の一方の入力端にはアンドゲート４１から得られた乗数の１ビット及び被乗数の１ビットのガロア拡大体ＧＦ（２^ｍ）での乗算結果が与えられる。ＸＯＲゲート４２の他方の入力端には、ＸＯＲゲートの出力であって遅延されたものである、乗数の他の１ビット及び被乗数の他の１ビットのガロア拡大体ＧＦ（２^ｍ）での乗算結果が与えられる。よってレジスタ４３の遅延時間を適切に（ここでは１クロック分）設定することにより、ベクトル表現した乗数と被乗数との積を得ることができる。
【００４６】
図３に示された乗算器では乗算を計算するのに４クロック分が必要となる。よって符号化器１において本実施の形態に示された乗算器を採用する場合には、入力データＤを与えるタイミングを４倍引き延ばす必要がある。しかしながら乗数が固定されている場合に特に好適であり、生成多項式に基づいて乗数が固定された乗算器として採用されることにより、符号化器の構成の冗長を小さくすることができる。
【００４７】
符号長をｎビットとするならば、乗数α^ｓを乗算する場合には、乗数提供部３００のレジスタにはα^{ｓ＋（ｎ−１）}をベクトル表現した値を初期値として格納すればよい。そして乗数提供部３００は、上述の通り、ガロア拡大体を得るために用いた既約原始多項式に基づいて、レジスタ、ＸＯＲゲートを用いて構成することができる。
【００４８】
但し、乗数提供部３００としては、図３に示されたような１クロック毎にガロア拡大体ＧＦ（２^４）の元をαの指数を１次ずつ低くして出力するもの以外に、αの指数を１次ずつ高くして出力するものを採用することができる。図８はこのような乗数提供部３００の構造を例示するブロック図であり、例えば特公平６−８０４９１号公報によって開示されている。乗算器３００においてレジスタ３０１の出力Ｔ_１はＸＯＲゲート３０４の一方の入力となると共にレジスタ３０５の入力にもなる。レジスタ３０５の出力Ｔ４はＸＯＲゲート３０４の他方の入力となる。ＸＯＲゲート３０４の出力はレジスタ３０３の入力となる。レジスタ３０３の出力Ｔ_３はレジスタ３０２の入力となる。レジスタ３０２の出力Ｔ_２はレジスタ３０１の入力となる。レジスタ３０１，３０２，３０３，３０５の初期値として値Ｔ_１１，Ｔ_１２，Ｔ_１３，Ｔ_１４を格納しておくことにより、１クロック分の時間が経過する毎に、出力Ｔ_１は値Ｔ_１１，Ｔ_２１，Ｔ_３１，Ｔ_４１を、出力Ｔ_２はＴ_１２，Ｔ_２２，Ｔ_３２，Ｔ_４２を、出力Ｔ_３はＴ_１３，Ｔ_２３，Ｔ_３３，Ｔ_４３を、出力Ｔ_４はＴ_１４，Ｔ_２４，Ｔ_３４，Ｔ_４４を、それぞれ順次に採ってゆく。
【００４９】
図８の乗数提供部３００を採用する場合に式（３）を計算するためには、被乗数Ａが被乗数提供部１００に入力する順序を図３に示された場合と逆にする。即ち、まずＡ_０を入力し、その後順次にＡ_１，Ａ_２，Ａ_３を入力する。
【００５０】
実施の形態２．
図９は、図１に示された符号化器１の乗算器２００，２０２，２０３として採用可能な、実施の形態２にかかる乗算器の構成を示す回路図である。当該乗算器も実施の形態１と同様に、被乗数提供部１００と、乗数提供部３００と、積和処理部４００とを備えている。
【００５１】
実施の形態２の被乗数提供部１００の構造は、実施の形態１と同様に直列に接続された４つのレジスタ１０５，１０６，１０７，１０８を有しており、これらはこの順に積和処理部４００へと近づく配置がなされている。被乗数提供部１００には被乗数Ａの４ビットデータＡ_３Ａ_２Ａ_１Ａ_０が最上位ビットから順に、即ちＡ_３，Ａ_２，Ａ_１，Ａ_０の順に、１ビットずつ積和処理部４００に入力される。但しレジスタ１０５，１０６，１０７，１０８はそれぞれ後述するクロック信号ＣＫ２の１クロック分だけ遅延してデータを伝達する。
【００５２】
実施の形態２の乗数提供部３００は、実施の形態１に示された乗数提供部３００の構造（図３参照）に対して、マルチプレクサ３０６と、２ビットカウンタ３０７とを追加した構成を有している。マルチプレクサ３０６は２ビットカウンタ３０７の制御に基づいてレジスタ３０１，３０２，３０３，３０５の出力から一つを選択して出力する。２ビットカウンタ３０７はクロック信号ＣＫ１に基づいてカウント動作を行い、クロック信号ＣＫ１の４クロック分でカウント動作は一巡する。これに伴い２ビットカウンタ３０７からはクロック信号ＣＫ１の４クロック分を一周期とするクロック信号ＣＫ２を出力する。即ちクロック信号ＣＫ２の１クロック分はクロック信号ＣＫ１の４クロック分に相当する。そしてレジスタ３０１，３０２，３０３，３０５はそれぞれクロック信号ＣＫ２の１クロック分だけ遅延してデータを伝達する。
【００５３】
実施の形態２の積和処理部４００は、実施の形態１に示された積和単位が一つだけ備えられており、そのレジスタ４３を、ＸＯＲゲート４２の出力から近い方から順に、レジスタ４３ａ，４３ｂ，４３ｃ，４３ｄの直列接続として有する構造を呈している。これらのレジスタはいずれもクロック信号ＣＫ１の１クロック分だけ遅延してデータを伝達する。よってレジスタ４３の全体としては、乗数のベクトルを構成する要素の数４と、クロック信号ＣＫ１の１クロックとの積でデータが遅延して伝達されることになる。
【００５４】
図９は最初の積和演算を行う状態を示している。レジスタ４３ａ，４３ｂ，４３ｃ，４３ｄのいずれからも、データ“０”が出力されている。また被乗数提供部１００においては、レジスタ１０５，１０６，１０７，１０８にはそれぞれＡ_０，Ａ_１，Ａ_２，Ａ_３が出力されている。符号化器１が備えるレジスタ乗算器２０３の乗数はα^１２であるので積和処理部４００に出力する初期値としてはα^１５＝１が採用される。よってこの初期値に対応して、レジスタ３０１，３０２，３０３，３０５からそれぞれＴ_４１，Ｔ_４２，Ｔ_４３，Ｔ_４４に対応する値“０”，“０”，“０”，“１”が出力されている。積和処理部４００においてＸＯＲゲート４２からは出力Ａ_３・Ｔ_４１が得られている。
【００５５】
図１０は図９に示された状態からクロック信号ＣＫ１の１クロック分の時間が経過した場合の状態を示すブロック図である。この状態ではレジスタ１０５，１０６，１０７，１０８，３０１，３０２，３０３，３０５が出力するデータは変化していない。しかしマルチプレクサ３０６は２ビットカウンタ３０７の制御を受けて、データＴ_４２を選択して出力する。これによりＸＯＲゲート４２からは出力Ａ_３・Ｔ_４２が得られている。レジスタ４３ａからは、クロック信号ＣＫ１の１クロック分前に入力したデータＡ_３・Ｔ_４１が出力されている。
【００５６】
図１１は図１０に示された状態からクロック信号ＣＫ１の２クロック分の時間が経過した場合の状態を示すブロック図である。マルチプレクサ３０６は２ビットカウンタ３０７の制御を受けて、データＴ_４４を出力する。これによりＸＯＲゲート４２からは出力Ａ_３・Ｔ_４４が得られている。レジスタ４３ａ，４３ｂ，４３ｃ，４３ｄからはそれぞれＡ_３・Ｔ_４３，Ａ_３・Ｔ_４２，Ａ_３・Ｔ_４１，“０”が出力されている。
【００５７】
図１２は図１１に示された状態からクロック信号ＣＫ１の１クロック分の時間が経過した場合の状態を示すブロック図である。この状態では、図９に示された状態からクロック信号ＣＫ２の１クロック分の時間が経過している。よってレジスタ１０６，１０７，１０８，３０１，３０２，３０３，３０５が出力するデータがそれぞれＡ_０，Ａ_１，Ａ_２，Ｔ_３１，Ｔ_３２，Ｔ_３３，Ｔ_３４に変化する。マルチプレクサ３０６は２ビットカウンタ３０７の制御を受けて、データＴ_３１を出力する。
レジスタ４３ａ，４３ｂ，４３ｃ，４３ｄからはそれぞれＡ_３・Ｔ_４４，Ａ_３・Ｔ_４３，Ａ_３・Ｔ_４２，Ａ_３・Ｔ_４１が出力されている。ＸＯＲゲート４２にはアンドゲート４１の出力Ａ_２・Ｔ_３１とレジスタ４３ｄの出力Ａ_３・Ｔ_４１が入力するので、ＸＯＲゲート４２の出力はＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１となる。このようにして、クロック信号ＣＫ１の４クロック分が経過することにより、実施の形態１において図４で示された積和単位４０１のＸＯＲゲート４２の出力が得られることになる。実施の形態２において積和処理部４００のレジスタ４３は、シリアルの４ビットデータをクロック信号ＣＫ１の１クロック分が経過する度に、ビット毎に順次シフトさせる機能を果たすといえる。
【００５８】
図１３は図９に示された状態からクロック信号ＣＫ１の１３クロック分の時間が経過した状態を示すブロック図である。レジスタ３０１，３０２，３０３，３０５からはそれぞれＴ_１１，Ｔ_１２，Ｔ_１３，Ｔ_１４が出力されており、マルチプレクサ３０６は２ビットカウンタ３０７の制御によってレジスタ３０１から得られたデータＴ_１１を出力している。レジスタ１０８からは乗数Ａの最下位ビットのデータＡ_０が出力されており、従ってアンドゲート４１からはデータＡ_０・Ｔ_１１が出力されている。レジスタ４３ａ，４３ｂ，４３ｃ，４３ｄからはそれぞれＡ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４，Ａ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３，Ａ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２，Ａ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１が出力されている。よってＸＯＲゲート４２の出力はＡ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１＋Ａ_０・Ｔ_１１となる。
【００５９】
図１３に示された状態からクロック信号ＣＫ１の３クロック分の時間が経過することにより、レジスタ４３ａ，４３ｂ，４３ｃ，４３ｄにはそれぞれＡ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４＋Ａ_０・Ｔ_１４，Ａ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３＋Ａ_０・Ｔ_１３，Ａ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２＋Ａ_０・Ｔ_１２，Ａ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１＋Ａ_０・Ｔ_１１が入力する。このように、図３に示された状態からクロック信号ＣＫ１の１６クロック分の時間（クロック信号ＣＫ２の４クロック分の時間）が経過して、実施の形態１において図６で示された積和単位４０１〜４０３のＸＯＲゲート４２の出力が得られることになる。
【００６０】
その後、更にクロック信号ＣＫ１の１クロック分ずつ時間が経過する毎に、積和演算処理部４００の出力Ｂは、順次Ａ_３・Ｔ_４１＋Ａ_２・Ｔ_３１＋Ａ_１・Ｔ_２１＋Ａ_０・Ｔ_１１，Ａ_３・Ｔ_４２＋Ａ_２・Ｔ_３２＋Ａ_１・Ｔ_２２＋Ａ_０・Ｔ_１２，Ａ_３・Ｔ_４３＋Ａ_２・Ｔ_３３＋Ａ_１・Ｔ_２３＋Ａ_０・Ｔ_１３，Ａ_３・Ｔ_４４＋Ａ_２・Ｔ_３４＋Ａ_１・Ｔ_２４＋Ａ_０・Ｔ_１４を出力する。
【００６１】
以上のようにして、本実施の形態によって式（３）の結果を得るまでにはクロック信号ＣＫ１の２０クロック分が必要である。実施の形態１によって同じ結果を得るまでにはクロック信号の５クロック分が必要であった。従って、本実施の形態では実施の形態１と比較して動作が４倍遅くなる。これに伴って符号化器１の動作も４倍遅くなる。しかしながら図３及び図９を比較すると明白なように、本実施の形態の方が、実施の形態１と比較して、アンドゲート３個分、ＸＯＲゲート３個分を省略し、回路規模が小さくなる。
【００６２】
本実施の形態においても、図８に示された乗算提供部３００を採用することができる。この場合、実施の形態１において説明したように、被乗数Ａの各ビットを積和処理部４００に与える順序を逆にする必要がある。
【００６３】
【発明の効果】
この発明のうち請求項１にかかる演算器によれば、第１ゲートの第１端及び第２端に、それぞれ乗数の１ビット及び被乗数の１ビットを与えることにより、両者のガロア拡大体ＧＦ（２^ｍ）での乗算結果が第１ゲートから得られる。当該結果は第２ゲートの第１端に与えられる。第２ゲートの第２端には、被乗数の他の１ビットと乗数の他の１ビットとのガロア拡大体ＧＦ（２^ｍ）での乗算結果が与えられる。よって第２ゲートの第２端に与えるデータとして、第２ゲートの出力を所定の時間遅延したものを採用することにより、ベクトル表現した乗数と被乗数との積を得ることができる。
【００６４】
この発明のうち請求項２にかかる演算器によれば、積和単位の数を低減して回路規模を小さくして、ガロア拡大体ＧＦ（２^ｍ）での乗数と被乗数との積を得ることができる。
【００６５】
この発明のうち請求項３にかかる演算器によれば、積和単位の数を一層低減して回路規模を更に小さくして、ガロア拡大体ＧＦ（２^ｍ）での乗数と被乗数との積を得ることができる。
【図面の簡単な説明】
【図１】本発明の実施の形態にかかる乗算器が適用可能な符号器の構造を示すブロック図である。
【図２】ＸＯＲゲートの構成例を示す回路図である。
【図３】本発明の実施の形態１にかかる乗算器の構成を示す回路図である。
【図４】本発明の実施の形態１にかかる乗算器の動作を示す回路図である。
【図５】本発明の実施の形態１にかかる乗算器の動作を示す回路図である。
【図６】本発明の実施の形態１にかかる乗算器の動作を示す回路図である。
【図７】本発明の実施の形態１にかかる乗算器の動作を示す回路図である。
【図８】乗数提供部の構造を例示するブロック図である。
【図９】本発明の実施の形態２にかかる乗算器の動作を示す回路図である。
【図１０】本発明の実施の形態２にかかる乗算器の動作を示す回路図である。
【図１１】本発明の実施の形態２にかかる乗算器の動作を示す回路図である。
【図１２】本発明の実施の形態２にかかる乗算器の動作を示す回路図である。
【図１３】本発明の実施の形態２にかかる乗算器の動作を示す回路図である。
【図１４】従来の技術を示す回路図である。
【符号の説明】
４１　アンドゲート、４２　ＸＯＲゲート、４３，４３ａ〜４３ｄ，１０５〜１０８，３０１〜３０４　レジスタ、１００　被乗数提供部、２００〜２０３　乗算器、３００　乗数提供部、４００　積和処理部、４０１〜４０４　積和単位。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an arithmetic unit, and is suitable for, for example, multiplication in a Galois extension field.
[0002]
[Prior art]
2. Description of the Related Art In recent years, digitization and IT have been advanced in all fields, and attention has been paid to digital data recording and communication. As an error correction method in the digital data recording system and the communication system, a Reed-Solomon code, a BCH code, and the like are often used. These codes are created by using the properties of the Galois extension field, and the handling thereof requires operations in the Galois extension field. In particular, from the Galois field GF (2) to the Galois extended field GF (2) ^m )), Let α be the root of an m-th order primitive irreducible polynomial used in the Galois extension field. ⁱ And b = α ^j Must be multiplied by
[0003]
FIG. 14 is a block diagram showing a configuration of a conventional multiplier that performs multiplication in an extended Galois field. The multiplicand a and the multiplier b are input to

logarithmic conversion blocks

101 and 102, respectively, and exponents i and j are obtained respectively. The exponents i and j are given to the addition block 103, where the exponent k is obtained as a remainder modulo (q-1) of the sum of the indices i and j (where the order of the field is 2 ^m Is q). After that, the exponent k is given to the inverse logarithmic transformation block 104, and the product c is α ^k Is required.
[0004]
[Problems to be solved by the invention]
In the above-described multiplier, it is common to use a conversion table for both logarithmic conversion and antilogarithmic conversion. However, in a multiplier employed in a Reed-Solomon code encoder, a generator polynomial for constituting the code is determined solely, and the multiplier is often constant. Therefore, when the multiplier using the above conversion table is employed in the encoder of the Reed-Solomon code, an unnecessary function of responding to a plurality of different multipliers is performed, and there is a problem that the circuit scale is redundant.
[0005]
The present invention has been made to solve the above problems, and has as its object to provide a technique for realizing multiplication in a Galois extension field with a small circuit scale.
[0006]
[Means for Solving the Problems]
An arithmetic unit according to claim 1 of the present invention is an arithmetic unit, and includes a delay unit having an input terminal, and an output terminal for delaying and outputting a signal given to the input terminal by at least one unit time, A first gate having a first end and a second end, an output end for outputting a logical AND of binary signals of signals input to the first end and the second end thereof, and an output of the first gate; , A second terminal connected to the output terminal of the delay unit, and an exclusive OR of binary signals of the signals input to the first terminal and the second terminal of the delay device. And a second gate having an output terminal for outputting to the input terminal of the delay unit.
[0007]
According to a second aspect of the present invention, there is provided the arithmetic unit according to the first aspect, wherein a multiplier providing unit for sequentially outputting a plurality of vectors constituting a matrix set based on the multiplier, and a multiplicand of 1 And a multiplicand providing unit that outputs bit by bit. A plurality of the sum-of-products units are provided for each element of the vector, and each of the sum-of-products units inputs the corresponding element to the first end of the first gate. Also, the one bit of the multiplicand is input to the second terminal of the first gate of the device, and the delay unit delays the signal applied to the input terminal of the device by one unit time.
[0008]
According to a third aspect of the present invention, there is provided the arithmetic unit according to the first aspect, wherein a plurality of vectors constituting a matrix set based on the multiplier are sequentially output for each element. And a multiplicand providing unit that outputs the multiplicand one bit at a time. The element is input to the first end of the first gate of the product-sum unit, and the one bit of the multiplicand is input to the second end of the first gate of the product-sum unit, The delay unit delays a signal supplied to its input terminal by a product of the number of elements constituting the vector and the unit time.
[0009]
BEST MODE FOR CARRYING OUT THE INVENTION
Embodiment 1 FIG.
FIG. 1 is a block diagram showing the structure of an encoder 1 to which a multiplier according to the present invention can be applied, which is known as an (8, 4) Reed-Solomon encoder. Here, a case where equations (1) and (2) are adopted as the irreducible primitive polynomial and the generator polynomial, respectively, will be exemplified.
[0010]
(Equation 1)

[0011]
(Equation 2)

[0012]
The encoder 1 includes

multipliers

200, 201, 202, and 203 based on the coefficients of the generator polynomial in order to calculate the remainder of the input data based on the generator polynomial. The coefficients of the generator polynomial are α in order from the lower order to the higher order of the indefinite element χ ⁶ , 1, α ⁴ , Α ¹² , 1, the

multipliers

200, 201, 202, and 203 each have a multiplier α ⁶ , 1, α ⁴ , Α ¹² Is used to perform multiplication of the Galois extension field. No multiplier corresponding to the highest order is required.
[0013]
Here, the input data D to be coded is updated every four bits, and the coder 1 uses the four-

bit registers

204, 206, 208, 210, and the four-bit exclusive OR circuit (hereinafter referred to as "XOR gate"). ), 205, 207, 209, and 211. The input data A is input to the XOR gate 211 four bits per clock.
[0014]

Registers

204, 206, 208, and 210 store a 4-bit value “0000” as an initial value at the time of encoding. The

XOR gates

205, 207, 209, and 211 calculate exclusive OR of two inputs that form a pair for each bit, and output the result for each bit. FIG. 2 is a circuit diagram showing a configuration example of each of the

XOR gates

205, 207, 209, and 211. A configuration is shown in which 4-bit data L1 and L2 are input, and exclusive-OR of these is used to output 4-bit data L3. Two

input XOR gates

21, 22, 23 and 24 are provided.
[0015]
The XOR gate 211 outputs an exclusive OR of the output of the register 210 and the input data D. The exclusive OR is the remainder modulo 2 of the ordinary sum, and is the Galois extension field GF (2 ^m ) Matches the addition result. The XOR gate 211 is provided to the

multipliers

200, 201, 202, and 203, where the above-described multiplication is performed. However, the multiplier employed in the multiplier depends on the generator polynomial. In FIG. 1, the multipliers of the respective multipliers are illustrated based on the generator polynomial G (χ) illustrated in the equation (2), but other structures can be adopted if the generator polynomials are different. Outputs as multiplication results obtained from the

multipliers

200, 201, 202, and 203 are supplied to a register 204 and

XOR gates

205, 207, and 209, respectively.
[0016]
The XOR gate 205 takes the exclusive OR of the output of the register 204 and the output of the multiplier 201 and outputs the result to the register 206. The XOR gate 207 takes an exclusive OR of the output of the register 206 and the output of the multiplier 202 and outputs the result to the register 208. The XOR gate 209 takes an exclusive OR of the output of the register 208 and the output of the multiplier 203 and outputs the result to the register 210. However, since the multiplier of the multiplier 201 is 1, this is omitted, and the output of the XOR gate 211 can be directly supplied to the XOR gate 205.
[0017]
Each time the operation for one clock advances, the input data A applies new 4-bit data to the XOR gate 211, and the

registers

204, 206, 208, and 210 apply the XOR gate to the 4-bit data applied thereto. Output to the

gates

205, 207, 209, 211. As a result, four of the check blocks in the codeword are obtained as outputs of the

registers

210, 208, 206, and 204 in order from the top. That is, the register 204 outputs the least significant 4 bits of the check block, and the register 210 outputs the most significant 4 bits of the check block.
[0018]
FIG. 3 is a circuit diagram showing a configuration of the multiplier according to the first embodiment, which can be employed as the

multipliers

200, 202, and 203.
[0019]
The multiplier includes a multiplicand providing unit 100, a multiplier providing unit 300, and a product-sum processing unit 400. The multiplicand providing unit 100 outputs the 4-bit data A of the multiplicand A. ₃ A ₂ A ₁ A ₀ From the most significant bit, ie, A ₃ , A ₂ , A ₁ , A ₀ Are input to the product-sum processing unit 400 one bit at a time. The multiplicand providing unit 100 has four

registers

105, 106, 107, and 108 connected in series, and these registers are arranged to approach the product-sum processing unit 400 in this order.

Registers

105, 106, 107, and 108 each transmit data with a delay of one clock.
[0020]
The multiplier providing unit 300 has a structure depending on the irreducible primitive polynomial, and an initial value depending on the multiplier is set. FIG. 3 exemplifies a structure in which the irreducible primitive polynomial g (χ) represented by the equation (1) is employed, and in each case, a signal inputted thereto is delayed by one clock and output.

Registers

301, 302, 303, and 305 and an XOR gate 304. Let α be the root at which the irreducible primitive polynomial g (χ) expressed by the equation (1) is equal to 0, and obtain the Galois extended field GF (2 ⁴ ) Can be expressed as a power of α, and are expressed in vector expression as follows in the order of increasing exponents. That is, (0100), (0010), (0001), (1100), (0110), (0011), (1101), (1010), (0101), (1110), (0111), (1111), (1111) 1011), (1001), and (1000). If the exponent increases more than this, the above vector representation is used in a cyclical manner.
[0021]
To obtain the above vector representation, the output of register 301 is input to register 302, the output of register 302 is input to register 303, the output of register 303 is one input of XOR gate 304, and the output of XOR gate 304 is The input of the register 305 becomes an input of the register 301, and the output of the register 305 becomes the other input of the XOR gate 304. The multiplier providing unit 300 generates the Galois extended field GF (2 ⁴ ) Is output by lowering the index of α by 1 order.
[0022]
In general, multiplication of a Galois extended field having a code length of 4 bits is represented by Expression (3) as disclosed in the above publication.
[0023]
[Equation 3]

[0024]
Where coefficient T ₁₁ , T ₁₂ , T ₁₃ , T ₁₄ Matches the value of the multiplier expressed as a vector. Equations (4) to (6) hold. The multiplication symbol "x" shown here is a Galois extended field GF (2 ^m ) Shows the multiplication of the vectors.
[0025]
(Equation 4)

[0026]
(Equation 5)

[0027]
(Equation 6)

[0028]
That is, the matrix expressed by the equation (3) is obtained by converting a plurality of vectors (T ₁₁ , T ₁₂ , T ₁₃ , T ₁₄ ), (T ₂₁ , T ₂₂ , T ₂₃ , T ₂₄ ), (T ₃₁ , T ₃₂ , T ₃₃ , T ₃₄ ), (T ₄₁ , T ₄₂ , T ₄₃ , T ₄₄ ).
[0029]
By the way, as described above, the multiplier providing unit 300 outputs the Galois extended field GF (2 ⁴ ) Is output by lowering the index of α by 1 order. Therefore, the multiplier N and α ³ The result of the multiplication of the Galois extended field between? And? Is given to the

registers

301, 302, 303, and 305 as an initial value. In this way, the vertical vector of the matrix of the equation (3) is output in order from the right every clock, and the product-sum operation of this and the multiplicand A is executed.
[0030]
The sum-of-products processing unit 400 has sum-of-

product units

401, 402, 403, and 404 that receive the outputs of the

registers

301, 302, 303, and 305, respectively, and that all receive the output of the register 108. That is, a plurality of sum-of-

product units

401, 402, 403, and 404 are provided for each element of the vertical vector.
[0031]
The sum-of-

product units

401, 402, 403, and 404 all have the same configuration. That is, each product-sum unit includes an AND gate 41, an XOR gate 42, and a register 43. The register 43 has an input terminal and an output terminal that outputs a signal supplied to its own input terminal with a delay of one clock. The AND gate 41 has a first terminal and a second terminal, and an output terminal for outputting a logical product of binary signals of signals input to the first terminal and the second terminal. The XOR gate has a first terminal for inputting the output of the AND gate 41, a second terminal connected to the output terminal of the register 43, and a binary logic of signals input to the first and second terminals of the XOR gate. And an output terminal for outputting an exclusive OR to an input terminal of the register 43. The output of the register 108 is provided to the first end of the AND gate 41. The output of the register corresponding to the product-sum unit to which the AND gate 41 belongs among the registers included in the multiplication providing unit 300 is input to the second end of the AND gate 41. For example, regarding the product-sum unit 401, the output of the register 301 is given to the second end of the AND gate 41.
[0032]
Hereinafter, the operation when the multiplier shown in FIG. 3 is adopted as the multiplier 203 shown in FIG. 1 will be described.
[0033]
FIG. 3 shows a state where the first product-sum operation is performed. In any unit of sum of products, data “0” is stored in the register 43 provided therein. In the multiplicand providing unit 100, the

registers

105, 106, 107 and 108 output A ₀ , A ₁ , A ₂ , A ₃ Is output. The multiplier of the register multiplier 203 included in the product-sum processing unit 400 is α ¹² Therefore, the initial value output to the product-sum processing unit 400 is α ^Fifteen = 1 is adopted. Accordingly, the

registers

301, 302, 303 and 305 respectively output T ₄₁ , T ₄₂ , T ₄₃ , T ₄₄ Are output as "0", "0", "0", and "1".
[0034]
In the sum-of-products unit 401, the 1-bit data T ₄₁ , A ₃ Enter. Since “0” is output as the initial value from the register 43, the XOR gate 42 outputs the output of the AND gate 41 as it is. Thereby, the input of the register 43 includes one bit A of the multiplicand A. ₃ And the multiplier α ^Fifteen 1 bit T of ₄₁ Galois extended field GF (2 ^m A) which is the multiplication result of ₃ ・ T ₄₁ Is given. The multiplication symbol “•” is a Galois extended field GF (2 ^m ) Shows multiplication. However, since the output of the product-sum unit 401 is obtained from the output of the register 43, the output of the product-sum unit 401 is "0" at this point.
[0035]
Similarly, the input of the register 43 of the product-sum unit 402 is A ₃ ・ T ₄₂ Is input to the register 43 of the sum-of-products unit 403. ₃ ・ T ₄₃ Is input to the input of the register 43 of the product-sum unit 404. ₃ ・ T ₄₄ Is given. The output of any of the sum-of-

product units

402, 403, and 404 is "0".
[0036]
FIG. 4 is a block diagram showing a state where one clock has elapsed from the state shown in FIG. In the multiplicand providing unit 100, 1-bit data A is output from the

registers

106, 107 and 108, respectively. ₀ , A ₁ , A ₂ Is output. On the other hand, from the multiplier providing unit 300, the

registers

301, 302, 303 and 305 ₃₁ , T ₃₂ , T ₃₃ , T ₃₄ Are output as "1", "0", "0", and "1".
[0037]
In the sum-of-products unit 401, A input from the register 43 in the operation one clock before is input. ₃ ・ T ₄₁ Is output. On the other hand, the AND gate 41 has 1-bit data T ₃₁ , A ₂ , The XOR gate 42 outputs A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ (However, the addition symbol “+” indicates exclusive OR). Thus, the input of the register 43 is A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ Is given. However, the output of the product-sum unit 401 is A ₃ ・ T ₄₁ It is.
[0038]
Similarly, the input of the register 43 of the product-sum unit 402 is A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ Is input to the register 43 of the sum-of-products unit 403. ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ Is input to the input of the register 43 of the product-sum unit 404. ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ Is given. The outputs of the sum-of-

product units

402, 403, and 404 are A ₃ ・ T ₄₂ , A ₃ ・ T ₄₃ , A ₃ ・ T ₄₄ It becomes.
[0039]
FIG. 5 is a block diagram showing a state when one clock has elapsed from the state shown in FIG. In the multiplicand providing unit 100, 1-bit data A is output from the

registers

107 and 108, respectively. ₀ , A ₁ Is output. On the other hand, in the multiplier providing unit 300, the

registers

301, 302, 303, 305 ₂₁ , T ₂₂ , T ₂₃ , T ₂₄ Are output as "1", "1", "0", and "1".
[0040]
In the sum-of-products unit 401, A input from the register 43 in the operation one clock before is input. ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ Is stored. On the other hand, the AND gate 41 has 1-bit data T ₂₁ , A ₁ , The XOR gate 42 outputs A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ Is output. Thus, the input of the register 43 is A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ Is given. However, the output of the product-sum unit 401 is A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ It is.
[0041]
Similarly, the input of the register 43 of the product-sum unit 402 is A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ Is input to the register 43 of the sum-of-products unit 403. ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ Is input to the input of the register 43 of the product-sum unit 404. ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ Is given. The outputs of the sum-of-

product units

402, 403, and 404 are A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ , A ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ , A ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ It becomes.
[0042]
FIG. 6 is a block diagram showing a state when one clock has elapsed from the state shown in FIG. In the multiplicand providing unit 100, 1-bit data A ₀ Is output. On the other hand, in the multiplier providing unit 300, the

registers

301, 302, 303, and 305 output T ₁₁ , T ₁₂ , T ₁₃ , T ₁₄ Are output as "1", "1", "1", and "1".
[0043]
The operation proceeds as in the case shown in FIGS. 3 to 5, and the input to the register 43 of the product-sum unit 401 is A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ + A ₀ ・ T ₁₁ Is input to the register 43 of the sum-of-products unit 402. ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ + A ₀ ・ T ₁₂ Is input to the register 43 of the sum-of-products unit 403. ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ + A ₀ ・ T ₁₃ Is input to the input of the register 43 of the product-sum unit 404. ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ + A ₀ ・ T ₁₄ Is given. The outputs of the sum-of-

product units

401, 402, 403, and 404 are A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ , A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ , A ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ , A ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ It becomes.
[0044]
FIG. 7 is a block diagram showing a state where one clock has elapsed from the state shown in FIG. In the multiplicand providing unit 100, from the sum-of-

products units

401, 402, 403, and 404, A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ + A ₀ ・ T ₁₁ , A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ + A ₀ ・ T ₁₂ , A ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ + A ₀ ・ T ₁₃ , A ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ + A ₀ ・ T ₁₄ Is output.
[0045]
Thus, equation (3) is calculated, and the multiplicand A and the multiplier α ¹² Gaussian extension GF (2 ⁴ ) Is performed. One input terminal of the XOR gate 42 has a Galois extended field GF (2) of 1 bit of the multiplier and 1 bit of the multiplicand obtained from the AND gate 41. ^m ) Is given. At the other input of the XOR gate 42, the Galois extension GF (2 ^m ) Is given. Therefore, by appropriately setting the delay time of the register 43 (here, one clock), the product of the vector-expressed multiplier and the multiplicand can be obtained.
[0046]
The multiplier shown in FIG. 3 requires four clocks to calculate the multiplication. Therefore, when the multiplier described in the present embodiment is employed in encoder 1, the timing of providing input data D needs to be extended four times. However, it is particularly preferable when the multiplier is fixed, and by adopting the multiplier as a multiplier having a fixed multiplier based on the generator polynomial, the redundancy of the configuration of the encoder can be reduced.
[0047]
If the code length is n bits, the multiplier α ^s , The register of the multiplier providing unit 300 stores α ^{s + (n-1)} May be stored as an initial value. As described above, the multiplier providing unit 300 can be configured using a register and an XOR gate based on the irreducible primitive polynomial used to obtain the Galois extended field.
[0048]
However, as the multiplier providing unit 300, the Galois extended field GF (2 ⁴ In addition to the element which outputs the element of α) by lowering the exponent of α by one order, the element which outputs the element of α by increasing the exponent by one order can be adopted. FIG. 8 is a block diagram illustrating the structure of such a multiplier providing unit 300, which is disclosed, for example, in Japanese Patent Publication No. 6-80491. The output T of the register 301 in the multiplier 300 ₁ Is one input of the XOR gate 304 and also the input of the register 305. The output T4 of the register 305 is the other input of the XOR gate 304. The output of the XOR gate 304 becomes the input of the register 303. Output T of register 303 ₃ Is an input to the register 302. Output T of register 302 ₂ Is an input of the register 301. The value T is used as an initial value of the

registers

301, 302, 303, and 305. ₁₁ , T ₁₂ , T ₁₃ , T ₁₄ Is stored every time one clock period elapses. ₁ Is the value T ₁₁ , T ₂₁ , T ₃₁ , T ₄₁ And the output T ₂ Is T ₁₂ , T ₂₂ , T ₃₂ , T ₄₂ And the output T ₃ Is T ₁₃ , T ₂₃ , T ₃₃ , T ₄₃ And the output T ₄ Is T ₁₄ , T ₂₄ , T ₃₄ , T ₄₄ Are taken sequentially.
[0049]
In order to calculate Equation (3) when employing the multiplier providing unit 300 in FIG. 8, the order in which the multiplicand A is input to the multiplicand providing unit 100 is reversed from that shown in FIG. That is, first, A ₀ And then A ₁ , A ₂ , A ₃ Enter
[0050]
Embodiment 2 FIG.
FIG. 9 is a circuit diagram showing a configuration of the multiplier according to the second embodiment, which can be employed as the

multipliers

200, 202, and 203 of the encoder 1 shown in FIG. The multiplier also includes a multiplicand providing unit 100, a multiplier providing unit 300, and a product-sum processing unit 400, as in the first embodiment.
[0051]
The structure of the multiplicand providing unit 100 according to the second embodiment has four

registers

105, 106, 107, and 108 connected in series similarly to the first embodiment. The arrangement is approaching. The multiplicand providing unit 100 has 4-bit data A of the multiplicand A ₃ A ₂ A ₁ A ₀ Are in order from the most significant bit, that is, A ₃ , A ₂ , A ₁ , A ₀ Are input to the product-sum processing unit 400 one bit at a time. However, the

registers

105, 106, 107, and 108 transmit data with a delay of one clock of a clock signal CK2 described later.
[0052]
The multiplier providing unit 300 according to the second embodiment has a configuration in which a multiplexer 306 and a 2-bit counter 307 are added to the structure of the multiplier providing unit 300 shown in the first embodiment (see FIG. 3). ing. The multiplexer 306 selects one of the outputs of the

registers

301, 302, 303, and 305 based on the control of the 2-bit counter 307 and outputs it. The 2-bit counter 307 performs a counting operation based on the clock signal CK1, and the counting operation makes one cycle for four clocks of the clock signal CK1. Accordingly, a clock signal CK2 having one cycle of four clocks of the clock signal CK1 is output from the 2-bit counter 307. That is, one clock of the clock signal CK2 corresponds to four clocks of the clock signal CK1. The

registers

301, 302, 303, and 305 each transmit data with a delay of one clock of the clock signal CK2.
[0053]
The sum-of-products processing unit 400 according to the second embodiment is provided with only one sum-of-product unit described in the first embodiment, and registers the registers 43a in order from the one closer to the output of the XOR gate 42. , 43b, 43c, 43d as a series connection. Each of these registers transmits data with a delay of one clock of the clock signal CK1. Therefore, as a whole of the register 43, data is delayed and transmitted by the product of the number 4 of elements constituting the multiplier vector and one clock of the clock signal CK1.
[0054]
FIG. 9 shows a state where the first product-sum operation is performed. Data “0” is output from each of the

registers

43a, 43b, 43c, and 43d. In the multiplicand providing unit 100, the

registers

105, 106, 107, and 108 have A ₀ , A ₁ , A ₂ , A ₃ Is output. The multiplier of the register multiplier 203 included in the encoder 1 is α ¹² Therefore, the initial value output to the product-sum processing unit 400 is α ^Fifteen = 1 is adopted. Therefore, the

registers

301, 302, 303, 305 ₄₁ , T ₄₂ , T ₄₃ , T ₄₄ Are output as "0", "0", "0", and "1". The output A from the XOR gate 42 in the product-sum processing unit 400 ₃ ・ T ₄₁ Is obtained.
[0055]
FIG. 10 is a block diagram showing a state when a time corresponding to one clock of clock signal CK1 has elapsed from the state shown in FIG. In this state, the data output from the

registers

105, 106, 107, 108, 301, 302, 303, and 305 has not changed. However, the multiplexer 306 receives the data T under the control of the 2-bit counter 307. ₄₂ Select and output. As a result, the output A is output from the XOR gate 42. ₃ ・ T ₄₂ Is obtained. From the register 43a, the data A input one clock before the clock signal CK1 is input. ₃ ・ T ₄₁ Is output.
[0056]
FIG. 11 is a block diagram showing a state when a time corresponding to two clocks of clock signal CK1 has elapsed from the state shown in FIG. The multiplexer 306 receives the data T under the control of the 2-bit counter 307. ₄₄ Is output. As a result, the output A is output from the XOR gate 42. ₃ ・ T ₄₄ Is obtained. A from

register

43a, 43b, 43c, 43d ₃ ・ T ₄₃ , A ₃ ・ T ₄₂ , A ₃ ・ T ₄₁ , "0" are output.
[0057]
FIG. 12 is a block diagram showing a state when a time corresponding to one clock of clock signal CK1 has elapsed from the state shown in FIG. In this state, the time corresponding to one clock of the clock signal CK2 has elapsed from the state shown in FIG. Therefore, the data output from the

registers

106, 107, 108, 301, 302, 303, and 305 is A ₀ , A ₁ , A ₂ , T ₃₁ , T ₃₂ , T ₃₃ , T ₃₄ Changes to The multiplexer 306 receives the data T under the control of the 2-bit counter 307. ₃₁ Is output.
A from

register

43a, 43b, 43c, 43d ₃ ・ T ₄₄ , A ₃ ・ T ₄₃ , A ₃ ・ T ₄₂ , A ₃ ・ T ₄₁ Is output. The XOR gate 42 has the output A of the AND gate 41 ₂ ・ T ₃₁ And the output A of the register 43d ₃ ・ T ₄₁ Is input, the output of the XOR gate 42 is A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ It becomes. In this way, when four clocks of the clock signal CK1 elapse, the output of the XOR gate 42 of the product-sum unit 401 shown in FIG. 4 in the first embodiment is obtained. In the second embodiment, it can be said that the register 43 of the product-sum processing unit 400 performs a function of sequentially shifting serial 4-bit data bit by bit every time one clock of the clock signal CK1 elapses.
[0058]
FIG. 13 is a block diagram showing a state where a time equivalent to 13 clocks of clock signal CK1 has elapsed from the state shown in FIG. The

registers

301, 302, 303, and 305 output T ₁₁ , T ₁₂ , T ₁₃ , T ₁₄ And the multiplexer 306 outputs the data T obtained from the register 301 under the control of the 2-bit counter 307. ₁₁ Is output. From the register 108, the data A of the least significant bit of the multiplier A ₀ Are output, and the data A is output from the AND gate 41. ₀ ・ T ₁₁ Is output. A from

register

43a, 43b, 43c, 43d ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ , A ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ , A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ , A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ Is output. Therefore, the output of XOR gate 42 is A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ + A ₀ ・ T ₁₁ It becomes.
[0059]
When three clocks of the clock signal CK1 have elapsed from the state shown in FIG. 13, the

registers

43a, 43b, 43c, and 43d have A respectively. ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ + A ₀ ・ T ₁₄ , A ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ + A ₀ ・ T ₁₃ , A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ + A ₀ ・ T ₁₂ , A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ + A ₀ ・ T ₁₁ Enter. As described above, the time corresponding to 16 clocks of the clock signal CK1 (the time corresponding to 4 clocks of the clock signal CK2) elapses from the state illustrated in FIG. 3, and the product sum illustrated in FIG. Outputs of the XOR gates 42 in the units 401 to 403 are obtained.
[0060]
Thereafter, the output B of the product-sum operation processing unit 400 sequentially outputs A ₃ ・ T ₄₁ + A ₂ ・ T ₃₁ + A ₁ ・ T ₂₁ + A ₀ ・ T ₁₁ , A ₃ ・ T ₄₂ + A ₂ ・ T ₃₂ + A ₁ ・ T ₂₂ + A ₀ ・ T ₁₂ , A ₃ ・ T ₄₃ + A ₂ ・ T ₃₃ + A ₁ ・ T ₂₃ + A ₀ ・ T ₁₃ , A ₃ ・ T ₄₄ + A ₂ ・ T ₃₄ + A ₁ ・ T ₂₄ + A ₀ ・ T ₁₄ Is output.
[0061]
As described above, it takes 20 clocks of the clock signal CK1 to obtain the result of Expression (3) according to the present embodiment. It took five clock signals to obtain the same result according to the first embodiment. Therefore, the operation of this embodiment is four times slower than that of the first embodiment. As a result, the operation of the encoder 1 also becomes four times slower. However, as is apparent from a comparison between FIGS. 3 and 9, the present embodiment has a smaller circuit scale than the first embodiment because three AND gates and three XOR gates are omitted. Become.
[0062]
Also in the present embodiment, multiplication providing section 300 shown in FIG. 8 can be employed. In this case, as described in the first embodiment, it is necessary to reverse the order in which each bit of multiplicand A is given to product-sum processing unit 400.
[0063]
【The invention's effect】
According to the arithmetic unit according to claim 1 of the present invention, by giving one bit of the multiplier and one bit of the multiplicand to the first end and the second end of the first gate, respectively, the Galois extended field GF ( 2 ^m ) Is obtained from the first gate. The result is provided to the first end of the second gate. At the second end of the second gate, a Galois extension GF (2) of another one bit of the multiplicand and another one bit of the multiplier is provided. ^m ) Is given. Therefore, by adopting data obtained by delaying the output of the second gate by a predetermined time as data to be provided to the second end of the second gate, it is possible to obtain the product of the vector-wise multiplier and the multiplicand.
[0064]
According to the arithmetic unit according to claim 2 of the present invention, the number of product-sum units is reduced to reduce the circuit scale, and the Galois extended field GF (2 ^m ), The product of the multiplier and the multiplicand can be obtained.
[0065]
According to the arithmetic unit according to claim 3 of the present invention, the number of multiply-accumulate units is further reduced, the circuit scale is further reduced, and the Galois extended field GF (2 ^m ), The product of the multiplier and the multiplicand can be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a structure of an encoder to which a multiplier according to an embodiment of the present invention can be applied.
FIG. 2 is a circuit diagram showing a configuration example of an XOR gate.
FIG. 3 is a circuit diagram showing a configuration of a multiplier according to the first exemplary embodiment of the present invention.
FIG. 4 is a circuit diagram showing an operation of the multiplier according to the first embodiment of the present invention.
FIG. 5 is a circuit diagram showing an operation of the multiplier according to the first exemplary embodiment of the present invention.
FIG. 6 is a circuit diagram showing an operation of the multiplier according to the first embodiment of the present invention.
FIG. 7 is a circuit diagram showing an operation of the multiplier according to the first embodiment of the present invention.
FIG. 8 is a block diagram illustrating the structure of a multiplier providing unit;
FIG. 9 is a circuit diagram showing an operation of the multiplier according to the second embodiment of the present invention.
FIG. 10 is a circuit diagram showing an operation of the multiplier according to the second embodiment of the present invention.
FIG. 11 is a circuit diagram showing an operation of the multiplier according to the second embodiment of the present invention.
FIG. 12 is a circuit diagram showing an operation of the multiplier according to the second embodiment of the present invention.
FIG. 13 is a circuit diagram showing an operation of the multiplier according to the second embodiment of the present invention.
FIG. 14 is a circuit diagram showing a conventional technique.
[Explanation of symbols]
41 AND gate, 42 XOR gate, 43, 43a to 43d, 105 to 108, 301 to 304 register, 100 multiplicand providing unit, 200 to 203 multiplier, 300 multiplier providing unit, 400 product-sum processing unit, 401-404 product-sum unit.

Claims

An input terminal, a delay device having an output terminal for delaying the signal given to the input terminal by at least one unit time and outputting the delayed signal;
A first gate having a first end and a second end, and an output end for outputting a logical AND of binary signals of signals input to the first end and the second end of the first gate;
A first terminal for inputting the output of the first gate, a second terminal connected to the output terminal of the delay device, and binary logic of signals input to the first terminal and the second terminal of the delay device. And a second gate having an output terminal for outputting an exclusive OR of the signals to the input terminal of the delay unit.

A multiplier providing unit in which a plurality of vectors constituting a matrix set based on the multiplier are sequentially output,
A multiplicand providing unit that outputs the multiplicand one bit at a time.
A plurality of the sum-of-product units are provided for each element of the vector, and each of the sum-of-product units inputs the element corresponding to itself to the first end of the first gate,
In any of the sum-of-product units, the 1-bit of the multiplicand is input to the second end of the first gate of the unit.
The arithmetic unit according to claim 1, wherein the delay unit delays a signal applied to the input terminal of the delay unit by one unit time.

A plurality of vectors constituting a matrix set based on the multiplier, a multiplier providing unit sequentially output for each element,
A multiplicand providing unit that outputs the multiplicand one bit at a time.
The element is input to the first end of the first gate of the product-sum unit,
The first bit of the multiplicand is input to the second end of the first gate of the product-sum unit,
The arithmetic unit according to claim 1, wherein the delay unit delays a signal supplied to the input terminal of the delay unit by a product of the number of elements constituting the vector and the unit time.