JP2007295128A

JP2007295128A - Logic integrated circuit and source of circuit for operation thereof, and computer readable recording medium for recording the same

Info

Publication number: JP2007295128A
Application number: JP2006118502A
Authority: JP
Inventors: Ryohei Tanaka; 良平田中
Original assignee: Daihen Corp
Current assignee: Daihen Corp
Priority date: 2006-04-21
Filing date: 2006-04-21
Publication date: 2007-11-08
Anticipated expiration: 2026-04-21
Also published as: JP4979975B2

Abstract

<P>PROBLEM TO BE SOLVED: To save the space of operation logic on a logic integrated circuit by constructing a simple and high-performance circuit for operation on the logic integrated circuit, such as an FPGA. <P>SOLUTION: Data memory in a coprocessor 1 are divided into memories 19, 20 for storing multiplication results and memories 21, 22 for storing addition results. An adder 15 adds two pieces of data stored in the memories 19, 20 for storing multiplication results. A multiplier 16 multiplies two pieces of data in data stored in the memories 21, 22 for storing addition results. As a result, addition processing can be executed in parallel with multiplication processing. In this case, the addition processing and multiplication processing are performed alternately in digital signal processing, thus executing the addition processing in parallel with the multiplication processing for executing processing at high speed as compared with when a CPU core is built in the FPGA. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、ＦＡ機器、通信機器、家電製品、医療機器等に組み込まれるフィールド・プログラマブル・ゲート・アレイ（以下、ＦＰＧＡと略す）、プログラマブル・ロジック・デバイス（以下、ＰＬＤと略す）等の再プログラミング可能な論理集積回路に係わり、特に、論理集積回路上にコプロセッサ（co-processor）等の演算用回路を構築する技術に関する。 The present invention relates to reprogramming of a field programmable gate array (hereinafter abbreviated as FPGA), a programmable logic device (hereinafter abbreviated as PLD) and the like incorporated in FA devices, communication devices, home appliances, medical devices, and the like. The present invention relates to a possible logic integrated circuit, and more particularly to a technique for constructing an arithmetic circuit such as a co-processor on a logic integrated circuit.

従来のＦＰＧＡ、ＰＬＤ等のユーザが手元で機能を完成させる方式の論理集積回路では、複雑な演算ロジックを構築する場合に、演算ロジックを直接ハードウェア記述言語で記述して、そのロジックに該当するハードウェア回路を作成する方法が採用されていた。これに対して、ＡＳＩＣ（Application Specific IC)の分野では、ＣＰＵコアを有するシステムＬＳＩの構成を採るものが多く（例えば、特許文献１参照）、複雑な演算処理については、ＣＰＵコア内部のプログラムで処理する方式が採用されている。 In a conventional logic integrated circuit in which a user completes a function by hand, such as a conventional FPGA, PLD, etc., when constructing a complicated arithmetic logic, the arithmetic logic is directly described in a hardware description language and corresponds to the logic. A method of creating a hardware circuit was adopted. On the other hand, in the field of ASIC (Application Specific IC), many systems adopt a configuration of a system LSI having a CPU core (see, for example, Patent Document 1), and complicated arithmetic processing is performed by a program inside the CPU core. A processing method is adopted.

しかしながら、上記のような従来の演算ロジックに相当するハードウェア回路を作成する方式のＦＰＧＡ又はＰＬＤでは、ディジタル信号処理における浮動小数点方式の演算用のロジック等の複雑な演算ロジックを構築する場合に、回路の規模が大きくなる。また、上記のような従来のＡＳＩＣに用いられているＣＰＵコアをＦＰＧＡ又はＰＬＤに直接組み込んだ場合には、ゲートの使用率が大きくなり、また、ＦＰＧＡ又はＰＬＤ上におけるＣＰＵコアの占有面積が大きくなる。
特開平６−２５０８７１号公報 However, in the FPGA or PLD of a method for creating a hardware circuit corresponding to the conventional arithmetic logic as described above, when constructing a complicated arithmetic logic such as a floating-point arithmetic logic in digital signal processing, The scale of the circuit increases. In addition, when the CPU core used in the conventional ASIC as described above is directly incorporated in the FPGA or PLD, the gate usage rate increases, and the area occupied by the CPU core on the FPGA or PLD increases. Become.
JP-A-6-250871

本発明は、上記の問題を解決するためになされたものであり、ＦＰＧＡやＰＬＤ等の論理集積回路上に簡易で高パーフォーマンスの演算用回路を構築することができるようにして、論理集積回路上における演算ロジックの省スペース化を図ることが可能な論理集積回路、その演算用回路のソース、及びそのソースを記録したコンピュータ読み取り可能な記録媒体を提供することを目的とする。 The present invention has been made in order to solve the above-described problems, and enables a simple and high-performance arithmetic circuit to be constructed on a logic integrated circuit such as an FPGA or a PLD. An object of the present invention is to provide a logic integrated circuit capable of saving the above-described arithmetic logic space, a source of the arithmetic circuit, and a computer-readable recording medium recording the source.

上記目的を達成するために請求項１の発明は、演算用回路を有するフィールド・プログラマブル・ゲート・アレイ等の論理集積回路において、前記演算用回路は、乗算器と、加算器と、前記乗算器による演算結果のデータを複数個格納可能な乗算結果格納専用メモリと、前記加算器による演算結果のデータを複数個格納可能な加算結果格納専用メモリと、これらの回路各部の制御を行うための制御部を備え、前記加算器は、前記乗算結果格納専用メモリに格納された複数個のデータのうちの２つのデータを加算し、前記乗算器は、前記加算結果格納専用メモリに格納された複数個のデータのうちの２つのデータを乗算するものである。 To achieve the above object, the invention of claim 1 is a logic integrated circuit such as a field programmable gate array having an arithmetic circuit, wherein the arithmetic circuit includes a multiplier, an adder, and the multiplier. A multiplication result storage dedicated memory capable of storing a plurality of operation result data by the adder, an addition result storage dedicated memory capable of storing a plurality of operation result data by the adder, and a control for controlling each part of these circuits And the adder adds two data out of a plurality of data stored in the multiplication result storage dedicated memory, and the multiplier stores a plurality of data stored in the addition result storage dedicated memory. Are multiplied by two of the data.

請求項２の発明は、請求項１に記載の論理集積回路において、前記制御部は、前記乗算結果格納専用メモリに格納された複数個のデータの中から、前記加算器に加算対象となるデータを出力するために、前記乗算結果格納専用メモリに前記加算対象となるデータに対応したアドレスを指示し、前記加算結果格納専用メモリに格納された複数個のデータの中から、前記乗算器に乗算対象となるデータを出力するために、前記加算結果格納専用メモリに前記乗算対象となるデータに対応したアドレスを指示するものである。 According to a second aspect of the present invention, in the logic integrated circuit according to the first aspect, the control unit selects data to be added to the adder from a plurality of data stored in the multiplication result storage dedicated memory. To output an address corresponding to the data to be added to the multiplication result storage dedicated memory, and multiply the multiplier from the plurality of data stored in the addition result storage dedicated memory. In order to output target data, an address corresponding to the data to be multiplied is instructed to the addition result storage dedicated memory.

請求項３の発明は、請求項２に記載の論理集積回路において、前記制御部は、水平型マイクロコードを含むマイクロ命令を格納したプログラムメモリを有し、前記制御部は、前記乗算結果格納専用メモリ及び加算結果格納専用メモリに対する、アドレス指示用とライト・イネーブル信号出力用の制御線を有し、前記プログラムメモリから前記マイクロ命令を読み込み、このマイクロ命令に含まれる水平型マイクロコードを構成する各ビットのオン／オフ情報を前記制御線を介して前記乗算結果格納専用メモリ及び加算結果格納専用メモリに伝達することにより、前記乗算結果格納専用メモリ及び加算結果格納専用メモリに対するデータの読み書きを制御するものである。 According to a third aspect of the present invention, in the logic integrated circuit according to the second aspect, the control unit has a program memory storing a microinstruction including a horizontal microcode, and the control unit is dedicated to storing the multiplication result. Control lines for address indication and write enable signal output for the memory and addition result storage dedicated memory, read the microinstruction from the program memory, and configure each horizontal microcode included in the microinstruction By transmitting bit ON / OFF information to the multiplication result storage dedicated memory and the addition result storage dedicated memory via the control line, data reading / writing to the multiplication result storage dedicated memory and the addition result storage dedicated memory is controlled. Is.

請求項４の発明は、請求項１乃至３に記載の論理集積回路において、前記加算器による演算結果のデータを一時的に格納するためのレジスタをさらに備えるものである。 According to a fourth aspect of the present invention, in the logic integrated circuit according to the first to third aspects of the present invention, the logic integrated circuit further includes a register for temporarily storing data of a calculation result by the adder.

請求項５の発明は、請求項１乃至４に記載の論理集積回路において、前記乗算器による演算処理と、前記加算器による演算処理とを同時に実行し得るようにしたものである。 According to a fifth aspect of the present invention, in the logic integrated circuit according to any one of the first to fourth aspects, the arithmetic processing by the multiplier and the arithmetic processing by the adder can be executed simultaneously.

請求項６の発明は、論理集積回路上における演算用回路についてのソースであって、前記ソースは、請求項１乃至請求項５のいずれかに記載の論理集積回路上の演算用回路についてのハードウェア記述言語レベルのソースであるものである。 The invention of claim 6 is a source for an arithmetic circuit on a logic integrated circuit, and the source is a hardware for an arithmetic circuit on the logic integrated circuit according to any one of claims 1 to 5. Hardware description language level source.

請求項７の発明は、論理集積回路上における演算用回路についてのソースを記録したコンピュータ読み取り可能な記録媒体であって、前記ソースは、請求項１乃至請求項５のいずれかに記載の論理集積回路上の演算用回路についてのハードウェア記述言語レベルのソースであるものである。 A seventh aspect of the present invention is a computer-readable recording medium in which a source for an arithmetic circuit on a logical integrated circuit is recorded, wherein the source is the logical integration according to any one of the first to fifth aspects. This is a hardware description language level source for an arithmetic circuit on the circuit.

請求項１及び２の発明によれば、演算用回路を、主に、乗算器と、加算器と、乗算結果格納専用メモリと、加算結果格納専用メモリと、制御部とで構成したことにより、フィールド・プログラマブル・ゲート・アレイ等の論理集積回路上に、簡易な構成の演算用回路を構築することができるので、論理集積回路上における演算ロジックの省スペース化を図ることができる。また、演算用回路内のデータ・メモリを、乗算結果格納専用メモリと加算結果格納専用メモリとに分けて、加算器は、乗算結果格納専用メモリに格納された複数個のデータのうちの２つのデータを加算し、乗算器は、加算結果格納専用メモリに格納された複数個のデータのうちの２つのデータを乗算するようにしたことにより、演算用回路による加算処理と乗算処理とを並行して実行することができる。ここで、適応ディジタルフィルタにおけるフィルタリング等のディジタル信号処理においては、加算処理と乗算処理が交互に行われることが多いので、上記のように加算処理と乗算処理とを並行して実行することができるようにしたことにより、従来のＡＳＩＣに用いられているＣＰＵコアをＦＰＧＡ又はＰＬＤに直接組み込んだ場合と比べて、クロックの周波数が同程度の場合には、ディジタル信号処理をより高速に実行することができる。 According to the first and second aspects of the invention, the arithmetic circuit is mainly composed of a multiplier, an adder, a multiplication result storage dedicated memory, an addition result storage dedicated memory, and a control unit. Since an arithmetic circuit having a simple configuration can be constructed on a logic integrated circuit such as a field programmable gate array, it is possible to save the arithmetic logic on the logic integrated circuit. Further, the data memory in the arithmetic circuit is divided into a multiplication result storage dedicated memory and an addition result storage dedicated memory, and the adder includes two of the plurality of data stored in the multiplication result storage dedicated memory. The data is added, and the multiplier multiplies two of the plurality of data stored in the addition result storage dedicated memory, so that the addition processing by the arithmetic circuit and the multiplication processing are performed in parallel. Can be executed. Here, in digital signal processing such as filtering in an adaptive digital filter, addition processing and multiplication processing are often performed alternately, so that addition processing and multiplication processing can be executed in parallel as described above. As a result, the digital signal processing can be executed at a higher speed when the clock frequency is similar to the case where the CPU core used in the conventional ASIC is directly incorporated in the FPGA or PLD. Can do.

請求項３の発明によれば、制御部は、乗算結果格納専用メモリ及び加算結果格納専用メモリに対する、アドレス指示用とライト・イネーブル信号出力用の制御線を有し、プログラムメモリからマイクロ命令を読み込み、このマイクロ命令に含まれる水平型マイクロコードを構成する各ビットのオン／オフ情報を制御線を介して乗算結果格納専用メモリ及び加算結果格納専用メモリに伝達することにより、乗算結果格納専用メモリ及び加算結果格納専用メモリに対するデータの読み書きを制御するようにした。これにより、制御部が、命令をデコードしてレジスタやメモリに対する制御信号を生成することなく、乗算結果格納専用メモリ及び加算結果格納専用メモリに対するデータの読み書きを制御することができるので、制御部の行う処理を簡略化することができる。従って、制御部を簡易な構成とすることができると共に、乗算結果格納専用メモリ及び加算結果格納専用メモリに対するデータの読み書きの処理の高速化を図ることができる。 According to the invention of claim 3, the control unit has control lines for address indication and write enable signal output for the multiplication result storage dedicated memory and the addition result storage dedicated memory, and reads the microinstruction from the program memory. By transmitting ON / OFF information of each bit constituting the horizontal microcode included in this microinstruction to the multiplication result storage dedicated memory and the addition result storage dedicated memory via the control line, the multiplication result storage dedicated memory and Controlled reading and writing of data to dedicated memory for storing addition results. As a result, the control unit can control reading and writing of data to and from the multiplication result storage dedicated memory and the addition result storage dedicated memory without decoding the instruction and generating a control signal for the register and the memory. Processing to be performed can be simplified. Therefore, the control unit can have a simple configuration, and the data read / write processing speed for the multiplication result storage dedicated memory and the addition result storage dedicated memory can be increased.

請求項４の発明によれば、加算器による演算結果のデータを一時的に格納するためのレジスタをさらに備えるようにしたことにより、ディジタル信号処理において、加算を連続して実行する場合における処理の高速化を図ることができる。 According to the fourth aspect of the present invention, a register for temporarily storing the data of the operation result by the adder is further provided, so that in the digital signal processing, the processing in the case where the addition is executed continuously is performed. The speed can be increased.

請求項６及び７の発明によれば、コンピュータにソースを読み取らせて、このソースを用いてコンピュータにより論理合成処理を行った結果を、論理集積回路にダウン・ロードすることにより、上記に記載の発明の効果と同等の効果を得ることができる。 According to the sixth and seventh aspects of the present invention, the source is read by a computer, and the result of the logic synthesis process performed by the computer using the source is downloaded to the logic integrated circuit. An effect equivalent to the effect of the invention can be obtained.

本発明を実施するための最良の形態について図面を参照して説明する。なお、以下に記載した実施形態は、本発明を網羅するものではなく、本発明は、下記の形態だけに限定されない。 The best mode for carrying out the present invention will be described with reference to the drawings. In addition, embodiment described below does not cover this invention, and this invention is not limited only to the following form.

以下、本発明の一実施形態による論理集積回路であるフィールド・プログラマブル・ゲート・アレイ（以下、ＦＰＧＡという）について図面を参照して説明する。図１に本実施形態によるＦＰＧＡにおけるコプロセッサ（co-processor）（請求項における演算用回路）周辺の構成を示す。コプロセッサ１は、ＦＰＧＡ（図６参照）上の演算ロジックの規模を小さくするために組み込まれたＩＰの一種である。このコプロセッサ１は、演算部３と、この演算部３による演算を制御する制御部２と、クロックジェネレータ１４とを備えている。演算部３は、乗算器１６と、加算器１５と、乗算器１６による演算結果のデータを複数個格納可能なメモリ１９及びメモリ２０（請求項における乗算結果格納専用メモリ）と、加算器１５による演算結果のデータを複数個格納可能なメモリ２１及びメモリ２２（請求項における加算結果格納専用メモリ）と、加算器１５に入力されるデータのルートの切り替えを行うためのマルチプレクサであるＭＵＸ１〜３と、乗算器１６に入力されるデータのルートの切り替えを行うためのマルチプレクサであるＭＵＸ４〜６と、加算器１５による演算結果のデータを一時的に格納するためのレジスタであるＡ＿Ｒｅｇ２３とから構成されている。メモリ１９は、リード用のポート１９ａとライト用のポート１９ｂを有するデュアル・ポート・メモリであり、メモリ１９からのデータの読み出しとメモリ１９へのデータの書き込みを同時に行うことができる。メモリ２０〜２２も同様である。メモリ１９〜２２は、それぞれ３２ビットのデータを２５６個格納することができる。ＭＵＸ２は、加算器１５に入力されるデータを、ＭＵＸ１から出力されたデータと、Ａ＿Ｒｅｇ２３からのデータとの間で切り替えるためのマルチプレクサである。ＭＵＸ１７は、外部入力データの切り替え用のマルチプレクサである。ＭＵＸ６は、乗算器１６に入力されるデータを、ＭＵＸ５から出力されたデータと、ＭＵＸ１７から出力された外部データとの間で切り替えるためのマルチプレクサである。 Hereinafter, a field programmable gate array (hereinafter referred to as an FPGA) which is a logic integrated circuit according to an embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows a configuration around a co-processor (arithmetic circuit in claims) in the FPGA according to the present embodiment. The coprocessor 1 is a kind of IP incorporated in order to reduce the scale of arithmetic logic on the FPGA (see FIG. 6). The coprocessor 1 includes a calculation unit 3, a control unit 2 that controls calculation by the calculation unit 3, and a clock generator 14. The calculation unit 3 includes a multiplier 16, an adder 15, a memory 19 and a memory 20 (multiplication result storage dedicated memory in claims) that can store a plurality of data of calculation results obtained by the multiplier 16, and an adder 15. A memory 21 and a memory 22 (memory for storing addition results in the claims) that can store a plurality of calculation result data, and multiplexers MUX1 to MUX3 for switching routes of data input to the adder 15; , MUX 4 to 6 that are multiplexers for switching the route of data input to the multiplier 16, and A_Reg 23 that is a register for temporarily storing the data of the operation result by the adder 15. Yes. The memory 19 is a dual-port memory having a read port 19 a and a write port 19 b, and can read data from the memory 19 and write data to the memory 19 at the same time. The same applies to the memories 20 to 22. Each of the memories 19 to 22 can store 256 pieces of 32-bit data. MUX2 is a multiplexer for switching the data input to the adder 15 between the data output from MUX1 and the data from A_Reg23. The MUX 17 is a multiplexer for switching external input data. The MUX 6 is a multiplexer for switching the data input to the multiplier 16 between the data output from the MUX 5 and the external data output from the MUX 17.

加算器１５は、主に、ライン（データ用のバス）Ｌ５，Ｌ１０とＭＵＸ１，３を介して、メモリ１９及びメモリ２０に格納された複数個のデータのうちの２つのデータを読み出して加算する。ただし、乗算器１６による直前の乗算結果をそのまま次の加算処理に使用する場合は、加算器１５は、ライン（データ用のバス）Ｌ４とＭＵＸ１を介して入力されたデータ、又はライン（データ用のバス）Ｌ９とＭＵＸ３を介して入力されたデータを加算処理に使用する。すなわち、ラインＬ４とラインＬ９とは、乗算器１６による直前の乗算結果をそのままＭＵＸ１又はＭＵＸ３に送るためのバイパス用のラインである。 The adder 15 mainly reads and adds two data out of a plurality of data stored in the memory 19 and the memory 20 via lines (data buses) L5, L10 and MUX1, 3. . However, when the previous multiplication result by the multiplier 16 is used as it is for the next addition processing, the adder 15 receives the data input via the line (data bus) L4 and MUX1, or the line (data The data input via L9 and MUX3 are used for the addition process. That is, the line L4 and the line L9 are bypass lines for sending the previous multiplication result by the multiplier 16 to the MUX1 or MUX3 as it is.

乗算器１６は、主に、ライン（データ用のバス）Ｌ１５，Ｌ２０とＭＵＸ４，５を介して、メモリ２１及びメモリ２２に格納された複数個のデータのうちの２つのデータを読み出して乗算する。ただし、加算器１５による直前の加算結果をそのまま次の乗算処理に使用する場合は、乗算器１６は、ライン（データ用のバス）Ｌ１４とＭＵＸ４を介して入力されたデータ、又はライン（データ用のバス）Ｌ１９とＭＵＸ５を介して入力されたデータを乗算処理に使用する。すなわち、ラインＬ１４とラインＬ１９とは、加算器１５による直前の加算結果をそのままＭＵＸ４又はＭＵＸ５に送るためのバイパス用のラインである。 The multiplier 16 mainly reads and multiplies two data out of a plurality of data stored in the memory 21 and the memory 22 via lines (data buses) L15 and L20 and MUXs 4 and 5. . However, when the previous addition result by the adder 15 is used as it is for the next multiplication process, the multiplier 16 receives the data input via the line (data bus) L14 and MUX4 or the line (data The data input via L19 and MUX5 are used for multiplication processing. That is, the line L14 and the line L19 are bypass lines for sending the immediately previous addition result by the adder 15 to the MUX 4 or MUX 5 as they are.

制御部２は、プログラムカウンタ１１と、プログラムメモリ１２と、命令レジスタ１３とを有している。プログラムカウンタ１１は、次に実行すべきマイクロ命令が存在するプログラムメモリ１２上のアドレスを指示する。プログラムメモリ１２は、レングスが６４ビットの水平型マイクロコード形式（マイクロ命令の１ビットを１つの制御信号に割り当てる単純な形式）のマイクロ命令を格納している。命令レジスタ１３には、プログラムカウンタ１１により指示されたプログラムメモリ１２上のアドレスに格納されたマイクロ命令がセットされる。 The control unit 2 includes a program counter 11, a program memory 12, and an instruction register 13. The program counter 11 indicates an address on the program memory 12 where a microinstruction to be executed next exists. The program memory 12 stores microinstructions in a horizontal microcode format (a simple format in which one bit of microinstruction is assigned to one control signal) having a length of 64 bits. The instruction register 13 is set with a microinstruction stored at an address on the program memory 12 designated by the program counter 11.

命令レジスタ１３とメモリ１９との間には、メモリ１９内のデータをＭＵＸ１に出力する際に、メモリ１９に加算対象となる読み込みデータに対応したアドレスを入力するための制御線Ｌ１と、メモリ１９に乗算器１６から出力された乗算結果を書き込む際に、メモリ１９上のデータが格納されるアドレスを入力するための制御線Ｌ２と、メモリ１９にライト・イネーブル信号を出力するための制御線Ｌ２１とが配設されている。また、命令レジスタ１３とメモリ２０との間には、メモリ２０内のデータをＭＵＸ３に出力する際に、メモリ２０に加算対象となる読み込みデータに対応したアドレスを入力するための制御線Ｌ６と、メモリ２０に乗算器１６から出力された乗算結果を書き込む際に、メモリ２０上のデータが格納されるアドレスを入力するための制御線Ｌ７と、メモリ２０にライト・イネーブル信号を出力するための制御線Ｌ２２とが配設されている。なお、命令レジスタ１３とＡ＿Ｒｅｇ２３との間には、命令レジスタ１３からＡ＿Ｒｅｇ２３にライト・イネーブル信号を出力するための不図示の制御線（以下、Ａ＿Ｒｅｇ制御線という）が設けられている。このＡ＿Ｒｅｇ制御線とＡ＿Ｒｅｇ２３とＭＵＸ２とを設けたことにより、加算器１５は、前回の加算結果をそのまま用いて、次の加算処理を行うことができる。 Between the instruction register 13 and the memory 19, when outputting the data in the memory 19 to the MUX 1, a control line L 1 for inputting an address corresponding to the read data to be added to the memory 19, and the memory 19 When the multiplication result output from the multiplier 16 is written to the control line L2, a control line L2 for inputting an address where data on the memory 19 is stored and a control line L21 for outputting a write enable signal to the memory 19 are input. Are arranged. Further, between the instruction register 13 and the memory 20, when outputting the data in the memory 20 to the MUX 3, a control line L6 for inputting an address corresponding to the read data to be added to the memory 20; When writing the multiplication result output from the multiplier 16 to the memory 20, a control line L 7 for inputting an address where data on the memory 20 is stored, and a control for outputting a write enable signal to the memory 20. A line L22 is provided. A control line (not shown) for outputting a write enable signal from the instruction register 13 to the A_Reg 23 (hereinafter referred to as an A_Reg control line) is provided between the instruction register 13 and the A_Reg 23. By providing the A_Reg control line, A_Reg 23, and MUX2, the adder 15 can perform the next addition process using the previous addition result as it is.

命令レジスタ１３とメモリ２１との間には、メモリ２１内のデータをＭＵＸ４に出力する際に、メモリ２１に乗算対象となる読み込みデータに対応したアドレスを入力するための制御線Ｌ１１と、メモリ２１に加算器１５から出力された加算結果を書き込む際に、メモリ２１上のデータが格納されるアドレスを入力するための制御線Ｌ１２と、メモリ２１にライト・イネーブル信号を出力するための制御線Ｌ２３とが配設されている。また、命令レジスタ１３とメモリ２２との間には、メモリ２２内のデータをＭＵＸ５に出力する際に、メモリ２２に乗算対象となる読み込みデータに対応したアドレスを入力するための制御線Ｌ１６と、メモリ２２に加算器１５から出力された加算結果を書き込む際に、メモリ２２上のデータが格納されるアドレスを入力するための制御線Ｌ１７と、メモリ２２にライト・イネーブル信号を出力するための制御線Ｌ２４とが配設されている。 Between the instruction register 13 and the memory 21, when outputting the data in the memory 21 to the MUX 4, a control line L 11 for inputting an address corresponding to the read data to be multiplied to the memory 21, and the memory 21 When the addition result output from the adder 15 is written, the control line L12 for inputting the address where the data on the memory 21 is stored and the control line L23 for outputting the write enable signal to the memory 21 Are arranged. Further, between the instruction register 13 and the memory 22, when outputting the data in the memory 22 to the MUX 5, a control line L16 for inputting an address corresponding to the read data to be multiplied to the memory 22; When writing the addition result output from the adder 15 to the memory 22, a control line L 17 for inputting an address where data on the memory 22 is stored, and a control for outputting a write enable signal to the memory 22. A line L24 is provided.

命令レジスタ１３と演算部３内のＭＵＸ１〜６との間には、不図示の制御線（以下、マルチプレクサ制御線という）が設けられている。 A control line (not shown) (hereinafter referred to as a multiplexer control line) is provided between the instruction register 13 and the MUXs 1 to 6 in the arithmetic unit 3.

乗算器１６とメモリ１９との間には、データ入力用のバスであるラインＬ３が設けられており、乗算器１６とメモリ２０との間には、データ入力用のバスであるラインＬ８が設けられている。また、加算器１５とメモリ２１との間には、データ入力用のバスであるラインＬ１３が設けられており、加算器１５とメモリ２２との間には、データ入力用のバスであるラインＬ１８が設けられている。 A line L3, which is a data input bus, is provided between the multiplier 16 and the memory 19, and a line L8, which is a data input bus, is provided between the multiplier 16 and the memory 20. It has been. A line L13, which is a data input bus, is provided between the adder 15 and the memory 21, and a line L18, which is a data input bus, is provided between the adder 15 and the memory 22. Is provided.

プログラムカウンタ１１は、プログラムメモリ１２からマイクロ命令を読み込んで命令レジスタ１３にセットし、このマイクロ命令に含まれる各ビットのオン／オフ情報を、制御線Ｌ１，Ｌ２，Ｌ６，Ｌ７，Ｌ１１，Ｌ１２，Ｌ１６，Ｌ１７，Ｌ２１，Ｌ２２，Ｌ２３及びＬ２４を介して、メモリ１９乃至２２に伝達することにより、メモリ１９乃至２２に対するデータの読み書きを制御する。また、プログラムカウンタ１１は、プログラムメモリ１２からマイクロ命令を読み込んで命令レジスタ１３にセットし、このマイクロ命令に含まれる各ビットのオン／オフ情報を、上記のマルチプレクサ制御線を介して、演算部３内のＭＵＸ１〜６に伝達することにより、演算部３内におけるデータの流れを制御する。すなわち、プログラムカウンタ１１は、プログラムメモリ１２からマイクロ命令を読み込んで命令レジスタ１３にセットし、このマイクロ命令に含まれる各ビットのオン／オフ情報を、演算部３内のメモリ１９乃至２２及びＭＵＸ１〜６に伝達することにより、演算部３内における演算処理を制御する。 The program counter 11 reads a microinstruction from the program memory 12 and sets it in the instruction register 13. The program counter 11 sends on / off information of each bit included in the microinstruction to the control lines L1, L2, L6, L7, L11, L12, By transmitting the data to the memories 19 to 22 via L16, L17, L21, L22, L23 and L24, the data reading and writing to the memories 19 to 22 is controlled. The program counter 11 reads a microinstruction from the program memory 12 and sets the microinstruction in the instruction register 13, and sends on / off information of each bit included in the microinstruction to the arithmetic unit 3 through the multiplexer control line. The flow of data in the arithmetic unit 3 is controlled by transmitting the data to the MUXs 1 to 6. That is, the program counter 11 reads a microinstruction from the program memory 12 and sets it in the instruction register 13, and stores on / off information of each bit included in the microinstruction in the memories 19 to 22 and MUX1 to MUX1 in the arithmetic unit 3. 6, the arithmetic processing in the arithmetic unit 3 is controlled.

上記の構成においては、加算器１５は、主に、乗算器１６による乗算結果を用いて加算処理を行い、乗算器１６は、主に、加算器１５による加算結果を用いて加算処理を行う。従って、演算部３内のデータフローは、図２（ｂ）に示されるようなイメージになる。従来のプロセッサにおいては、ＡＬＵを複数備えたものがある。しかしながら、図２（ａ）に示されるように、この種のプロセッサにおける各ＡＬＵ１０１、１０２は、それぞれのＡＬＵ１０１、１０２が過去に処理したデータのみを用いて、次の演算処理を行う。これに対して、本実施形態におけるコプロセッサ１の加算器１５は、原則として、乗算器１６が過去に処理したデータを用いて、次の演算処理を行う。また、本実施形態におけるコプロセッサ１の乗算器１６は、原則として、加算器１５が過去に処理したデータを用いて、次の演算処理を行う。 In the above configuration, the adder 15 mainly performs addition processing using the multiplication result of the multiplier 16, and the multiplier 16 mainly performs addition processing using the addition result of the adder 15. Therefore, the data flow in the calculation unit 3 is an image as shown in FIG. Some conventional processors include a plurality of ALUs. However, as shown in FIG. 2A, the ALUs 101 and 102 in this type of processor perform the following arithmetic processing using only data processed by the ALUs 101 and 102 in the past. On the other hand, the adder 15 of the coprocessor 1 according to the present embodiment performs the following arithmetic processing using data previously processed by the multiplier 16 in principle. Further, in principle, the multiplier 16 of the coprocessor 1 in the present embodiment performs the following arithmetic processing using data processed in the past by the adder 15.

図３は、プログラムメモリ１２に格納されるマイクロ命令の元になるコマンドとマイクロ命令との対応関係を示す表である。図における３９の列と４０の列は、それぞれマイクロ命令の元になるコマンドの第１オペランドと第２オペランドの内容を示している。また、各マイクロ命令は、全体で６４ビットのレングスを持ち、第１コマンド部と第２コマンド部から構成される１６ビットのコマンド部３１と、第１乃至第６ＯＰ部から構成される４８ビットのオペランド部３２とから構成される。第１乃至第６ＯＰ部は、それぞれ８ビットのレングスを持つ。これらの第１乃至第６ＯＰ部には、該当するマイクロ命令の元になるコマンドにおける各オペランド３９、４０に対応した、メモリ１９乃至２２上のアドレスが格納される。 FIG. 3 is a table showing the correspondence between the commands that are the basis of the microinstructions stored in the program memory 12 and the microinstructions. In the figure, columns 39 and 40 indicate the contents of the first operand and the second operand of the command that is the basis of the microinstruction, respectively. Each microinstruction has a total length of 64 bits, a 16-bit command part 31 composed of a first command part and a second command part, and a 48-bit part composed of first to sixth OP parts. And an operand part 32. Each of the first to sixth OP sections has a length of 8 bits. In these first to sixth OP sections, addresses on the memories 19 to 22 corresponding to the operands 39 and 40 in the command that is the source of the corresponding microinstruction are stored.

図中の”ｍｕｌｔ”は、乗算命令であり、”ｗ＿ｍｕｌｔ”は、乗算結果のメモリ１９及び２０への書込命令であり、”ａｄｄ”は、加算命令であり、”ｗ＿ａｄｄｓ”は、加算結果のメモリ２１及び２２への書込命令である。”ｓｕｂ＿ａｂ”は、「（第１オペランドが指し示すアドレスのデータ） − （第２オペランドが指し示すアドレスのデータ）」を意味する減算命令であり、”ｓｕｂ＿ｂａ”は、「（第２オペランドが指し示すアドレスのデータ） − （第１オペランドが指し示すアドレスのデータ）」を意味する減算命令である。また、”ｌｄａ”は、Ａ＿Ｒｅｇ２３からのデータの書き込み命令であり、”ｓｔａ”は、Ａ＿Ｒｅｇ２３へのデータの読み込み命令である。また、”ｌｄ＿ｄｉｖ”，”ｌｄ＿ａｌｉｍｉｔ”，”ｌｄ＿ｄｉｎ”は、いずれもロード（ｌｏａｄ）系の命令であり、”ｓｔ＿ｕａ１”，”ｓｔ＿ｕａ２”，”ｓｔ＿ｕｂ１”，”ｓｔ＿ｕｂ２”，”ｓｔ＿ｍ１”，”ｓｔ＿ｍ２”，”ｓｔ＿ｎ１”，”ｓｔ＿ｎ２”は、いずれもストア（ｓｔｏｒｅ）系の命令である。 In the figure, “multit” is a multiplication instruction, “w_multit” is a write instruction to the memories 19 and 20 for multiplication results, “add” is an addition instruction, and “w_adds” is an addition result. Is a write instruction to the memories 21 and 22. “Sub_ab” is a subtraction instruction meaning “(data of the address indicated by the first operand) − (data of the address indicated by the second operand)”, and “sub_ba” is “(the address of the address indicated by the second operand). Data)-a subtraction instruction meaning "(data at address indicated by first operand)". “Lda” is a data write command from A_Reg 23, and “sta” is a data read command to A_Reg 23. Further, “ld_div”, “ld_limit”, and “ld_din” are all load instructions, and are “st_ua1”, “st_ua2”, “st_ub1”, “st_ub2”, “st_m1”, “st_m2”. , “St_n1” and “st_n2” are store instructions.

図中において「排他」と記載されたグループ内の２つ以上のコマンドを同時に実行させることはできない。従って、図４に示されるプログラムシート５０において、図３中の「排他」と記載されたグループ（以下、排他グループという）内の２つ以上のコマンドを同じ行に記載することはできない。例えば、プログラムシート５０において、図３中の最初の排他グループ（”ａｄｄ”と”ｓｕｂ＿ａｂ”と”ｓｕｂ＿ｂａ”から構成されるグループ）内の２つ以上のコマンド（例えば、”ａｄｄ”と”ｓｕｂ＿ａｂ”）を同じ行に記載することはできない。これに対して、図３中の同じ排他グループ内に含まれるコマンドどうしの組み合わせでなければ、図３中に示される２つ以上のコマンドを同時に実行することが可能である。例えば、図３中の”ｍｕｌｔ”、”ｗ＿ｍｕｌｔ”、”ｗ＿ａｄｄｓ”、及び”ａｄｄ”を同時に実行させることができる。従って、図４に示されるプログラムシート５０において、”ｍｕｌｔ”、”ｗ＿ｍｕｌｔ”、”ｗ＿ａｄｄｓ”、及び”ａｄｄ”を同じ行に記載することができる。 In the figure, two or more commands in the group described as “exclusive” cannot be executed simultaneously. Therefore, in the program sheet 50 shown in FIG. 4, two or more commands in the group described as “exclusive” in FIG. 3 (hereinafter referred to as an exclusive group) cannot be described in the same line. For example, in the program sheet 50, two or more commands (for example, “add” and “sub_ab”) in the first exclusive group (a group composed of “add”, “sub_ab”, and “sub_ba”) in FIG. ) Cannot appear on the same line. On the other hand, two or more commands shown in FIG. 3 can be executed at the same time unless the commands are included in the same exclusive group in FIG. For example, “mult”, “w_multit”, “w_adds”, and “add” in FIG. 3 can be executed simultaneously. Therefore, in the program sheet 50 shown in FIG. 4, “mult”, “w_multi”, “w_adds”, and “add” can be described in the same line.

図３において、３５は、第２オペランド４０内のラベルに対応したアドレスに格納された、メモリ１９、２０上のデータの正負の符号を反転させるためのビットを表す。３６は、第１オペランド３９内のラベルに対応したアドレスに格納された、メモリ１９、２０上のデータの正負の符号を反転させるためのビットを表す。また、３４は、メモリ２１、２２へのライト・イネーブル信号に対応したビットを表し、３３は、メモリ１９、２０へのライト・イネーブル信号に対応したビットを表す。さらにまた、３７は、ＭＵＸ２におけるデータの流れを切り替えるためのビットであり、３８は、ＭＵＸ６におけるデータの流れを切り替えるためのビットである。なお、第２コマンド部における３７、３８以外のビットは、演算部３内のＭＵＸ２、６以外のマルチプレクサにおけるデータの流れを切り替えるためのビットである。 In FIG. 3, 35 represents a bit for inverting the sign of the data on the memories 19 and 20 stored at the address corresponding to the label in the second operand 40. Reference numeral 36 denotes a bit for inverting the sign of the data on the memories 19 and 20 stored at the address corresponding to the label in the first operand 39. Reference numeral 34 denotes a bit corresponding to the write enable signal to the memories 21 and 22, and 33 denotes a bit corresponding to the write enable signal to the memories 19 and 20. Furthermore, 37 is a bit for switching the data flow in MUX2, and 38 is a bit for switching the data flow in MUX6. The bits other than 37 and 38 in the second command part are bits for switching the data flow in the multiplexers other than MUX2 and 6 in the arithmetic part 3.

図４は、図３に示されるコマンドを組み合せたプログラムを入力するためのエディターである、プログラムシートを示す。図中のプログラムシート５０に示されるプログラムは、ＩＩＲ（infinite impulse response:無限長インパルス応答）フィルタにおける処理の一部を記載したものである。なお、このコプロセッサ１においては、パイプライン処理の段数が４段であるので、乗算処理や加算処理の結果を、これらの処理から見て４つ以上後の処理において、メモリ１９乃至２２に書き込む必要がある。ただし、パイプライン処理の段数を変更すれば、乗算処理や加算処理の結果を、いくつ後の処理で書き込む必要があるかという点を変えることができる。 FIG. 4 shows a program sheet which is an editor for inputting a program combining the commands shown in FIG. The program shown in the program sheet 50 in the figure describes a part of processing in an IIR (infinite impulse response) filter. In this coprocessor 1, since the number of stages of pipeline processing is four, the results of multiplication processing and addition processing are written in the memories 19 to 22 in processing four or more times after these processing. There is a need. However, if the number of stages of pipeline processing is changed, it is possible to change the number of later processes in which the results of multiplication processing and addition processing need to be written.

プログラムシート５０における１行目のコマンド”ｍｕｌｔｃｏｎｓｔ１”は、外部から入力されたデータ（図１中におけるＭＵＸ１７から入力されたＥＸＴ＿Ｄａｔａ）を、定数”１”と掛けるためのコマンドである。この乗算の結果は、矢印（１）に示されるように、５行目の２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｃ”により、加算器１５側のメモリ１９及び２０における、ラベルｖ＿ｖｃに対応したアドレスに保存される。このように、外部から入力されたデータに定数”１”を掛けて加算器１５側のメモリ１９及び２０に格納する処理を行った理由は、外部から入力されたデータを直接加算器１５側のメモリ１９及び２０に書き込むルートがないからである。上記のような乗算処理を行うことにより、外部から入力されたデータを加算器１５側のメモリ１９及び２０に書き込むことができる。これは、本実施形態のコプロセッサ１の構造をシンプルにするための工夫の一つである。 The command “multi const1” on the first line in the program sheet 50 is a command for multiplying externally input data (EXT_Data input from the MUX 17 in FIG. 1) by the constant “1”. The result of this multiplication is stored at the address corresponding to the label v_vc in the memories 19 and 20 on the adder 15 side by the second command “w_multiv_vc” on the fifth line as shown by the arrow (1). Is done. As described above, the reason why the data input from the outside is multiplied by the constant “1” and stored in the memories 19 and 20 on the adder 15 side is that the data input from the outside is directly added to the adder 15 side. This is because there is no route for writing to the memories 19 and 20. By performing the multiplication process as described above, externally input data can be written in the memories 19 and 20 on the adder 15 side. This is one of the ideas for simplifying the structure of the coprocessor 1 of the present embodiment.

プログラムシート５０における３行目のコマンド”ｍｕｌｔｖ＿ｖｂｖ＿ａａ”は、乗算器１６側のメモリ２１又は２２における、ラベルｖ＿ｖｂに対応したアドレスに保存されたデータと、乗算器１６側のメモリ２１又は２２における、ラベルｖ＿ａａに対応したアドレスに保存されたデータとの乗算を行うためのコマンドである。この乗算の結果は、矢印（２）に示されるように、７行目の２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｅ”により、加算器１５側のメモリ１９及び２０における、ラベルｖ＿ｖｅに対応したアドレスに保存される。 The command “multiv v_vb v_aa” on the third line in the program sheet 50 is the data stored at the address corresponding to the label v_vb in the memory 21 or 22 on the multiplier 16 side and the memory 21 or 22 on the multiplier 16 side. , A command for performing multiplication with data stored at an address corresponding to the label v_aa. The result of this multiplication is stored in the address corresponding to the label v_ve in the memories 19 and 20 on the adder 15 side by the second command “w_multiv_ve” on the seventh line, as shown by the arrow (2). Is done.

プログラムシート５０における５行目のコマンド”ｍｕｌｔｖ＿ｖｉｖ＿ｂｂ”は、乗算器１６側のメモリ２１又は２２における、ラベルｖ＿ｖｉに対応したアドレスに保存されたデータと、乗算器１６側のメモリ２１又は２２における、ラベルｖ＿ｂｂに対応したアドレスに保存されたデータとの乗算を行うためのコマンドである。この乗算の結果は、矢印（４）に示されるように、９行目の２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｆ”により、加算器１５側のメモリ１９及び２０における、ラベルｖ＿ｖｆに対応したアドレスに保存される。 The command “multiv v_vi v_bb” on the fifth line in the program sheet 50 is the data stored in the address corresponding to the label v_vi in the memory 21 or 22 on the multiplier 16 side and the memory 21 or 22 on the multiplier 16 side. , A command for performing multiplication with data stored at an address corresponding to the label v_bb. The result of this multiplication is stored at the address corresponding to the label v_vf in the memories 19 and 20 on the adder 15 side by the second command “w_multiv_vf” on the ninth line, as shown by the arrow (4). Is done.

上記７行目の２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｅ”により、加算器１５側のメモリ１９及び２０における、ラベルｖ＿ｖｅに対応したアドレスに保存された乗算結果と、上記９行目の２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｆ”により、加算器１５側のメモリ１９及び２０における、ラベルｖ＿ｖｆに対応したアドレスに保存された乗算結果は、矢印（３）及び（５）に示されるように、１０行目の３つ目のコマンド”ａｄｄｖ＿ｖｆｖ＿ｖｅ”で、加算処理に用いられる。そして、このコマンド”ａｄｄｖ＿ｖｆｖ＿ｖｅ”による加算の結果は、矢印（６）に示されるように、１４行目の４つ目のコマンド”ｗ＿ａｄｄｓｖ＿ｖｄ”により、乗算器１６側のメモリ２１及び２２における、ラベルｖ＿ｖｄに対応したアドレスに保存される。 The multiplication command stored in the address corresponding to the label v_ve in the memories 19 and 20 on the adder 15 side by the second command “w_multiv_ve” on the seventh line and the second command on the ninth line The multiplication result stored at the address corresponding to the label v_vf in the memories 19 and 20 on the adder 15 side by the command “w_multiv_vf” is as shown in arrows (3) and (5). The third command “add v_vf v_ve” is used for addition processing. As a result of the addition by the command “add v_vf v_ve”, the fourth command “w_adds v_vd” on the fourteenth line shows the result of addition in the memories 21 and 22 on the multiplier 16 side, as shown by the arrow (6). , And stored in an address corresponding to the label v_vd.

７行目のコマンド”ｍｕｌｔｖ＿ｖｂｃｏｎｓｔ１”は、乗算器１６側のメモリ２１又は２２における、ラベルｖ＿ｖｂに対応したアドレスに保存されたデータと、定数”１”とを掛けるためのコマンドである。この乗算の結果は、矢印（７）に示されるように、１１行目の２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｉ”により、加算器１５側のメモリ１９及び２０における、ラベルｖ＿ｖｉに対応したアドレスに保存される。この乗算結果は、矢印（８）に示されるように、１２行目の３つ目のコマンド”ａｄｄｖ＿ｖｉｃｏｎｓｔ０”で、加算処理に用いられる。このコマンドは、加算器１５側のメモリ１９又は２０における、ラベルｖ＿ｖｉに対応したアドレスに保存されたデータと、定数”０”とを加算するためのコマンドである。このコマンドによる加算の結果は、矢印（９）に示されるように、１４行目の４つ目のコマンド”ｗ＿ａｄｄｓｖ＿ｖｉ”により、乗算器１６側のメモリ２１及び２２における、ラベルｖ＿ｖｉに対応したアドレスに保存される。 The command “multi v_vb const1” on the seventh line is a command for multiplying the data stored in the address corresponding to the label v_vb in the memory 21 or 22 on the multiplier 16 side by the constant “1”. The result of this multiplication is stored at the address corresponding to the label v_vi in the memories 19 and 20 on the adder 15 side by the second command “w_multiv_vi” on the eleventh line, as shown by the arrow (7). Is done. This multiplication result is used for the addition process by the third command “add v_vi const0” on the 12th line as shown by the arrow (8). This command is a command for adding the data stored in the address corresponding to the label v_vi and the constant “0” in the memory 19 or 20 on the adder 15 side. As a result of the addition by this command, the address corresponding to the label v_vi in the memories 21 and 22 on the multiplier 16 side is obtained by the fourth command “w_adds v_vi” on the 14th line, as shown by the arrow (9). Saved in.

上記のように、乗算器１６側のメモリ２１又は２２における、ラベルｖ＿ｖｂに対応したアドレスに保存されたデータに定数”１”を掛けて加算器１５側のメモリ１９及び２０に格納する処理（７行目のコマンド”ｍｕｌｔｖ＿ｖｂｃｏｎｓｔ１”）を行った後に、この加算器１５側のメモリ１９又は２０に格納された乗算結果と、定数”０”とを加算して乗算器１６側のメモリ２１及び２２における、ラベルｖ＿ｖｉに対応したアドレスに格納する処理を行った理由は、乗算器１６側のメモリ２１又は２２における、所定のアドレスに保存されたデータを、乗算器１６側のメモリ２１及び２２における、他のアドレスに直接書き込むルートがないからである。上記のような処理を行うことにより、乗算器１６側のメモリ上の所定のアドレスに保存されたデータを、同じメモリ上の他のアドレスにコピーすることができる。これも、本実施形態のコプロセッサ１の構造をシンプルにするための工夫の一つである。 As described above, the data stored in the address corresponding to the label v_vb in the memory 21 or 22 on the multiplier 16 side is multiplied by the constant “1” and stored in the memories 19 and 20 on the adder 15 side (7 After executing the command “multiv_vb const1”) on the line, the multiplication result stored in the memory 19 or 20 on the adder 15 side and the constant “0” are added to the memory 21 on the multiplier 16 side. The reason for performing the process of storing in the address corresponding to the label v_vi in 22 is that the data stored at a predetermined address in the memory 21 or 22 on the multiplier 16 side is stored in the memories 21 and 22 on the multiplier 16 side. This is because there is no route to write directly to another address. By performing the processing as described above, data stored at a predetermined address on the memory on the multiplier 16 side can be copied to another address on the same memory. This is also one of the ideas for simplifying the structure of the coprocessor 1 of the present embodiment.

また、図４には示していないが、上記の７行目のコマンド”ｍｕｌｔｖ＿ｖｂｃｏｎｓｔ１”を、１行目のコマンド”ｍｕｌｔｃｏｎｓｔ１”に置き換えることにより、外部から入力されたデータを乗算器１６側のメモリ２１及び２２に書き込むことができる。これも、本実施形態のコプロセッサ１の構造をシンプルにするための工夫の一つである。 Although not shown in FIG. 4, the command “multi v_vb const1” on the seventh line is replaced with the command “multi const1” on the first line, so that the data input from the outside is on the multiplier 16 side. The memories 21 and 22 can be written. This is also one of the ideas for simplifying the structure of the coprocessor 1 of the present embodiment.

本実施形態のコプロセッサ１は、プログラムシート５０における同じ行に記載された各コマンド（例えば、５行目の１つ目のコマンド”ｍｕｌｔｖ＿ｖｉｖ＿ｂｂ”と、２つ目のコマンド”ｗ＿ｍｕｌｔｖ＿ｖｃ”と）を並列に処理することができる。すなわち、コプロセッサ１は、ｍｕｌｔコマンドによる乗算処理と、ｗ＿ｍｕｌｔコマンドによる乗算結果の加算器１５側のメモリ１９及び２０への書き込み処理と、ａｄｄコマンドによる加算処理と、ｗ＿ａｄｄｓコマンドによる加算結果の乗算器１６側のメモリ２１及び２２への書き込み処理等を並列に処理することができる。 The coprocessor 1 according to the present embodiment includes each command (for example, the first command “multiv_vi v_bb” in the fifth row and the second command “w_multiv_vc” described in the same row in the program sheet 50. ) Can be processed in parallel. That is, the coprocessor 1 performs multiplication processing by a multi command, writing processing of the multiplication result by the w_multi command to the memories 19 and 20 on the adder 15 side, addition processing by the add command, and multiplication multiplier of the addition result by the w_adds command. Write processing to the 16-side memories 21 and 22 can be performed in parallel.

上記のように、乗算処理と、乗算結果の加算器１５側のメモリへの書き込み処理と、加算処理と、加算結果の乗算器１６側のメモリへの書き込み処理とを並列に処理することができる理由は、従来のように、演算処理用のＡＬＵを用いるのではなく、加算専用の演算器である加算器１５と、乗算専用の演算器である乗算器１６とで演算を行うようにする共に、データメモリを、加算器１５側のメモリ１９及び２０と、乗算器１６側のメモリ２１及び２２とに分けたことによる。 As described above, the multiplication process, the process of writing the multiplication result into the memory on the adder 15 side, the addition process, and the process of writing the addition result into the memory on the multiplier 16 side can be processed in parallel. The reason is that, instead of using an ALU for arithmetic processing as in the prior art, an operation is performed by an adder 15 that is an arithmetic unit dedicated to addition and a multiplier 16 that is an arithmetic unit dedicated to multiplication. This is because the data memory is divided into memories 19 and 20 on the adder 15 side and memories 21 and 22 on the multiplier 16 side.

次に、図５及び図６を参照して、上記のコプロセッサ１についてのハードウェア記述言語レベルのソースに基づく論理合成結果を、ＦＰＧＡ１０にダウンロードする手順について説明する。ユーザは、プログラムメモリ１２に格納するプログラムを作成して、パソコン上でこのプログラムのシミュレーション（オフライン・デバッグ）を行う。そして、このシミュレーションによる検証結果がＯＫになると、パソコンは、ユーザによる指示操作に応じて、コプロセッサ１についてのハードウェア記述言語（例えばＶＨＤＬ(VHSIC Hardware Description Language））レベルのソースの生成と（Ｓ１）、ＦＰＧＡ１０上の他のＩＰ７６、７７についてのハードウェア記述言語レベルのソースの生成（＃２）を行った後、これらのソースの論理合成を行い（＃３）、その論理合成結果をＦＰＧＡ１０内のＲＯＭ７５にダウン・ロードする（＃４）。このＲＯＭ７５が、請求項７における、コンピュータ読み取り可能な記録媒体に相当する。 Next, a procedure for downloading a logic synthesis result based on the hardware description language level source for the coprocessor 1 to the FPGA 10 will be described with reference to FIGS. 5 and 6. The user creates a program to be stored in the program memory 12, and performs simulation (offline debugging) of this program on a personal computer. When the verification result by the simulation is OK, the personal computer generates a hardware description language (for example, VHDL (VHSIC Hardware Description Language)) level source for the coprocessor 1 according to an instruction operation by the user (S1). ) After generating the hardware description language level source for other IPs 76 and 77 on the FPGA 10 (# 2), the logic synthesis of these sources is performed (# 3), and the logic synthesis result is stored in the FPGA 10 Is downloaded to the ROM 75 (# 4). The ROM 75 corresponds to a computer-readable recording medium according to claim 7.

上述したように、本実施形態によるコプロセッサ１によれば、演算用回路を、主に、乗算器１６と、加算器１５と、乗算結果格納専用メモリ１９及び２０と、加算結果格納専用メモリ２１及び２２と、制御部２とで構成したことにより、ＦＰＧＡ１０上に、簡易な構成の演算用回路を構築することができるので、ＦＰＧＡ１０上における演算ロジックの省スペース化を図ることができる。また、ＦＰＧＡ１０内のデータ・メモリを、乗算結果格納専用メモリ１９及び２０と加算結果格納専用メモリ２１及び２２とに分けて、加算器１５は、乗算結果格納専用メモリ１９及び２０に格納された複数個のデータのうちの２つのデータを加算し、乗算器１６は、加算結果格納専用メモリ２１及び２２に格納された複数個のデータのうちの２つのデータを乗算するようにしたことにより、コプロセッサ１による加算処理と乗算処理とを並行して実行することができる。 As described above, according to the coprocessor 1 according to the present embodiment, the arithmetic circuit mainly includes the multiplier 16, the adder 15, the multiplication result storage dedicated memories 19 and 20, and the addition result storage dedicated memory 21. And 22 and the control unit 2 make it possible to construct an arithmetic circuit with a simple configuration on the FPGA 10, so that the arithmetic logic on the FPGA 10 can be saved in space. Further, the data memory in the FPGA 10 is divided into multiplication result storage dedicated memories 19 and 20 and addition result storage dedicated memories 21 and 22, and the adder 15 includes a plurality of multiplication result storage dedicated memories 19 and 20. Two of the data are added, and the multiplier 16 multiplies two of the plurality of data stored in the addition result storage dedicated memories 21 and 22 so that the The addition processing and multiplication processing by the processor 1 can be executed in parallel.

ここで、適応ディジタルフィルタにおけるフィルタリング等のディジタル信号処理においては、加算処理と乗算処理が交互に行われることが多いので、上記のように加算処理と乗算処理とを並行して実行することができるようにしたことにより、従来のＡＳＩＣに用いられているＣＰＵコアをＦＰＧＡ又はプログラマブル・ロジック・デバイス（以下、ＰＬＤと略す）に直接組み込んだ場合と比べて、クロックの周波数が同程度の場合には、ディジタル信号処理をより高速に実行することができ、特に、ＦＰＧＡにおいて浮動小数点演算を実行する場合における処理速度を高速化することができる。具体的には、本実施形態によるコプロセッサ１を浮動小数点演算に用いた場合、コプロセッサ１におけるクロックの周波数が５０ＭＢ／Ｓであるとき、クロックの周波数が３００ＭＢ／Ｓの通常のＤＳＰ（Digital Signal Processor）に比べて、処理速度を１０倍の速さにすることができる。 Here, in digital signal processing such as filtering in an adaptive digital filter, addition processing and multiplication processing are often performed alternately, so that addition processing and multiplication processing can be executed in parallel as described above. As a result, when the CPU core used in a conventional ASIC is directly incorporated into an FPGA or programmable logic device (hereinafter abbreviated as PLD), the clock frequency is similar. The digital signal processing can be executed at a higher speed, and in particular, the processing speed can be increased when a floating point operation is executed in the FPGA. Specifically, when the coprocessor 1 according to the present embodiment is used for floating-point arithmetic, when a clock frequency in the coprocessor 1 is 50 MB / S, a normal DSP (Digital Signal) with a clock frequency of 300 MB / S is used. Compared with a processor, the processing speed can be 10 times faster.

上記の制御部２は、乗算結果格納専用メモリ１９、２０、及び加算結果格納専用メモリ２１、２２に対する、アドレス指示用の制御線Ｌ１，Ｌ２，Ｌ６，Ｌ７，Ｌ１１，Ｌ１２，Ｌ１６，Ｌ１７とライト・イネーブル信号出力用の制御線Ｌ２１，Ｌ２２，Ｌ２３，Ｌ２４とを有し、プログラムメモリ１２からマイクロ命令を読み込み、このマイクロ命令に含まれる水平型マイクロコードを構成する各ビットのオン／オフ情報を上記の各制御線を介して乗算結果格納専用メモリ１９、２０及び加算結果格納専用メモリ２１、２２に伝達することにより、乗算結果格納専用メモリ１９、２０及び加算結果格納専用メモリ２１、２２に対するデータの読み書きを制御するようにした。これにより、制御部２が、命令をデコードしてレジスタやメモリに対する制御信号を生成することなく、乗算結果格納専用メモリ１９、２０及び加算結果格納専用メモリ２１、２２に対するデータの読み書きを制御することができるので、制御部２の行う処理を簡略化することができる。従って、制御部２を簡易な構成とすることができると共に、乗算結果格納専用メモリ１９、２０及び加算結果格納専用メモリ２１、２２に対するデータの読み書きの処理の高速化を図ることができる。 The control unit 2 writes address indication control lines L1, L2, L6, L7, L11, L12, L16, and L17 to the multiplication result storage dedicated memories 19 and 20 and the addition result storage dedicated memories 21 and 22. -Control signal L21, L22, L23, L24 for enabling signal output, reads a microinstruction from the program memory 12, and stores on / off information of each bit constituting a horizontal microcode included in the microinstruction Data for the multiplication result storage dedicated memories 19 and 20 and the addition result storage dedicated memories 21 and 22 are transmitted to the multiplication result storage dedicated memories 19 and 20 and the addition result storage dedicated memories 21 and 22 through the control lines. Controlled reading and writing. Thereby, the control unit 2 controls the reading / writing of the data with respect to the multiplication result storage dedicated memories 19 and 20 and the addition result storage dedicated memories 21 and 22 without decoding the instruction and generating a control signal for the register and the memory. Therefore, the process performed by the control unit 2 can be simplified. Therefore, the control unit 2 can have a simple configuration, and the speed of data read / write processing for the multiplication result storage dedicated memories 19 and 20 and the addition result storage dedicated memories 21 and 22 can be increased.

また、加算器１５による加算結果のデータを一時的に格納するためのレジスタであるＡ＿Ｒｅｇ２３をさらに備えるようにしたことにより、コプロセッサ１におけるディジタル信号処理において、加算を連続して実行する場合における処理の高速化を図ることができる。 Further, by further providing A_Reg 23 which is a register for temporarily storing data of the addition result by the adder 15, processing in the case where addition is continuously executed in the digital signal processing in the coprocessor 1. Can be speeded up.

なお、本発明は、上記実施形態に限られるものではなく、様々な変形が可能である。例えば、上記の実施形態では、本発明によるコプロセッサ１が構築される論理集積回路が、ＦＰＧＡである場合の例を示したが、本発明によるコプロセッサ１をＦＰＧＡ以外の再プログラミング可能な論理集積回路（例えばＰＬＤ）に構築してもよい。また、上記の実施形態では、コプロセッサ１におけるパイプライン処理の段数が４段である場合の例を示したが、コプロセッサにおけるパイプライン処理の段数は、これに限られない。 The present invention is not limited to the above embodiment, and various modifications can be made. For example, in the above-described embodiment, an example in which the logic integrated circuit in which the coprocessor 1 according to the present invention is constructed is an FPGA is shown. However, the coprocessor 1 according to the present invention can be reprogrammed with a logic integration other than the FPGA. A circuit (eg, PLD) may be constructed. In the above-described embodiment, an example in which the number of stages of pipeline processing in the coprocessor 1 is four has been described. However, the number of stages of pipeline processing in the coprocessor is not limited thereto.

本発明の一実施形態による論理集積回路上のコプロセッサの構成図。The block diagram of the coprocessor on the logic integrated circuit by one Embodiment of this invention. （ａ）は、従来のプロセッサにおけるデータフローのイメージを示す図、（ｂ）は、上記本実施形態によるコプロセッサ内の演算部におけるデータフローのイメージを示す図。(A) is a figure which shows the image of the data flow in the conventional processor, (b) is a figure which shows the image of the data flow in the calculating part in the coprocessor by the said this embodiment. 上記図１中のプログラムメモリに格納されるマイクロ命令の元になるコマンドとマイクロ命令との対応関係を示す表。The table | surface which shows the correspondence of the command used as the origin of the microinstruction stored in the program memory in the said FIG. 1, and a microinstruction. 上記図３に示されるコマンドを組み合せたプログラムを入力するためのエディターである、プログラムシートを示す図。The figure which shows the program sheet which is an editor for inputting the program which combined the command shown by the said FIG. 上記コプロセッサについてのハードウェア記述言語レベルのソースに基づく論理合成結果を、ＦＰＧＡにダウンロードする手順を示すフローチャート。The flowchart which shows the procedure which downloads the logic synthesis result based on the hardware description language level source | sauce about the said coprocessor to FPGA. 上記の論理合成結果のダウン・ロードとダウン・ロード後の実機テストに用いられる機器の構成図。The block diagram of the apparatus used for the down load of said logic synthesis result, and the actual machine test after a down load.

Explanation of symbols

１コプロセッサ（演算用回路）
２制御部
１０ＦＰＧＡ
１２プログラムメモリ
１５加算器
１６乗算器
１９メモリ（乗算結果格納専用メモリ）
２０メモリ（乗算結果格納専用メモリ）
２１メモリ（加算結果格納専用メモリ）
２２メモリ（加算結果格納専用メモリ）
２３Ａ＿Ｒｅｇ（加算器による演算結果のデータを一時的に格納するためのレジスタ）
Ｌ１，Ｌ２，Ｌ６，Ｌ７，Ｌ１１，Ｌ１２，Ｌ１６，Ｌ１７制御線（アドレス指示用の制御線）
Ｌ２１，Ｌ２２，Ｌ２３，Ｌ２４制御線（ライト・イネーブル信号出力用の制御線）
７５ＲＯＭ（コンピュータ読み取り可能な記録媒体） 1 Coprocessor (arithmetic circuit)
2 Control unit 10 FPGA
12 program memory 15 adder 16 multiplier 19 memory (dedicated memory for storing multiplication results)
20 memories (dedicated memory for storing multiplication results)
21 memory (memory dedicated to storing addition results)
22 Memory (Dedicated result storage memory)
23 A_Reg (register for temporarily storing data of the operation result by the adder)
L1, L2, L6, L7, L11, L12, L16, L17 Control lines (control lines for address indication)
L21, L22, L23, L24 Control line (control line for write enable signal output)
75 ROM (computer-readable recording medium)

Claims

In a logic integrated circuit such as a field programmable gate array having an arithmetic circuit,
The arithmetic circuit is:
A multiplier,
An adder;
A multiplication result storage dedicated memory capable of storing a plurality of data of operation results by the multiplier;
An addition result storage dedicated memory capable of storing a plurality of operation result data by the adder;
A control unit for controlling each part of these circuits is provided.
The adder adds two data out of a plurality of data stored in the multiplication result storage dedicated memory,
2. The logic integrated circuit according to claim 1, wherein the multiplier multiplies two data among a plurality of data stored in the addition result storage dedicated memory.

The controller is
An address corresponding to the data to be added to the multiplication result storage dedicated memory for outputting the data to be added to the adder from the plurality of data stored in the multiplication result storage dedicated memory. Instruct
In order to output data to be multiplied to the multiplier from among a plurality of data stored in the addition result storage dedicated memory, an address corresponding to the data to be multiplied is added to the addition result storage dedicated memory. The logic integrated circuit according to claim 1, wherein:

The control unit has a program memory storing microinstructions including horizontal microcode,
The control unit has control lines for address indication and write enable signal output for the multiplication result storage dedicated memory and the addition result storage dedicated memory, and reads the microinstruction from the program memory. By transmitting ON / OFF information of each bit constituting the included horizontal microcode to the multiplication result storage dedicated memory and the addition result storage dedicated memory via the control line, the multiplication result storage dedicated memory and the addition result 3. The logic integrated circuit according to claim 2, wherein reading / writing of data with respect to the storage-only memory is controlled.

4. The logic integrated circuit according to claim 1, further comprising a register for temporarily storing data of a calculation result by the adder.

5. The logic integrated circuit according to claim 1, wherein the arithmetic processing by the multiplier and the arithmetic processing by the adder can be executed simultaneously.

A source for an arithmetic circuit on a logic integrated circuit,
6. The source of an arithmetic circuit, wherein the source is a hardware description language level source for the arithmetic circuit on the logic integrated circuit according to any one of claims 1 to 5.

A computer-readable recording medium recording a source for an arithmetic circuit on a logic integrated circuit,
6. The computer in which the source of the arithmetic circuit is recorded, wherein the source is a hardware description language level source for the arithmetic circuit on the logic integrated circuit according to any one of claims 1 to 5. A readable recording medium.