JP2003167730A

JP2003167730A - Instruction set variable microprocessor

Info

Publication number: JP2003167730A
Application number: JP2001368080A
Authority: JP
Inventors: Shunzo Yamashita; 春造山下
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2001-12-03
Filing date: 2001-12-03
Publication date: 2003-06-13

Abstract

<P>PROBLEM TO BE SOLVED: To provide a microprocessor provided with speed performance equal to that of a microprocessor and high flexibility that programmable logic has and capable of rewriting an instruction set to provide an SoC having such microprocessor. <P>SOLUTION: A control logic part of an instruction decoder and a part of a data path part are constituted by the programmable logic PL1 to rewrite the contents of the programmable logic in accordance with the instruction set. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術の分野】本発明は、ユーザーサイド
で命令セットを柔軟に変更できるマイクロプロセッサ
と、このようなマイクロプロセッサと周辺回路とをワン
チップに集積した半導体集積回路に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a microprocessor capable of flexibly changing an instruction set on the user side, and a semiconductor integrated circuit in which such a microprocessor and peripheral circuits are integrated in one chip.

【０００２】[0002]

【従来の技術】半導体集積回路の集積度の大幅な向上に
より、これまでプリント基板上に複数の半導体チップ
（通常はマイクロプロセッサチップとインタフェース回
路等の周辺回路から構成される）で構成されていたシス
テムをワンチップに集積する（Ｓｙｓｔｅｍ−ｏｎ−ａ
―Ｃｈｉｐ（ＳｏＣ）という）ことが可能になってい
る。ワンチップ化によりチップの製造コストが削減さ
れ、プリント基板も小型化できるため、システムコスト
が低減される。また、プリント基板上のチップ間配線
が、それよりも桁違いに短いチップ内配線に置き換えら
れるので、配線容量が減り、その結果、システム全体の
低消費電力化や高速化が可能となる。このように、Ｓｏ
Ｃは多くの優れた特徴を有する。2. Description of the Related Art Due to a great improvement in the degree of integration of semiconductor integrated circuits, a plurality of semiconductor chips (usually composed of a microprocessor chip and peripheral circuits such as interface circuits) have been formed on a printed circuit board. Integrate system into one chip (System-on-a
-Chip (SoC)) is now possible. Since the chip manufacturing cost is reduced and the printed circuit board can be downsized by the one-chip implementation, the system cost is reduced. In addition, since the inter-chip wiring on the printed circuit board is replaced by an intra-chip wiring that is orders of magnitude shorter than that, wiring capacitance is reduced, and as a result, low power consumption and high speed of the entire system can be achieved. Thus, So
C has many excellent features.

【０００３】しかし、ＳｏＣでは、使用されるアプリケ
ーション毎に必要となる周辺回路の組み合わせが変わる
ため、そのアプリケーションに特化した専用のマスクパ
ターンを起こして、専用のＳｏＣを製造する必要があ
る。このため、（１）ワンチップ化したい周辺回路とマ
イクロプロセッサが一体となった設計データを、Ｖｅｒ
ｉｌｏｇＨＤＬやＶＨＤＬ等のハードウエア記述言語
（ＨＤＬ：ＨａｒｄｗａｒｅＤｅｓｃｒｉｐｔｉｏｎ
Ｌａｎｇｕａｇｅ）で作成し、論理合成を行い、回路
データを作成する工程、（２）回路データからマスクパ
ターンを作成して、そのマスクパターンに従って、チッ
プを製造する工程が必要となる。（１）の設計工程に最
低でも1ヶ月程度、（２）の製造工程も同様に1ヶ月程度
要するため、ＳｏＣの仕様が決定してから出来上がるま
での期間（ＴＡＴ（ＴｕｒｎＡｒｒｏｕｎｄＴｉｍ
ｅ）という）には、最低でも数ヶ月程度必要になる。However, in the SoC, the combination of peripheral circuits required varies depending on the application used, so it is necessary to raise a dedicated mask pattern specialized for the application to manufacture the dedicated SoC. Therefore, (1) the design data in which the peripheral circuit to be integrated into one chip and the microprocessor are integrated is Ver.
A hardware description language (HDL: Hardware description) such as iHDL HDL or VHDL.
(Language) to perform logic synthesis to create circuit data, and (2) a step of creating a mask pattern from the circuit data and manufacturing a chip according to the mask pattern. The design process of (1) requires at least about one month, and the manufacturing process of (2) also requires about one month. Therefore, the period from the determination of the SoC specifications to the completion (TAT (Turn Round Time Tim)
e)) requires at least several months.

【０００４】一方、携帯電話等に代表されるように、激
しい競争による商品サイクルの短期化に伴い、ＳｏＣに
代表される半導体チップでは、ＴＡＴの大幅な削減が重
要なテーマである。このため、ＴＡＴを削減するために
様々な方法が提案されている。On the other hand, in a semiconductor chip typified by SoC, a significant reduction in TAT is an important theme as the product cycle is shortened due to fierce competition as typified by mobile phones and the like. Therefore, various methods have been proposed to reduce TAT.

【０００５】第１の方法が、ＩＰ（Ｉｎｔｅｌｌｅｃｕ
ｔｕａｌＰｒｏｐｅｒｔｙ）の再利用である。日経エ
レクトロニクス１９９８年８月１０日号、ｐｐ．１００
〜ｐｐ．１０９（文献１）に示されるように、ＳｏＣを
構成する個別モジュールのインターフェース仕様を共通
化し、共通化した仕様に従って予め個別モジュールを設
計する。これにより、個別モジュールの組み合わせを変
更して新たなＳｏＣを設計する場合にも、設計データの
大半を流用できるようになり、設計期間を削減する事が
可能となる。しかし、これのみでは製造工程に要する期
間については短縮できない。The first method is IP (Intellecu).
reuse of the real property). Nikkei Electronics August 10, 1998, pp. 100
~ Pp. As shown in 109 (Reference 1), the interface specifications of the individual modules forming the SoC are made common, and the individual modules are designed in advance according to the common specifications. As a result, even when a combination of individual modules is changed and a new SoC is designed, most of the design data can be used and the design period can be shortened. However, this alone cannot shorten the period required for the manufacturing process.

【０００６】第２の方法が、ユーザがチップの回路構成
を変更する事が可能なプログラマブルロジックの利用で
ある。プログラマブルロジックは、ＦＰＧＡ（Ｆｉｅｌ
ｄＰｒｏｇｒａｍａｂｌｅＧａｔｅＡｒｒａｙ）や、
ＰＬＤ（ＰｒｏｇｒａｍａｂｌｅＬｏｇｉｃＤｅｖｉ
ｃｅ）等と呼ばれ、いくつかの種類がある。プログラマ
ブルロジックの構成については、例えば、ＫＬＵＷＥＲ
ＡＣＡＤＥＭＩＣＰＵＢＬＩＳＨＥＲＳ、ＡＲＣＨＩ
ＴＥＣＨＴＵＲＥＡＮＤＣＡＤＦＯＲＤＥＥＰ
−ＳＵＢＭＩＣＲＯＮＦＰＧＡｓ、ｐｐ．１〜ｐｐ．
２１（文献２）等に開示されている。ＬＵＴ（Ｌｏｏｋ
ＵｐＴａｂｌｅ）の内容とプログラマブルな配線の
構成を書き換える事により、様々な論理機能を持った論
理回路を実現する事が可能である。The second method is the use of programmable logic that allows the user to change the circuit configuration of the chip. Programmable logic is FPGA (Field)
dProgrammable Gate Array),
PLD (Programmable Logic Device)
ce) etc., and there are several types. For the configuration of the programmable logic, for example, KLUWER
ACADEMIC CPUBLISHERS, ARCHI
TECHTURE AND CAD FOR DEEP
-SUBMICRON FPGAs, pp. 1-pp.
21 (Reference 2) and the like. LUT (Look
By rewriting the contents of the Up Table) and the configuration of the programmable wiring, it is possible to realize a logic circuit having various logic functions.

【０００７】このプログラマブルロジックをＳｏＣに集
積する事により、チップ製造後にユーザが様々な機能の
周辺回路をプログラムすることが可能になる。これによ
り、製造工程に要する期間が実質的に短縮できる。例え
ば、日経エレクトロニクス２００１年１月１５日号、ｐ
ｐ．１５３〜ｐｐ．１６０（文献３）には、数１０万ゲ
ート規模の論理回路が実現可能なＦＰＧＡと、通常のカ
スタムチップで構成されたマイクロプロセッサコアとを
ワンチップに集積したＳｏＣが紹介されている。By integrating this programmable logic in the SoC, the user can program the peripheral circuits having various functions after the chip is manufactured. Thereby, the period required for the manufacturing process can be substantially shortened. For example, Nikkei Electronics January 15, 2001, p.
p. 153-pp. 160 (Reference 3) introduces an SoC in which an FPGA capable of realizing a logic circuit having a scale of hundreds of thousands of gates and a microprocessor core formed of a normal custom chip are integrated in one chip.

【０００８】一方、ＳｏＣが広く利用されることが期待
されるディジタル信号処理を多用する通信系のアプリケ
ーションでは、汎用のプロセッサコアの持つ汎用的な命
令セットでは満足な処理能力が得られないケースが存在
する。例えば、日経エレクトロニクス２０００年３月２
７日号、ｐｐ．２１０〜ｐｐ．２１９（文献４）では、
通信プロトコル処理で用いられるビット列の並べ替え操
作をサポートする特殊命令を追加する事により、処理速
度が１０倍以上、改善できる例が紹介されている。ま
た、同様に命令セットを変更できるプロセッサとして
は、ＭＯＲＧＡＮＫＡＵＦＭＡＮＮＰＵＢＬＩＳＨＥ
ＲＳ、ＣＯＭＰＵＴＥＲＡＲＣＨＩＴＥＣＴＵＲＥ
ＡＱＵＡＮＴＩＴＡＴＩＶＥＡＰＰＲＯＡＣＨ、ｐ
ｐ．２３８〜ｐｐ．２４５（文献５）に開示されるもの
がある。この従来例では、制御論理をマイクロプログラ
ムと呼ばれる一種のソフトウエアで構成し、それを書き
換え可能なＲＡＭ等に格納することにより命令セットを
変更している。On the other hand, in a communication system application that uses a lot of digital signal processing in which SoC is expected to be widely used, there is a case where a general-purpose instruction set of a general-purpose processor core cannot obtain a sufficient processing capacity. Exists. For example, Nikkei Electronics March 2000 2000
7th issue, pp. 210-pp. 219 (reference 4),
An example has been introduced in which the processing speed can be improved 10 times or more by adding a special instruction that supports the rearrangement operation of the bit string used in the communication protocol processing. Similarly, as a processor whose instruction set can be changed, MORGANKAUFMANN PUBLISHE
RS, COMPUTER ARCHITECTURE
A QUANTITATIVE APPROACH, p
p. 238-pp. 245 (reference 5). In this conventional example, the control logic is composed of a kind of software called a microprogram, and the instruction set is changed by storing it in a rewritable RAM or the like.

【０００９】[0009]

【発明が解決しようとする課題】文献３に開示の従来技
術では、一部の周辺回路だけがプログラマブルであり、
プロセッサコアは、ハードワイヤードロジックで構成さ
れているため、ユーザが命令セットを変更することはで
きない。汎用的な命令セットではアプリケーションによ
っては十分な性能が得られない場合が存在することは文
献４が示唆するところである。In the prior art disclosed in Document 3, only some peripheral circuits are programmable,
Since the processor core is composed of hard-wired logic, the user cannot change the instruction set. Document 4 suggests that a general-purpose instruction set may not provide sufficient performance depending on the application.

【００１０】もちろんプロセッサコアそのものをプログ
ラマブルにすれば、命令セットの変更は自由であるが、
一般に同じ半導体プロセスを使用した場合、プログラマ
ブルロジック上で形成された演算器の演算速度は、専用
のマスクパターンで作成するハードワイヤードの演算器
の演算速度の３倍程度遅くなるといわれており（文献
２）、性能劣化は免れない。また、文献５に開示の従来
技術ではマイクロプログラムを格納したＲＯＭ等のメモ
リの動作スピードが遅いため、やはり動作スピードが遅
いという問題があった。Of course, if the processor core itself is programmable, the instruction set can be changed freely.
Generally, when the same semiconductor process is used, the operation speed of an arithmetic unit formed on a programmable logic is said to be about three times as slow as the operation speed of a hard-wired arithmetic unit created with a dedicated mask pattern (Reference 2). ), Performance degradation is unavoidable. Further, in the conventional technique disclosed in Document 5, there is a problem that the operation speed is slow because the operation speed of the memory such as the ROM storing the microprogram is slow.

【００１１】このように、プロセッサの演算速度とプロ
セッサの柔軟性（命令セットを自由に変更できるか否
か）とは従来トレードオフの関係にあった。As described above, there has conventionally been a trade-off relationship between the operation speed of the processor and the flexibility of the processor (whether or not the instruction set can be freely changed).

【００１２】本発明の目的は、フルカスタムで構成され
たプロセッサコアと同程度のスピード性能を持ち、か
つ、ＦＰＧＡ等のプログラマブルデバイスで構成した場
合と同様に、短いＴＡＴで命令セットをユーザーアプリ
ケーションに応じて変更できるマイクロプロセッサ、ま
たはそのようなプロセッサコアを有するＳｏＣを提供す
ることである。An object of the present invention is to provide a user application with an instruction set with a short TAT as in the case of having a speed performance comparable to a full-customized processor core and being composed of a programmable device such as FPGA. Providing a microprocessor that can be modified accordingly, or a SoC with such a processor core.

【００１３】また、本発明の他の目的は、アプリケーシ
ョン毎に最適化された命令セットに切り替えて最適化さ
れた命令セットで処理を行う事が可能なマイクロプロセ
ッサ、またはそのようなプロセッサコアを持ったＳｏＣ
を提供することである。Another object of the present invention is to have a microprocessor capable of switching to an optimized instruction set for each application and performing processing with the optimized instruction set, or a processor core having such a microprocessor. SoC
Is to provide.

【００１４】[0014]

【課題を解決するための手段】上記目的を達成するた
め、本発明の望ましい態様は、同一チップ上にデータパ
ス部と制御部を有し、その制御部をＦＰＧＡ等のプログ
ラマブルロジックで構成して、データパス部はプログラ
マブルロジックではないハードワイヤードロジックで構
成したマイクロプロセッサである。In order to achieve the above object, a preferred embodiment of the present invention has a data path section and a control section on the same chip, and the control section is configured by a programmable logic such as FPGA. The data path unit is a microprocessor configured by hard-wired logic rather than programmable logic.

【００１５】また、このようなマイクロプロセッサを実
装したデータ処理装置において、アプリケーションプロ
グラムに応じた命令セットで処理を行わせるように、マ
イクロプロセッサのプログラマブルロジックを書き換え
る。Further, in the data processing device having such a microprocessor mounted, the programmable logic of the microprocessor is rewritten so that the processing is performed by the instruction set according to the application program.

【００１６】[0016]

【発明の実施の形態】以下、本発明の命令セット可変マ
イクロプロセッサを、図面に示したいくつかの実施例を
参照して、詳細に説明する。なお、以下においては、同
じ参照番号のものは同じもの、もしくは類似のものを表
わすものとする。＜実施例１＞図１は、本発明の命令セット可変マイクロ
プロセッサの構成図である。命令セット可変マイクロプ
ロセッサＣ１０は、プログラマブルロジックＰＬ１とハ
ードワイヤードロジックとを含む。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A variable instruction set microprocessor of the present invention will be described in detail below with reference to some embodiments shown in the drawings. In the following, those having the same reference number represent the same or similar items. <First Embodiment> FIG. 1 is a block diagram of a variable instruction set microprocessor of the present invention. The instruction set variable microprocessor C10 includes programmable logic PL1 and hardwired logic.

【００１７】プログラマブルロジックＰＬ１の構成例を
図２０に示す。プログラマブルロジックＰＬ１は、複数
の論理ブロックＬＢ１、ＬＢ２、．．．、縦方向の配線
層ＶＬ１、ＶＬ２、．．．、横方向の配線層ＨＬ１、Ｈ
Ｌ２、．．．、および、縦方向／横方向配線層と論理ブ
ロック間の接続を設定するためのスイッチマトリクスＣ
Ｓ１、ＣＳ２、．．．、ＭＳ１、ＭＳ２、．．．から構
成される。スイッチマトリクスは、複数のプログラマブ
ルスイッチから構成される。プログラマブルスイッチの
構成例を図２１（ａ）に示す。ｎＭＯＳトランジスタＴ
１００の導通／非導通を結線情報保持メモリＭ１００で
制御することにより、配線層の接続状態が設定される。
結線情報保持メモリＭ１００は、回路構成情報保持部Ｃ
Ｍ１（図１）に属し、ＳＲＡＭや、フラッシュメモリ等
に代表される不揮発性メモリで構成される。書き換えイ
ンターフェースＣＩ１経由で、その記憶内容を外部から
書き換え可能であり、この記憶内容を書き換えることに
より配線間の接続関係を任意に変更できる。なお、配線
層上の信号電圧レベルが、ｎＭＯＳトランジスタＴ１０
０によりそのしきい値電圧分だけドロップするのを防止
するため、図２１（ｂ）に示すようにスイッチをｎＭＯ
ＳＴ１００／ｐＭＯＳＴ１１０の相補型構成にすること
が有効である。さらに、図２２に示すように、バッファ
（インバータＩＶ２００＋ＩＶ２０１、ＩＶ２０２＋Ｉ
Ｖ２０３、ＩＶ２０４＋ＩＶ２０５）を挿入するように
してもよい。縦方向配線および横方向配線の配線長が長
い等の理由により配線による信号遅延が問題になる場合
に有効なスイッチである。この構成のスイッチでは、バ
ッファにより信号レベルが回復されるので、配線遅延を
最小限に抑えることが可能である。FIG. 20 shows a configuration example of the programmable logic PL1. The programmable logic PL1 includes a plurality of logic blocks LB1, LB2 ,. ．． , Vertical wiring layers VL1, VL2 ,. ．． , Lateral wiring layers HL1, H
L2 ,. ．． , And a switch matrix C for setting the connection between the vertical / horizontal wiring layer and the logic block.
S1, CS2 ,. ．． , MS1, MS2 ,. ．． Composed of. The switch matrix is composed of a plurality of programmable switches. A configuration example of the programmable switch is shown in FIG. nMOS transistor T
The connection state of the wiring layer is set by controlling the conduction / non-conduction of 100 with the connection information holding memory M100.
The connection information holding memory M100 is a circuit configuration information holding unit C.
It belongs to M1 (FIG. 1) and is composed of a nonvolatile memory represented by SRAM, flash memory, or the like. The stored contents can be externally rewritten via the rewriting interface CI1, and the connection relation between the wirings can be arbitrarily changed by rewriting the stored contents. It should be noted that the signal voltage level on the wiring layer is the nMOS transistor T10.
In order to prevent the voltage drop by the threshold voltage due to 0, the switch is switched to nMO as shown in FIG.
It is effective to have a complementary structure of ST100 / pMOST110. Further, as shown in FIG. 22, a buffer (inverters IV200 + IV201, IV202 + I
V203, IV204 + IV205) may be inserted. This switch is effective when signal delay due to wiring becomes a problem due to the long wiring length of the vertical wiring and the horizontal wiring. In the switch having this configuration, the signal level is restored by the buffer, so that the wiring delay can be minimized.

【００１８】一方、論理ブロックＬＢ１、ＬＢ
２、．．．の構成例を図２３に示す。論理ブロックＬＢ
は、基本論理素子ＬＵＴ（Look Up Table）ＬＵ３０
０、ＬＵ３０１、ＬＵ３０２、複数ある入力端子から任
意の入力を選択するセレクタＳ３００、Ｓ３０
１、．．．、ＤタイプフリップフロップＤ３００、およ
び回路の構成情報を記憶するメモリＭ３００〜３０３か
ら構成される。基本論理素子ＬＵＴは、内蔵するメモリ
の内容を書き換えることにより任意の論理関数を実現可
能な論理素子である。基本論理素子ＬＵＴの構成例を図
２４に示す。入力ｉｎ１〜ｉｎ４に入力される論理値に
応じてトランジスタＴ４００〜Ｔ４４１の導通／非導通
が決定され、ＳＲＡＭやフラッシュメモリで構成される
論理値保持メモリＭ４００〜Ｍ４１５の値が出力され
る。論理値保持メモリＭ４００〜Ｍ４１５は、回路構成
保持部ＣＭ１に属し、書き換えインターフェースＣＩ１
経由で、その記憶内容を外部から記憶内容を書き換え可
能である。基本論理素子ＬＵＴでは、論理値保持メモリ
の内容を書き換えることにより、任意の論理機能を持っ
た論理回路が実現できる。例えば、４つの入力ｉｎ１〜
ｉｎ４が論理値“００００”の時の、実現したい論理回
路の出力の論理値を論理値保持メモリＭ４００に格納す
る。同様に、入力が“１０００”の時の、実現したい論
理回路の出力の論理値を論理値保持メモリＭ４０１に格
納する。同様に、それぞれの入力パターンに対応する出
力の論理値を、論理値保持メモリＭ４０２〜Ｍ４１５に
格納する事により、目的の論理機能を持った論理回路が
実現できる。かかる基本論理素子ＬＵＴの論理値保持メ
モリの内容を適宜設定する事により、論理関数の種類に
よらずに４入力以下のすべての論理関数を実現可能であ
る。On the other hand, the logical blocks LB1 and LB
2 ,. ．． FIG. 23 shows a configuration example of the above. Logical block LB
Is a basic logic element LUT (Look Up Table) LU30
0, LU301, LU302, selectors S300 and S30 for selecting an arbitrary input from a plurality of input terminals
1 ,. ．． , D-type flip-flop D300, and memories M300 to 303 that store circuit configuration information. The basic logic element LUT is a logic element that can realize an arbitrary logic function by rewriting the contents of the built-in memory. FIG. 24 shows a configuration example of the basic logic element LUT. The conduction / non-conduction of the transistors T400 to T441 are determined according to the logical values input to the inputs in1 to in4, and the values of the logical value holding memories M400 to M415 configured by SRAM or flash memory are output. The logical value holding memories M400 to M415 belong to the circuit configuration holding unit CM1 and have a rewrite interface CI1.
The stored contents can be rewritten from the outside via the via. In the basic logic element LUT, a logic circuit having an arbitrary logic function can be realized by rewriting the contents of the logic value holding memory. For example, four inputs in1
The logical value of the output of the logic circuit to be realized when in4 is the logical value “0000” is stored in the logical value holding memory M400. Similarly, the logic value of the output of the logic circuit to be realized when the input is "1000" is stored in the logic value holding memory M401. Similarly, by storing the logical value of the output corresponding to each input pattern in the logical value holding memories M402 to M415, a logical circuit having a desired logical function can be realized. By appropriately setting the contents of the logic value holding memory of the basic logic element LUT, it is possible to realize all the logical functions having four inputs or less, regardless of the kind of the logical function.

【００１９】このように、基本論理素子ＬＵＴにより任
意の組合せ論理回路が実現でき、論理ブロックＬＢでは
フリップフロップＤ３００を持つことにより任意の順序
回路が実現できる。さらに、上述したプログラマブルス
イッチを適宜設定して、このような論理ブロックＬＢを
複数個接続することにより、より大規模な論理関数が実
現できる。このように、回路構成保持部ＣＭ１内の結線
情報保持メモリおよび論理値保持メモリの内容を書き換
えることにより、プログラマブルロジックＰＬ１を目的
とする論理機能を持った論理回路に書き換えることがで
きる。In this way, an arbitrary combinational logic circuit can be realized by the basic logic element LUT, and an arbitrary sequential circuit can be realized by having the flip-flop D300 in the logic block LB. Furthermore, a larger-scale logic function can be realized by appropriately setting the above-mentioned programmable switch and connecting a plurality of such logic blocks LB. In this way, by rewriting the contents of the connection information holding memory and the logic value holding memory in the circuit configuration holding unit CM1, the programmable logic PL1 can be rewritten to a logic circuit having a desired logic function.

【００２０】ハードワイヤードロジックにはデータパス
部を構成する回路要素である外部バスインターフェース
ＩＢ１、プログラムカウンタＰＣ１、命令レジスタＩＲ
１、レジスタファイルＲＦ１、スイッチマトトリクスＳ
１，Ｓ２、ＡＬＵ（ＡｒｉｔｈｅｍａｔｉｃＬｏｇｉ
ｃＵｎｉｔ）Ａ１、演算器Ａ２を含む。ＡＬＵＡ１
は、加算、減算、論理演算（ＡＮＤ、ＯＲ、ＸＯＲ
等）、ビットシフト等の各種演算を実装しており、制御
線ＣＬ１１により選択的にこれらの演算を実行する。演
算器Ａ２には、ＡＬＵには実装されていないタイプの演
算、例えば、乗算や浮動小数点等の演算が実装されてお
り、制御線ＣＬ１２により選択的にこれらの演算を実行
する。レジスタファイルＲＦ１は、演算対象のデータま
たは演算結果のデータ等を記憶する複数のレジスタから
構成される。レジスタ選択端子により指定されたレジス
タに対して、ｉｎ端子経由で書き込み、ｏｕｔ１／ｏｕ
ｔ２端子経由で、読み出しができるように構成されてい
る。書き込み先のレジスタや読み出し先のレジスタは、
制御線ＣＬ１０経由で指定される。スイッチマトリクス
Ｓ１，Ｓ２は、制御端子により制御される複数のセレク
タから構成され、その入出力間の結線を設定可能であ
る。具体的には、スイッチマトリクスＳ１は、レジスタ
ファイルＲＦ１からのデータ線ＤＬ１０〜１３か、後述
するプログラムロジック上に構成された入力論理ＰＣ２
０からのデータ線ＤＬ１４〜１７のどちらかを、制御線
ＣＬ１３に従って選択して、ＡＬＵＡ１および／または
演算器Ａ２に入力する。同様に、スイッチマトリクスＳ
２は、演算器Ａ２からのデータ線ＤＬ２０、ＡＬＵＡ１
からのデータ線ＤＬ２２、および、後述するプログラム
ロジック上に構成された出力論理ＰＣ３０からのデータ
線ＤＬ２１、ＤＬ２３を、制御線ＣＬ１４に従って選択
し、データ線ＤＬ５０経由でレジスタファイルＲＦ１に
書き込む。プログラムカウンタＰＣ１は、次に実行すべ
き命令のメモリアドレス情報を保持している。このメモ
リアドレス情報は、命令を実行する毎に、制御線ＣＬ１
５経由でカウントアップされ、次に実行すべき命令のメ
モリアドレスに更新される。外部インターフェースＩＢ
１は、プログラムカウンタＰＣ１で指定されたメモリＭ
ＥＭ１上のアドレスにある命令をプロセッサに読み込
む。また、レジスタファイルＲＦ１の内容をメモリＭＥ
Ｍ１に書き込んだり、周辺回路ＰＦ１，ＰＦ２などに引
き渡す動作を行う。命令レジスタＩＲ１は、外部インタ
ーフェースＩＢ１により読み込まれた命令を保持する。In the hardwired logic, an external bus interface IB1 which is a circuit element forming a data path section, a program counter PC1, and an instruction register IR.
1, register file RF1, switch matrix S
1, S2, ALU (Arithmetic Logi)
c Unit) A1 and an arithmetic unit A2. ALUA1
Is addition, subtraction, logical operation (AND, OR, XOR
Etc.), various operations such as bit shift are mounted, and these operations are selectively executed by the control line CL11. The arithmetic unit A2 is equipped with an arithmetic operation of a type not implemented in the ALU, for example, arithmetic operation such as multiplication or floating point, and these arithmetic operations are selectively executed by the control line CL12. The register file RF1 is composed of a plurality of registers for storing data to be calculated or data of calculation results. Write to the register specified by the register selection terminal via the in terminal and out1 / ou
Reading is possible via the t2 terminal. The write destination register and the read destination register are
It is specified via the control line CL10. The switch matrices S1 and S2 are composed of a plurality of selectors controlled by control terminals, and the connection between their inputs and outputs can be set. Specifically, the switch matrix S1 includes the data lines DL10 to 13 from the register file RF1 or the input logic PC2 configured on the program logic described later.
One of the data lines DL14 to 17 from 0 is selected according to the control line CL13 and input to the ALUA1 and / or the arithmetic unit A2. Similarly, the switch matrix S
2 is a data line DL20, ALUA1 from the arithmetic unit A2
The data line DL22 from the data line DL22 and the data lines DL21 and DL23 from the output logic PC30 configured on the program logic described later are selected according to the control line CL14, and are written in the register file RF1 via the data line DL50. The program counter PC1 holds memory address information of an instruction to be executed next. This memory address information is stored in the control line CL1 every time an instruction is executed.
It is counted up via 5, and updated to the memory address of the next instruction to be executed. External interface IB
1 is a memory M designated by the program counter PC1
The instruction at the address on EM1 is read into the processor. In addition, the contents of the register file RF1 are stored in the memory
The operation of writing to M1 and delivering to the peripheral circuits PF1, PF2, etc. is performed. The instruction register IR1 holds the instruction read by the external interface IB1.

【００２１】これらのハードワイヤードロジックの個々
の回路自体は公知の回路構成により実現できるものであ
る。また、これらのハードワイヤードロジックは例示で
あって、例示した回路要素を含まない、あるいは例示し
た回路要素以外の回路要素を含むことは当然許容され
る。Each circuit of these hard-wired logics can be realized by a known circuit configuration. Further, these hard-wired logics are mere examples, and it is naturally allowable that the hardwired logic does not include the illustrated circuit elements or that it includes circuit elements other than the illustrated circuit elements.

【００２２】プログラマブルロジックＰＬ１には、これ
らのデータパス部を構成する回路要素を制御する制御回
路が構成される。具体的には、命令デコーダＰＣ１０、
プログラムカウンタ制御回路ＰＣ１１、レジスタファイ
ル制御回路ＰＣ１２、ＡＬＵ／演算器制御回路ＰＣ１
３、スイッチマトリクス制御回路ＰＣ１４を含む。ただ
し、これらの各回路は例示であって、実現しようとする
命令セットによっては例示した回路要素を含まない、あ
るいは例示した回路要素以外の回路要素を含むことは当
然許容される。The programmable logic PL1 is provided with a control circuit for controlling the circuit elements forming these data path sections. Specifically, the instruction decoder PC10,
Program counter control circuit PC11, register file control circuit PC12, ALU / arithmetic unit control circuit PC1
3. Includes a switch matrix control circuit PC14. However, each of these circuits is an example, and depending on the instruction set to be realized, it is naturally permissible not to include the illustrated circuit element or to include a circuit element other than the illustrated circuit element.

【００２３】例えば、プログラマブルロジックＰＬ１に
はこれら制御回路に加えて、レジスタファイルＲＦ１の
データに対して簡単な演算処理を施すため、レジスタフ
ァイルとＡＬＵ／演算器間に入力論理回路ＰＣ２０およ
び／または出力論理回路ＰＣ３０を具備してもよい。さ
らに、これらの論理回路ＰＣ２０／ＰＣ３０を命令セッ
トに応じて制御する、入出力論理制御回路ＰＣ１５も含
まれる。For example, in addition to these control circuits, the programmable logic PL1 is provided with an input logic circuit PC20 and / or an output between the register file and the ALU / arithmetic unit in order to perform simple arithmetic processing on the data of the register file RF1. The logic circuit PC30 may be provided. Further, an input / output logic control circuit PC15 for controlling these logic circuits PC20 / PC30 according to an instruction set is also included.

【００２４】後述するように、本発明の命令セット可変
マイクロプロセッサでは、これらの各種制御回路の回路
構成を書き換えることにより、様々な命令セットに対応
することが可能である。回路構成の書き換えは、プログ
ラマブルロジック上の各ＬＵＴの論理関数や、ＬＵＴ間
を結線するプログラマブルスイッチの結線情報を保持す
る回路構成保持部ＣＭ１の内容（図２０〜２４において
説明した構成例の場合においては、結線情報保持メモ
リ、選択情報保持メモリ、論理値保持メモリを含む）を
書き換えることで実現される。ＣＭ１の内容は、書き換
え端子ＣＷ０と回路構成書き換えインターフェースＣＩ
１経由で、チップ外部より書き換えられる。なお、回路
構成書き換えインターフェースＣＩ１には、例えば、Ｆ
ＰＧＡで一般的なＪＴＡＧインターフェース（文献３）
などが使用可能である。As will be described later, the variable instruction set microprocessor of the present invention can cope with various instruction sets by rewriting the circuit configurations of these various control circuits. The circuit configuration can be rewritten by the logic function of each LUT on the programmable logic and the contents of the circuit configuration holding unit CM1 that holds the connection information of the programmable switch that connects between the LUTs (in the case of the configuration example described in FIGS. Is realized by rewriting the connection information holding memory, the selection information holding memory, and the logical value holding memory). The contents of CM1 are the rewriting terminal CW0 and the circuit configuration rewriting interface CI.
It is rewritten from outside the chip via 1. The circuit configuration rewriting interface CI1 may be, for example, F
General JTAG interface for PGA (Reference 3)
Etc. can be used.

【００２５】さらに、プログラマブルロジック上の未使
用領域にマイクロプロセッサの周辺回路ＰＦ１を作成す
ることも可能である。周辺回路をワンチップに集積する
ことにより、システム全体のチップ数を削減できる。Further, the peripheral circuit PF1 of the microprocessor can be formed in an unused area on the programmable logic. The number of chips in the entire system can be reduced by integrating the peripheral circuits on one chip.

【００２６】このように、本発明の命令セット可変マイ
クロプロセッサＣ１０では、プログラマブルロジック上
に構成された制御回路や演算処理回路の回路構成が書き
換え可能であるため、対応する命令セットの命令フォー
マットを変更したり、新規に命令を追加することが可能
である。一方でＡＬＵＡ１、演算器Ａ２等はハードワイ
ヤードロジックであるために高速な演算処理が可能とな
り、プロセッサの高速性と柔軟性の両立を可能とするも
のである。As described above, in the variable instruction set microprocessor C10 of the present invention, since the circuit configurations of the control circuit and the arithmetic processing circuit configured on the programmable logic can be rewritten, the instruction format of the corresponding instruction set is changed. It is possible to add a new instruction. On the other hand, since the ALUA1, the arithmetic unit A2, etc. are hard-wired logic, high-speed arithmetic processing is possible, and both high speed and flexibility of the processor can be achieved.

【００２７】以下、本発明の命令セット可変マイクロプ
ロセッサＣ１０において、簡単な例をとって定義された
命令セットを実装する方法及びその際の各回路要素の動
作を説明する。ここで、命令セットとは、そのマイクロ
プロセッサが実行可能な命令の集合をいう。（Ａ）所定の命令セットを実現するプロセッサ図２は、加算命令ＡＤＤと乗算命令ＭＵＬから構成され
る命令セットの例（ＩＳ１〜ＩＳ２）である。A method of implementing the defined instruction set in the variable instruction set microprocessor C10 of the present invention and the operation of each circuit element in that case will be described below. Here, the instruction set refers to a set of instructions executable by the microprocessor. (A) Processor that realizes a predetermined instruction set FIG. 2 is an example (IS1 to IS2) of an instruction set including an addition instruction ADD and a multiplication instruction MUL.

【００２８】各命令は、オペコードと呼ばれる命令の種
類を表すフィールドと、演算対象のレジスタの識別子
（ここではレジスタ番号と呼ぶことにする）を表すフィ
ールドから構成される。例えば、ＲｓレジスタとＲｔレ
ジスタの内容を加算して、その結果をＲｄレジスタに書
き込むというＡＤＤ命令ＩＥ１は、ＩＳ１のようなフォ
ーマットで規定される。同様に、ＲｓレジスタとＲｔレ
ジスタの内容を乗算して、その結果をＲｄレジスタに書
き込むというＭＵＬ命令ＩＥ２は、ＩＳ２で示すフォー
マットで規定される。また、上位６ビットがオペコード
に割り当てられており、図２の例では、ＡＤＤ命令が
“０００００１”で、ＭＵＬ命令が“００００１０”で
ある。さらに、演算対象とするＲｓ、Ｒｔ、Ｒｄのレジ
スタ番号には、それぞれ５ビットのフィールドが割り当
てられている。Each instruction is composed of a field indicating an instruction type called an opcode and a field indicating an identifier of a register to be operated (herein referred to as a register number). For example, the ADD instruction IE1 of adding the contents of the Rs register and the Rt register and writing the result to the Rd register is defined in a format such as IS1. Similarly, the MUL instruction IE2 of multiplying the contents of the Rs register and the Rt register and writing the result to the Rd register is defined in the format indicated by IS2. The upper 6 bits are assigned to the operation code. In the example of FIG. 2, the ADD instruction is “000001” and the MUL instruction is “000010”. Further, a 5-bit field is assigned to each of the register numbers of Rs, Rt, and Rd, which are calculation targets.

【００２９】制御回路はこれらを解析して、レジスタフ
ァイル中のどの番号のレジスタに対して、どうようなタ
イプの演算を行うかを制御する。図２の命令セットの例
では、命令長として３２ビット、そのうちオペコードに
６ビット、Ｒｓ、Ｒｔ、Ｒｄの各レジスタに５ビットの
フィールドを割り振ってある。したがって、２の６乗＝
６４命令、２の５乗＝３２個までのレジスタに対応可能
である。各領域に割り振るビット幅を増やすことによ
り、対応する命令やレジスタの数を増やすことができ
る。The control circuit analyzes these and controls what type of operation is performed for which number register in the register file. In the example of the instruction set in FIG. 2, the instruction length is 32 bits, of which 6 bits are assigned to the opcode, and 5 bits are assigned to the Rs, Rt, and Rd registers. Therefore, 2 6 =
It is possible to support 64 instructions, 2 to the fifth power = 32 registers. By increasing the bit width allocated to each area, the number of corresponding instructions and registers can be increased.

【００３０】命令の記述方法は図２のようなフォーマッ
トに限定されない。例えば、オペコードを下位６ビット
等に割り振る事も可能である。また、アドレスを直接指
定してそのアドレスに対してメモリ上で演算をする命令
もある。このような場合には、レジスタ番号だけではな
く、アドレスそのものを直接、命令に書き込む必要があ
る。このような場合についても、以下に示す方法と同様
の方法で、本発明の命令セット可変マイクロプロセッサ
で実現可能である。The instruction description method is not limited to the format shown in FIG. For example, it is possible to assign the operation code to the lower 6 bits or the like. There is also an instruction that directly specifies an address and performs an operation on the address on the memory. In such a case, it is necessary to write not only the register number but the address itself directly in the instruction. Even in such a case, the instruction set variable microprocessor of the present invention can be realized by a method similar to the method described below.

【００３１】ここで、図２の命令セットを実行するプロ
セッサを命令セット可変マイクロプロセッサＣ１０を用
いて実現した例を図３に示す。なお、図３では図２の命
令セットの実行に必要な最小限の回路要素を示してい
る。プログラマブルロジックＰＬ１上に命令デコーダＰ
Ｃ１０ａ、プログラムカウンタ制御回路ＰＣ１１ａ、レ
ジスタファイル制御回路ＰＣ１２ａ、ＡＬＵ／演算器制
御回路ＰＣ１３ａ、スイッチマトリクス制御回路ＰＣ１
４ａは、図２のＡＤＤ命令、ＭＵＬ命令を実現するよう
に構成される。FIG. 3 shows an example in which the processor that executes the instruction set shown in FIG. 2 is realized by using the instruction set variable microprocessor C10. Note that FIG. 3 shows the minimum circuit elements required to execute the instruction set of FIG. An instruction decoder P on the programmable logic PL1
C10a, program counter control circuit PC11a, register file control circuit PC12a, ALU / arithmetic unit control circuit PC13a, switch matrix control circuit PC1
4a is configured to realize the ADD instruction and the MUL instruction of FIG.

【００３２】通常のマイクロプロセッサと同様に、外部
バスインターフェースＩＢ１経由で、メモリＭＥＭ１か
ら命令を読み込む事により、処理が開始される。このよ
うにして読み込まれた命令は、命令レジスタＩＲ１に格
納され、プログラマブルロジックＰＬ１上に構成された
命令デコーダＰＣ１０ａで解析される。Like an ordinary microprocessor, the processing is started by reading an instruction from the memory MEM1 via the external bus interface IB1. The instruction thus read is stored in the instruction register IR1 and analyzed by the instruction decoder PC10a configured on the programmable logic PL1.

【００３３】命令デコーダＰＣ１０ａは、命令レジスタ
ＩＲ１に転送された命令のうち、オペコードに相当する
上位６ビットと、Ｒｓ／Ｒｔ／Ｒｄのアドレスに相当す
るビットを分離し、上位６ビットをＡＬＵ／演算器制御
回路ＰＣ１３ａ、スイッチマトリクス制御回路ＰＣ１４
ａへ、アドレスデータをレジスタファイル制御回路ＰＣ
１２ａへ出力する。Of the instructions transferred to the instruction register IR1, the instruction decoder PC10a separates the upper 6 bits corresponding to the operation code from the bits corresponding to the address of Rs / Rt / Rd, and the upper 6 bits are ALU / arithmetically operated. Control circuit PC13a, switch matrix control circuit PC14
Address data to a to register file control circuit PC
Output to 12a.

【００３４】レジスタファイル制御回路ＰＣ１２ａは、
命令デコーダより渡された、Ｒｓ／Ｒｔ／Ｒｄのアドレ
スデータを解析し、それぞれに対応するレジスタを制御
線ＣＬ１０経由で、レジスタファイルＲＦ１から選択す
る。The register file control circuit PC12a is
The address data of Rs / Rt / Rd passed from the instruction decoder is analyzed, and the corresponding registers are selected from the register file RF1 via the control line CL10.

【００３５】ＡＬＵ／演算器制御回路ＰＣ１３ａは、オ
ペコードに相当する６ビットの信号が“０００００１”
ならば、ＡＬＵＡ１の制御線ＣＬ１１を活性化させて、
ＡＬＵを加算モードで起動する。一方、命令デコーダか
らの信号が“００００１０”ならば、乗算器Ａ２の制御
線ＣＬ１２を活性化させる。In the ALU / arithmetic unit control circuit PC13a, the 6-bit signal corresponding to the operation code is "000001".
Then, activate the control line CL11 of ALUA1,
Start the ALU in add mode. On the other hand, if the signal from the instruction decoder is "000010", the control line CL12 of the multiplier A2 is activated.

【００３６】スイッチマトリクス制御回路ＰＣ１４ａ
は、制御線ＣＬ１３経由で、スイッチマトリクスＳ１
が、レジスタファイルＲＦ１とスイッチマトリクスＳ１
とを直接接続するデータ線（ＤＬ１０、ＤＬ１１）及び
（ＤＬ１２、ＤＬ１３）を選択するように設定する。ま
た、スイッチマトリクスＳ２への制御線ＣＬ１４は、命
令デコーダからの信号が“０００００１”ならば、ＡＬ
Ｕからのデータ線ＤＬ２２を選択し、“００００１０”
ならば、乗算器からのデータ線ＤＬ２０を選択するよう
に設定する。Switch matrix control circuit PC14a
Switch matrix S1 via the control line CL13
But register file RF1 and switch matrix S1
It is set to select the data lines (DL10, DL11) and (DL12, DL13) that directly connect and. The control line CL14 to the switch matrix S2 is AL if the signal from the instruction decoder is "000001".
Select the data line DL22 from U and select "000010"
If so, the data line DL20 from the multiplier is set to be selected.

【００３７】プログラムカウンタ制御回路ＰＣ１１ａ
は、命令デコーダからのオペコードが“０００００
１”、“００００１０”のどちらの場合にも、制御線Ｃ
Ｌ１５経由で、プログラムカウンタＰＣ１をインクリメ
ントして、次に実行すべき命令のメモリアドレスに設定
する。Program counter control circuit PC11a
Indicates that the operation code from the instruction decoder is "00000
In both cases of "1" and "000010", the control line C
The program counter PC1 is incremented via L15 and set to the memory address of the next instruction to be executed.

【００３８】以上の動作により、本発明の命令セット可
変マイクロプロセッサは図２に示す命令セットを実現で
きる。また、各制御回路は上述の処理を実現するように
回路データ（ここでは、ネットリストと呼ぶ）を作成
し、それらのネットリストをプログラマブルロジックＰ
Ｌ１に書き込むことにより、目的の命令セットを実行す
るプロセッサが実現できる。なお、プログラマブルロジ
ック上に回路を合成する方法には、通常のＬＳＩ設計で
使われる論理合成ツールが使用可能であり、公知の方法
が使用できる。本発明の命令セット可変マイクロプロセ
ッサでは、それらの方法をそのまま適用すればよい。With the above operation, the variable instruction set microprocessor of the present invention can realize the instruction set shown in FIG. In addition, each control circuit creates circuit data (referred to as a netlist here) so as to realize the above-described processing, and the netlist is programmed by the programmable logic P.
By writing to L1, a processor that executes the target instruction set can be realized. As a method of synthesizing a circuit on a programmable logic, a logic synthesizing tool used in ordinary LSI design can be used, and a known method can be used. In the variable instruction set microprocessor of the present invention, those methods may be applied as they are.

【００３９】実際の命令セットに存在する別のタイプの
命令としては、次に実行すべき命令のアドレスを指定す
るタイプの命令（例えば、ジャンプ命令等）がある。ジ
ャンプ命令は、次に実行する命令のアドレスを命令フォ
ーマット中の特定のフィールドに記述する。このような
ジャンプ命令をサポートする場合には、プログラムカウ
ンタ制御回路ＰＣ１１ａを制御線ＣＬ１５経由で、プロ
グラムカウンタＰＣ１のアドレス内容を、そのジャンプ
先のアドレスに設定し直すような回路構成とすればよ
い。Another type of instruction that exists in the actual instruction set is an instruction of the type that specifies the address of the next instruction to be executed (for example, a jump instruction). The jump instruction describes the address of the next instruction to be executed in a specific field in the instruction format. To support such a jump instruction, the program counter control circuit PC11a may be configured to reset the address contents of the program counter PC1 to the jump destination address via the control line CL15.

【００４０】このように、命令デコーダやその他の制御
回路をＦＰＧＡ等のプログラマブルロジックで構成する
ことにより、チップが製造された後も制御回路の回路構
成を変更することが可能なため、対応する命令を柔軟に
変更することができる。By thus configuring the instruction decoder and other control circuits with programmable logic such as FPGA, the circuit configuration of the control circuit can be changed even after the chip is manufactured. Can be changed flexibly.

【００４１】例えば、図２に示した命令セットはＡＤＤ
命令が“０００００１”であったのに対して、ＡＤＤ命
令を“１０００００”とする命令セットがあったとす
る。一般に、マイクロプロセッサの種類が異なれば、演
算処理として同じ内容の命令であってもオペコードが異
なり、あるプロセッサ用に作成されたプログラムは別の
プロセッサでは動作できなかった。しかしながら、本発
明のプロセッサによれば、プログラマブルロジック上に
構成された制御回路を以下のように修正すればよい。ま
ず、ＡＬＵ／演算器制御回路ＰＣ１３ａをオペコードに
相当する６ビットの信号が“１０００００”ならば、Ａ
ＬＵＡ１の制御線ＣＬ１１を活性化させて、ＡＬＵを加
算モードで起動するように修正する。またスイッチマト
リクス制御回路ＰＣ１４ａをオペコードに相当する６ビ
ットの信号が“１０００００”ならば、スイッチマトリ
クスＳ２への制御線ＣＬ１４は、ＡＬＵからのデータ線
ＤＬ２２を選択するように修正する。For example, the instruction set shown in FIG. 2 is ADD.
It is assumed that there is an instruction set that sets the ADD instruction to "100000" while the instruction is "000001". In general, if the type of microprocessor is different, the operation code is different even if the instruction has the same content as the arithmetic processing, and the program created for one processor cannot run on another processor. However, according to the processor of the present invention, the control circuit configured on the programmable logic may be modified as follows. First, if the 6-bit signal corresponding to the operation code of the ALU / arithmetic unit control circuit PC13a is "100000", A
The control line CL11 of LUA1 is activated and the ALU is modified to start in the addition mode. If the 6-bit signal corresponding to the operation code is "100000", the switch matrix control circuit PC14a is modified so that the control line CL14 to the switch matrix S2 selects the data line DL22 from the ALU.

【００４２】また、オペコード以外にも命令フォーマッ
トそのものが異なる場合もある。例えば、レジスタＲｄ
とレジスタＲｓのデータを演算して、レジスタＲｄに書
き戻す形式をとる場合がある。この場合には、以下のよ
うに修正する。まず、命令デコーダＰＣ１０ａは命令レ
ジスタＩＲ１に転送された命令のうち、Ｒｓ／Ｒｄのア
ドレスに相当するビットを分離し、アドレスデータをレ
ジスタファイル制御回路ＰＣ１２ａへ出力するように修
正する。またレジスタファイル制御回路ＰＣ１２ａは、
命令デコーダより渡された、Ｒｓ／Ｒｄのアドレスデー
タを解析し、それぞれに対応するレジスタを制御線ＣＬ
１０経由で、レジスタファイルＲＦ１から選択するよう
に修正する。In addition to the opcode, the instruction format itself may be different. For example, register Rd
And the data of the register Rs may be calculated and written back to the register Rd. In this case, it is corrected as follows. First, the instruction decoder PC10a separates the bit corresponding to the address of Rs / Rd in the instruction transferred to the instruction register IR1 and corrects it so as to output the address data to the register file control circuit PC12a. Further, the register file control circuit PC12a is
The address data of Rs / Rd passed from the instruction decoder is analyzed, and the corresponding registers are set to the control line CL.
Modify so as to select from the register file RF1 via 10.

【００４３】つまり、必ずしも、チップを設計するまで
に、命令セットが確定されている必要があるわけではな
い。このため、命令セットの策定と、チップの設計・製
造を独立して並行して行うことが可能である。さらに、
命令セットに不具合があった場合にも、通常のマイクロ
プロセッサチップのように、チップの設計・製造からや
り直す必要はなく、修正した命令セットに応じて制御回
路の回路構成を変更すればよい。（Ｂ）パイプライン制御プロセッサ通常のマイクロプロセッサでは、高速化のために、処理
をブロック毎に分割して、別々のクロックサイクルで実
行するパイプライン動作が、広く採用されている。パイ
プラインの分割方法には、いろいろな方式が考えられる
が、図４の例では、命令フェッチＩＦ、命令デコードＩ
Ｄ、演算実行ＥＸ、メモリアクセスＭＥＭ、レジスタ書
き込みＷＢの５つのステージに分割している。処理を複
数ステージに分割することにより、１クロックサイクル
あたりに処理しなければならない内容が簡単になり、ク
ロック周波数を引き上げて処理スピードを高速化するこ
とが可能になる。In other words, it is not always necessary that the instruction set be fixed before the chip is designed. Therefore, it is possible to independently design the instruction set and design and manufacture the chip in parallel. further,
Even if there is a problem in the instruction set, it is not necessary to start over from the design and manufacturing of the chip as in a normal microprocessor chip, and the circuit configuration of the control circuit may be changed according to the corrected instruction set. (B) Pipeline Control Processor In general microprocessors, a pipeline operation in which processing is divided into blocks and executed in different clock cycles is widely adopted for speeding up. There are various methods for dividing the pipeline. In the example of FIG. 4, the instruction fetch IF and the instruction decode I are used.
It is divided into five stages of D, operation execution EX, memory access MEM, and register write WB. By dividing the processing into a plurality of stages, the contents to be processed per clock cycle are simplified, and the clock frequency can be increased to increase the processing speed.

【００４４】図５は、このようなパイプライン動作によ
り本発明の命令セット可変マイクロプロセッサを高速化
した例であり、ＩＦ／ＩＤ／ＥＸ／ＭＥＭ／ＷＢの各ス
テージ下で動作する回路要素を示している。なお、各回
路要素の処理内容は図３に示したものと同じである。FIG. 5 shows an example in which the instruction set variable microprocessor of the present invention is speeded up by such a pipeline operation, and shows circuit elements operating under each stage of IF / ID / EX / MEM / WB. ing. The processing contents of each circuit element are the same as those shown in FIG.

【００４５】ＩＦステージでは、プログラムカウンタＰ
Ｃ１で指定されたメモリアドレスから、命令を外部バス
インターフェースＩＢ１経由で読み込む。読み込まれた
命令はパイプラインレジスタＰＲ１０に保持される。こ
のパイプラインレジスタＰＲ１０〜ＰＲ１３は、ハード
ワイヤードロジックであるラッチやフリップフロップ等
の記憶素子で構成され、前段のステージから渡されたデ
ータを保持し、次のクロックサイクルから動作が開始さ
れる次段のステージにデータを渡す働きをする。このパ
イプラインレジスタには、通常のマイクロプロセッサで
使用されているのと全く同じ構成のものが使用可能であ
る。読み込まれた命令は、パイプラインレジスタＰＲ１
０を介して、次のＩＤステージで命令デコーダＰＣ１０
ａに入力される。In the IF stage, the program counter P
The instruction is read from the memory address specified by C1 via the external bus interface IB1. The read instruction is held in the pipeline register PR10. The pipeline registers PR10 to PR13 are composed of storage elements such as latches and flip-flops, which are hard-wired logic, hold the data passed from the previous stage, and start operation from the next clock cycle. To pass data to the stage. For this pipeline register, it is possible to use the same configuration as that used in a normal microprocessor. The read instruction is the pipeline register PR1.
0 to the instruction decoder PC10 in the next ID stage
Input to a.

【００４６】ＩＤステージでは、プログラマブルロジッ
ク上に構成された命令デコーダや各種制御回路により、
レジスタや演算器などをコントロールする制御信号が生
成される。ＩＤステージでは、レジスタファイル制御回
路ＰＣ１２ａの生成するＲｔ／Ｒｓ選択信号のみがレジ
スタファイルＲＦ１に渡される。それ以外の制御信号で
あるＲｄ選択信号、ＡＬＵ／演算器制御信号、スイッチ
マトリクス制御信号は、次のクロックサイクル以降のＥ
Ｘステージ、ＷＢステージで必要とされるため、必要と
されるクロックサイクルまで情報を保持する必要があ
る。そこで、これらの制御信号はラッチやフリップフロ
ップ等の記憶素子で構成された遅延回路ＬＤ１０〜ＬＤ
１４に入力され、保持される。プログラムカウンタ制御
信号も同様である。In the ID stage, by the instruction decoder and various control circuits constructed on the programmable logic,
A control signal for controlling a register, a computing unit, etc. is generated. In the ID stage, only the Rt / Rs selection signal generated by the register file control circuit PC12a is passed to the register file RF1. Rd selection signal, ALU / arithmetic unit control signal, and switch matrix control signal, which are other control signals, are E
Since it is required in the X stage and WB stage, it is necessary to retain information until the required clock cycle. Therefore, these control signals are output to the delay circuits LD10 to LD including storage elements such as latches and flip-flops.
14 is input and held. The same applies to the program counter control signal.

【００４７】ＥＸステージでは、ＡＬＵ／演算器が演算
を実行する。ＡＬＵ／演算器の制御は、遅延回路ＬＤ１
０およびＬＤ１１に保持された制御信号により行われ
る。この演算結果はパイプラインレジスタＰＲ１２に保
持される。また、プログラムカウンタＰＣ１の内容も、
プルグラムカウンタ制御信号に従って更新される。In the EX stage, the ALU / arithmetic unit executes the arithmetic. The control of the ALU / arithmetic unit is performed by the delay circuit LD1.
0 and the control signal held in LD11. The result of this operation is held in the pipeline register PR12. Also, the contents of the program counter PC1
It is updated according to the program control signal.

【００４８】演算結果をメモリに書き込む場合には、Ｍ
ＥＭステージで外部バスインターフェースＩＢ１経由で
メモリに書き込まれる。また、レジスタに書き込む場合
には、ＷＢステージにおいて、保持されている制御信号
に基づいて、目的のＲｄレジスタが選択され、演算結果
が書き込まれる。以上のようにして、本発明の命令セッ
ト可変マイクロプロセッサにおいて、パイプライン動作
が実現される。When writing the calculation result to the memory, M
It is written in the memory at the EM stage via the external bus interface IB1. Further, when writing to the register, in the WB stage, the target Rd register is selected based on the held control signal, and the calculation result is written. As described above, the pipeline operation is realized in the variable instruction set microprocessor of the present invention.

【００４９】ここで、ＦＰＧＡに代表されるプログラマ
ブルロジックでは、同じ半導体プロセスを使用して同じ
機能の回路を作った場合、ＡＳＩＣ等のフルカスタムチ
ップに比べ、３倍程度、遅い回路しか実現できないとい
われているのは上述の通りである。図４のパイプライン
制御を解析した内容を図６に示す。図４のパイプライン
制御では、ＩＦ／ＥＸ／ＭＥＭ／ＷＢの各ステージにお
いてはハードワイヤードロジック（ハードマクロ）によ
る処理が行われているのに対し、ＩＤステージではプロ
グラマブルロジックによる処理が行われている。したが
って、ＩＤステージの遅延時間が他のステージに比べて
大きくなって、このマイクロプロセッサのクロック周波
数を律速してしまう可能性がある。これは、パイプライ
ン動作においては、クロック周波数は各ステージの中で
もっとも遅延時間の大きいステージに合わせる必要があ
るからである。Here, in programmable logic represented by FPGA, when circuits having the same function are made by using the same semiconductor process, only circuits which are about three times slower than full-custom chips such as ASIC can be realized. What is said is as described above. The content of analysis of the pipeline control of FIG. 4 is shown in FIG. In the pipeline control of FIG. 4, processing by hard-wired logic (hard macro) is performed in each stage of IF / EX / MEM / WB, whereas processing by programmable logic is performed in the ID stage. . Therefore, there is a possibility that the delay time of the ID stage becomes larger than that of the other stages, and the clock frequency of this microprocessor is limited. This is because in the pipeline operation, the clock frequency needs to be adjusted to the stage having the longest delay time among the stages.

【００５０】そこで、プログラムロジックが主に動作す
るＩＤステージをＩＤ１ステージ、ＩＤ２ステージに分
割する。これにより、クロック周波数を高くし、他のス
テージにおける無駄時間を削除することができる。Therefore, the ID stage in which the program logic mainly operates is divided into the ID1 stage and the ID2 stage. This makes it possible to increase the clock frequency and eliminate dead time in other stages.

【００５１】図６に示す細分化されたパイプライン制御
に対応した命令セット可変マイクロプロセッサの動作を
図７に示す。図７の例では、図５に示したＩＤステージ
を、命令デコーダＰＣ１０ａが動作するＩＤ１ステージ
と、プログラムカウンタ制御回路ＰＣ１１ａ、レジスタ
ファイル制御回路ＰＣ１２ａ、ＡＬＵ／演算器制御回路
ＰＣ１３ａ、スイッチマトリクス制御回路ＰＣ１４ａが
動作するＩＤ２ステージに分割する。ＩＤステージを２
分割する事により、ＩＤ１／ＩＤ２ステージの遅延時間
を半分程度にまで削減する事が可能となる。このため、
ＩＤステージがチップのクロック周波数を律速している
場合には、クロック周波数を２倍程度に向上できる。例
えば、３ステージ以上にさらに分割してもよく、また各
回路要素の遅延時間に応じて、分割ポイントを適宜変え
てもよい。FIG. 7 shows the operation of the instruction set variable microprocessor corresponding to the subdivided pipeline control shown in FIG. In the example of FIG. 7, the ID stage shown in FIG. 5 is replaced by the ID1 stage in which the instruction decoder PC10a operates, a program counter control circuit PC11a, a register file control circuit PC12a, an ALU / arithmetic unit control circuit PC13a, and a switch matrix control circuit PC14a. Is divided into the ID2 stage in which is operated. 2 ID stages
By dividing, the delay time of the ID1 / ID2 stage can be reduced to about half. For this reason,
When the ID stage controls the clock frequency of the chip, the clock frequency can be increased to about double. For example, it may be further divided into three stages or more, and the division points may be changed appropriately according to the delay time of each circuit element.

【００５２】パイプライン制御のステージ数を増加する
と高速動作が可能になる一方で、パイプラインレジスタ
が必要になる等、回路規模が大きくなってしまうおそれ
もある。しかし、本発明においては、演算器等はハード
マクロで実現しており、律速するおそれがあるのはこれ
らの制御信号を生成するステージに限定される。そのた
め、プログラマブルロジックを採用したために新たに必
要となるステージ分割も少なく抑えることが可能であ
り、若干の面積増加はあるにせよ、フルカスタムＡＳＩ
Ｃと同程度のスピード性能が得ることを可能とするもの
である。（Ｃ）ＡＬＵや演算器に実装されていないタイプの演算
命令を実現するプロセッサこの例では、簡単な命令セットを取り上げ、ハードワイ
ヤードロジックで構成されたＡＬＵや演算器等に実装さ
れていないタイプの演算についての命令も、本発明の命
令セット可変マイクロプロセッサで実現可能であること
を示す。以下、図８に示す命令セットの実装について説
明する。図８の命令セットは、図２の命令セットに新し
い命令（ＩＳ３，ＩＥ３）を追加した命令セットであ
る。追加された命令ＩＳ３はＩＥ３に示されるように、
Ｒｓで指定されたレジスタの上位１６ビットと下位１６
ビットを１６ビットデータとして加算して、その結果を
Ｒｄで指定されたレジスタに書き込むという命令であ
る。When the number of stages of pipeline control is increased, high-speed operation becomes possible, but on the other hand, there is a possibility that the circuit scale becomes large due to the need for pipeline registers. However, in the present invention, the arithmetic unit and the like are realized by a hard macro, and the risk of rate limiting is limited to the stage for generating these control signals. Therefore, it is possible to minimize the stage division newly required due to the adoption of programmable logic. Even though there is a slight increase in area, full custom ASI
It is possible to obtain the same speed performance as C. (C) A processor that realizes a type of arithmetic instruction that is not implemented in an ALU or arithmetic unit In this example, a simple instruction set is taken up, and a processor of a type that is not implemented in an ALU or arithmetic unit configured by hardwired logic is used. It is shown that the instruction for the operation can also be realized by the variable instruction set microprocessor of the present invention. The implementation of the instruction set shown in FIG. 8 will be described below. The instruction set of FIG. 8 is an instruction set in which new instructions (IS3, IE3) are added to the instruction set of FIG. The added instruction IS3 is, as shown in IE3,
Upper 16 bits and lower 16 bits of the register specified by Rs
This is an instruction to add bits as 16-bit data and write the result to the register designated by Rd.

【００５３】このような命令は、備え付けのＡＬＵや演
算器で、そのような演算モードがサポートされていない
場合には、通常は、ＡＬＵ／演算器を複数回使用して実
現する。例えば、（１）Ｒｓの内容を別のレジスタＲｔにコピー（２）Ｒｔを１６ビット右にシフト（３）Ｒｓの上位１６ビットを０に設定（４）ＲｓとＲｔを加算して結果をＲｄに書き込むというように、基本命令を４サイクルかけて実行する。
しかし、従来技術の項で説明したように、通信系のアプ
リケーションで必要とされるディジタル信号の変復調処
理ではデータのビット位置をずらして演算を行うという
タイプの命令を高速に実行する必要がある（例えば、文
献４を参照）。このため、このようなビット操作を、で
きる限り少ないサイクル数で、高速に実行できる事が望
ましい。Such an instruction is normally realized by using the ALU / arithmetic unit a plurality of times when the built-in ALU or arithmetic unit does not support such an arithmetic mode. For example, (1) copy the contents of Rs to another register Rt (2) shift Rt 16 bits to the right (3) set the upper 16 bits of Rs to 0 (4) add Rs and Rt, and add the result to Rd The basic instruction is executed in four cycles such as writing to.
However, as described in the section of the prior art, in the modulation / demodulation processing of a digital signal required for a communication application, it is necessary to execute a high-speed instruction of the type that shifts the bit position of data to perform an operation ( For example, see Reference 4. Therefore, it is desirable that such bit manipulation can be executed at high speed with as few cycles as possible.

【００５４】図１に示す命令セット可変マイクロプロセ
ッサでは、レジスタファイルとＡＬＵ／演算器の間に、
プログラマブルロジックで構成した入力論理回路ＰＣ２
０および出力論理回路ＰＣ３０を具備する。このため、
これらの論理回路で上記のビット操作等の処理を行わせ
ることにより、ＡＬＵでサポートされていない演算を高
速に実行する事が可能である。In the instruction set variable microprocessor shown in FIG. 1, between the register file and the ALU / arithmetic unit,
Input logic circuit PC2 composed of programmable logic
0 and an output logic circuit PC30. For this reason,
By causing the logic circuits to perform the above-described processing such as bit manipulation, it is possible to execute operations that are not supported by the ALU at high speed.

【００５５】図９に示すのが、以上で述べた機能を利用
して、図８の命令セットを実現した例である。図９に示
されるように、プログラマブルロジック上に、ＡＬＵ／
演算器制御回路ＰＣ１３ｄ、スイッチマトリクス制御回
路ＰＣ１４ｄ、入力論理回路ＰＣ２０ｄと入出力論理制
御回路ＰＣ１５ｄをプログラムすることにより実現され
る。以下に、各回路の動作をＳＡＤＤ命令に関係する範
囲で説明する。ＡＤＤ命令、ＭＵＬ命令については図３
と同じ動作をするように回路を構成する。FIG. 9 shows an example in which the function described above is used to realize the instruction set of FIG. As shown in FIG. 9, the ALU /
It is realized by programming the arithmetic unit control circuit PC13d, the switch matrix control circuit PC14d, the input logic circuit PC20d, and the input / output logic control circuit PC15d. The operation of each circuit will be described below in the range related to the SADD instruction. Figure 3 for the ADD and MUL instructions
The circuit is configured to operate in the same manner as.

【００５６】ＡＬＵ／演算器制御回路ＰＣ１３ｄは、Ｓ
ＡＤＤ命令に対応して命令デコーダからの信号が“００
０１００”のときに、制御線ＣＬ１１を活性化させ、加
算モードでＡＬＵを起動するように回路を構成する。The ALU / arithmetic unit control circuit PC13d is S
The signal from the instruction decoder is "00" in response to the ADD instruction.
When 0100 ″, the circuit is configured to activate the control line CL11 and activate the ALU in the addition mode.

【００５７】スイッチマトリクス制御回路ＰＣ１４ｄ
は、命令デコーダからの信号がＳＡＤＤ命令のオペコー
ドに対応する“０００１００”のときには、制御線ＣＬ
１３を通じて、スイッチマトリクスＳ１にはデータ線Ｄ
Ｌ１４，ＤＬ１５を選択させる。また、スイッチマトリ
クスＳ２には、制御線ＣＬ１４を通じて、データ線ＤＬ
２２を選択させるように、回路を構成する。Switch matrix control circuit PC14d
Is a control line CL when the signal from the instruction decoder is "000100" corresponding to the operation code of the SADD instruction.
13 through the switch matrix S1 to the data line D
Select L14 and DL15. Further, the switch matrix S2 is connected to the data line DL through the control line CL14.
The circuit is configured to select 22.

【００５８】入出力論理制御回路ＰＣ１５ｄは、命令デ
コーダＰＣ１０ａからの信号がＳＡＤＤ命令のオペコー
ドである“０００１００”のときのみ、制御線ＣＬ２０
を通じて、入力論理回路ＰＣ２０ｄを活性化させる。The input / output logic control circuit PC15d controls the control line CL20 only when the signal from the instruction decoder PC10a is "000100" which is the operation code of the SADD instruction.
Through, the input logic circuit PC20d is activated.

【００５９】入力論理回路ＰＣ２０ｄでは、制御線ＣＬ
２０により活性化され、ｉｎ１から入力される３２ビッ
トのデータ線ＤＬ１０を、上位１６ビットと下位１６ビ
ットに振り分けてｏｕｔ１／ｏｕｔ２に接続されたデー
タ線ＤＬ１４／ＤＬ１５に出力する。スイッチマトリク
スＳ１は、これらの２つのデータ線を選択して、加算モ
ードで起動しているＡＬＵに入力する。In the input logic circuit PC20d, the control line CL
The 32-bit data line DL10 activated by 20 and input from in1 is distributed to the upper 16 bits and the lower 16 bits and output to the data lines DL14 / DL15 connected to out1 / out2. The switch matrix S1 selects these two data lines and inputs them to the ALU operating in the addition mode.

【００６０】これにより、ＡＬＵの出力には「Ｒｓで指
定されたレジスタの上位１６ビットと下位１６ビットを
加算した結果」が出力され、目的の図８に示すＳＡＤＤ
動作が実現される。As a result, the result of adding the upper 16 bits and the lower 16 bits of the register designated by Rs is output to the output of the ALU, and the target SADD shown in FIG. 8 is output.
The operation is realized.

【００６１】図１０に示すのは、このＳＡＤＤ命令実行
時のデータパスの演算の様子である。Ｒｓで指定された
レジスタ番号のレジスタが読み出され、入力論理回路Ｐ
Ｃ２０ｄにより、上位１６ビットと下位１６ビットに分
離され、その２つを加算モードのＡＬＵで加算するとい
う流れになる。なお、図１１に入力論理回路ＰＣ２０ｄ
の等価回路の例を示す。入力論理回路ＰＣ２０ｄとし
て、トライステートバッファＧ１００〜Ｇ１３２、Ｇ２
００〜Ｇ２３２で構成される論理回路をプログラムする
ことにより、この所望の動作を実現する回路が提供でき
る。FIG. 10 shows how the data path is calculated when the SADD instruction is executed. The register of the register number designated by Rs is read out, and the input logic circuit P
The C20d separates the upper 16 bits and the lower 16 bits, and the two are added by the addition mode ALU. The input logic circuit PC20d shown in FIG.
An example of the equivalent circuit of is shown. As the input logic circuit PC20d, tristate buffers G100 to G132, G2
A circuit that realizes this desired operation can be provided by programming the logic circuit configured by 00 to G232.

【００６２】図９の回路をパイプライン化した場合の例
を図１２に示す。ここでは、図６及び図７と同様にＩＦ
／ＩＤ１／ＩＤ２／ＥＸ／ＭＥＭ／ＷＢにステージ分け
してパイプライン化した例であるが、この他のステージ
分けも可能である。FIG. 12 shows an example in which the circuit of FIG. 9 is pipelined. Here, the IF is the same as in FIGS.
This is an example in which / ID1 / ID2 / EX / MEM / WB is divided into stages and pipelined, but other division into stages is also possible.

【００６３】ＥＸステージにおいては、入力論理回路Ｐ
Ｃ２０ｄが演算を実行するため、遅延時間が増加してし
まう可能性がある。しかし、入力論理回路ＰＣ２０ｄで
行う処理内容は簡単な内容であり、図１１で示したよう
に、段数の浅い回路で構成できる。したがって、入力論
理回路ＰＣ２０ｄによる遅延時間の増加は小さく、ＥＸ
ステージにおいて入力論理回路ＰＣ２０ｄを動作させ
る。一方、入力論理回路ＰＣ２０ｄに複雑な処理を実装
させる場合には、入力論理回路ＰＣ２０ｄを用いる命令
のみについては、ＥＸステージを複数のクロックサイク
ルで実行するマルチサイクル命令として制御回路を実装
することが望ましい。入力論理回路ＰＣ２０ｄを用いな
い命令については、スイッチマトリクスＳ１により、入
力論理回路ＰＣ２０ｄを経由せずに、ダイレクトにＡＬ
Ｕ／演算器Ａ１，Ａ２に接続する。したがって、入力論
理回路ＰＣ２０ｄを用いない命令については、ＥＸステ
ージを１クロックで実行することが可能となり、動作ス
ピードの低下を関係する命令のみに限定することが可能
である。In the EX stage, the input logic circuit P
Since the C20d executes the calculation, the delay time may increase. However, the processing contents performed by the input logic circuit PC20d are simple contents, and as shown in FIG. 11, can be constituted by a circuit having a shallow number of stages. Therefore, the increase in the delay time due to the input logic circuit PC20d is small, and the EX
The input logic circuit PC20d is operated in the stage. On the other hand, when implementing complicated processing in the input logic circuit PC20d, it is desirable to implement the control circuit as a multi-cycle instruction that executes the EX stage in a plurality of clock cycles only for the instruction using the input logic circuit PC20d. . For an instruction that does not use the input logic circuit PC20d, the switch matrix S1 allows the AL to be directly input without passing through the input logic circuit PC20d.
U / Connected to arithmetic units A1 and A2. Therefore, for the instruction that does not use the input logic circuit PC20d, the EX stage can be executed in one clock, and the decrease in operation speed can be limited to only the relevant instruction.

【００６４】なお、このようにマルチサイクル命令で実
装された場合でも、複数の基本命令に分けて実行するよ
りは、少ないクロック数で実行できる場合が多い。例え
ば、図９の例ではＳＡＤＤ命令をＥＸステージに２クロ
ック必要となるマルチサイクル命令で実現したとして
も、基本命令を４回実行するよりは、処理に要するクロ
ック数が少なくて済む。その結果、高速に処理可能にな
る。Even when the multi-cycle instruction is implemented as described above, it is often possible to execute it with a smaller number of clocks than to execute it by dividing it into a plurality of basic instructions. For example, in the example of FIG. 9, even if the SADD instruction is realized by a multi-cycle instruction that requires two clocks in the EX stage, the number of clocks required for processing is smaller than that required when the basic instruction is executed four times. As a result, high speed processing is possible.

【００６５】以上のように、本発明の命令セット可変マ
イクロプロセッサでは、レジスタファイルとＡＬＵ／演
算器間に挿入された、プログラマブルロジックで構成し
た入力論理回路で、一部の演算処理を行う事により、備
え付けのＡＬＵ／演算器で提供されていない演算に対応
した命令を高速に実行する事が可能である。As described above, in the variable instruction set microprocessor of the present invention, a part of arithmetic processing is performed by the input logic circuit formed by the programmable logic inserted between the register file and the ALU / arithmetic unit. , It is possible to execute at high speed an instruction corresponding to an operation not provided by the built-in ALU / operation unit.

【００６６】次に、図１３の命令セットを例にとって、
ＡＬＵ／演算器の出力に接続されたプログラマブルロジ
ックを利用する事により、より複雑な命令を実装可能で
あることを説明する。図１３の命令セットは、図８の命
令セットに新しい命令（ＩＳ４、ＩＥ４）を追加した命
令セットである。追加された命令ＩＳ４は、ＩＥ４に示
されるように、Ｒｓで指定されたレジスタの上位１６ビ
ットと下位１６ビットを１６ビットデータとして乗算し
て、さらに、その乗算結果と、Ｒｔで指定されたレジス
タを加算して、結果をＲｄで指定されたレジスタに書き
込むという命令である。実施例３の場合と同様に、チッ
プ内に持つＡＬＵや演算器でそのような演算モードがサ
ポートされていない場合には、通常は、ＡＬＵ／演算器
を複数回使用して実現する必要がある。例えば、（１）Ｒｓの内容を別のレジスタＲ２にコピー（２）Ｒ２を１６ビット右にシフト（３）Ｒｓの上位１６ビットを０に設定（４）ＲｓとＲ２のデータを読み出して乗算器で乗算し
て結果を別のＲ３に書き込む（５）Ｒ３とＲｔのデータを読み出してＡＬＵで加算し
て結果をＲｄに書き込むというように、基本命令を５回実行する必要がある。し
かし、通信系で必要とされるディジタル信号の変復調処
理では、フィルタ処理などで、このような積和演算（Ｍ
ＡＣ演算）を高速に実行する必要がある。このため、多
くのマイクロプロセッサやディジタルシグナルプロセッ
サでは、専用の演算器を搭載している。Next, taking the instruction set of FIG. 13 as an example,
It will be described that more complicated instructions can be implemented by using the programmable logic connected to the output of the ALU / arithmetic unit. The instruction set of FIG. 13 is an instruction set in which new instructions (IS4, IE4) are added to the instruction set of FIG. As shown in IE4, the added instruction IS4 multiplies the upper 16 bits and the lower 16 bits of the register designated by Rs as 16-bit data, and further, the multiplication result and the register designated by Rt. Is added and the result is written to the register specified by Rd. As in the case of the third embodiment, when such an arithmetic mode is not supported by the ALU or arithmetic unit in the chip, it is usually necessary to use the ALU / arithmetic unit multiple times for implementation. . For example, (1) copy the contents of Rs to another register R2 (2) shift R2 16 bits to the right (3) set the upper 16 bits of Rs to 0 (4) read the data of Rs and R2, and multiply And the result is written to another R3 (5) The data of R3 and Rt are read, added by the ALU, and the result is written to Rd. It is necessary to execute the basic instruction five times. However, in the modulation / demodulation processing of the digital signal required in the communication system, such a product-sum operation (M
It is necessary to execute the AC calculation) at high speed. For this reason, many microprocessors and digital signal processors are equipped with dedicated arithmetic units.

【００６７】これに対して、図１に示す本発明の命令セ
ット可変マイクロプロセッサでは、レジスタファイルと
演算器間に挿入された、プログラマブルロジックで構成
される入力論理回路ＰＣ２０と出力論理回路ＰＣ３０
で、簡単な演算処理を分担させる事により、ＭＡＣ演算
をサポートしていないＡＬＵでもＭＡＣ演算命令を実現
できる。On the other hand, in the variable instruction set microprocessor of the present invention shown in FIG. 1, an input logic circuit PC20 and an output logic circuit PC30, which are formed by programmable logic and are inserted between the register file and the arithmetic unit.
Thus, by sharing the simple arithmetic processing, it is possible to realize the MAC arithmetic instruction even in the ALU that does not support the MAC arithmetic.

【００６８】図１４に示すのが、本発明の命令セット可
変マイクロプロセッサで、図１３の命令セットを実現し
た例である。図１４に示されるように、プログラマブル
ロジック上に、ＡＬＵ／演算器制御回路ＰＣ１３ｅ、ス
イッチマトリクス制御回路ＰＣ１４ｅ、入力論理回路Ｐ
Ｃ２０ｅ、入出力論理制御回路ＰＣ１５ｅと、出力論理
回路ＰＣ３０ｅをプログラムすることにより実現され
る。以下に、各回路の動作をＳＭＡＣ命令に関係する範
囲で説明する。ＡＤＤ命令、ＭＵＬ命令については図３
と同じ動作をするように回路を構成し、ＳＡＤＤ命令に
ついては図９と同じ動作をするように回路を構成する。FIG. 14 shows an example in which the instruction set variable microprocessor of the present invention realizes the instruction set of FIG. As shown in FIG. 14, the ALU / arithmetic unit control circuit PC13e, the switch matrix control circuit PC14e, and the input logic circuit P are arranged on the programmable logic.
It is realized by programming the C20e, the input / output logic control circuit PC15e, and the output logic circuit PC30e. The operation of each circuit will be described below in the range related to the SMAC instruction. Figure 3 for the ADD and MUL instructions
The circuit is configured to operate in the same manner as the above, and the circuit is configured to operate in the same manner as in FIG. 9 for the SADD instruction.

【００６９】ＡＬＵ／演算器制御回路ＰＣ１３ｅは、Ｓ
ＭＡＣ命令に対応し、命令デコーダからの信号が“００
０１０１”のときには、制御線ＣＬ１１を活性化させ、
加算モードでＡＬＵＡ１を起動する。また、制御線ＣＬ
１２も活性化して乗算器Ａ２を起動する。The ALU / arithmetic unit control circuit PC13e is S
The signal from the instruction decoder corresponds to the MAC instruction and is "00".
When 0101 ", the control line CL11 is activated,
Start ALUA1 in addition mode. In addition, the control line CL
12 is also activated to activate the multiplier A2.

【００７０】スイッチマトリクス制御回路ＰＣ１４ｅ
は、命令デコーダからの信号が、ＳＭＡＣ命令のオペコ
ードに対応する“０００１０１”のときには、制御線Ｃ
Ｌ１４を通じて、スイッチマトリクスＳ１にはデータ線
ＤＬ１６，ＤＬ１７、および、データ線ＤＬ１４，ＤＬ
１１を選択させる。また、スイッチマトリクスＳ２に
は、制御線ＣＬ１３を通じて、ＤＬ２２を選択させるよ
うに設定する。Switch matrix control circuit PC14e
When the signal from the instruction decoder is "000101" corresponding to the operation code of the SMAC instruction, the control line C
The data lines DL16 and DL17 and the data lines DL14 and DL are provided to the switch matrix S1 through L14.
Select 11. Further, the switch matrix S2 is set to select DL22 through the control line CL13.

【００７１】入出力論理制御回路でＰＣ１５ｅは、命令
デコーダからの信号が、ＳＡＤＤ命令、ＳＭＡＣ命令の
オペコードである“０００１００”と“０００１０１”
のときには、制御線ＣＬ２０、ＣＬ３０を通じて、入力
論理回路ＰＣ２０ｅおよび出力論理回路ＰＣ３０ｅを活
性化させる。In the input / output logic control circuit, the signal from the instruction decoder of the PC 15e is "000100" and "000101" which are the operation codes of the SADD instruction and the SMAC instruction.
At the time of, the input logic circuit PC20e and the output logic circuit PC30e are activated through the control lines CL20 and CL30.

【００７２】入力論理回路ＰＣ２０ｅは、制御線ＣＬ２
０により活性化される。ＳＡＤＤ命令実行時と違って、
ＳＭＡＣ命令時には、まず、最初のサイクルでは、ｉｎ
１から入力されるＲｓレジスタの３２ビットデータを上
位１６ビットと下位１６ビットに振り分けて、データ線
ＤＬ１６およびＤＬ１７に出力し、スイッチマトリクス
Ｓ１経由で乗算器Ａ２に入力する。乗算された結果は、
出力論理回路ＰＣ３０ｅに構成されたラッチに保持す
る。保持された演算結果は、次のクロックサイクルで、
データ線ＤＬ１００を経由して入力論理回路のｉｎ３に
入力される。このクロックサイクルでは、ｉｎ３の信号
が、ｏｕｔ１／ｏｕｔ２にそのまま出力されるように、
入力論理回路ＰＣ２０ｅが動作し、スイッチマトリック
スＳ１経由で、加算モードのＡＬＵＡ１に入力される。
Ｒｔレジスタの内容がデータ線ＤＬ１１に出力され、デ
ータ線ＤＬ１４に出力された乗算結果とが、ＡＬＵＡ１
で加算されることにより、「Ｒｓで指定されたレジスタ
の上位１６ビットと下位１６ビットを乗算した結果にＲ
ｔレジスタを加算した結果」が出力され、目的のＭＡＣ
演算命令が実行される。The input logic circuit PC20e has a control line CL2.
It is activated by 0. Unlike when the SADD instruction is executed,
At the time of SMAC instruction, first, in the first cycle, in
The 32-bit data of the Rs register input from 1 is distributed to upper 16 bits and lower 16 bits, output to the data lines DL16 and DL17, and input to the multiplier A2 via the switch matrix S1. The result of the multiplication is
It is held in the latch formed in the output logic circuit PC30e. The held operation result is the next clock cycle,
It is input to in3 of the input logic circuit via the data line DL100. In this clock cycle, the signal in3 is output as it is to out1 / out2,
The input logic circuit PC20e operates and is input to the addition mode ALUA1 via the switch matrix S1.
The content of the Rt register is output to the data line DL11, and the multiplication result output to the data line DL14 is ALUA1.
The result of multiplying the upper 16 bits and the lower 16 bits of the register specified by Rs is R
The result of adding the t registers "is output, and the target MAC
The arithmetic instruction is executed.

【００７３】このようにして、図１４の構成の命令可変
マイクロプロセッサにより、図１３に示すＳＭＡＣ命令
が実現可能になる。図１５に示すのは、ＳＭＡＣ命令実
行時のデータパスの演算の様子である。Ｒｓで指定され
たレジスタ番号のレジスタが読み出され、入力論理回路
ＰＣ２０ｅにより、上位１６ビットと下位１６ビットに
分離される。次に、その２つを乗算器で乗算する。さら
に、その乗算結果を出力論理回路ＰＣ３０ｅでラッチし
た後に、入力論理回路ＰＣ２０ｅ経由で、加算モードで
起動したＡＬＵに入力して、加算結果を得て、ＭＡＣ演
算結果を得る。In this way, the SMAC instruction shown in FIG. 13 can be realized by the instruction variable microprocessor having the configuration of FIG. FIG. 15 shows how the data path is calculated when the SMAC instruction is executed. The register having the register number designated by Rs is read out and separated into upper 16 bits and lower 16 bits by the input logic circuit PC20e. Next, the two are multiplied by a multiplier. Further, after the multiplication result is latched by the output logic circuit PC30e, it is input to the ALU started in the addition mode via the input logic circuit PC20e, the addition result is obtained, and the MAC operation result is obtained.

【００７４】なお、入力論理回路ＰＣ２０ｅ、出力論理
回路ＰＣ３０ｅは、複数の構成例が考えられる。その一
例として、図１６に示す、トライステートバッファＧ５
００〜Ｇ７３２、セレクタＳ１００〜Ｓ１３２で構成し
た等価回路を示す。また、図１４の場合においても、パ
イプライン化により、処理スピードの高速化が実現可能
である。＜実施例２＞第２の実施の形態として、本発明の命令セ
ット可変マイクロプロセッサを、コンピュータネットワ
ーク上の各種サーバーに格納された複数のアプリケーシ
ョンプログラムを実行するために用いた構成を説明す
る。図１７に示されるように、コンピュータ端末ＰＤ１
は、コンピュータネットワークＮ１００を介して、各種
アプリケーションプログラムＡＰ１００〜ＡＰ２００を
格納したサーバＡＳ１００〜ＡＳ２００に接続される。
コンピュータ端末ＰＤ１は、キーボードＫＹ１等の入出
力インターフェースＩＦ１経由でユーザーからの指示を
受け、指定されたアプリケーションプログラムをコンピ
ュータネットワークＮ１００経由で半導体メモリ等や磁
気ディスク装置等で構成された記憶装置ＭＳ１にダウン
ロードし、プログラムを本発明の命令セット可変マイク
ロプロセッサＣ１０で実行する。実行結果は必要に応じ
て表示装置ＤＳ１等経由でユーザーに提供される。な
お、コンピュータネットワークＮ１００は模式的に示し
たもので、有線回線、無線回線といった接続方法は限定
されない。また、端末ＰＤ１は、独立したチップがボー
ド上に配置され、それらがバスＢＵ１で接続される形態
を例示しているが、例えば命令セット可変プロセッサＣ
１０と書き換え制御回路ＷＣ０を１チップに集積すると
いった形態も可能である。A plurality of configuration examples of the input logic circuit PC20e and the output logic circuit PC30e are conceivable. As an example thereof, the tri-state buffer G5 shown in FIG.
An equivalent circuit composed of 00 to G732 and selectors S100 to S132 is shown. Further, also in the case of FIG. 14, it is possible to realize a high processing speed by pipeline processing. <Second Embodiment> As a second embodiment, a configuration in which the instruction set variable microprocessor of the present invention is used to execute a plurality of application programs stored in various servers on a computer network will be described. As shown in FIG. 17, the computer terminal PD1
Is connected to servers AS100-AS200 storing various application programs AP100-AP200 via a computer network N100.
The computer terminal PD1 receives an instruction from the user via the input / output interface IF1 such as the keyboard KY1 and downloads the designated application program to the storage device MS1 including a semiconductor memory or a magnetic disk device via the computer network N100. Then, the program is executed by the instruction set variable microprocessor C10 of the present invention. The execution result is provided to the user via the display device DS1 or the like as necessary. Note that the computer network N100 is schematically shown, and the connection method such as a wired line or a wireless line is not limited. Further, the terminal PD1 exemplifies a form in which independent chips are arranged on the board and they are connected by the bus BU1. For example, the instruction set variable processor C is used.
It is also possible to integrate 10 and the rewrite control circuit WC0 on one chip.

【００７５】このようなアプリケーションプログラムを
自分自身の記憶装置に持たずに、必要に応じて各種サー
バからダウンロードして実行できるコンピュータ端末
は、記憶装置の記憶容量が小さくて済む。このため、携
帯型の端末、いわゆるＰＤＡ（ＰｅｒｓｏｎａｌＤｉ
ｇｉｔａｌＡｓｓｉｓｔａｎｔ）に適している。ま
た、ユーザーがいちいちアプリケーションプログラムを
インストールするという作業を強いられずに済む。さら
に、ダウンロード先のサーバーを適宜設定することによ
り、常に最新のアプリケーションを利用できる等、様々
なメリットがある。このため、このような端末は、次世
代の分散コンピューティング環境を実現するものとして
期待されている（例えば、日経エレクトロニクス２００
１年７月３０日号、ｐｐ．１０８〜ｐｐ．１１７（文献
６）参照）。A computer terminal that does not have such an application program in its own storage device and can be downloaded from various servers and executed as necessary can have a small storage capacity of the storage device. Therefore, a portable terminal, a so-called PDA (Personal Di)
It is suitable for the digital assistant). Moreover, the user does not have to be forced to install application programs one by one. Furthermore, there are various merits such that the latest application can always be used by appropriately setting the download destination server. Therefore, such a terminal is expected to realize a next-generation distributed computing environment (for example, Nikkei Electronics 200
July 30, 1st issue, pp. 108-pp. 117 (reference 6)).

【００７６】しかし、現状ではマイクロプロセッサの命
令セットには様々な種類のものが存在し、しかもそれら
には相互に互換性がない。このため、図１７のような分
散コンピューティング環境を実現する場合には、サーバ
ーＡＳ１００〜ＡＳ２００に格納されるアプリケーショ
ンプログラムは、端末ＰＤ１のプロセッサＣ１０に固有
の命令セットで記述されていなくてはならなかった。通
常、サーバーのプロセッサと端末のプロセッサは違うた
め、サーバー上に格納されるアプリケーションプログラ
ムは、サーバーのプロセッサが実行できるプログラムで
はなく、端末ＰＤ１のプロセッサ専用に別途用意する必
要があった。そのため、端末ＰＤ１のプロセッサの種類
毎に、それ専用の命令セットで書かれた専用のアプリケ
ーションプログラムを用意する必要があり、管理の手間
も膨大なものになってしまう。At present, however, there are various kinds of microprocessor instruction sets, and they are not compatible with each other. Therefore, in order to realize the distributed computing environment as shown in FIG. 17, the application programs stored in the servers AS100 to AS200 must be written in an instruction set specific to the processor C10 of the terminal PD1. It was Since the processor of the server and the processor of the terminal are usually different, the application program stored on the server needs to be prepared separately for the processor of the terminal PD1 and not a program that can be executed by the server processor. Therefore, it is necessary to prepare a dedicated application program written with a dedicated instruction set for each type of processor of the terminal PD1, and the management effort becomes enormous.

【００７７】この問題に対処するため、文献６に示され
ている従来技術においては、Ｊａｖａ等のプロセッサに
依存しない中間的な命令セットでアプリケーションプロ
グラムを記述しておき、ダウンロード先の端末側で、そ
の端末を構成するプロセッサの命令セットに変換しなが
ら実行するという方法が開示されている。In order to deal with this problem, in the conventional technique disclosed in Document 6, an application program is described with an intermediate instruction set that does not depend on a processor such as Java, and the terminal of the download destination side There is disclosed a method of executing while converting to an instruction set of a processor configuring the terminal.

【００７８】しかし、この方法では、中間的な命令セッ
トを自分自身の固有の命令セットに変換するのに時間が
かかるため、端末ＰＤ１のプロセッサ固有の命令セット
で書かれた専用アプリケーションプログラムに対して、
動作スピードが劣ったものになってしまうというおそれ
がある。However, according to this method, it takes time to convert the intermediate instruction set into its own unique instruction set, so that it is necessary for the dedicated application program written in the processor-specific instruction set of the terminal PD1. ,
There is a risk that the operation speed will be inferior.

【００７９】本実施例においては、本発明の命令セット
可変マイクロプロセッサを後述する命令セットマッピン
グプログラムＰ１０と組み合わせて使用することによ
り、様々な命令セットで記述されたアプリケーションプ
ログラムをダウンロードして実行できるようにして、サ
ーバー上に専用のプログラムを格納する必要をなくす。
また、回路構成そのものを実行すべきアプリケーション
の命令セットに合わせて変更してから、アプリケーショ
ンを実行するため、動作スピードも大幅に改善される。
以下、図１７、１８を使用して、本発明の命令セットマ
ッピングプログラムＰ１０の動作を説明する。命令セッ
トマッピングプログラムは端末上で実施する場合と、サ
ーバー上で実施する場合との２通りの方法がありうる。（１）命令セットマッピングを端末上で実施する場合命令セットマッピングプログラムＰ１０は記憶装置ＭＳ
１に格納される（図１７を参照）。命令セットマッピン
グプログラムＰ１０は、ダウンロードされたアプリケー
ションプログラムＡＰ１０に応じた命令セットが実行で
きるように、マイクロプロセッサＣ１０のプログラムロ
ジックＰＬ１の回路構成を書き換える。In the present embodiment, by using the variable instruction set microprocessor of the present invention in combination with the instruction set mapping program P10 described later, it is possible to download and execute application programs described in various instruction sets. And eliminate the need to store a dedicated program on the server.
Further, the circuit configuration itself is changed according to the instruction set of the application to be executed, and then the application is executed, so that the operation speed is significantly improved.
The operation of the instruction set mapping program P10 of the present invention will be described below with reference to FIGS. The instruction set mapping program can be implemented in two ways, that is, on the terminal and on the server. (1) When the instruction set mapping is performed on the terminal The instruction set mapping program P10 is the storage device MS.
1 (see FIG. 17). The instruction set mapping program P10 rewrites the circuit configuration of the program logic PL1 of the microprocessor C10 so that the instruction set corresponding to the downloaded application program AP10 can be executed.

【００８０】命令セットマッピングプログラムＰ１０
は、図１８に示すように、命令セット記述ファイルＤ１
０、リソース情報ファイルＲ１０から、プログラムロジ
ックＰＬ１の回路構成を決定する。命令セット記述ファ
イルＤ１０には、図２で示したような実現したい命令の
動作、命令のフォーマット、オペコードなどが記述され
ている。一方、リソース情報ファイルＲ１０には、本発
明の命令セット可変マイクロプロセッサＣ１０が実装し
ている演算器の種類やそれらをどのように制御するかと
いった制御情報、さらに、実装しているプログラマブル
ロジックの規模等の情報が記述されている。Instruction set mapping program P10
Is the instruction set description file D1 as shown in FIG.
0, the circuit configuration of the program logic PL1 is determined from the resource information file R10. In the instruction set description file D10, the operation of the instruction to be realized as shown in FIG. 2, the instruction format, the operation code, etc. are described. On the other hand, in the resource information file R10, control information such as the types of arithmetic units mounted by the instruction set variable microprocessor C10 of the present invention and how to control them, and the scale of the mounted programmable logic are provided. Information such as is described.

【００８１】命令セット記述ファイルＤ１０およびリソ
ース情報ファイルＲ１０とも、端末ＰＤ１上の記憶装置
ＭＳ１上に常時保持していることも可能である。しか
し、記憶装置ＭＳ１の記憶容量が限られている場合に
は、これらのファイルをネットワーク上のサーバーＡＳ
２００に用意しておいて必要に応じて端末ＰＤ１にダウ
ンロードする構成とすることが望ましい。命令セットマ
ッピングプログラムＰ１０は、入力された命令セット記
述ファイルＤ１０およびリソース情報ファイルＤ２０の
情報を解析して、実現すべき命令毎に使用する演算器を
決定する（Ｐ１１）。次に、命令毎にその命令を実現す
るために必要な、演算器の制御手順を解析し、プログラ
マブルロジック上に構成される制御論理回路の仕様を決
定する（Ｐ１２）。このように決定された制御論理の仕
様から、一般的な方式で論理合成を行い、その仕様を実
現する論理回路を合成し、プログラマブルロジックネッ
トリストＮ１０を作成する（Ｐ１３）。最後に、作成さ
れたネットリストＮ１０を回路構成書き換えインターフ
ェース経由でプログラマブルロジックＰＬ１に書き込
む。Both the instruction set description file D10 and the resource information file R10 can be always held in the storage device MS1 on the terminal PD1. However, when the storage capacity of the storage device MS1 is limited, these files are stored in the server AS on the network.
It is desirable to have a configuration in which it is prepared in 200 and downloaded to the terminal PD1 as needed. The instruction set mapping program P10 analyzes the input information of the instruction set description file D10 and the resource information file D20, and determines the arithmetic unit to be used for each instruction to be realized (P11). Next, for each instruction, the control procedure of the arithmetic unit necessary for realizing the instruction is analyzed, and the specification of the control logic circuit configured on the programmable logic is determined (P12). From the control logic specifications determined in this way, logic synthesis is performed by a general method, a logic circuit that realizes the specifications is synthesized, and a programmable logic netlist N10 is created (P13). Finally, the created netlist N10 is written in the programmable logic PL1 via the circuit configuration rewriting interface.

【００８２】このように、命令セットマッピングプログ
ラムＰ１０と本発明の命令セット可変マイクロプロセッ
サＣ１０を使用することにより、命令セットが与えられ
れば自動的にその命令セットを実現するマイクロプロセ
ッサが得られるようになる。これにより、ネットワーク
上の各種サーバー上に存在する様々なアプリケーション
プログラムを端末ＰＤ１上でシームレスに実行すること
が可能になる。As described above, by using the instruction set mapping program P10 and the instruction set variable microprocessor C10 of the present invention, it is possible to obtain a microprocessor which automatically realizes the instruction set when the instruction set is given. Become. As a result, various application programs existing on various servers on the network can be seamlessly executed on the terminal PD1.

【００８３】ここで、命令セットマッピングプログラム
Ｐ１０の各ルーチンの中で、Ｐ１３のプログラマブルロ
ジックの論理合成までは、必ずしも端末ＰＤ１上で実行
する必要がある訳ではない。プログラマブルロジックＰ
Ｌ１の論理合成（Ｐ１３）までは、ネットワーク上の高
速なサーバーＡＳ１００等で実行して、結果のプログラ
マブルロジックのネットリストのみをダウンロードする
ことも可能である。このような構成にすると、記憶装置
ＭＳ１に要求される記憶容量が小さくて済む。また、命
令セットによっては制御論理の規模が大きくなってしま
って論理合成に比較的長い時間が必要になってしまう場
合にも、論理合成を高速なサーバーで実行することによ
り書き換えにかかる時間を最小限に抑えることが可能で
ある。また、サーバーＡＳ２００上で、あらかじめ必要
となる命令セット毎に、図１８のＰ１３の論理合成まで
を実行しておいて、プログラマブルロジックネットリス
トを用意しておき、アプリケーションの命令セットを切
り替える際に、プログラマブルロジックネットリストの
みを端末側にダウンロードするという構成も可能であ
る。Here, in each routine of the instruction set mapping program P10, the logic synthesis of the programmable logic of P13 does not necessarily have to be executed on the terminal PD1. Programmable logic P
It is also possible to execute up to the logic synthesis (P13) of L1 on the high-speed server AS100 or the like on the network and download only the netlist of the resulting programmable logic. With such a configuration, the storage capacity required for the storage device MS1 can be small. In addition, even if the scale of the control logic becomes large depending on the instruction set and relatively long time is required for logic synthesis, the time required for rewriting can be minimized by executing logic synthesis on a high-speed server. It is possible to limit it. Also, on the server AS200, for each instruction set required in advance, the logic synthesis of P13 of FIG. 18 is executed, a programmable logic netlist is prepared, and when switching the instruction set of the application, A configuration is also possible in which only the programmable logic netlist is downloaded to the terminal side.

【００８４】ネットリストＮ１０のプログラマブルロジ
ックＰＬ１への書き込み方法について、図２５〜図２７
を用いて説明する。図２５に本発明の命令セット可変プ
ロセッサのメモリ空間マップＭＡ０の例を示す。図２５
には、書き換え制御回路ＷＣ０経由で書き込まれる回路
情報保持部ＣＭ１がメモリ空間上のアドレスＡＣ０〜Ａ
Ｆに割り当てられ、それ以外のアドレスＡ０〜ＡＣ０ま
では通常のメモリとして利用される。もちろん、回路情
報保持部ＣＭ１はメモリ空間マップＭＡ０の所定のアド
レスに割り当てればよく、この割付方法には限定されな
い。A method of writing the netlist N10 into the programmable logic PL1 will be described with reference to FIGS.
Will be explained. FIG. 25 shows an example of the memory space map MA0 of the variable instruction set processor of the present invention. Figure 25
Of the circuit information holding unit CM1 written via the rewrite control circuit WC0.
It is assigned to F and the other addresses A0 to AC0 are used as a normal memory. Of course, the circuit information holding unit CM1 may be allocated to a predetermined address of the memory space map MA0, and the allocation method is not limited.

【００８５】書き換えは図２６のＷＰ１００〜１３０の
ステップで実行される。（ａ）ＷＰ１００プロセッサＣ１０は、ネットワークインターフェースＮ
１１、コンピュータネットワークＮ１００を介してサー
バにアクセスし、所望の命令セットに対応するプログラ
マブルロジックネットリストＮ１０を取得し、アドレス
ＡＳ〜ＡＥで指定されるメモリ空間に展開する。なお、
端末ＰＤ１が論理合成する実施形態ではアドレスＡＳ〜
ＡＥで指定されるメモリ空間に生成したネットリストを
記憶するようにする。このアドレスＡＳ〜ＡＥで指定さ
れるメモリ空間はメモリＭＳ１上にあるものとする。こ
の状態のメモリ空間マップを図２７（ａ）に示す。（ｂ）ＷＰ１１０プロセッサＣ１０は、メモリＭＳ１に記憶されたネット
リストを回路構成保持部ＣＭ１に転送する必要がある。
そこで、プロセッサＣ１０は、ＤＭＡ(DirectMemory Ac
cess)コントローラＤＭＡ０に転送元となるアドレス
（先頭アドレスＡＳ，終了アドレスＡＥ）をＤＭＡコン
トローラＤＭＡ０に通知するとともに、バスＢＵ１の使
用権をＤＭＡコントローラＤＭＡ０に与える。（ｃ）ＷＰ１２０ＤＭＡコントローラＤＭＡ０は、メモリＭＳ１に記憶さ
れたネットリストを一旦自己のバッファメモリに蓄積
し、書き換え制御回路ＷＣ０、回路構成書き換えインタ
ーフェースＣ１１を介して、アドレスＡＣ１〜ＡＦとし
て定義される回路情報保持部ＣＭ１に転送する。この状
態のメモリ空間マップを図２７（ｂ）に示す。最終的に
は、ネットリストが回路情報保持部ＣＭ１に書き込まれ
る。この状態のメモリ空間マップを図２７（ｃ）に示
す。（ｄ）ＷＰ１３０ＤＭＡコントローラＤＭＡ０はバス使用権をプロセッサ
Ｃ１０に渡す。また、プロセッサＣ１０は予め定められ
たメモリアドレス（通常メモリ領域）上の命令（例えば
ブート命令など）から動作を再開する。The rewriting is executed in the steps of WP100-130 in FIG. (A) WP100 Processor C10 is a network interface N
11. Access the server via the computer network N100, obtain the programmable logic netlist N10 corresponding to the desired instruction set, and expand it in the memory space specified by the addresses AS to AE. In addition,
In the embodiment in which the terminal PD1 logically synthesizes, the address AS ~
The generated netlist is stored in the memory space designated by AE. It is assumed that the memory space designated by the addresses AS to AE is on the memory MS1. The memory space map in this state is shown in FIG. (B) WP110 The processor C10 needs to transfer the netlist stored in the memory MS1 to the circuit configuration holding unit CM1.
Therefore, the processor C10 uses the DMA (DirectMemory Ac)
cess) The controller DMA0 is notified of the transfer source address (start address AS, end address AE) to the DMA controller DMA0, and the bus BU1 usage right is given to the DMA controller DMA0. (C) The WP120 DMA controller DMA0 temporarily stores the netlist stored in the memory MS1 in its own buffer memory, and through the rewrite control circuit WC0 and the circuit configuration rewrite interface C11, a circuit defined as addresses AC1 to AF. The information is transferred to the information holding unit CM1. The memory space map in this state is shown in FIG. Finally, the netlist is written in the circuit information holding unit CM1. The memory space map in this state is shown in FIG. (D) The WP130 DMA controller DMA0 passes the bus use right to the processor C10. Further, the processor C10 restarts the operation from an instruction (for example, a boot instruction) on a predetermined memory address (normal memory area).

【００８６】このように、本発明の命令セット可変プロ
セッサＣ１０は、そのプログラマブルロジックＰＬ１を
書き換え可能である。As described above, the variable instruction set processor C10 of the present invention can rewrite the programmable logic PL1.

【００８７】図１９に、端末ＰＤ１上で、命令セットを
書き換えながら異なる命令セットで書かれたアプリケー
ションプログラムを実行する様子を示す。まず、アプリ
ケーションプログラムＡ（ＡＰ１）がダウンロードされ
る。このアプリケーションプログラムＡを実行するため
の命令セットＡの記述ファイルＤ１が記憶装置ＭＳ１上
に存在しない場合は、命令セットＡの記述ファイルも同
時にダウンロードされる。次に、命令セットマッピング
プログラムにより、命令セットＡを実現するためのプロ
グラマブルロジックネットリストＮ１が生成される。最
後に、プログラマブルロジックネットリストＮ１に従っ
て、本発明の命令セット可変マイクロプロセッサＣ１０
の回路構成が書き換えられ、目的のアプリケーションプ
ログラムＡＰ１が実行される。別の命令セットＢで書か
れたアプリケーションプログラムＡＰ２を実行する場合
には、同様の手順で、命令セットＢに対応したプログラ
マブルロジックネットリストＮ２が生成され、命令セッ
ト可変マイクロプロセッサＣ１０が書き換えられ、アプ
リケーションプログラムＡＰ２が実行される。ネットリ
ストＮ１をサーバ側で生成する場合には、アプリケーシ
ョンプログラムＡＰ１とそれに対応したネットリストＮ
１をダウンロードするようにすればよい。FIG. 19 shows how the application program written in a different instruction set is executed while rewriting the instruction set on the terminal PD1. First, the application program A (AP1) is downloaded. When the description file D1 of the instruction set A for executing the application program A does not exist in the storage device MS1, the description file of the instruction set A is also downloaded at the same time. Next, the instruction set mapping program generates the programmable logic netlist N1 for realizing the instruction set A. Finally, according to the programmable logic netlist N1, the instruction set variable microprocessor C10 of the present invention is
The circuit configuration is rewritten, and the target application program AP1 is executed. When executing the application program AP2 written with another instruction set B, the programmable logic netlist N2 corresponding to the instruction set B is generated, the instruction set variable microprocessor C10 is rewritten, and the application is executed by the same procedure. The program AP2 is executed. When the netlist N1 is generated on the server side, the application program AP1 and the corresponding netlist N
1 should be downloaded.

【００８８】[0088]

【発明の効果】本発明の命令セット可変マイクロプロセ
ッサにより、命令フォーマットが異なる、様々な命令セ
ットに対応可能であり、スピード性能の劣化の少ないマ
イクロプロセッサを提供できる。According to the variable instruction set microprocessor of the present invention, it is possible to provide a microprocessor which can deal with various instruction sets having different instruction formats and in which the speed performance is less deteriorated.

[Brief description of drawings]

【図１】本発明の命令セット可変プロセッサの構成を示
す図である。FIG. 1 is a diagram showing a configuration of a variable instruction set processor of the present invention.

【図２】第１の実施例で説明する命令セットの各命令の
フォーマットを説明した図である。FIG. 2 is a diagram illustrating a format of each instruction of an instruction set described in the first embodiment.

【図３】図２の命令セットを、命令セット可変マイクロ
プロセッサで実現した例である。FIG. 3 is an example in which the instruction set of FIG. 2 is realized by a variable instruction set microprocessor.

【図４】命令セット可変マイクロプロセッサをパイプラ
イン動作させた時の各ステージ毎の処理の概要を示した
図である。FIG. 4 is a diagram showing an outline of processing for each stage when a variable instruction set microprocessor is pipelined.

【図５】命令セット可変マイクロプロセッサを、図４に
示すパイプライン動作で実現した時の処理の流れを示す
図である。5 is a diagram showing a processing flow when the variable instruction set microprocessor is realized by the pipeline operation shown in FIG. 4;

【図６】命令セット可変マイクロプロセッサをパイプラ
イン動作させた時の各ステージ毎の処理の概要を示した
図である。FIG. 6 is a diagram showing an outline of processing for each stage when a variable instruction set microprocessor is pipelined.

【図７】命令セット可変マイクロプロセッサを、図６に
示すパイプライン動作で実現した時の処理の流れを示す
図である。7 is a diagram showing a processing flow when the variable instruction set microprocessor is realized by the pipeline operation shown in FIG. 6;

【図８】第１の実施形態（Ｃ）で説明する命令セット
の、各命令のフォーマットを説明した図である。FIG. 8 is a diagram illustrating a format of each instruction of the instruction set described in the first embodiment (C).

【図９】図８の命令セットを、命令セット可変マイクロ
プロセッサで実現した例である。9 is an example in which the instruction set of FIG. 8 is realized by a variable instruction set microprocessor.

【図１０】図９の本発明の命令セット可変マイクロプロ
セッサにおいて、ＳＡＤＤ命令実行時のデータパス部の
信号の流れを説明した図である。10 is a diagram illustrating a signal flow of a data path unit at the time of executing a SADD instruction in the instruction set variable microprocessor of the present invention shown in FIG. 9;

【図１１】図９の命令セット可変マイクロプロセッサ
の、入力論理回路ＰＣ２０ｄの構成例の等価回路であ
る。11 is an equivalent circuit of a configuration example of an input logic circuit PC20d of the instruction set variable microprocessor of FIG.

【図１２】命令セット可変マイクロプロセッサをパイプ
ライン動作させた時の各ステージ毎の処理の概要を示し
た図である。FIG. 12 is a diagram showing an outline of processing for each stage when a variable instruction set microprocessor is pipelined.

【図１３】第１の実施例（Ｃ）で説明する命令セット
の、各命令のフォーマットを説明した図である。FIG. 13 is a diagram illustrating a format of each instruction of the instruction set described in the first embodiment (C).

【図１４】図１３の命令セットを命令セット可変マイク
ロプロセッサで実現した例である。FIG. 14 is an example in which the instruction set of FIG. 13 is implemented by a variable instruction set microprocessor.

【図１５】図１４に示した本発明の命令セット可変マイ
クロプロセッサにおいて、ＳＭＡＣ命令実行時のデータ
パス部の信号の流れを説明した図である。FIG. 15 is a diagram for explaining the signal flow of the data path unit when the SMAC instruction is executed in the instruction set variable microprocessor of the present invention shown in FIG.

【図１６】図１４に示した本発明の命令セット可変マイ
クロプロセッサの、入力論理回路ＰＣ２０ｅおよび出力
論理回路ＰＣ３０ｅの構成例の等価回路である。16 is an equivalent circuit of a configuration example of an input logic circuit PC20e and an output logic circuit PC30e of the instruction set variable microprocessor of the present invention shown in FIG.

【図１７】命令セット可変マイクロプロセッサにより、
ネットワーク上の各種サーバーに格納された各種アプリ
ケーションプログラムを実行するコンピュータ端末の構
成例を示した図である。FIG. 17 shows a variable instruction set microprocessor.
It is a figure showing the example of composition of the computer terminal which runs various application programs stored in various servers on a network.

【図１８】命令セットマッピングプログラムＰ１０の処
理の流れを示した図である。FIG. 18 is a diagram showing a processing flow of an instruction set mapping program P10.

【図１９】異なる命令セットで記述された複数のプログ
ラムを、命令セット可変マイクロプロセッサにより実行
した場合の、処理の流れの概要を示した図である。FIG. 19 is a diagram showing an outline of a processing flow when a plurality of programs written in different instruction sets are executed by a variable instruction set microprocessor.

【図２０】プログラマブルロジックＰＬ１の構成例を示
す図である。FIG. 20 is a diagram showing a configuration example of a programmable logic PL1.

【図２１】図２１（ａ）は、プログラマブルロジックＰ
Ｌ１のプログラマブルスイッチの構成例を示す図であ
り、図２１（ｂ）はその変形例を示す図である。FIG. 21A shows a programmable logic P.
It is a figure which shows the structural example of the programmable switch of L1, and FIG.21 (b) is a figure which shows the modification.

【図２２】プログラマブルロジックＰＬ１のプログラマ
ブルスイッチのさらなる変形例を示す図である。FIG. 22 is a diagram showing a further modification of the programmable switch of the programmable logic PL1.

【図２３】プログラマブルロジックＰＬ１の論理ブロッ
クＬＢの構成例を示す図である。FIG. 23 is a diagram showing a configuration example of a logic block LB of the programmable logic PL1.

【図２４】論理ブロックＬＢの基本論理素子ＬＵＴの構
成例を示す図である。FIG. 24 is a diagram showing a configuration example of a basic logic element LUT of a logic block LB.

【図２５】本発明の命令セット可変プロセッサのメモリ
空間マップを示す図である。FIG. 25 is a diagram showing a memory space map of the instruction set variable processor of the present invention.

【図２６】本発明の命令セット可変プロセッサのプログ
ラマブルロジックＰＬ１を書き換えるための手続きを説
明するための図である。FIG. 26 is a diagram for explaining a procedure for rewriting the programmable logic PL1 of the variable instruction set processor of the present invention.

【図２７】本発明の命令セット可変プロセッサのプログ
ラマブルロジックＰＬ１を書き換えるための手続きを説
明するための図である。FIG. 27 is a diagram for explaining a procedure for rewriting the programmable logic PL1 of the variable instruction set processor of the present invention.

[Explanation of symbols]

Ｃ１０…プロセッサ、ＰＬ１…プログラマブルロジッ
ク、ＩＢ１…外部バスインターフェイス、ＩＲ１…命令
レジスタ、ＰＣ１…プログラムカウンタ、ＲＦ１…レジ
スタファイル、Ｓ１，Ｓ２…スイッチマトリックス、Ａ
１…ＡＬＵ、Ａ２…演算器。C10 ... Processor, PL1 ... Programmable logic, IB1 ... External bus interface, IR1 ... Instruction register, PC1 ... Program counter, RF1 ... Register file, S1, S2 ... Switch matrix, A
1 ... ALU, A2 ... arithmetic unit.

Claims

[Claims]

1. A register file composed of a plurality of registers, an arithmetic circuit composed of hard-wired logic, for executing a predetermined arithmetic operation based on data output from the register file, and a programmable logic. A microprocessor for forming an instruction decoder for decoding an instruction and a first control circuit for selecting a designated register from the register file according to the decoded instruction by the programmable logic.

2. The programmable logic according to claim 1, wherein the programmable logic includes a plurality of basic logic elements for realizing a predetermined logical function and a plurality of basic logic elements including a switching element and switching the switching element. A microprocessor having programmable wirings interconnected with each other and a storage element for storing switching information of the switching element.

3. The first switch circuit according to claim 1, comprising a hard-wired logic and connected to an input of the arithmetic circuit, wherein predetermined processing is performed on the data in accordance with the decoded instruction. The first logic circuit to be performed and the second control circuit for controlling the first switch circuit are formed by the programmable logic, and the output of the register file and the output of the first logic circuit are input to the first switch circuit. And a circuit in which the output of the register file and the output of the first logic circuit are selectively input to the arithmetic circuit by the second control circuit.

4. The operation circuit according to claim 1, further comprising a second switch circuit configured by hard-wired logic and connected to an output of the arithmetic circuit, the arithmetic result of the arithmetic circuit according to the decoded instruction. A second logic circuit that performs a predetermined process and a third control circuit that controls the second switch circuit are formed by the programmable logic, and the input of the second switch circuit is the output of the arithmetic circuit and the second circuit. A microprocessor which is connected to the output of the logic circuit, and in which the output of the arithmetic circuit and the output of the second logic circuit are selectively stored in the register file by the third control circuit.

5. A data path unit having a register file, an arithmetic circuit configured by hard-wired logic and performing a predetermined arithmetic operation based on data stored in the register file, and an arithmetic process in the data path unit is controlled. And a programmable logic in which a control circuit is formed. The control circuit divides the process from fetching an instruction to writing the processing result in the data path unit into the register file into a plurality of stages to perform pipeline processing. Processor.

6. The microprocessor according to claim 5, wherein the control circuit executes a stage for decoding the instruction and a stage for generating a control signal for the data path unit based on the decoded instruction in different stages.

7. The programmable logic according to claim 5 or 6, wherein the programmable logic includes a plurality of basic logic elements for realizing a predetermined logic function, and the plurality of basic logics including a switch element and switching the switch element. A microprocessor having a programmable wiring for connecting elements to each other and a storage element for storing switching information of the switching element.

8. A data path unit having a register file, a hard wired logic, and an arithmetic circuit for executing a predetermined arithmetic operation based on data stored in the register file, a programmable logic, and the programmable logic. A microprocessor having a rewrite interface for rewriting contents, and having a circuit configuration of an instruction decoder for decoding an instruction for executing an operation in the data path unit, written in the programmable logic from the rewrite interface.

9. The programmable logic device according to claim 8, wherein the programmable logic connects between a plurality of basic logic elements for realizing a predetermined logic function and the plurality of basic logic elements by including a switch element and switching the switch element. A microprocessor having programmable wirings connected to each other and a storage element for storing switching information of the switching element.

10. A data processing device having a processor for performing processing according to an instruction set according to an application.