JPH10207720A

JPH10207720A - Information processor

Info

Publication number: JPH10207720A
Application number: JP9009407A
Authority: JP
Inventors: Takao Yamamoto; 崇夫山本
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1997-01-22
Filing date: 1997-01-22
Publication date: 1998-08-07

Abstract

PROBLEM TO BE SOLVED: To improve the application efficiency of a hardware resource while attaining the high speed of information processing by optionally switching a scalar mode and a vector mode. SOLUTION: A control unit 30 is provided with an instruction fetch unit 40 for fetching an instruction from a memory unit 20, a sequencer 50 for controlling the operation of the whole information processor and 1st to 3rd multiplexers 60 to 62. The sequencer 50 decodes an instruction INST supplied from the 3rd multiplexer 62. The sequencer 50 switches the scalar mode for executing a scalar instruction by using the 1st and 2nd register sets 11, 12 respectively as scalar register sets and the vector mode for executing a vector instruction by using the 1st and 2nd register sets 11, 12 as two-dimensional vector register sets. The mode switching is dynamically executed in accordance with an event generated during the operation of the information processor.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、マルチスレッド型
の情報処理装置に関するものである。The present invention relates to a multi-thread type information processing apparatus.

【０００２】[0002]

【従来の技術】従来、複数個のスレッド（命令ストリー
ム）を備えたマルチスレッド型パイプライン方式の情報
処理装置が知られている。これは、アプリケーションプ
ログラムにおけるタスクレベルの並列性を利用して高速
の情報処理を達成するものである。具体的には、各スレ
ッドは、１個のプログラムカウンタと、複数個のレジス
タで構成された１個のレジスタセットとを有する。しか
も、これらのスレッドは少なくとも１個の演算器を共有
する。そして、複数個のスレッドの各々にタスクが割付
けられ、複数のタスクが時分割で並列実行される。2. Description of the Related Art Conventionally, a multi-threaded pipeline type information processing apparatus having a plurality of threads (instruction streams) has been known. This achieves high-speed information processing using task-level parallelism in an application program. Specifically, each thread has one program counter and one register set composed of a plurality of registers. Moreover, these threads share at least one arithmetic unit. Then, a task is assigned to each of the plurality of threads, and the plurality of tasks are executed in parallel in a time-division manner.

【０００３】また、ベクトル処理方式の情報処理装置が
知られている。これは、行列計算などにおけるデータの
並列性を利用して高速の情報処理を達成するものであ
る。Further, an information processing apparatus of a vector processing system is known. This achieves high-speed information processing by utilizing the parallelism of data in a matrix calculation or the like.

【０００４】[0004]

【発明が解決しようとする課題】一般に、情報処理装置
においては、タスクレベルの並列性とデータの並列性と
の両面性を考慮する必要がある。Generally, in an information processing apparatus, it is necessary to consider both the parallelism of task level and the parallelism of data.

【０００５】Tzi-cker Chiueh, "Multi-Threaded Vecto
rization", Proceedings of 18th International Sympo
sium on Computer Architecture, pp.352-361, 1991 に
は、マルチスレッド型パイプライン方式とベクトル処理
方式とを組合せた情報処理装置が提案されている。とこ
ろが、この情報処理装置は、スカラレジスタセットと、
これとは独立したベクトルレジスタセットとを備えたも
のであった。したがって、ハードウェア資源の利用効率
が悪いという問題があった。[0005] Tzi-cker Chiueh, "Multi-Threaded Vecto
rization ", Proceedings of 18th International Sympo
sium on Computer Architecture, pp.352-361, 1991, proposes an information processing apparatus that combines a multithreaded pipeline system and a vector processing system. However, this information processing device has a scalar register set,
And a vector register set independent of this. Therefore, there is a problem that the utilization efficiency of hardware resources is poor.

【０００６】本発明の目的は、ベクトル処理方式を導入
したマルチスレッド型の情報処理装置において、ハード
ウェア資源の利用効率を高めることにある。An object of the present invention is to improve the utilization efficiency of hardware resources in a multi-thread type information processing apparatus into which a vector processing method is introduced.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
に、本発明に係るマルチスレッド型の情報処理装置は、
複数個のレジスタセットのうちの個々のレジスタセット
をスカラレジスタセットとして使用してスカラ命令を実
行するスカラモードと、前記複数個のレジスタセットの
うちのＮ個（Ｎは２以上の整数）のレジスタセットをＮ
元ベクトルレジスタセットとして使用してベクトル命令
を実行するベクトルモードとを切換えることとしたもの
である。In order to achieve the above object, a multi-thread information processing apparatus according to the present invention comprises:
A scalar mode for executing a scalar instruction using an individual register set of the plurality of register sets as a scalar register set, and N (N is an integer of 2 or more) registers of the plurality of register sets Set N
The mode is switched to a vector mode in which a vector instruction is executed using the source vector register set.

【０００８】本発明の情報処理装置によれば、スカラ命
令のみを実行するタスクには１個のレジスタセットがス
カラレジスタセットとして割付けられる。全てのレジス
タセットをそれぞれスカラレジスタセットとして使用す
ることができる。また、Ｎ元ベクトル命令を実行するタ
スクにはＮ個のレジスタセットがＮ元ベクトルレジスタ
セットとして割付けられる。According to the information processing apparatus of the present invention, one register set is assigned to a task that executes only a scalar instruction as a scalar register set. All register sets can be used as scalar register sets. In addition, N register sets are assigned to a task that executes an N-ary vector instruction as an N-ary vector register set.

【０００９】スカラモードとベクトルモードとの間の切
換えは、情報処理装置の動作中に発生する事象に応じて
動的になされる。例えば、スカラモードにおいてベクト
ル命令がフェッチされたときに発生する内部割込みに応
答してベクトルモードへの切換えが実行され、ベクトル
モードにおいて専用割込み命令がフェッチ又はデコード
されたときに発生する内部割込みに応答してスカラモー
ドへの切換えが実行される。[0009] Switching between the scalar mode and the vector mode is dynamically performed according to an event that occurs during the operation of the information processing apparatus. For example, switching to vector mode is performed in response to an internal interrupt that occurs when a vector instruction is fetched in scalar mode, and responding to an internal interrupt that occurs when a dedicated interrupt instruction is fetched or decoded in vector mode. Then, switching to the scalar mode is executed.

【００１０】なお、外部割込みの処理はスレッド資源の
停止及び退避処理を必要とする。これは、レジスタセッ
トの記憶内容の退避処理を含んでいる。また、ベクトル
モードのスレッド資源の退避に比べて、スカラモードの
スレッド資源の退避の方が簡便であるという事情があ
る。したがって、スカラモードのスレッドと、ベクトル
モードのスレッドとが共存する場合には、スカラモード
のスレッドに優先的に外部割込みの処理を実行させるの
が好ましい。[0010] The processing of the external interrupt needs to stop and save the thread resources. This includes saving the contents of the register set. Also, there is a situation in which saving thread resources in scalar mode is easier than saving thread resources in vector mode. Therefore, when a thread in the scalar mode and a thread in the vector mode coexist, it is preferable to cause the thread in the scalar mode to execute the external interrupt process preferentially.

【００１１】[0011]

【発明の実施の形態】図１は、本発明に係るマルチスレ
ッド型パイプライン方式の情報処理装置の構成例を示し
ている。図１の情報処理装置は、大きく分けて、データ
パスユニット１０と、メモリユニット２０と、制御ユニ
ット３０とから構成される。メモリユニット２０は、複
数のタスクの各々を構成する命令列と、各タスクに関連
したデータとを記憶している。FIG. 1 shows an example of the configuration of a multithreaded pipeline type information processing apparatus according to the present invention. The information processing apparatus of FIG. 1 is roughly composed of a data path unit 10, a memory unit 20, and a control unit 30. The memory unit 20 stores an instruction sequence configuring each of a plurality of tasks, and data related to each task.

【００１２】データパスユニット１０は、各々複数個の
レジスタ（Ｒ０，Ｒ１，Ｒ２，…）を有する第１及び第
２のレジスタセット１１，１２と、算術論理演算を実行
するための１個の演算器（ＡＬＵ）１５と、演算結果を
一時記憶するためのＭレジスタ（ＲＥＧＭ）１６とを
備えている。第１のレジスタセット１１は、１つの書込
みポートＷＴ１と、第１及び第２の読出しポートＲＤ１
１，ＲＤ１２とを備えている。第２のレジスタセット１
２は、１つの書込みポートＷＴ２と、第１及び第２の読
出しポートＲＤ２１，ＲＤ２２とを備えている。演算器
１５は、第１のレジスタセット１１の第１の読出しポー
トＲＤ１１又は第２のレジスタセット１２の第１の読出
しポートＲＤ２１から第１の入力を受取り、かつ第１の
レジスタセット１１の第２の読出しポートＲＤ１２又は
第２のレジスタセット１２の第２の読出しポートＲＤ２
２から第２の入力を受取る。演算器１５の演算結果は、
Ｍレジスタ１６に一時記憶され、該Ｍレジスタ１６か
ら、データアクセス（又はブランチ先指定）のためのア
ドレスＡＤＲＳとしてメモリユニット２０及び制御ユニ
ット３０へ供給されたり、第１のレジスタセット１１の
書込みポートＷＴ１及び第２のレジスタセット１２の書
込みポートＷＴ２へ供給されたりできるようになってい
る。第１のレジスタセット１１の書込みポートＷＴ１及
び第２のレジスタセット１２の書込みポートＷＴ２は、
メモリユニット２０からデータの供給を受けることもで
きる。ただし、両書込みポートＷＴ１，ＷＴ２は共通の
データバスに接続されている。The data path unit 10 includes first and second register sets 11 and 12 each having a plurality of registers (R0, R1, R2,...) And one operation for executing an arithmetic and logic operation. (ALU) 15 and an M register (REG M) 16 for temporarily storing the operation result. The first register set 11 includes one write port WT1 and first and second read ports RD1.
1 and RD12. Second register set 1
2 has one write port WT2 and first and second read ports RD21 and RD22. The arithmetic unit 15 receives a first input from the first read port RD11 of the first register set 11 or the first read port RD21 of the second register set 12, and receives the second input of the first register set 11. Read port RD12 or the second read port RD2 of the second register set 12
2 to receive a second input. The calculation result of the calculator 15 is
The data is temporarily stored in the M register 16 and is supplied from the M register 16 to the memory unit 20 and the control unit 30 as an address ADRS for data access (or branch destination designation), or the write port WT1 of the first register set 11 And to the write port WT2 of the second register set 12. The write port WT1 of the first register set 11 and the write port WT2 of the second register set 12 are:
Data can also be supplied from the memory unit 20. However, both write ports WT1, WT2 are connected to a common data bus.

【００１３】制御ユニット３０は、メモリユニット２０
から命令をフェッチするための命令フェッチユニット４
０と、情報処理装置全体の動作を制御するためのシーケ
ンサ５０と、第１、第２及び第３のマルチプレクサ６
０，６１，６２とを備えている。命令フェッチユニット
４０は、第１及び第２のプログラムカウンタ（ＰＣ１，
ＰＣ２）４１，４２と、第１及び第２の命令バッファ
（ＩＢ１，ＩＢ２）４５，４６とを備えている。第１及
び第２のプログラムカウンタ４１，４２は、各々に対応
付けられたタスクに係る次にフェッチすべき命令のアド
レスを保持しており、それぞれシーケンサ５０からの指
示に従ってインクリメントされる。第１のマルチプレク
サ６０は、シーケンサ５０からの指示に応じて、第１の
プログラムカウンタ４１から供給されたアドレスと、第
２のプログラムカウンタ４２から供給されたアドレスと
のいずれか一方を選択し、該選択したアドレスを第２の
マルチプレクサ６１へ供給する。第２のマルチプレクサ
６１は、シーケンサ５０からの指示に応じて、第１のマ
ルチプレクサ６０から供給されたアドレスと、演算器１
５からＭレジスタ１６を介して供給されたアドレスＡＤ
ＲＳとのいずれか一方を選択し、該選択したアドレスを
メモリユニット２０へ供給する。第１の命令バッファ４
５は、対応タスクに係る第１のプログラムカウンタ４１
の保持アドレスに従ってメモリユニット２０からフェッ
チされた命令を一時記憶する。第２の命令バッファ４６
は、対応タスクに係る第２のプログラムカウンタ４２の
保持アドレスに従ってメモリユニット２０からフェッチ
された命令を一時記憶する。第１及び第２の命令バッフ
ァ４５，４６は、各々一時記憶している命令を第３のマ
ルチプレクサ６２へ供給する。第３のマルチプレクサ６
２は、シーケンサ５０からの指示に応じて、第１の命令
バッファ４５から供給された命令と、第２の命令バッフ
ァ４６から供給された命令とのいずれか一方を選択し、
該選択した命令をシーケンサ５０へ供給する。図１中の
ＩＮＳＴは、このようにしてシーケンサ５０へ供給され
た命令を表している。なお、多くの情報処理装置で採用
されているように一度に複数の命令をフェッチする場合
には、該複数の命令からなる命令列を命令バッファに記
憶し、該命令バッファがＦＩＦＯ（ファースト・イン・
ファースト・アウト）動作をすることによって１命令ず
つ第３のマルチプレクサ６２に供給する構成にしてもよ
い。これにより、命令フェッチの回数を少なくすること
ができる。The control unit 30 includes a memory unit 20
Instruction fetch unit 4 for fetching instructions from
0, a sequencer 50 for controlling the operation of the entire information processing apparatus, and first, second and third multiplexers 6.
0, 61, and 62. The instruction fetch unit 40 includes first and second program counters (PC1, PC1).
PC2) 41, 42 and first and second instruction buffers (IB1, IB2) 45, 46. The first and second program counters 41 and 42 hold the addresses of instructions to be fetched next for the tasks associated with them, and are incremented in accordance with instructions from the sequencer 50, respectively. The first multiplexer 60 selects one of the address supplied from the first program counter 41 and the address supplied from the second program counter 42 in accordance with an instruction from the sequencer 50, and The selected address is supplied to the second multiplexer 61. The second multiplexer 61 receives the address supplied from the first multiplexer 60 and the arithmetic unit 1 in response to an instruction from the sequencer 50.
5 through the M register 16
RS, and supplies the selected address to the memory unit 20. First instruction buffer 4
5 is a first program counter 41 related to the corresponding task
The instruction fetched from the memory unit 20 is temporarily stored in accordance with the holding address of the instruction. Second instruction buffer 46
Temporarily stores the instruction fetched from the memory unit 20 according to the holding address of the second program counter 42 relating to the corresponding task. The first and second instruction buffers 45 and 46 respectively supply the temporarily stored instructions to the third multiplexer 62. Third multiplexer 6
2 selects one of the instruction supplied from the first instruction buffer 45 and the instruction supplied from the second instruction buffer 46 in response to an instruction from the sequencer 50,
The selected instruction is supplied to the sequencer 50. INST in FIG. 1 represents the instruction supplied to the sequencer 50 in this manner. When a plurality of instructions are fetched at a time as employed in many information processing apparatuses, an instruction sequence including the plurality of instructions is stored in an instruction buffer, and the instruction buffer stores the FIFO (first-in・
A first-out operation may be performed to supply the instructions to the third multiplexer 62 one instruction at a time. As a result, the number of instruction fetches can be reduced.

【００１４】シーケンサ５０は、第３のマルチプレクサ
６２から供給された命令ＩＮＳＴをデコードする。ま
た、シーケンサ５０は、第１及び第２のレジスタセット
１１，１２の各々をスカラレジスタセットとして使用し
てスカラ命令を実行するスカラモードと、第１及び第２
のレジスタセット１１，１２を２元ベクトルレジスタセ
ットとして使用してベクトル命令を実行するベクトルモ
ードとを切換える。このモード切換えは、図１の情報処
理装置の動作中に発生する事象に応じて動的に実行され
る。具体的には、スカラモードにおいてベクトル命令が
フェッチされかつデコードされたときに発生する内部割
込みに応答して、スカラモードからベクトルモードへの
切換えが実行される。また、ベクトルモードにおいて復
帰のための専用割込み命令がフェッチされかつデコード
されたときに発生する内部割込みに応答して、ベクトル
モードからスカラモードへの切換えが実行される。な
お、ベクトルモードでは、ベクトル命令の実行のみでな
く、スカラ命令の実行も可能である。更に、シーケンサ
５０は、現サイクルにおいていずれのタスクが実行中で
あるかを示す情報、すなわち「タスク情報」を記憶して
いる。ここで、スカラモードにおいて第１のプログラム
カウンタ４１が特定のタスク１に係る命令アドレスを、
第２のプログラムカウンタ４２が他の特定のタスク２に
係る命令アドレスをそれぞれ保持しているものとする。
更に、第１のレジスタセット１１を使用してタスク１の
命令を実行し、かつ第２のレジスタセット１２を使用し
てタスク２の命令を実行するものとする。この場合に
は、第１のプログラムカウンタ４１と第１のレジスタセ
ット１１とが１個のスレッドを構成し、かつ第２のプロ
グラムカウンタ４２と第２のレジスタセット１２とが他
の１個のスレッドを構成する。あるいは、第２のレジス
タセット１２を使用してタスク１の命令を実行し、かつ
第１のレジスタセット１１を使用してタスク２の命令を
実行することもできる。この場合には、第１のプログラ
ムカウンタ４１と第２のレジスタセット１２とが１個の
スレッドを構成し、かつ第２のプログラムカウンタ４２
と第１のレジスタセット１１とが他の１個のスレッドを
構成する。シーケンサ５０は、このようなスレッド構成
に係る情報、すなわち「スレッド情報」をも記憶してい
る。The sequencer 50 decodes the instruction INST supplied from the third multiplexer 62. The sequencer 50 also includes a scalar mode for executing a scalar instruction using each of the first and second register sets 11 and 12 as a scalar register set, and a first and second scalar mode.
Is switched to a vector mode in which a vector instruction is executed by using the register sets 11 and 12 as a binary vector register set. This mode switching is dynamically executed in response to an event that occurs during the operation of the information processing apparatus in FIG. Specifically, switching from the scalar mode to the vector mode is performed in response to an internal interrupt generated when a vector instruction is fetched and decoded in the scalar mode. Switching from the vector mode to the scalar mode is performed in response to an internal interrupt generated when a dedicated interrupt instruction for return in the vector mode is fetched and decoded. In the vector mode, it is possible to execute not only a vector instruction but also a scalar instruction. Further, the sequencer 50 stores information indicating which task is being executed in the current cycle, that is, “task information”. Here, in the scalar mode, the first program counter 41 sets the instruction address related to the specific task 1 to:
It is assumed that the second program counter 42 holds an instruction address related to another specific task 2.
Further, it is assumed that the instruction of task 1 is executed using the first register set 11, and the instruction of task 2 is executed using the second register set 12. In this case, the first program counter 41 and the first register set 11 constitute one thread, and the second program counter 42 and the second register set 12 constitute another thread. Is configured. Alternatively, the task 1 instruction may be executed using the second register set 12 and the task 2 instruction may be executed using the first register set 11. In this case, the first program counter 41 and the second register set 12 constitute one thread, and the second program counter 42
And the first register set 11 constitute another thread. The sequencer 50 also stores information related to such a thread configuration, that is, “thread information”.

【００１５】シーケンサ５０は、スレッド毎の現在の動
作モードを示す「モード情報」を記憶するためのモード
レジスタ５１と、内部割込みの処理と外部割込みＩＮＴ
の処理とを実行するための割込み処理ハンドラ５２とを
備えている。第１及び第２のレジスタセット１１，１２
の各々がスカラレジスタセットとして使用される場合に
は、２個のスレッドの各々がスカラモードである。第１
及び第２のレジスタセット１１，１２が２元ベクトルレ
ジスタセットとして使用される場合には、一方のスレッ
ドが主ベクトルモードであり、他方のスレッドが従ベク
トルモードである。なお、モードレジスタ５１は、上記
タスク情報及びスレッド情報の記憶にも用いられる。シ
ーケンサ５０は、モードレジスタ５１に記憶されたタス
ク情報、スレッド情報及びモード情報に従って、第１及
び第２のレジスタセット１１，１２と、演算器１５と、
Ｍレジスタ１６と、命令フェッチユニット４０と、第
１、第２及び第３のマルチプレクサ６０，６１，６２と
の各々の動作を制御する。The sequencer 50 includes a mode register 51 for storing "mode information" indicating a current operation mode for each thread, an internal interrupt process, and an external interrupt INT.
And an interrupt processing handler 52 for executing the above processing. First and second register sets 11 and 12
Is used as a scalar register set, each of the two threads is in scalar mode. First
When the second register set 11 and the second register set 12 are used as a binary vector register set, one thread is in the main vector mode and the other thread is in the slave vector mode. The mode register 51 is also used for storing the task information and the thread information. According to the task information, thread information, and mode information stored in the mode register 51, the sequencer 50 includes first and second register sets 11 and 12,
It controls the operation of each of the M register 16, the instruction fetch unit 40, and the first, second, and third multiplexers 60, 61, and 62.

【００１６】図２は、図１の情報処理装置のスカラモー
ドにおけるパイプライン動作の具体例を示している。こ
こでは、第１のプログラムカウンタ４１と第１のレジス
タセット１１とがタスク１の命令列（Ａ０，Ａ１，Ａ
２，Ａ３，Ａ４，…）を実行するためのスレッド１を構
成し、かつ第２のプログラムカウンタ４２と第２のレジ
スタセット１２とがタスク２の命令列（Ｂ０，…）を実
行するためのスレッド２を構成するものとする。これら
２個のスレッドは、演算器１５を共有する。命令Ａ０，
Ａ１，Ａ２，Ａ３，Ａ４及びＢ０は、いずれもスカラ命
令である。このうち、２個の命令Ａ０及びＢ０は、各々
レジスタ間接アドレス指定のロード命令である。他の４
個の命令Ａ１，Ａ２，Ａ３及びＡ４は、各々レジスタ算
術演算命令である。なお、図２中の命令ＩＮＳＴ以外の
行は、いずれの命令の実行に係るデータ又はアドレスで
あるかを表している。FIG. 2 shows a specific example of the pipeline operation in the scalar mode of the information processing apparatus of FIG. Here, the first program counter 41 and the first register set 11 store the instruction sequence (A0, A1, A1
2, A3, A4,...), And the second program counter 42 and the second register set 12 execute the instruction sequence (B0,. It is assumed that a thread 2 is configured. These two threads share the arithmetic unit 15. Instruction A0,
A1, A2, A3, A4 and B0 are all scalar instructions. The two instructions A0 and B0 are load instructions for register indirect addressing. The other four
Each of the instructions A1, A2, A3, and A4 is a register arithmetic operation instruction. Note that the lines other than the instruction INST in FIG. 2 indicate which instruction is the data or address related to the execution of the instruction.

【００１７】図２によれば、サイクル１では、第１のプ
ログラムカウンタ４１から供給されたアドレスに応じて
メモリユニット２０からフェッチされた命令Ａ０が、シ
ーケンサ５０によりデコードされる。このデコードの結
果から、命令Ａ０がレジスタ間接アドレス指定のスカラ
ロード命令であることが認識される。そして、第１のレ
ジスタセット１１の中の指定された２個のレジスタの記
憶アドレスが、読出しポートＲＤ１１，ＲＤ１２に読出
される。Referring to FIG. 2, in cycle 1, the instruction A0 fetched from the memory unit 20 in accordance with the address supplied from the first program counter 41 is decoded by the sequencer 50. From the result of this decoding, it is recognized that the instruction A0 is a scalar load instruction specifying the register indirect address. Then, the storage addresses of the two designated registers in the first register set 11 are read to the read ports RD11 and RD12.

【００１８】サイクル２では、第１の読出しポートＲＤ
１１上のアドレスと第２の読出しポートＲＤ１２上のア
ドレスとの加算、すなわち命令Ａ０に係るアドレス計算
が演算器１５により実行される。また、第２のプログラ
ムカウンタ４２から供給されたアドレスに応じてメモリ
ユニット２０からフェッチされた命令Ｂ０が、シーケン
サ５０によりデコードされる。このデコードの結果か
ら、命令Ｂ０がレジスタ間接アドレス指定のスカラロー
ド命令であることが認識される。そして、第２のレジス
タセット１２の中の指定された２個のレジスタの記憶ア
ドレスが、読出しポートＲＤ２１，ＲＤ２２に読出され
る。In cycle 2, the first read port RD
The arithmetic unit 15 performs addition of the address on the address 11 and the address on the second read port RD12, that is, the address calculation according to the instruction A0. The instruction B0 fetched from the memory unit 20 according to the address supplied from the second program counter 42 is decoded by the sequencer 50. From the result of this decoding, it is recognized that the instruction B0 is a scalar load instruction specifying the register indirect address. Then, the storage addresses of the two designated registers in the second register set 12 are read to the read ports RD21 and RD22.

【００１９】サイクル３では、演算器１５によるアドレ
ス計算の結果が、ロード命令Ａ０に係るデータアクセス
のためのアドレスＡＤＲＳとしてメモリユニット２０へ
供給される。また、第１の読出しポートＲＤ２１上のア
ドレスと第２の読出しポートＲＤ２２上のアドレスとの
加算、すなわち命令Ｂ０に係るアドレス計算が演算器１
５により実行される。更に、第１のプログラムカウンタ
４１から供給されたアドレスに応じてメモリユニット２
０からフェッチされた命令Ａ１が、シーケンサ５０によ
りデコードされる。このデコードの結果から、命令Ａ１
がスカラレジスタ算術演算命令であることが認識され
る。そして、第１のレジスタセット１１の中の指定され
た２個のレジスタの記憶データが、読出しポートＲＤ１
１，ＲＤ１２に読出される。In cycle 3, the result of the address calculation by the arithmetic unit 15 is supplied to the memory unit 20 as an address ADRS for data access according to the load instruction A0. The addition of the address on the first read port RD21 and the address on the second read port RD22, that is, the address calculation related to the instruction B0 is performed by the arithmetic unit 1
5 is performed. Further, according to the address supplied from the first program counter 41, the memory unit 2
The instruction A1 fetched from 0 is decoded by the sequencer 50. From the result of this decoding, the instruction A1
Is a scalar register arithmetic operation instruction. Then, the storage data of the two specified registers in the first register set 11 is stored in the read port RD1.
1, RD12.

【００２０】サイクル４では、命令Ａ０に応じてメモリ
ユニット２０から読出されたデータが第１のレジスタセ
ット１１の書込みポートＷＴ１へ供給され、該データが
第１のレジスタセット１１の中の指定されたレジスタに
書込まれる。これにより、命令Ａ０の実行が完了する。
また、演算器１５によるアドレス計算の結果が、ロード
命令Ｂ０に係るデータアクセスのためのアドレスＡＤＲ
Ｓとしてメモリユニット２０へ供給される。更に、第１
の読出しポートＲＤ１１上のデータと第２の読出しポー
トＲＤ１２上のデータとの算術演算、すなわち命令Ａ１
に係るスカラ算術演算が演算器１５により実行される。
更にまた、第１のプログラムカウンタ４１から供給され
たアドレスに応じてメモリユニット２０からフェッチさ
れた命令Ａ２が、シーケンサ５０によりデコードされ
る。このデコードの結果から、命令Ａ２がスカラレジス
タ算術演算命令であることが認識される。そして、第１
のレジスタセット１１の中の指定された２個のレジスタ
の記憶データが、読出しポートＲＤ１１，ＲＤ１２に読
出される。In cycle 4, data read from memory unit 20 in response to instruction A0 is supplied to write port WT1 of first register set 11, and the data is designated in first register set 11. Written to a register. Thus, the execution of the instruction A0 is completed.
The result of the address calculation by the arithmetic unit 15 is the address ADR for data access related to the load instruction B0.
S is supplied to the memory unit 20 as S. Furthermore, the first
Arithmetic operation of the data on the read port RD11 of the first instruction and the data on the second read port RD12, that is, the instruction A1
Is executed by the arithmetic unit 15.
Further, the instruction A2 fetched from the memory unit 20 according to the address supplied from the first program counter 41 is decoded by the sequencer 50. From the decoding result, it is recognized that the instruction A2 is a scalar register arithmetic operation instruction. And the first
The stored data of the two designated registers in the register set 11 is read out to the read ports RD11 and RD12.

【００２１】サイクル５では、命令Ｂ０に応じてメモリ
ユニット２０から読出されたデータが第２のレジスタセ
ット１２の書込みポートＷＴ２へ供給され、該データが
第２のレジスタセット１２の中の指定されたレジスタに
書込まれる。この際、メモリユニット２０から第２のレ
ジスタセット１２へのデータ転送にデータバスが使用さ
れる。したがって、演算器１５による演算の結果、すな
わち命令Ａ１に係るスカラ算術演算の結果は、Ｍレジス
タ１６に一時記憶される構成になっている。また、第１
の読出しポートＲＤ１１上のデータと第２の読出しポー
トＲＤ１２上のデータとの算術演算、すなわち命令Ａ２
に係るスカラ算術演算が演算器１５により実行される。
更に、第１のプログラムカウンタ４１から供給されたア
ドレスに応じてメモリユニット２０からフェッチされた
命令Ａ３が、シーケンサ５０によりデコードされる。こ
のデコードの結果から、命令Ａ３がスカラレジスタ算術
演算命令であることが認識される。そして、第１のレジ
スタセット１１の中の指定された２個のレジスタの記憶
データが、読出しポートＲＤ１１，ＲＤ１２に読出され
る。In cycle 5, the data read from the memory unit 20 in response to the instruction B0 is supplied to the write port WT2 of the second register set 12, and the data is designated in the second register set 12. Written to a register. At this time, a data bus is used for data transfer from the memory unit 20 to the second register set 12. Therefore, the result of the operation by the operation unit 15, that is, the result of the scalar arithmetic operation according to the instruction A1 is temporarily stored in the M register 16. Also, the first
Arithmetic operation on the data on the read port RD11 and the data on the second read port RD12, that is, the instruction A2
Is executed by the arithmetic unit 15.
Further, the instruction A3 fetched from the memory unit 20 according to the address supplied from the first program counter 41 is decoded by the sequencer 50. From the result of this decoding, it is recognized that the instruction A3 is a scalar register arithmetic operation instruction. Then, the storage data of the two designated registers in the first register set 11 is read to the read ports RD11 and RD12.

【００２２】サイクル６では、Ｍレジスタ１６に一時記
憶されていた演算結果、すなわち命令Ａ１に係るスカラ
算術演算の結果が第１のレジスタセット１１の書込みポ
ートＷＴ１へ供給され、該結果が第１のレジスタセット
１１の中の指定されたレジスタに書込まれる。また、命
令Ａ２に係るスカラ算術演算の結果は、Ｍレジスタ１６
に一時記憶される。更に、第１の読出しポートＲＤ１１
上のデータと第２の読出しポートＲＤ１２上のデータと
の算術演算、すなわち命令Ａ３に係るスカラ算術演算が
演算器１５により実行される。更にまた、第１のプログ
ラムカウンタ４１から供給されたアドレスに応じてメモ
リユニット２０からフェッチされた命令Ａ４が、シーケ
ンサ５０によりデコードされる。このデコードの結果か
ら、命令Ａ４がスカラレジスタ算術演算命令であること
が認識される。そして、第１のレジスタセット１１の中
の指定された２個のレジスタの記憶データが、読出しポ
ートＲＤ１１，ＲＤ１２に読出される。In cycle 6, the operation result temporarily stored in the M register 16, that is, the result of the scalar arithmetic operation related to the instruction A1, is supplied to the write port WT1 of the first register set 11, and the result is stored in the first register set WT1. The data is written to the designated register in the register set 11. The result of the scalar arithmetic operation related to the instruction A2 is stored in the M register 16
Is temporarily stored. Further, the first read port RD11
The arithmetic operation of the above data and the data on the second read port RD12, that is, the scalar arithmetic operation related to the instruction A3 is executed by the arithmetic unit 15. Furthermore, the instruction A4 fetched from the memory unit 20 according to the address supplied from the first program counter 41 is decoded by the sequencer 50. From the decoding result, it is recognized that the instruction A4 is a scalar register arithmetic operation instruction. Then, the storage data of the two designated registers in the first register set 11 is read to the read ports RD11 and RD12.

【００２３】以上のとおり、スカラモードでは、第１及
び第２のレジスタセット１１，１２の各々がスカラレジ
スタセットとして使用される。そして、タスク１のスカ
ラ命令列とタスク２のスカラ命令列とが、マルチスレッ
ド型パイプライン方式で並列実行される。As described above, in the scalar mode, each of the first and second register sets 11 and 12 is used as a scalar register set. Then, the scalar instruction sequence of task 1 and the scalar instruction sequence of task 2 are executed in parallel by a multi-thread pipeline method.

【００２４】図３は、図１の情報処理装置におけるスカ
ラモードからベクトルモードへの移行処理の例を示して
いる。スレッド１がタスク１の命令列を、スレッド２が
タスク２の命令列をそれぞれスカラモードで実行してい
るものとする。このとき、スレッド１の一部を構成する
第１のプログラムカウンタ４１から供給されたアドレス
に応じてメモリユニット２０からベクトル命令がフェッ
チされ、かつ該フェッチされたベクトル命令がシーケン
サ５０によってデコードされると、該シーケンサ５０に
内部割込みが発生する。割込み処理ハンドラ５２は、割
込み要因を調べてスレッド１でベクトル命令割込みが発
生したことをステップＳ１で確認すると、ステップＳ３
及びＳ４へ処理を進める。ステップＳ３では、スレッド
２に割付けられたタスク２の実行が停止され、かつスレ
ッド２の資源（第２のレジスタセット１２の記憶内容）
が退避される。ステップＳ４では、割込み処理ハンドラ
５２は、スレッド１を主ベクトルモードへ、スレッド２
を従ベクトルモードへそれぞれ移行させる。これに応じ
てモードレジスタ５１の記憶内容が更新された後、タス
ク１の実行へ復帰する。なお、スレッド２でベクトル命
令割込みが発生した場合には、ステップＳ２において、
スレッド１の停止及び資源退避と、スレッド２の主ベク
トルモードへの移行と、スレッド１の従ベクトルモード
への移行とが行われる。FIG. 3 shows an example of processing for shifting from the scalar mode to the vector mode in the information processing apparatus of FIG. It is assumed that the thread 1 executes the instruction sequence of the task 1 and the thread 2 executes the instruction sequence of the task 2 in the scalar mode. At this time, when a vector instruction is fetched from the memory unit 20 according to the address supplied from the first program counter 41 constituting a part of the thread 1, and the fetched vector instruction is decoded by the sequencer 50, , An internal interrupt occurs in the sequencer 50. The interrupt handler 52 checks the cause of the interrupt and confirms in step S1 that a vector instruction interrupt has occurred in the thread 1, and then proceeds to step S3.
Then, the process proceeds to S4. In step S3, the execution of the task 2 assigned to the thread 2 is stopped, and the resources of the thread 2 (the contents stored in the second register set 12)
Is evacuated. In step S4, the interrupt processing handler 52 sets the thread 1 to the main vector mode,
To the slave vector mode. After the contents stored in the mode register 51 are updated in response to this, the process returns to the execution of the task 1. When a vector instruction interrupt occurs in the thread 2, in step S2,
The thread 1 stops and saves resources, the thread 2 shifts to the main vector mode, and the thread 1 shifts to the slave vector mode.

【００２５】図４は、図１の情報処理装置のベクトルモ
ードにおけるパイプライン動作の例を示している。ここ
では、第１のプログラムカウンタ４１と第１のレジスタ
セット１１とで構成されたスレッド１が主ベクトルモー
ドであり、第２のプログラムカウンタ４２と第２のレジ
スタセット１２とで構成されたスレッド２が従ベクトル
モードであるものとする。そして、スレッド１及び２が
タスク１のベクトル命令列（Ｖ０，Ｖ１，Ｖ２，…）を
実行するものとする。命令Ｖ０はレジスタ間接アドレス
指定のロード命令であり、命令Ｖ１及びＶ２は各々レジ
スタ加算命令である。FIG. 4 shows an example of a pipeline operation in the vector mode of the information processing apparatus of FIG. Here, the thread 1 composed of the first program counter 41 and the first register set 11 is in the main vector mode, and the thread 2 composed of the second program counter 42 and the second register set 12 Is a slave vector mode. Then, it is assumed that the threads 1 and 2 execute the vector instruction sequence (V0, V1, V2,...) Of the task 1. The instruction V0 is a load instruction specifying a register indirect address, and the instructions V1 and V2 are register addition instructions.

【００２６】図４によれば、ロード命令Ｖ０の実行手順
は次のとおりである。すなわち、サイクル１では、第１
のプログラムカウンタ４１から供給されたアドレスに応
じてメモリユニット２０からフェッチされた命令Ｖ０
が、シーケンサ５０によりデコードされる。このデコー
ドの結果から、命令Ｖ０がレジスタ間接アドレス指定の
ベクトルロード命令であることが認識される。そして、
第１のレジスタセット１１の中の指定された２個のレジ
スタＲｉ，Ｒｊの記憶アドレスが、読出しポートＲＤ１
１，ＲＤ１２に読出される。サイクル２では、第１の読
出しポートＲＤ１１上のアドレスと第２の読出しポート
ＲＤ１２上のアドレスとの加算、すなわち命令Ｖ０に係
る第１ベクトル要素Ｄ０のアドレス計算（Ｒｉ＋Ｒｊ）
が演算器１５により実行される。サイクル３では、演算
器１５によるアドレス計算（Ｒｉ＋Ｒｊ）の結果が、ロ
ード命令Ｖ０に係る第１のデータアクセスのためのアド
レスＡＤＲＳとしてメモリユニット２０へ供給される。
また、サイクル３では、命令Ｖ０に係る第２ベクトル要
素Ｄ１のアドレス計算（Ｒｉ＋Ｒｊ＋ａ）が演算器１５
により実行される。サイクル４では、メモリユニット２
０から読出された第１ベクトル要素Ｄ０が第１のレジス
タセット１１の書込みポートＷＴ１へ供給され、該第１
ベクトル要素Ｄ０が第１のレジスタセット１１の中の指
定されたレジスタＲｋに書込まれる。また、サイクル４
では、演算器１５によるアドレス計算（Ｒｉ＋Ｒｊ＋
ａ）の結果が、ロード命令Ｖ０に係る第２のデータアク
セスのためのアドレスＡＤＲＳとしてメモリユニット２
０へ供給される。そして、サイクル５では、メモリユニ
ット２０から読出された第２ベクトル要素Ｄ１が第２の
レジスタセット１２の書込みポートＷＴ２へ供給され、
該第２ベクトル要素Ｄ１が第２のレジスタセット１２の
中の指定されたレジスタＲｋに書込まれる。以上の動作
によりベクトルロード命令Ｖ０の実行が完了する。その
結果、第１ベクトル要素Ｄ０と第２ベクトル要素Ｄ１と
で構成された２元ベクトルが、第１のレジスタセット１
１と第２のレジスタセット１２とで構成された２元ベク
トルレジスタセットの中の指定された２元ベクトルレジ
スタＲｋに格納される。Referring to FIG. 4, the execution procedure of the load instruction V0 is as follows. That is, in cycle 1, the first
Instruction V0 fetched from memory unit 20 according to the address supplied from program counter 41 of
Are decoded by the sequencer 50. From the result of this decoding, it is recognized that the instruction V0 is a vector load instruction specifying a register indirect address. And
The storage addresses of the two specified registers Ri and Rj in the first register set 11 are stored in the read port RD1.
1, RD12. In cycle 2, addition of the address on the first read port RD11 and the address on the second read port RD12, that is, the address calculation of the first vector element D0 according to the instruction V0 (Ri + Rj)
Is executed by the arithmetic unit 15. In cycle 3, the result of the address calculation (Ri + Rj) by the arithmetic unit 15 is supplied to the memory unit 20 as the address ADRS for the first data access according to the load instruction V0.
In the cycle 3, the address calculation (Ri + Rj + a) of the second vector element D1 related to the instruction V0 is performed by the arithmetic unit 15
Is executed by In cycle 4, memory unit 2
0 is supplied to the write port WT1 of the first register set 11, and the first vector element D0 read from
The vector element D0 is written to the designated register Rk in the first register set 11. Cycle 4
Then, the address calculation (Ri + Rj +
The result of a) is the memory unit 2 as the address ADRS for the second data access according to the load instruction V0.
0. Then, in cycle 5, the second vector element D1 read from the memory unit 20 is supplied to the write port WT2 of the second register set 12, and
The second vector element D1 is written to a designated register Rk in the second register set 12. With the above operation, the execution of the vector load instruction V0 is completed. As a result, a binary vector composed of the first vector element D0 and the second vector element D1 is stored in the first register set 1
It is stored in a specified binary vector register Rk in a binary vector register set composed of the first and second register sets 12.

【００２７】加算命令Ｖ１の実行手順は次のとおりであ
る。すなわち、サイクル３では、第１のプログラムカウ
ンタ４１から供給されたアドレスに応じてメモリユニッ
ト２０からフェッチされた命令Ｖ１が、シーケンサ５０
によりデコードされる。このデコードの結果から、命令
Ｖ１がベクトルレジスタ加算命令であることが認識され
る。そして、第１のレジスタセット１１の中の指定され
た２個のレジスタＲ０，Ｒ１の記憶データが、読出しポ
ートＲＤ１１，ＲＤ１２に読出される。サイクル４で
は、第１の読出しポートＲＤ１１上のデータと第２の読
出しポートＲＤ１２上のデータとの加算、すなわち命令
Ｖ１に係る第１ベクトル要素の加算（Ｒ０＋Ｒ１）が演
算器１５により実行される。また、サイクル４では、第
２のレジスタセット１２の中の指定された２個のレジス
タＲ０，Ｒ１の記憶データが、読出しポートＲＤ２１，
ＲＤ２２に読出される。サイクル５では、演算器１５に
よる加算の結果、すなわち命令Ｖ１に係る第１ベクトル
要素の加算（Ｒ０＋Ｒ１）の結果が、Ｍレジスタ１６に
一時記憶される。また、サイクル５では、第１の読出し
ポートＲＤ２１上のデータと第２の読出しポートＲＤ２
２上のデータとの加算、すなわち命令Ｖ１に係る第２ベ
クトル要素の加算（Ｒ０＋Ｒ１）が演算器１５により実
行される。サイクル６では、Ｍレジスタ１６に一時記憶
されていた加算結果、すなわち命令Ｖ１に係る第１ベク
トル要素の加算（Ｒ０＋Ｒ１）の結果が第１のレジスタ
セット１１の書込みポートＷＴ１へ供給され、該結果が
第１のレジスタセット１１の中の指定されたレジスタＲ
０に書込まれる。また、サイクル６では、演算器１５に
よる加算の結果、すなわち命令Ｖ１に係る第２ベクトル
要素の加算（Ｒ０＋Ｒ１）の結果が、Ｍレジスタ１６に
一時記憶される。そして、サイクル７では、Ｍレジスタ
１６に一時記憶されていた加算結果、すなわち命令Ｖ１
に係る第２ベクトル要素の加算（Ｒ０＋Ｒ１）の結果が
第２のレジスタセット１２の書込みポートＷＴ２へ供給
され、該結果が第２のレジスタセット１２の中の指定さ
れたレジスタＲ０に書込まれる。以上の動作によりベク
トルレジスタ加算命令Ｖ１の実行が完了する。その結
果、第１のレジスタセット１１と第２のレジスタセット
１２とで構成された２元ベクトルレジスタセットの中の
指定された２個のベクトルレジスタＲ０，Ｒ１の記憶内
容の和が、指定されたベクトルレジスタＲ０に格納され
る。なお、図４には、次のベクトルレジスタ加算命令Ｖ
２の実行過程において、第１のレジスタセット１１の中
の指定された２個のレジスタＲｘ，Ｒｙの記憶データが
読出しポートＲＤ１１，ＲＤ１２に読出されることまで
が示されている。The execution procedure of the addition instruction V1 is as follows. That is, in cycle 3, the instruction V1 fetched from the memory unit 20 according to the address supplied from the first program counter 41 is transmitted to the sequencer 50.
Is decoded by From the result of this decoding, it is recognized that the instruction V1 is a vector register addition instruction. Then, the storage data of the two designated registers R0 and R1 in the first register set 11 is read to the read ports RD11 and RD12. In cycle 4, the arithmetic unit 15 performs addition of the data on the first read port RD11 and the data on the second read port RD12, that is, the addition (R0 + R1) of the first vector element according to the instruction V1. In the cycle 4, the data stored in the two specified registers R0 and R1 in the second register set 12 are stored in the read ports RD21 and RD21.
Read to RD22. In cycle 5, the result of the addition by the arithmetic unit 15, that is, the result of the addition (R0 + R1) of the first vector element according to the instruction V1, is temporarily stored in the M register 16. In the cycle 5, the data on the first read port RD21 and the second read port RD2
The arithmetic unit 15 performs the addition with the data on the second 2, that is, the addition (R0 + R1) of the second vector element according to the instruction V1. In cycle 6, the addition result temporarily stored in the M register 16, that is, the result of the addition (R0 + R1) of the first vector element according to the instruction V1, is supplied to the write port WT1 of the first register set 11, and the result is Specified register R in the first register set 11
Written to 0. In the cycle 6, the result of the addition by the computing unit 15, that is, the result of the addition (R0 + R1) of the second vector element according to the instruction V1 is temporarily stored in the M register 16. Then, in cycle 7, the addition result temporarily stored in M register 16, that is, instruction V1
Is supplied to the write port WT2 of the second register set 12, and the result is written to the designated register R0 in the second register set 12. With the above operation, the execution of the vector register addition instruction V1 is completed. As a result, the sum of the storage contents of the specified two vector registers R0 and R1 in the binary vector register set composed of the first register set 11 and the second register set 12 is specified. It is stored in the vector register R0. FIG. 4 shows the next vector register addition instruction V
2 shows that the storage data of the two designated registers Rx and Ry in the first register set 11 are read out to the read ports RD11 and RD12 in the execution process of No. 2.

【００２８】以上のとおり、スカラモードでは第１及び
第２のレジスタセット１１，１２の各々がスカラレジス
タセットとして使用されるのに対し、ベクトルモードで
は第１及び第２のレジスタセット１１，１２が２元ベク
トルレジスタセットとして使用される。つまり、２元ベ
クトル命令を実行するタスクには第１及び第２のレジス
タセット１１，１２が２元ベクトルレジスタセットとし
て自動的に割付けられる結果、ハードウェア資源の高効
率利用が達成される。As described above, in the scalar mode, each of the first and second register sets 11 and 12 is used as a scalar register set, whereas in the vector mode, the first and second register sets 11 and 12 are used. Used as a binary vector register set. In other words, the first and second register sets 11 and 12 are automatically allocated to the task of executing the binary vector instruction as a binary vector register set, so that efficient use of hardware resources is achieved.

【００２９】さて、スレッド１及び２がベクトルモード
でタスク１の命令列を実行している間にメモリユニット
２０からスカラモードへの復帰のための専用割込み命令
がフェッチされ、かつ該フェッチされた専用割込み命令
がシーケンサ５０によってデコードされると、該シーケ
ンサ５０に内部割込みが発生する。割込み処理ハンドラ
５２は、スカラモードへの復帰のための専用割込み命令
がフェッチされたことを確認すると、スレッド１及び２
をそれぞれスカラモードへ移行させる。これに応じて、
モードレジスタ５１の記憶内容が更新される。従ベクト
ルモードであったスレッド２では、退避されていた第２
のレジスタセット１２の記憶内容の復帰が行われた後
に、タスク２の復帰アドレスからその実行が再開され
る。主ベクトルモードであったスレッド１では、引き続
きタスク１が実行される。Now, while the threads 1 and 2 are executing the instruction sequence of the task 1 in the vector mode, a dedicated interrupt instruction for returning to the scalar mode from the memory unit 20 is fetched, and the fetched dedicated instruction is issued. When the interrupt instruction is decoded by the sequencer 50, an internal interrupt occurs in the sequencer 50. Upon confirming that the dedicated interrupt instruction for returning to the scalar mode has been fetched, the interrupt processing handler 52 checks the threads 1 and 2
To the scalar mode. In response,
The content stored in the mode register 51 is updated. In the thread 2 that was in the slave vector mode, the second
After the storage contents of the register set 12 are restored, the execution is resumed from the return address of the task 2. In the thread 1 in the main vector mode, the task 1 is continuously executed.

【００３０】以上のとおり、図１の情報処理装置によれ
ば、その動作中にフェッチされた命令の種類に応じて動
的にモード切換えが行われるので、行列演算の有無など
のプログラムの特質に適応したハードウェア資源の利用
が達成される。As described above, according to the information processing apparatus of FIG. 1, the mode is dynamically switched in accordance with the type of the instruction fetched during the operation. Adaptive hardware resource utilization is achieved.

【００３１】図５は、本発明に係るマルチスレッド型パ
イプライン方式の情報処理装置の他の構成例を示してい
る。図５の情報処理装置では、スレッド数が３に拡張さ
れている。図５において、データパスユニット１０は、
第１及び第２のレジスタセット１１，１２と、演算器１
５と、Ｍレジスタ１６とに加えて、複数個のレジスタ
（Ｒ０，Ｒ１，Ｒ２，…）を有する第３のレジスタセッ
ト１３を備えている。第３のレジスタセット１３は、１
つの書込みポートＷＴ３と、第１及び第２の読出しポー
トＲＤ３１，ＲＤ３２とを備えている。また、図５の命
令フェッチユニット４０は、第１及び第２のプログラム
カウンタ４１，４２と、第１及び第２の命令バッファ４
５，４６とに加えて、第３のプログラムカウンタ（ＰＣ
３）４３と、第３の命令バッファ（ＩＢ３）４７とを備
えている。図５中の他の構成は、図１と同様である。FIG. 5 shows another example of the configuration of a multithread pipeline type information processing apparatus according to the present invention. In the information processing apparatus of FIG. 5, the number of threads is expanded to three. In FIG. 5, the data path unit 10 includes:
First and second register sets 11 and 12 and a computing unit 1
5, a third register set 13 having a plurality of registers (R0, R1, R2,...) In addition to the M register 16 and the M register 16. The third register set 13 includes 1
It has two write ports WT3 and first and second read ports RD31 and RD32. The instruction fetch unit 40 shown in FIG. 5 includes first and second program counters 41 and 42 and first and second instruction buffers 4 and 4.
5 and 46, and a third program counter (PC
3) 43, and a third instruction buffer (IB3) 47. Other configurations in FIG. 5 are the same as those in FIG.

【００３２】図５の情報処理装置では、第１、第２及び
第３のレジスタセット１１，１２，１３の各々をスカラ
レジスタセットとして使用してスカラ命令を実行するス
カラモードと、第１、第２及び第３のレジスタセット１
１，１２，１３のうちの２個のレジスタセットを２元ベ
クトルレジスタセットとして使用してベクトル命令を実
行するベクトルモードとが切換えられる。つまり、ベク
トル命令を実行する１個のタスクと、スカラ命令のみを
実行する他の１個のタスクとの同時処理が可能である。In the information processing apparatus shown in FIG. 5, a scalar mode for executing a scalar instruction using each of the first, second and third register sets 11, 12, and 13 as a scalar register set is provided. 2nd and 3rd register set 1
A vector mode in which two vector sets among 1, 12, and 13 are used as a binary vector register set to execute a vector instruction is switched. That is, it is possible to simultaneously process one task that executes a vector instruction and another task that executes only a scalar instruction.

【００３３】図６〜図８は、図５の情報処理装置におけ
るスカラモードからベクトルモードへの移行処理の例を
示している。ここで、スレッド１がタスク１の命令列
を、スレッド２がタスク２の命令列を、スレッド３がタ
スク３の命令列をそれぞれスカラモードで実行している
ものとする。このとき、スレッド１の一部を構成する第
１のプログラムカウンタ４１から供給されたアドレスに
応じてメモリユニット２０からベクトル命令がフェッチ
され、かつ該フェッチされたベクトル命令がシーケンサ
５０によってデコードされると、該シーケンサ５０に内
部割込みが発生する。割込み処理ハンドラ５２は、図６
に示すように、割込み要因を調べてスレッド１でベクト
ル命令割込みが発生したことをステップＳ１１で確認
し、かつスレッド２がスカラモードであることをモード
レジスタ５１の記憶内容からステップＳ１３で確認する
と、ステップＳ１４及びＳ１５へ処理を進める。ステッ
プＳ１４では、スレッド２に割付けられたタスク２の実
行が停止され、かつスレッド２の資源（第２のレジスタ
セット１２の記憶内容）が退避される。ステップＳ１５
では、割込み処理ハンドラ５２は、スレッド１を主ベク
トルモードへ、スレッド２を従ベクトルモードへそれぞ
れ移行させる。これに応じてモードレジスタ５１の記憶
内容が更新された後、タスク１の実行へ復帰する。この
際、スレッド３はスカラモードを維持する。したがっ
て、タスク３の処理は継続される。なお、スレッド２又
は３でベクトル命令割込みが発生した場合には、ステッ
プＳ１２において、各々の場合に応じた割込み受付け処
理が実行される。FIGS. 6 to 8 show an example of the process of shifting from the scalar mode to the vector mode in the information processing apparatus of FIG. Here, it is assumed that the thread 1 executes the instruction sequence of the task 1, the thread 2 executes the instruction sequence of the task 2, and the thread 3 executes the instruction sequence of the task 3 in the scalar mode. At this time, when a vector instruction is fetched from the memory unit 20 according to the address supplied from the first program counter 41 constituting a part of the thread 1, and the fetched vector instruction is decoded by the sequencer 50, , An internal interrupt occurs in the sequencer 50. The interrupt handler 52 is configured as shown in FIG.
As shown in (2), when the cause of the interrupt is checked to confirm that the vector instruction interrupt has occurred in the thread 1 in step S11, and that the thread 2 is in the scalar mode is confirmed in step S13 from the storage contents of the mode register 51, The process proceeds to steps S14 and S15. In step S14, the execution of the task 2 assigned to the thread 2 is stopped, and the resources of the thread 2 (the contents stored in the second register set 12) are saved. Step S15
Then, the interrupt handler 52 shifts the thread 1 to the main vector mode and the thread 2 to the slave vector mode. After the contents stored in the mode register 51 are updated in response to this, the process returns to the execution of the task 1. At this time, the thread 3 maintains the scalar mode. Therefore, the processing of task 3 is continued. When a vector instruction interrupt occurs in the thread 2 or 3, an interrupt accepting process corresponding to each case is executed in step S12.

【００３４】スレッド１でベクトル命令割込みが発生し
ており、かつスレッド２及び３が既にベクトルモードへ
移行している場合には、割込み処理ハンドラ５２は、図
６中のステップＳ１３から図７中のステップＳ２１へ処
理を進める。ここで、スレッド２及び３がベクトルモー
ドであっても、該スレッド２及び３のうちの主ベクトル
モードのスレッドよりもスレッド１が高い優先順位を持
つことがステップＳ２１〜Ｓ２３で確認されると、割込
み処理ハンドラ５２は、ステップＳ２４〜Ｓ２９へ処理
を進める。その結果、スレッド２及び３のベクトルモー
ドが解除され、スレッド１が主ベクトルモードへ、スレ
ッド２が従ベクトルモードへそれぞれ移行する。この
際、スカラモードの待機タスクがある場合には、該待機
タスクがスレッド３に割当てられる。スレッド２及び３
のうちの主ベクトルモードのスレッドよりもスレッド１
が高い優先順位を持たないことがステップＳ２１〜Ｓ２
３で確認されると、割込み処理ハンドラ５２は、スレッ
ド１の主ベクトルモードへの移行を待たせるように、図
８中のステップＳ３１〜Ｓ３４へ処理を進める。この
際、スカラモードの待機タスクがある場合には、該待機
タスクがスレッド１に割当てられる。When a vector instruction interrupt has occurred in the thread 1 and the threads 2 and 3 have already shifted to the vector mode, the interrupt processing handler 52 proceeds from step S13 in FIG. 6 to step S13 in FIG. The process proceeds to step S21. Here, even if the threads 2 and 3 are in the vector mode, if it is confirmed in steps S21 to S23 that the thread 1 has a higher priority than the thread in the main vector mode among the threads 2 and 3, The interrupt handler 52 proceeds to steps S24 to S29. As a result, the thread 2 and the thread 3 are released from the vector mode, and the thread 1 shifts to the master vector mode and the thread 2 shifts to the slave vector mode. At this time, if there is a waiting task in the scalar mode, the waiting task is assigned to the thread 3. Threads 2 and 3
Thread 1 than thread in main vector mode
Do not have a high priority in steps S21 and S2.
When confirmed in step 3, the interrupt processing handler 52 advances the processing to steps S31 to S34 in FIG. 8 so as to wait for the transition of the thread 1 to the main vector mode. At this time, if there is a waiting task in the scalar mode, the waiting task is assigned to the thread 1.

【００３５】図５の情報処理装置におけるベクトルモー
ドからスカラモードへの復帰は、図１の場合と同様であ
るので、説明を省略する。Returning from the vector mode to the scalar mode in the information processing apparatus shown in FIG. 5 is the same as that in the case shown in FIG.

【００３６】図９は、図５の情報処理装置における外部
割込みＩＮＴの受付け処理の例を示している。外部割込
みＩＮＴの処理はスレッド資源の停止及び退避処理を必
要とする。ベクトルモードのスレッド資源の退避に比べ
て、スカラモードのスレッド資源の退避の方が簡便であ
るという事情がある。そこで、割込み処理ハンドラ５２
は、図９中のステップＳ４１〜Ｓ４５に従って、スカラ
モードのスレッドに優先的に外部割込みの処理を実行さ
せる。これにより、ベクトルモードのタスクを中断する
機会を減らすことができる。FIG. 9 shows an example of a process of accepting an external interrupt INT in the information processing apparatus of FIG. The processing of the external interrupt INT requires stopping and saving the thread resources. There is a situation in which saving thread resources in scalar mode is easier than saving thread resources in vector mode. Therefore, the interrupt handler 52
Causes the thread in the scalar mode to execute the external interrupt process preferentially in accordance with steps S41 to S45 in FIG. As a result, it is possible to reduce opportunities to interrupt the task in the vector mode.

【００３７】なお、４個以上のスレッドを有する情報処
理装置への拡張は、図５から明らかであろう。一般化す
れば、複数個のレジスタセットのうちの個々のレジスタ
セットをスカラレジスタセットとして使用してスカラ命
令を実行するスカラモードと、該複数個のレジスタセッ
トのうちのＮ個（Ｎは２以上の整数）のレジスタセット
をＮ元ベクトルレジスタセットとして使用してベクトル
命令を実行するベクトルモードとが切換えられる。The extension to the information processing apparatus having four or more threads will be apparent from FIG. In general, a scalar mode in which a scalar instruction is executed by using an individual register set among a plurality of register sets as a scalar register set, and a scalar mode of N (where N is 2 or more) among the plurality of register sets Is switched to a vector mode in which a vector instruction is executed by using the register set of (i.

【００３８】スカラモードにおいてベクトル命令を含む
関数の呼び出し命令（ＣＡＬＬ命令）が実行されたとき
にベクトルモードへ切換え、ベクトルモードにおいてベ
クトル命令を含まない関数への復帰命令（ＲＥＴＵＥＮ
命令）が実行されたときにスカラモードへ切換えること
としてもよい。また、複数個のスレッドの各々に割付け
られたタスクがベクトル命令を実行するタスクであるか
否かを示す情報をオペレーティングシステムに持たせて
おき、該情報に応じて、タスク切換えの際に、次の実行
タスクがベクトル命令を実行するタスクであればベクト
ルモードへ切換え、次の実行タスクがベクトル命令を実
行しないタスクであればスカラモードへ切換えることと
してもよい。When a call instruction (CALL instruction) for a function including a vector instruction is executed in the scalar mode, the mode is switched to the vector mode, and in the vector mode, a return instruction to a function not including the vector instruction (RETUEN) is issued.
(Instruction), the mode may be switched to the scalar mode. In addition, the operating system has information indicating whether or not the task assigned to each of the plurality of threads is a task for executing a vector instruction. If the execution task is a task that executes a vector instruction, the mode is switched to the vector mode, and if the next execution task is a task that does not execute the vector instruction, the mode is switched to the scalar mode.

【００３９】[0039]

【発明の効果】以上説明してきたとおり、本発明に係る
マルチスレッド型の情報処理装置によれば、複数個のレ
ジスタセットを個々にスカラレジスタセットとして使用
するモードと、そのうちのＮ個（Ｎは２以上の整数）の
レジスタセットをベクトルレジスタセットとして使用す
るモードとを自由に切換えることができるようにしたの
で、行列演算の有無などのプログラムの特質に適応して
情報処理の高速化を達成しつつ、ハードウェア資源の利
用効率を高めることができる。As described above, according to the multi-thread information processing apparatus according to the present invention, a mode in which a plurality of register sets are individually used as scalar register sets, and N modes (N is (2 or more integers) register set can be freely switched to a mode in which the register set is used as a vector register set, so that the speed of information processing can be increased by adapting to program characteristics such as the presence or absence of matrix operation. In addition, the utilization efficiency of hardware resources can be improved.

[Brief description of the drawings]

【図１】本発明に係る情報処理装置の構成例を示すブロ
ック図である。FIG. 1 is a block diagram illustrating a configuration example of an information processing apparatus according to the present invention.

【図２】図１の情報処理装置のスカラモードにおける動
作例を示すタイミングチャート図である。FIG. 2 is a timing chart illustrating an operation example of the information processing apparatus in FIG. 1 in a scalar mode.

【図３】図１の情報処理装置におけるスカラモードから
ベクトルモードへの移行処理の例を示すフローチャート
図である。FIG. 3 is a flowchart illustrating an example of a transition process from a scalar mode to a vector mode in the information processing apparatus in FIG. 1;

【図４】図１の情報処理装置のベクトルモードにおける
動作例を示すタイミングチャート図である。FIG. 4 is a timing chart illustrating an operation example of the information processing apparatus of FIG. 1 in a vector mode.

【図５】本発明に係る情報処理装置の他の構成例を示す
ブロック図である。FIG. 5 is a block diagram illustrating another configuration example of the information processing apparatus according to the present invention.

【図６】図５の情報処理装置におけるスカラモードから
ベクトルモードへの移行処理の例を示すフローチャート
図である。FIG. 6 is a flowchart illustrating an example of a transition process from a scalar mode to a vector mode in the information processing apparatus of FIG. 5;

【図７】図６に続く処理の例を示すフローチャート図で
ある。FIG. 7 is a flowchart illustrating an example of processing subsequent to FIG. 6;

【図８】図７に続く処理の例を示すフローチャート図で
ある。FIG. 8 is a flowchart illustrating an example of processing subsequent to FIG. 7;

【図９】図５の情報処理装置における外部割込みの受付
け処理の例を示すフローチャート図である。9 is a flowchart illustrating an example of an external interrupt acceptance process in the information processing apparatus of FIG. 5;

[Explanation of symbols]

１０データパスユニット１１，１２，１３レジスタセット１５演算器（演算手段）１６Ｍレジスタ２０メモリユニット３０制御ユニット（制御手段）４０命令フェッチユニット４１，４２，４３プログラムカウンタ４５，４６，４７命令バッファ５０シーケンサ５１モードレジスタ５２割込み処理ハンドラ６０，６１，６２マルチプレクサ Reference Signs List 10 data path unit 11, 12, 13 register set 15 arithmetic unit (operation unit) 16 M register 20 memory unit 30 control unit (control unit) 40 instruction fetch unit 41, 42, 43 program counter 45, 46, 47 instruction buffer 50 Sequencer 51 Mode register 52 Interrupt handler 60, 61, 62 Multiplexer

Claims

[Claims]

1. A multi-thread type information processing apparatus, comprising: a plurality of register sets each having a plurality of registers, each of which belongs to one thread, and shared by the plurality of threads. Computing means for performing an operation based on data supplied from the plurality of register sets; and a scalar instruction using each of the plurality of register sets as a scalar register set. A scalar mode to be executed and a vector mode to execute a vector instruction by using N (N is an integer of 2 or more) register sets of the plurality of register sets as an N-ary vector register set. Information processing apparatus, comprising: a control unit for controlling the plurality of register sets and the arithmetic unit. Place.

2. The information processing apparatus according to claim 1, wherein the control unit dynamically switches to one of the scalar mode and the vector mode in response to an event occurring during operation of the information processing apparatus. An information processing apparatus having a function of switching to an information processing device.

3. The information processing apparatus according to claim 1, wherein said control means switches to said vector mode in response to an internal interrupt generated when a vector instruction is fetched in said scalar mode, and said vector mode. An information processing apparatus having a function of switching to the scalar mode in response to an internal interrupt generated when a dedicated interrupt instruction is fetched or decoded.

4. The information processing apparatus according to claim 1, wherein said control means switches to said vector mode when a call instruction of a function including a vector instruction is executed in said scalar mode, and said vector means in said vector mode. An information processing apparatus having a function of switching to the scalar mode when a return instruction to a function not including an instruction is executed.

5. The information processing apparatus according to claim 1, wherein the control unit responds to information indicating whether a task assigned to each of the plurality of threads is a task for executing a vector instruction. When the task is switched, if the next execution task is a task that executes a vector instruction, the task is switched to the vector mode, and if the next execution task is a task that does not execute a vector instruction, the function is switched to the scalar mode. An information processing apparatus characterized by the above-mentioned.

6. The information processing apparatus according to claim 1, wherein the control unit has a function of causing a thread in a scalar mode among the plurality of threads to execute an external interrupt process with priority. Information processing device.