JPS62143177A

JPS62143177A - Vector processor

Info

Publication number: JPS62143177A
Application number: JP28409085A
Authority: JP
Inventors: Tomoo Aoyama; 青山　智夫; Hiroshi Murayama; 浩村山
Original assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Current assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Priority date: 1985-12-17
Filing date: 1985-12-17
Publication date: 1987-06-26

Abstract

PURPOSE:To easily execute program debugging by interrupting the decoding processing of a vector instruction during the setting period of the 1st bit, and in the setting period of the 2nd bit, serially executing scalar processing and vector processing. CONSTITUTION:When an incorrect result is generated in a vector processor, a position generating the cause of the incorrect result in a program can be checked by interrupting the vector processing on an optional position in a DO loop and outputting the data to a main storage device 1 even when a variable referred in the DO loop or a value of an array does not exist in the main storage device 1 but exists in a register 6. Even when a variable or an array referred in the DO loop exists only in the register 6 of the processor, a value for debugging is substituted for the variable or array to retry the vector processing. In addition, parallel processing between the vector processing inherent in the vector processor and the scalar processing is suppressed and the defect of ordering for the reference of the main storage due to the parallel processing and a defect due to the rewriting of a vector instruction string can be easily extracted. Consequently, program debugging can be easily attained.

Description

【発明の詳細な説明】〔発明の利用分野〕本発明はベクトル処理袋（ξに係り、特にベクトル処理
によって不正な結果を得た場合の原因の追求に好適なベ
クトル処理装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to a vector processing bag (ξ), and more particularly to a vector processing device suitable for investigating the cause when an incorrect result is obtained by vector processing.

（発明の背景〕ベクトル計算を指向する処理装置、特にベクトル処理装
置においては、汎用計算機におけるデパック手段より一
層強力なデパック手段を用意する必要がある。たとえば
、Ｄ○ブロック構造の処理Ａとそうでない処理Ｂのプロ
グラムをベクトル処理装置で実行すると、処理ＡはＤｏ
ブロック構造であるので、ベクトル演算器又はベクトル
データ転送回路で実行される。以下、処理装置内の演算
器又はデータ転送回路を総称してリソースという。(Background of the Invention) In a processing device that is oriented toward vector calculation, especially in a vector processing device, it is necessary to prepare a depacking means that is more powerful than the depacking means in a general-purpose computer. When the program of process B is executed on a vector processing device, process A is Do
Since it has a block structure, it is executed by a vector arithmetic unit or a vector data transfer circuit. Hereinafter, the arithmetic units or data transfer circuits in the processing device will be collectively referred to as resources.

ベクトル計算は一般に非常に多量の演算を含むので、高
速のベクトルリソースをもってしても相当量の時間を消
費する。このため、処理Ａ（ベクトル処理）の完了を待
たず処理Ｂ（スカラ処理）を開始するように制御するの
が一般的である。だゾし、プログラム論理が処理ＡとＢ
が独立に実行できる場合に限り、並列処理が行われる。Vector calculations generally involve a very large amount of operations, and therefore consume a considerable amount of time even with high-speed vector resources. For this reason, it is common to perform control to start processing B (scalar processing) without waiting for the completion of processing A (vector processing). The program logic is processing A and B.
Parallel processing occurs only if the two can be executed independently.

プログラム論理が正しく、プログラム作成者の意図が反
映されているものであるならば、上記の如き４ρ列的な
処理を行っても正当な結果を得ることかできる。しかし
、プログラム作成者の意図とは逆に、プログラム論理が
誤りを含むものであるとき、上記の如き並列的な処理を
行っていると、その誤りの検出は困難になる。たとえば
処理Ａの中て読出す配列Ｃと処理Ｂの中で書込みを行う
領域りは、正しいプログラムならば主記憶上の重複はな
いが、たまたまＤｏ制御変数値を誤ってしまったため、
配列Ｃのポインタが大きくなって領域りの一部をさして
しまったような場合である。このような場合、処理Ｂ内
のストア処理がベクトル処理となる処理Ａ内の配列Ｃの
読出しと時間的に重複すると、ベクトル処理においては
主記憶参照は処理の状況によって時間が大きく変化する
ため。If the program logic is correct and reflects the intentions of the program creator, it is possible to obtain valid results even if the above-mentioned 4ρ process is performed. However, if the program logic contains an error contrary to the intention of the program creator, it becomes difficult to detect the error if parallel processing as described above is performed. For example, if the program is correct, the array C read in process A and the area written in process B will not overlap in main memory, but because the Do control variable value happened to be incorrect,
This is a case where the pointer of array C has grown so large that it points to part of the area. In such a case, if the store processing in Process B overlaps in time with the readout of array C in Process A, which is vector processing, the time required for main memory reference in vector processing varies greatly depending on the processing situation.

ストア処理とベクトル読出しの順序が変り、結果の再現
性がなくなる。こ＼で処理の状況とは、主記憶ボートの
ビジー状況等をいう。このため、ベクトル処理の配列Ｃ
の読出しとスカラ処理のストアの時間的順序性がプログ
ラムで規定されている通りではなくなり、ある場合は正
しい結果が得られ、ある時は不正な結果となる。The order of store processing and vector readout changes, and the reproducibility of the results is lost. Here, the processing status refers to the busy status of the main memory boat, etc. Therefore, the vector processing array C
The temporal ordering of reads and stores of scalar processing is no longer as specified in the program, and sometimes a correct result is obtained, and sometimes an incorrect result.

またベクトル命令を直前のスカラ処理で変更しているよ
うな場合、ベクトル処理装置は命令の先読みを行ってい
るので、変更前のベクトル命令を命令バッファ中にとり
込んで処理している可能性があり、この命令先読みが主
記憶とバッファストレージ間のデータ転送に関係してい
ると、処理装置の他のデータ転送回路との競合によって
、命令書替えが有効になったり、無効になったりする。In addition, if a vector instruction is modified by the previous scalar processing, the vector processing device reads ahead of the instruction, so there is a possibility that the vector instruction before the modification is loaded into the instruction buffer and processed. If this instruction prefetch is related to data transfer between main memory and buffer storage, instruction rewriting may be enabled or disabled due to competition with other data transfer circuits of the processing device.

このように処理の並列性を実現するということは、プロ
グラムに規定されている処理の順序性を破ることになっ
ている。Achieving parallel processing in this way means breaking the order of processing defined in the program.

一方、プログラムのデパックを行う作業を解析してみる
と、デパック作業が行えるということは、プログラムに
記述されている論理が順序だって行われ、１つの文の結
果が確定した後に、次の文の処理を開始するという前提
条件を必要としている。On the other hand, when we analyze the process of depacking a program, we find that the depacking process can be performed in an orderly manner, meaning that the logic written in the program is performed in order, and after the result of one statement is determined, the next statement is Requires a precondition to start processing.

この前提条件は並列処理もしくはベクトル処理と排反の
条件である。従って、プログラムと処理装置のハードウ
ェアの間に、処理装置の論理動作をシリアライズするか
又は一部の論理動作の順序性を保障する指示を行うため
に何らかのインターフェイスが必要である。This precondition is exclusive of parallel processing or vector processing. Therefore, some interface is required between the program and the hardware of the processing device to serialize the logical operations of the processing device or provide instructions that ensure the ordering of some logical operations.

従来、このような並列処理におけるプログラムデバッグ
の困難さを解決するため、データフローマシンにおいて
メンテナンスアーキテクチャを付加させた例がある（平
木、西口、関口、島田゛′科学技術１１Ｆ算用データ駆
」すＪ計算機Ｓ　ＩＧＭＡ−１のメンテナンスアーキテ
クチャ″、情報処理学会第３１回講演論文集６Ｄ−３）
。データフローマシンは並列処理を行う計算機の一種で
あるが、１命令の演算処理そのものは各々独立して実行
される。従って、ある指定された時点で計算機の処理を
中断させ、計算機内部のレジスタ、メモリ等の情報を、
メンテナンスアーキテクチャを使って読出すことによっ
て指定された時点の計算の状況を知ることができる。In the past, in order to solve the difficulty of program debugging in parallel processing, there is an example of adding a maintenance architecture to a data flow machine (Hiraki, Nishiguchi, Sekiguchi, Shimada ``Science and Technology 11F Computing Data Drive''). ``Maintenance Architecture of J Computer S IGMA-1'', Information Processing Society of Japan 31st Conference Proceedings 6D-3)
. A data flow machine is a type of computer that performs parallel processing, but each instruction's arithmetic processing is executed independently. Therefore, the processing of the computer is interrupted at a certain specified point, and the information in the computer's internal registers, memory, etc.
By reading it using the maintenance architecture, it is possible to know the calculation status at a specified time.

これに対しベクトル計算は、ＤＯループ処理をスカラ処
理とは計算方法の異るバク１ヘル命令の集合に展開して
実行するので、ベクトル計算の任意の時点での停止とい
うことか極めて困難である。On the other hand, vector calculations are executed by expanding DO loop processing into a set of Baku1Hel instructions, which have a different calculation method than scalar processing, so it is extremely difficult to stop vector calculations at any point. .

仮にもしそれが行えたとしても、ベクトル演算器の中の
パイプライン処理途中の変形されたデータがプログラム
デバッグにどれだけの情報を与えうるかという点も問題
である。従って、ベクトル処理の適切な中断点をベクト
ル命令間に指定し、この中断点にベクトル処理が致った
時にベクトル処理を停止させる方法が妥当と考えられる
。こ＼で、ベクトル計算を停止させるということは、ベ
クトル処理装置にとってどういう意味を持つのかを考察
する。Even if this could be done, there is also the issue of how much information the transformed data during pipeline processing in the vector arithmetic unit can provide for program debugging. Therefore, it is considered appropriate to specify an appropriate interruption point for vector processing between vector instructions and to stop vector processing when the vector processing reaches this interruption point. Now let's consider what stopping vector calculations means for the vector processing device.

バク１〜ル処理はベクトル命令の解読から始まり、全ベ
クトル要素の処理をもって終了するが、一般にベクトル
処理の開始と終了の間の時間は、１ベクトル命令で複数
のデータを処理するので、スカラ命令の実行に比べｌ−
〜２オーダ長い。このためスカラ処理と同じような考え
方でベクトル命令の実行開始を中断させても、すでに起
動されているベクトル命令は数十マシンサイクル以上停
止せずに動作する。しかし、プログラムのソースコード
を対象にデパックを行うということは、計算機内部の動
作が完結し処理結果が主記憶等の記憶手段に書込まれる
ことを前提条件としているので、バク１−ルプロセツサ
内のリソースの停止を待つ必要がある。従って、ベクト
ル処理のデパックを行うためには、ベクトル命令の起動
を目的とするプログラム位置で中断させ、この中断点よ
り以前に実行されているベクトル命令の完了を待って１
種々のデパック処理を開始する必要がある。Back 1 to 1 processing begins with the decoding of a vector instruction and ends with the processing of all vector elements, but generally the time between the start and end of vector processing is limited to scalar instructions since multiple data are processed with one vector instruction. l−
~2 orders of magnitude longer. Therefore, even if the start of execution of a vector instruction is interrupted using a concept similar to scalar processing, vector instructions that have already been started will continue to operate without stopping for more than a few dozen machine cycles. However, depacking the source code of a program requires that the internal operations of the computer be completed and the processing results written to storage means such as main memory. It is necessary to wait for the resource to stop. Therefore, in order to depack vector processing, it is necessary to interrupt the start of a vector instruction at the target program position, wait for the completion of the vector instructions that have been executed before this interruption point, and then
Various depacking processes need to be initiated.

デパックを行うということは、処理装置内の記憶手段上
のビットパターンをキャラクタに変換して外部の装置Ｎ
に出力することである。このデータ変換方法は汎用機等
で用いられている方法を用いることが出来る。従って、
ベクトルプロセッサ内のレジスタ類を主記憶に書出すこ
とが出来れば、従来の手法のソフトウェア資産を用いる
ことができる。考察を容易化するため、ベクトルプロセ
ッサ内の記憶手段をベクトルレジスタに限定する。Depacking means converting the bit pattern on the storage means in the processing device into a character and
It is to output to. As this data conversion method, a method used in general-purpose machines and the like can be used. Therefore,
If the registers in the vector processor can be written to the main memory, software assets of conventional methods can be used. To facilitate discussion, storage means within the vector processor will be limited to vector registers.

ベクトルレジスタの内容を主記憶に書出すためには、ベ
クトルのストア命令を発行する必要がある。In order to write the contents of the vector register to main memory, it is necessary to issue a vector store instruction.

しかし、このベクトルストア命令を発行する方法が問題
である。これをコンパイラのデパックモード指定によっ
て、ベクトル命令列の中に前もってベクトルストア命令
を埋めこんでおく方式で解決することはできない。この
ストア命令埋め込み法では、ベクトル処理そのものは止
らずに行われ、一連の計算が完了した後に、そのベクト
ル計算のトレースが得られるだけである。従って、プロ
グラムデバッグの重要な処理、即ち１、プログラム処理を停止させ、その時点で処理装置内
の記憶手段に任意のデータパターンをセットし、処理装
置を再起動する、２、上記の停止時点で、処理装置内の記憶手段の任意の
位置のデータを出力し、処理装置を再起動する、という処理を行うことができない。その上、コンパイラ
のデパックモード指定により、ベクトル命令列が余分な
ストア命令挿入により変化し、ベクトルプロセッサの処
理状況特に主記憶に関するデータ処理状況が、デパック
モード以外の場合と異る点も問題である。However, the problem is how to issue this vector store instruction. This cannot be solved by embedding a vector store instruction in the vector instruction sequence in advance by specifying the depack mode of the compiler. In this store instruction embedding method, vector processing itself is performed without stopping, and only a trace of the vector calculation is obtained after a series of calculations is completed. Therefore, the important processes for program debugging are: 1. Stop the program processing, set an arbitrary data pattern in the storage means in the processing device at that point, and restart the processing device. 2. At the above-mentioned stop point. , outputting data at an arbitrary location in the storage means within the processing device, and restarting the processing device cannot be performed. Furthermore, due to the depack mode specification of the compiler, the vector instruction sequence changes by inserting an extra store instruction, and the processing status of the vector processor, especially the data processing status regarding main memory, is different from that in cases other than depack mode. It is.

以上の議論はベクトルレジスタについてであったが、ベ
クトルプロセッサ内の記憶手段がベクトルマスクレジス
タやその他のレジスタに替っても全く同様である。この
ような事情により、ベクトル処理装置のデバッグ手法は
未だ確立されていないのが現状である。Although the above discussion has been about vector registers, the same applies even if the storage means in the vector processor is replaced by vector mask registers or other registers. Due to these circumstances, the current situation is that a debugging method for vector processing devices has not yet been established.

[Purpose of the invention]

本発明の目的は、プログラムデバッグが容易なベクトル
処理装置を提供することにある。An object of the present invention is to provide a vector processing device that facilitates program debugging.

[Summary of the invention]

本発明は、ベクトル処理装置においてプログラムデバッ
グを行うために、ベクトルプロセッサの処理状況をテス
トする命令（ＴＶＰ命令）と、ベクトル命令の解読処理
の中断を指示する命令（ＱＶＤ命令）と、該命令による
解読処理中断の解除を指示する命令（ＲＤＲ命令）と、
ベクトル処理とスカラ処理の並列実行の抑止を指示する
命令（ＳＢＯＮ命令）と、該命令による抑止の解除を指
示する命令（ＳＢＯＦＦ命令）とを命令セシｌ−して設
ける。一方、スカラプロセッサとベクトルプロセッサ間
で制御をやりとりするために用いられる制御レジスタ中
に、ベクトルプロセッサの命令解読中断を示すビット（
Ｄピッ１−）と、スカラとベクトル処理のシリアライス
を示すビット（Ｓビット）を設ける。そして、前記制御
レジスタ中のＤビットをＱＶＤ命令、ＲＤＲ命令でセッ
ト、リセットし、Ｓビットを５ＢＯＮ命令、Ｓ　ＢＯＦ
Ｆ命令でセット、リセットする。これらにより、ベクト
ル処理装置のプログラムデバッグが支障なく達成できる
。In order to debug a program in a vector processing device, the present invention provides an instruction (TVP instruction) for testing the processing status of a vector processor, an instruction (QVD instruction) for instructing interruption of vector instruction decoding processing, and an instruction (RDR instruction) that instructs to cancel the interruption of decoding processing;
An instruction (SBON instruction) that instructs to suppress the parallel execution of vector processing and scalar processing, and an instruction (SBOFF instruction) that instructs to cancel the suppression by this instruction are provided as an instruction series. On the other hand, in the control register used to exchange control between the scalar processor and the vector processor, there is a bit (
A bit (S bit) indicating serialization of scalar and vector processing is provided. Then, the D bit in the control register is set and reset by the QVD instruction and RDR instruction, and the S bit is set and reset by the 5BON instruction and S BOF.
Set and reset with F command. These allow program debugging of the vector processing device to be accomplished without any problems.

[Embodiments of the invention]

初めに本発明の基本概念について説明する。ベクトル処
理装置のプログラムデバッグを実現するためには、ハー
ドウェア、アーキテクチャ、コンパイラ、オペレーショ
ンシステム（Ｏ８）の４処理系の連係動作が必要である
。First, the basic concept of the present invention will be explained. In order to realize program debugging of a vector processing device, it is necessary to cooperate with four processing systems: hardware, architecture, compiler, and operation system (O8).

ハードウェアでは、スカシプロセッサとベクトルプロセ
ッサ間で制御のやりとりするための２種類の記憶手段（
以下、この記憶手段を制御レジスタのり、Ｓビットとよ
ぶ）を設ける。Ｄビットはベクトルプロセッサの命令解
読中断を示し、Ｓピッ！−はスカシ、ベクトル処理のシ
リアルライズを指示する。スカシプロセッサはＤビット
がオンになると、制御をＯ８の常駐している空間に移す
。In terms of hardware, there are two types of storage means (
This storage means is provided with a control register (hereinafter referred to as S bit). The D bit indicates the instruction decoding interruption of the vector processor, and the S p! - instructs to serialize vector processing. When the D bit is turned on, the Squashi processor transfers control to the resident space of O8.

これにより、ユーザプログラムのスカシ処理は中断され
、Ｏ８の処理が開始される。この時点ではベクトルプロ
セッサは未だユーザプログラムの処理を実行している。As a result, the user program clearing process is interrupted and the process of O8 is started. At this point, the vector processor is still processing the user program.

従って、スカシ処理はユーザ空間からＯ８空間への処理
へスイッチングされているが、ベクトル処理については
ユーザ空間の処理が続行している。この２面性が従来の
タスクスイッチの概念と異る点である。Ｏ８は主記憶の
特定番地に格納されているアドレスへ制御を渡す処理を
行う。Therefore, while the search process is switched from the user space to the O8 space, the vector processing continues in the user space. This dual nature is different from the conventional concept of task switching. O8 performs processing to transfer control to an address stored at a specific address in the main memory.

二＼でユーザ空間の構成について議論する。一般にユー
ザ空間は単一の構成のように思われていて、大部分のプ
ログラムがこの単一構成の空間の中で論理を構成してい
る。しかし、ユーザプログラムの中である予測し得ない
事象、たとえばアドレス例外などが発生した時、ユーザ
空間の中で異常終了するのではなく、例外が発生した時
点でＯ８に制御を渡し、○Ｓが第２のユーザ空間に制御
を移行し、この新しいユーザ空間内で処理を続行できる
ようにＯ８に対し、ユーザプログラム実行環境を設定し
ておくことができるような処理系がある。この処理系は
ベクトル処理装置とは限らない。このような処理系では
単一構成のユーザ空間ではなく、２種類の異る構成のユ
ーザ空間の中でユーザプログラムが実行されている。以
下、この２種類のユーザ空間を第１種ユーザ空間、第２
種ユーザ空間という。第２種ユーザ空間は第１種ユーザ
空間で予測し得ない事象が発生しなければ制御が移行し
ない空間で、かつ、上記事象によってＯ８が介入しなけ
ればプログラム制御を移せない空間である。本発明にお
けるＯ８の動作は上記の２種類のユーザ空間間の制御の
連絡に類似しているが次の点が異る。Section 2 discusses the configuration of user space. Generally, user space is considered to be a single structure, and most programs configure their logic within this single structure. However, when an unpredictable event such as an address exception occurs in a user program, instead of abnormally terminating in user space, control is passed to O8 at the time the exception occurs, and ○S is There is a processing system that can set a user program execution environment for O8 so that control can be transferred to a second user space and processing can be continued in this new user space. This processing system is not necessarily a vector processing device. In such a processing system, a user program is executed not in a single user space but in two different user spaces. Hereinafter, these two types of user spaces will be referred to as type 1 user space and type 2 user space.
This is called the seed user space. The second type user space is a space to which control cannot be transferred unless an unpredictable event occurs in the first type user space, and a space to which program control cannot be transferred unless the O8 intervenes due to the above event. The operation of O8 in the present invention is similar to the control communication between the two types of user spaces described above, but differs in the following points.

従来の２種類のユーザ空間間のスイッチングは、常に処
理装置の動作が１点で行われており、それが完了してか
ら行われている。こ＼で動作の１点ということを次のよ
うに定義することにする。時間をたて軸に、主記憶装置
ｉｔのアドレスを横軸にとった２次元の空間（これはユ
ーザ、Ｏ８空間とは異る分類に属する空間である。以下
、この空間のことをπ空間という）で処理装置の動作を
示すと、汎用計算機即ちスカシ計算機は、１本の線によ
って示せる。従って、ある時間に対して、「動作の１点
」という言葉が意味を持つことになる。Conventionally, switching between two types of user spaces is always performed at one point, and only after the operation of the processing device is completed. Here, we will define one point of action as follows. A two-dimensional space with time as the axis and the address of the main memory IT as the horizontal axis (this space belongs to a different classification from the user/O8 space.Hereinafter, this space will be referred to as π space). A general-purpose computer, ie, a computer, can be represented by a single line. Therefore, the words "one point of action" have meaning for a certain time.

２種類のユーザ空間のスイッチング動作では、２様のユ
ーザ空間は同一時間には処理装置は両方の空間をアクセ
スしていない。しかし、本発明のベクトル処理装置では
、π空間において、複数の処理軌跡を描く。故に「動作
の１点」ではなく、ある時間に対し「動作の複数点」と
いうことになり、２様のユーザ空間は同一時間にスカラ
、ベクトル両プロセッサからアクセスが行われうる。従
って、同一時刻で２様のユーザ空間が共に処理対象とな
ることがありうる。In the switching operation of two types of user spaces, the processing device does not access both spaces at the same time. However, the vector processing device of the present invention draws a plurality of processing trajectories in the π space. Therefore, it is not ``one point of operation'' but ``multiple points of operation'' for a certain time, and two types of user spaces can be accessed by both scalar and vector processors at the same time. Therefore, two types of user spaces may both be processed at the same time.

従ってアーキテクチャとしては次のような命令セットを
用意する。Therefore, the following instruction set is prepared as an architecture.

スカシ処理命令として、１、ベクトルプロセッサの処理の状況をテストする命令
（以下Ｔ　ｅｓｔ　Ｖ　ｅｃｔｏｒ　Ｐ　ｒｏｃｅｓｓ
ｏｒ命令という。略称はＴＶＰ）、２、Ｄビットをリセソ］へして、ベクトル命令Ｍ、読を
再開する命令（Ｒｅｓｔ　Ｄ　ｂｉｔ　ａｎｄ　Ｒｅ５
ｔａｒｔ命令という。略称はＲＤＲ）、３、Ｓビットをオン、オフする命令（Ｓ　ｂｉｔ　ｏｎ
ｏｒ　ｏｆｆ、、略称は５ＢＯＮ、５ＢＯＦＦ）、４、
第２種のユーザ空間から第１種のユーザ空間へもどる命
令（この命令は５ｕｐｅｒｖｉｓｏｒ　Ｃａ１ｌ（ＳＶ
Ｃ）命令で代用することができる）、ベクトル処理命令
として、５、ベクトル命令解読を中断し、Ｄビットをオンとする
命令（Ｑ　ｕｉｔ　Ｖ　ｅｃｔｏｒ　Ｄ　ｅｃｏｄｅ命
令、略して０ＶＤ）次にコンパイラの処理について述べる。コンパイラは第
１，２種ユーザ空間の境界を決める必要がある。コンパ
イラは、ユーザプログラムと第２種空間にロードされる
デパックのためのプログラムの両方をコンパイラするの
で、第１．第２種空間の大きさを予測することが出来、
従って両方の空間の境界を決定しうる。The following processing instructions include: 1. An instruction to test the processing status of a vector processor (hereinafter referred to as T est Vector Processor)
It is called an or command. (abbreviation: TVP), 2. Reset D bit], vector instruction M, instruction to resume reading (Rest D bit and Re5)
This is called a tart instruction. (abbreviation: RDR), 3. Instruction to turn on and off the S bit (S bit on
or off, abbreviations are 5BON, 5BOFF), 4,
Command to return from the second type of user space to the first type of user space (this command is 5supervisor Ca1l (SV
C) instruction), as a vector processing instruction: 5. Instruction to interrupt vector instruction decoding and turn on the D bit (Quit vector decode instruction, abbreviated as 0VD) Next, compiler processing Let's talk about. The compiler needs to determine the boundaries between type 1 and type 2 user spaces. The compiler compiles both the user program and the depacking program loaded into the second type space. It is possible to predict the size of space of the second kind,
Therefore, the boundaries of both spaces can be determined.

第２種ユーザ空間にロードされるデパックプログラムは
、ベクトル処理が中断された時、任意の処理装置内のレ
ジスタの内容を主記憶に出力し、任意のレジスタに値を
設定できる機能を持つ必要がある。このためにデパック
プログラムの先頭でベクトルプロセッサの処理の完了を
待つ処理が必要である（ＴＶＰ命令を用いる）。The depack program loaded into type 2 user space must have a function that can output the contents of registers in any processing device to main memory and set values in any register when vector processing is interrupted. There is. For this reason, it is necessary to wait for the completion of processing by the vector processor at the beginning of the depack program (using the TVP instruction).

デパックを行うユーザにとって、知りたい情報は処理装
置内のレジスタ値や主記憶上の絶対番地上のデータでは
なく、プログラムで用いている変数又は配列の特定時刻
における値である。故に第１種ユーザ空間にある変数、
配列と処理装置内のレジスタ番号と主記憶上のロケーシ
ョンアドレスの対応を第２種のユーザ空間に引渡す必要
がある。For users who perform depacking, the information they want to know is not the register values in the processing device or the data at absolute addresses in the main memory, but the values of variables or arrays used in the program at a specific time. Therefore, variables in the first type user space,
It is necessary to transfer the correspondence between the array, the register number in the processing device, and the location address on the main memory to the second type of user space.

特にベクトルプロセッサはベクトル計算の中間結果をベ
クトルレジスタと呼ばれるところの、ユーザプログラム
が管理するバッファレジスタに格納するため、ユーザプ
ログラムからこのベクトルレジスタ番号を管理しなくて
済むように、コンパイラが配列又は変数とこのベクトル
レジスタ番号との対応を行っている。従って第１種、第
２種ユーザ空間間でこの配列変数とベクトルレジスタ番
号の対応表をうけわたし、この対応表を用いてユーザの
指摘した配列又は変数のベクトルレジスタの番号を見出
し、これらの内容を指定した主記憶上のアドレスに書出
す必要がある。この対応表の授受については第１，２種
のユーザ空間とも、多重仮想空間（ＭＶＳ）という概念
から見れば、１個のユーザ空間なので、主記憶の特定番
地を当該対応表に割当てるか、又はサブルーチンコール
のように、ある特定の汎用レジスタにこの対応表のアド
レスを格納して、第１，２種ユーザ空間をスイッチング
させればよい。たりし、後者の方式はベクトルプロセッ
サが汎用レジスタに値を直接書込まないようなアーキテ
クチャを持っている処理装置でないと実現は困雅である
。なぜならば、第１゜２種ユーザ空間をスカラプロセッ
サでスイッチングする際、ベクトルプロセッサは未だ命
令処理未完了であって、アーキテクチャがベクトル処理
中汎用レジスタの内容を変更できる型式であると、汎用
レジスタ内の値の保証ができなくなるからである。In particular, vector processors store intermediate results of vector calculations in buffer registers called vector registers that are managed by the user program. and this vector register number. Therefore, this correspondence table of array variables and vector register numbers is passed between the first and second type user spaces, and this correspondence table is used to find the vector register number of the array or variable pointed out by the user, and their contents are must be written to the specified address in main memory. Regarding the exchange of this correspondence table, since both the first and second type of user space are one user space from the concept of multiple virtual space (MVS), it is necessary to allocate a specific address in the main memory to the correspondence table, or Just like a subroutine call, the address of this correspondence table may be stored in a specific general-purpose register to switch between type 1 and type 2 user spaces. However, the latter method is difficult to implement unless the vector processor has an architecture that does not directly write values to general-purpose registers. This is because when switching type 1/2 user space with a scalar processor, the vector processor has not yet completed instruction processing, and if the architecture is such that the contents of general-purpose registers can be changed during vector processing, the contents of general-purpose registers may be changed. This is because the value of can no longer be guaranteed.

第１，２種ユーザ空間間で変数、配列とベクトルレジス
タ又は主記憶アドレスの対応表のうけわたしか可能にな
ると、この対応表とベクトル命令のロード／ストア命令
を起動して（この時ＲＤＲ命令を用いる）、ベクトルプ
ロセッサ内の任意のレジスタの値を読出し又は書込むこ
とができる。When it becomes possible to exchange the correspondence table between variables, arrays, vector registers, or main memory addresses between type 1 and type 2 user spaces, load/store instructions for this correspondence table and vector instructions are activated (at this time, the RDR instruction ), the value of any register within the vector processor can be read or written.

なお、ロード／ストア命令の後にはＱＶＤ命令が再度必
要である。Note that the QVD instruction is required again after the load/store instruction.

このような方法によってベクトル処理を中断させ、ベク
トル処理装置の内部レジスタ、主記憶上のデータを読取
り、また、これらの記憶手段にデパックのための特定の
データを書込むことが可能になる。これは、ベクトル命
令列の中に埋め込まれたベクトル命令解読中断命令によ
って、スカラプロセッサの命令実行処理を中断させデパ
ックプログラムを起動することによる。従って、ベクト
ル命令列に対し何らかの人工的作為即ちＱＶＤ命令の挿
入が行われていることになる。しかし、この人工的作為
がベクトルプロセッサの処理に及ぼす影響はわずかであ
る。即ち、ベクトル命令の特徴はスカラ命令に比べて、
命令解読処理よりも命令実行時間が大きい点にある。従
って、ベクトルプロセッサ内では、ベクトル命令は解読
が完了して、その命令を実行するために必要なリソース
の空きを待っている状態になっている確率が高い。Such a method makes it possible to interrupt vector processing, read data in the internal registers and main memory of the vector processing device, and write specific data for depacking into these storage means. This is because a vector instruction decoding interrupt instruction embedded in a vector instruction string interrupts the instruction execution process of the scalar processor and starts the depack program. Therefore, some kind of artificial manipulation, that is, QVD instructions are inserted into the vector instruction sequence. However, this artifact has only a small effect on the processing of the vector processor. In other words, compared to scalar instructions, vector instructions have the following characteristics:
The point is that the instruction execution time is longer than the instruction decoding process. Therefore, within the vector processor, there is a high probability that the vector instruction has been completely decoded and is waiting for the resources necessary to execute the instruction to become available.

このためベクトル命令列中に命令解読中断命令（ＱＶＤ
命令）を挿入したとしても、その影響はベクトル処理装
置の命令解読回路部に及ぶだけで、リソースの動作状態
にまで及ばない。この点がデパックのためベクトル命令
列にロード／ストア命令を挿入する方法との相異である
。従って、ベクトル命令解読中断命令をベクトル命令中
に挿入することは、プログラムに対する変形ではあるが
、その変形の度合は局所的であって、デパックのために
プログラムに大＋１］な変更を加えたことにはならない
。Therefore, the instruction decoding interrupt instruction (QVD) is included in the vector instruction sequence.
Even if an instruction is inserted, the effect is only on the instruction decoding circuit of the vector processing device, but not on the operational status of the resources. This point is different from the method of inserting load/store instructions into a vector instruction sequence for depacking. Therefore, inserting a vector instruction decoding interrupt instruction into a vector instruction is a modification to the program, but the degree of modification is local, and it is not necessary to make a major change to the program for depacking. It won't be.

以上のようなデパック手法は、スカシ処理装置としては
十分ではあるが、ベクトル処理装置としては未だ不十分
である。ベクトル処理装置はベクトル処理とそれ以外の
スカシ処理の並列動作を行うことができる。このため、
この並列動作不適なプログラムがユーザの錯誤によりコ
ーディングされ、コンパイラによってもこの並列動作不
適が検出され得なかった場合、再現性の乏しい不正動作
が起りうる。このような場合、デパック操作としては、
プログラムのある特定の処理をある処理の後で実行する
ような指定を行う必要がある。このデパック処理操作を
特別な文を導入して実現することは可能である。たとえ
ば、スーパーバイザマクロにおいてはタスクのボスト／
ウェイト命令等がある。しかし、この方式では、複数の
処理間の順序関係を明示的に指示する必要が生じた場合
に、プログラム記述上非常に複雑になり困難になる。Although the depacking method described above is sufficient for use as a swatch processing device, it is still insufficient for use as a vector processing device. The vector processing device can perform parallel operations of vector processing and other search processing. For this reason,
If this program unsuitable for parallel operation is coded by a user's mistake and the compiler cannot detect this unsuitability for parallel operation, malfunction with poor reproducibility may occur. In such a case, the depacking operation is
It is necessary to specify that a certain process in the program should be executed after a certain process. It is possible to implement this depacking operation by introducing a special statement. For example, in a supervisor macro, a task's boss/
There are wait commands, etc. However, in this method, if it becomes necessary to explicitly indicate the order relationship between multiple processes, the program becomes very complicated and difficult to write.

また、プログラム文脈の解読も困難になる。これは従来
のプログラム文語に「いつ処理を行うか」という指定が
ないことに起因する。従って、この方式はベクトル処理
装置の、）ｐ列動作を行うことの適不適を判断するため
には採用できない。It also becomes difficult to decipher the program context. This is due to the fact that conventional programming language does not specify "when to perform processing." Therefore, this method cannot be adopted for determining whether or not it is appropriate to perform column ()p operations in a vector processing device.

本発明では、ベクトル処理装置の並列動作の適不適を次
のようにして実現する。In the present invention, the suitability of parallel operation of the vector processing device is realized as follows.

１、ユーザはプログラムの調べたい範囲を指摘する。1. The user points out the range that the program wants to investigate.

２、コンパイラは上記の範囲で、処理装置の制御レジス
タのＳビットがオンになるように、必要なスカシ命令を
オブジェクトコード中に挿入する（ＳＢＯＮ、５ＢＯＦ
Ｆ命令を用いる）。2. The compiler inserts necessary scan instructions into the object code within the above range so that the S bit of the control register of the processing unit is turned on (SBON, 5BOF).
(using the F command).

３、これによって被検証プログラムのある特定の範囲で
ベクトル処理とそれ以外のスカシ処理のシリアライズが
行われるように、スカシプロセッサの命令解読回路が制
御する。3. As a result, the instruction decoding circuit of the scanning processor controls so that serialization of vector processing and other scanning processing is performed within a certain range of the program to be verified.

４、上記のようなシリアルライズ動作でプログラムを実
行した場合と、並列処理動作を行った場合との結果を比
較し、不正な結果がベクトル処理装置の並列処理機能を
誤って使用した＼めに発生したのか否かを判定する。4. Compare the results when the program is executed using the serialization operation as described above and when the program is executed in parallel processing operation, and find out that the incorrect result is due to incorrect use of the parallel processing function of the vector processing device. Determine whether or not it has occurred.

この様にすることにより、プログラムが暴走して５ＢＯ
ＦＦ命令を書替えると、処理装置はユーザプログラムが
終了するまでシリアライズ動作となる。しかし、このた
めにプログラムデバノクが重大な支障を受けるというも
のではない。By doing this, the program will run out of control and you will receive 5BO.
When the FF instruction is rewritten, the processing device performs serialization operation until the user program ends. However, this does not mean that the program devanok will be seriously hindered.

以下１本発明の一実施例を図面により説明するが、ベク
トル処理装置の命令解読処理はスカシとベクトルの２種
類の命令系を２つの命令解Ｊ部で解読し、命令実行を制
御する方式を採用するものとする。この方式はスカシと
ベクトル命令の解読部が分離されているので、２種類の
命令列が独立に実行でき、並列処理を容易に実現できる
。なお。An embodiment of the present invention will be described below with reference to the drawings. In the instruction decoding process of a vector processing device, two types of instruction systems, scan and vector, are decoded by two instruction solution J parts, and instruction execution is controlled. shall be adopted. In this method, the decoding units for the scan and vector instructions are separated, so two types of instruction sequences can be executed independently, and parallel processing can be easily realized. In addition.

ベクトル処理装置の命令解読処理には他に、スカシとベ
クトル命令の混在命令列を単一の命令解読部で解読し、
命令実行を制御する方式があり、これはスカシ命令とベ
クトル命令が混在しているため、スカシ処理とベクトル
処理の順序性保証をはじめとし、２帥頂の命令間の制御
が容易に実現しろるｔ、′ｉＲがある。無論、本発明は
この方式を採用してもよい。In addition to the instruction decoding process of the vector processing device, a single instruction decoding unit decodes a mixed instruction sequence of scat and vector instructions.
There is a method for controlling instruction execution, and since it uses a mixture of scan instructions and vector instructions, it is easy to control the two-top instructions, including guaranteeing the order of scan processing and vector processing. There is t,'iR. Of course, the present invention may adopt this method.

第１図は本発月のベクトル処理装置の一実施例のブロッ
ク図である。第１図において、１は主記憶装置、８は主
記憶制御回路である。２はデータ転送回路（以下、リク
エスタという）であり、本実施例では５個存在するので
添字ａ、ｂ、　　・・、ｅをつけて区別する。３は命令
解読回路で、これは２個存在するので添字ａ、ｂをつけ
て区別する。FIG. 1 is a block diagram of an embodiment of the present invention's vector processing device. In FIG. 1, 1 is a main memory device, and 8 is a main memory control circuit. Reference numeral 2 designates data transfer circuits (hereinafter referred to as requesters), and since there are five in this embodiment, they are distinguished by adding subscripts a, b, . . . , e. 3 is an instruction decoding circuit, and since there are two of these, subscripts a and b are added to distinguish them.

４はスカラ演算器（汎用、浮動小数点レジスタを含む）
、５は命令起動回路、６はベクトルレジスタ、７はベク
トル演算器である。ベクトル処理装置の各資源は２つの
命令解読回路３ａ、３ｂによってそれぞれ制御されるの
で、第１図に点線で囲ったように、スカラプロセッサ部
ＳＰと、ベクトルプロセッサ部ｖＰに分類することがで
きる。4 is a scalar arithmetic unit (general purpose, including floating point registers)
, 5 is an instruction activation circuit, 6 is a vector register, and 7 is a vector arithmetic unit. Since each resource of the vector processing device is controlled by two instruction decoding circuits 3a and 3b, they can be classified into a scalar processor section SP and a vector processor section vP, as indicated by dotted lines in FIG.

ベクトル処理装置に起動がかけられると、リクエスタ２
ａは命令読出要求をパス１０を通して主記憶制御回路８
へ送出する。主記憶制御回路８は命令読出要求の主記憶
参照アドレスから主記憶装置１内のアクセスすべきバン
クを決定し、当該バンクへ読出要求を発行する。主記憶
装置の同一バンクに対し、複数の参照要求があると、主
記憶制御回路８は、参照要求間の優先順位を決定する。When the vector processing unit is activated, requester 2
a sends an instruction read request to the main memory control circuit 8 through a path 10;
Send to. The main memory control circuit 8 determines the bank to be accessed in the main memory device 1 from the main memory reference address of the instruction read request, and issues a read request to the bank. When there are multiple reference requests to the same bank of the main memory device, the main memory control circuit 8 determines the priority order among the reference requests.

主記憶装置１のバンクから読出された命令は主記憶制御
回路８を通り、パス１１．リクエスタ２ａ。Instructions read from the banks of main memory 1 pass through main memory control circuit 8 and pass through path 11 . Requester 2a.

パス１２を介して命令解読回路３ａに送られる。It is sent to the instruction decoding circuit 3a via path 12.

命令解読回路３ａはスカラ命令を解読し、命令実行に必
要な指示をスカラ演算器４、リクエスタ２ｂにパス１３
．１４を介して送る。スカラ命令列中にベクトル処理開
始命令（以下、この命令をＥ　ｘｅｃｕｔｅ　Ｖ　ｅｃ
ｔｏｒ　Ｐ　ｒｏｃｅｓｓｉｎｇ命令といい、略してＥ
ＸＶＰと書く）があると、命令解読回路３ａはパス１５
を介してベクトル命令読出しをリクエスタ２ｃに対して
指示する。同時にパス１６によってベクトル命令解読回
路３ｂを起動する。リクエスタ２ｃはスカラ命令読出し
と同様にしてパス１７を使用して主記憶制御回路８に読
出要求を発行する。The instruction decoding circuit 3a decodes the scalar instruction and passes instructions necessary for executing the instruction to the scalar arithmetic unit 4 and the requester 2b via a path 13.
．． Send via 14. A vector processing start instruction (hereinafter referred to as Execute V ec
It is called a tor processing instruction and is abbreviated as E.
(written as XVP), the instruction decoding circuit 3a passes path 15.
The requester 2c is instructed to read the vector instruction via the requester 2c. At the same time, the vector instruction decoding circuit 3b is activated by path 16. The requester 2c issues a read request to the main memory control circuit 8 using the path 17 in the same manner as when reading a scalar instruction.

主記憶装置１から読出されたベクトル命令は主記憶制御
回路８、パス１８、リクエスタ２ｃ、パス１９を通って
命令解読回路３ｂに送られる。命令解読回路３ｂは、解
読したベクトル命令を命令起動回路５に送る。命令起動
回路５はパス２０゜２１を介して、ベクトルプロセッサ
部ＶＰのリソースの状態を常時管理している。パス２０
．２１は東線である。当該ベクトル命令の実行に必要な
リソースが空いていない時、命令起動回路５はパス２２
を介して当該ベクトル命令の読出中止をリクエスタ２ｃ
に通知する。リソースが空いている場合、命令起動回路
５はパス２３（東線）を介して命令実行に必要なリソー
スに起動処理を行う。A vector instruction read from main memory 1 is sent to instruction decoding circuit 3b through main memory control circuit 8, path 18, requester 2c, and path 19. The instruction decoding circuit 3b sends the decoded vector instruction to the instruction activation circuit 5. The instruction activation circuit 5 constantly manages the resource status of the vector processor section VP via paths 20 and 21. pass 20
．． 21 is the east line. When the resources necessary to execute the vector instruction are not available, the instruction activation circuit 5 selects the path 22.
The requester 2c requests to stop reading the vector instruction through
Notify. If the resource is vacant, the instruction activation circuit 5 performs activation processing on the resource necessary for executing the instruction via the path 23 (east line).

起動されたリソースかりクエスタ２ｄ又は２ｅならば、
当該リクエスタはパス２４又は２５を用いて主記憶制御
回路８にリクエストアドレスを送り。If the activated resource is Questa 2d or 2e,
The requester sends the request address to the main memory control circuit 8 using the path 24 or 25.

ロード処理の場合は、パス２６又は２７を介してデータ
を受信し、これらのデータをパス２８又は２９を介して
ベクトルレジスタ６に書込む。ストア処理の場合は、パ
ス２８又は２９を介してベクトルレジスタ６上のデータ
を読出しく実際の動作は１図示しないベクトルレジスタ
制御回路がベクトルレジスタ側からデータをパス２８．
２９を介してレクエスタ２ｄ、２ｅに送るように制御さ
れる）、パス２６又は２７を介して主記憶制御回路８に
送出する。起動されたリソースがベクトル演算器７なら
ば、ベクトルレジスタ６から被演算バク１−ルテータが
読出され、演算が行われてパス３０を通って演算結果が
バク１−ルレジスタ６に書込まれる。In the case of a load process, data is received via path 26 or 27 and written to the vector register 6 via path 28 or 29. In the case of store processing, the data on the vector register 6 is read out via the path 28 or 29.The actual operation is as follows:1.A vector register control circuit (not shown) reads data from the vector register side via the path 28.
29) and is sent to the main memory control circuit 8 via a path 26 or 27. If the activated resource is the vector arithmetic unit 7, the vector register 6 to be operated on is read out from the vector register 6, the operation is performed, and the result of the operation is written to the vector register 6 through the path 30.

なお、ベクトルレジスタ６への書込み、読出しに関する
制御は、本発明には直接関係ないので、第１図ではそれ
に必要な構成は省略されている。Note that control regarding writing to and reading from the vector register 6 is not directly related to the present invention, and therefore the necessary configuration is omitted in FIG.

次に、第２図を用いて本発明で必要とする命令の動作を
説明する。Next, the operation of the instructions required by the present invention will be explained using FIG.

第２図は第１図の命令解読回路３ａ、３ｂ及びその周辺
回路の詳細を示したもので、第１図と同一部分には同じ
番号が付されている。スカラプロセッサ部ＳＰとベクト
ルプロセッサ部ＶＰの境界は点線５７で示す。FIG. 2 shows details of the instruction decoding circuits 3a and 3b of FIG. 1 and their peripheral circuits, and the same parts as in FIG. 1 are given the same numbers. The boundary between the scalar processor section SP and the vector processor section VP is indicated by a dotted line 57.

第２図においては、カウンタ５０は第１図のりクエスタ
２ａのアドレス生成部に属している。このカウンタ５０
はレジスタ５１に格納されている命令語長に従って毎サ
イクルカウントアツプされ、レジスタ９１、パス１０を
介して命令続出アドレスを主記憶制御回路に送出する。In FIG. 2, the counter 50 belongs to the address generation section of the NoriQuesta 2a in FIG. This counter 50
is counted up every cycle according to the instruction word length stored in the register 51, and sends the instruction successive address to the main memory control circuit via the register 91 and path 10.

主記憶制御回路８は主記憶装置の指定されたアドレスか
ら命令を読出し、パス１１上に送出する。パス１」上の
命令はレジスタ６５を経由して比較回路５２によって、
並列的にＴＶＰ、５ＢＯＮ、５ＢＯＦＦ、ＥＸＶＰ、Ｒ
ＤＲ命令か否かが判定される。レジスタ５３には、これ
らの５命令を調べるためのデータが格納されている。レ
ジスタ６５の命令がＴＶＰ、５ＢＯＮ、５ＢＯＦＦ、Ｅ
ＸＶＰ、ＲＤＲ（７）いずれかである時、パス８０〜８
４上にそれぞれ“１″が送出される。The main memory control circuit 8 reads an instruction from a designated address in the main memory and sends it onto the path 11. The instruction on "Path 1" is passed through the register 65 and is processed by the comparator circuit 52.
TVP, 5BON, 5BOFF, EXVP, R in parallel
It is determined whether or not it is a DR command. The register 53 stores data for checking these five instructions. The instructions in register 65 are TVP, 5BON, 5BOFF, and E.
Pass 80-8 when either XVP or RDR (7)
"1" is sent on each of the four channels.

レジスタ５４は制御レジスタで、フリップフロップ５５
．５６は該制御レジスタのＳ、Ｄビットに対応している
。Register 54 is a control register, and flip-flop 55
．． 56 corresponds to the S and D bits of the control register.

５ＢＯＮ命令が比較回路５２によって検出されると、フ
リップフロップ５５がＬＬ　Ｉ　ＩＩにセットされる。When the 5BON instruction is detected by comparison circuit 52, flip-flop 55 is set to LL I II.

ベクトルプロセッサ内のリソースを管理している命令起
動回路５は、パス３１上にベクトルプロセッサ部がビジ
ーか否かの情報を送出している（こぎではビジーを１１
１　ＩＩとする）。ベクトルプロセッサ部のビジー情報
とフリップフロップ５５の出力はＡＮＤ回路５８で論理
積がとられ、パス８５上に送出される。このパス８５上
の信号はカウンタ５０の動作を停止させる。同時にレジ
スタ６５の書込イネーブルを落す、従って、ベクトルプ
ロセッサ部がビジーで制御レジスタ５４のＳビットがＬ
Ｌ　Ｉ　ＩＩの時、スカシ命令読出は中断され、スカシ
、ベクトル両プロセッサ部の並列処理はシリアライズさ
れる。The instruction activation circuit 5, which manages resources within the vector processor, sends information on the path 31 as to whether the vector processor is busy or not (in this example, busy is set to 11).
1 II). The busy information of the vector processor section and the output of the flip-flop 55 are logically ANDed by an AND circuit 58 and sent onto a path 85. The signal on this path 85 causes counter 50 to stop operating. At the same time, the write enable of the register 65 is turned off. Therefore, the vector processor section is busy and the S bit of the control register 54 is L.
At the time of L I II, the reading of the swash instruction is interrupted, and the parallel processing of both the swash and vector processor sections is serialized.

比較回路５２で５ＢＯＦＦ命令が検出されると、フリッ
プフロップ５５はパス８２上の信号によってリセットさ
れる。When comparison circuit 52 detects the 5BOFF instruction, flip-flop 55 is reset by a signal on path 82.

比較回路５２でＥＸＶＰ命令が検出されると、パス８３
上の信号はカウンタ５９を起動する。該カウンタ５９は
第１図のベクトル命令を読出すりクエスタ２ｃのアドレ
ス生成部の一部を構成している。カウンタ５９は起動を
受けると、レジスタ６ｏ上に格納されているベクトル命
令語長に従って毎サイクルカウントアツプされ、アドレ
スをパス１７上に送出する。主記憶制御回路８はパス１
７上に送出されたアドレスに従って主記憶装置からベク
トル命令を読出し、パス１８に送出する６読出されたベ
クトル命令はレジスタ６３に格納される。このベクトル
命令は比較回路６１によってＱＶＤ命令であるか否かを
調べられる。レジスタ６２にはＱＶＤ命令検出のための
データが格納されている。比較回路６１によってＱＶＤ
命令が検出されると、信号ＩＩ　１　＃がパス８６上に
送出される。この信号によって制御レジスタ５４のフリ
ップフロップ５６がＩＩ　Ｉ　ＩＩにセットされる。該
フリップフロップ５６のセットによりパス８７を通って
カウンタ５９の動作が停止する。同時にレジスタ６３の
書込イネーブルを落す。これによってベクトル命令解読
処理が中断される。When the comparison circuit 52 detects the EXVP instruction, the path 83
The upper signal activates counter 59. The counter 59 reads out the vector instruction shown in FIG. 1 and constitutes a part of the address generation section of the Questa 2c. When counter 59 is activated, it is counted up every cycle according to the vector instruction word length stored in register 6o, and sends an address onto path 17. Main memory control circuit 8 is path 1
A vector instruction is read from main memory according to the address sent out on path 18, and the vector instruction read out is stored in register 63. This vector instruction is checked by the comparison circuit 61 to see if it is a QVD instruction. The register 62 stores data for detecting a QVD instruction. QVD by comparison circuit 61
When an instruction is detected, a signal II 1 # is sent on path 86. This signal sets flip-flop 56 of control register 54 to II II II. The setting of the flip-flop 56 causes the counter 59 to stop operating through a path 87. At the same time, the write enable of register 63 is turned off. This interrupts the vector instruction decoding process.

ＱＶＤ命令以外のベクトル命令はパス８８を通ってデコ
ーダ６４によって解読され、デコード情報が命令起動回
路５に送出される。Vector instructions other than QVD instructions pass through path 88 and are decoded by decoder 64, and decoded information is sent to instruction activation circuit 5.

比較回路５２でＲＤＲ命令が検出されると、信号“１”
がパス８４上に送出される。この信号によってフリップ
フロップ５６がリセットされる。When the comparison circuit 52 detects the RDR command, the signal becomes “1”.
is sent out on path 84. This signal resets flip-flop 56.

これにより、パス８７を介してレジスタ６３の書込イネ
ーブルが書込可能となり、同時にカウンタ５９の動作が
再開される。This enables the write enable of register 63 to be written via path 87, and at the same time restarts the operation of counter 59.

比較回路５２でＴＶＰ命令が検出されると、信号１１１
　＋＋−Ｈ，？バス８０上に送出される。パス８０上の
信号はパス３１上の信号とＡＮＤ回路６６で論理積がと
られ、結果がレジスタ６７にセットされる。レジスタ６
７は条件コードを示しており、ＴｖＰ命令実行によって
、その時点でのベクトルプロセッサ部のビジー情報がＴ
ＶＰ以降の命令によって参照可能としている。When the comparison circuit 52 detects the TVP command, the signal 111
++-H,? It is sent out on bus 80. The signal on path 80 is ANDed with the signal on path 31 by AND circuit 66, and the result is set in register 67. register 6
7 indicates a condition code, and when the TvP instruction is executed, the busy information of the vector processor section at that time becomes T.
It can be referenced by instructions after VP.

第３図は第１図の命令起動回路５の詳細図である。第３
図において、パス１５０を介してベクトル命令が送られ
て来るとレジスタ１００に格納される。レジスタ１００
にセットされた命令はその命令を実行するために必要な
リソースが何であるかを決定するリソース決定回路１０
１に与えられ。FIG. 3 is a detailed diagram of the instruction activation circuit 5 of FIG. 1. Third
In the figure, when a vector instruction is sent via path 150, it is stored in register 100. register 100
A resource determination circuit 10 determines what resources are required to execute the instruction.
given to 1.

リソースの番号（複数）が決定される。リソース決定回
路１０１はオペコードをアドレスとしてリソース番号を
読出す記憶手段（たとえばＲＯＭ）によって構成できる
。リソース決定回路１０１の出力はデコーダ１０４でデ
コートされる。以下での説明を簡明にするため、レジス
タ１００上のベクトル命令を実行するのに２リソースが
割付られたとする。この割付情報はパス１５１，１５２
を介してセレクタ１０２，１０３に入力される。レジス
タ１０５はリソースの状態を保持している。A number of resources is determined. The resource determination circuit 101 can be configured by a storage means (for example, ROM) that reads a resource number using an operation code as an address. The output of the resource determination circuit 101 is decoded by a decoder 104. To simplify the explanation below, assume that two resources are allocated to execute the vector instruction on register 100. This allocation information is path 151, 152
are input to selectors 102 and 103 via. Register 105 holds the state of resources.

リソースの状態はビジー（”　ｌ　”　）か否かで管理
する。複数のリソースの状態を示しているレジスタ１０
５の各出力はセレクタ１０２，１０３で選択され、結果
がレジスタ１０６，１０７にセットされる。したがって
、レジスタ１０６，１０７の値は、レジスタ１００のベ
クトル命令を実行するリソースのビジー状態を示してい
る。The state of a resource is managed by determining whether it is busy ("l") or not. Register 10 showing the status of multiple resources
5 are selected by selectors 102 and 103, and the results are set in registers 106 and 107. Therefore, the values in registers 106 and 107 indicate the busy state of the resource that executes the vector instruction in register 100.

レジスタ１０６，１０７の値はＯＲ回路１０８で論理和
がとられ、結果がパス２２上に送出される。パス２２上
の信号がＩＩ　Ｉ　１１であると、レジスタ１００のベ
クトル命令は実行できないため、この情報を用いて第１
図のベクトル命令リクエスタ２ｃの動作は抑止される。The values of registers 106 and 107 are logically summed by OR circuit 108 and the result is sent onto path 22. If the signal on path 22 is II I 11, the vector instruction in register 100 cannot be executed, so this information is used to
The operation of the vector instruction requester 2c shown in the figure is suppressed.

ＯＲ回路１０８の論理和の反転信号はパス１５３上に送
出される。パス１５３上の信号が１１１　ＩＩの場合、
レジスタ１００のベクトル命令は実行できることを示し
ている。レジスタ１０６，１０７の出力は反転回路１２
１，１２２を介し、レジスタ１００のベクトル命令を実
行するためのリソースを選択するリソース選択回路１０
９に入力される。リソース選択回路１０９は、予め定め
たプライオリティにしたがってリソースを選択する。リ
ソース選択回路１０９の出力は各リソースに対応してお
り、パス１５４上のある信号がＩＩ　Ｉ　ＩＩとなるこ
とによって、対応するリソースが選択されたことが示さ
れる。パス１５４上の信号はパス１５３上の信号とＡＮ
Ｄ回路１１０によって論理積がとられ、結果がパス１５
５に送出され、レジスタ１０５のセット信号となる。レ
ジスタ１０５はパス２１．２１上のリソースベクトル命
令処理完了報告信号によってリセットされる。The inverted signal of the logical sum of OR circuit 108 is sent onto path 153. If the signal on path 153 is 111 II,
This indicates that the vector instruction in register 100 can be executed. The outputs of the registers 106 and 107 are sent to the inverting circuit 12.
1, 122, a resource selection circuit 10 that selects a resource for executing a vector instruction in the register 100;
9 is input. Resource selection circuit 109 selects resources according to predetermined priorities. The output of the resource selection circuit 109 corresponds to each resource, and when a certain signal on the path 154 becomes II III II, it is indicated that the corresponding resource has been selected. The signal on path 154 is AN
The logical AND is performed by the D circuit 110, and the result is passed to the path 15.
5 and becomes a set signal for register 105. Register 105 is reset by the resource vector instruction processing completion report signal on path 21.21.

レジスタ１００に格納されているベクトル命令は、パス
１５３上の信号が１”のときレジスタ１１１に移行する
。レジスタ１１１のベクトル命令のオペコードはパス１
５６を通ってオーダ生成回路１１２に入力される。オー
ダ生成回路１１２では、ベクトル命令実行に必要なオー
ダ情報を、オペコードをアドレスとしてＲＯＭの如き記
憶手段を引用することによって生成する。オーダ生成回
路１１２の出力はパス１５６を通り、スイッチング回路
１１３によってリソース対応に分別され、パス２３を介
して命令を実行するリソースに送出される。同様にレジ
スタ１１１上のベクトル命令のオペランドはパス１５７
を通り、スイッチング回路１１４によってリソース対応
に分別され、パス２３上に送出される。The vector instruction stored in register 100 is transferred to register 111 when the signal on path 153 is 1''. The opcode of the vector instruction in register 111 is path 1.
56 and is input to the order generation circuit 112. The order generation circuit 112 generates order information necessary for executing a vector instruction by referring to a storage means such as a ROM using the operation code as an address. The output of the order generation circuit 112 passes through a path 156, is sorted according to resource by the switching circuit 113, and is sent out via a path 23 to the resource that executes the instruction. Similarly, the operand of the vector instruction on register 111 is path 157.
The signals are sorted according to resources by the switching circuit 114 and sent out onto the path 23.

レジスタ１０５の各リソースビジー情報はＯＲ回路１１
５で論理和がとられ、結果がパス３１上ら送出される。Each resource busy information in the register 105 is sent to the OR circuit 11.
5, and the result is sent out over path 31.

パス３１上の信号はベクトルプロセッサ部のビジーを表
わす信号として用いられる。The signal on path 31 is used as a signal indicating that the vector processor section is busy.

第４図は第３図におけるリソース選択回路１０９の詳細
図で、説明の便宜上、４つのリソースが選択対象になっ
ている場合を示している。第４図において、４リソース
のどれかが使用可能という情報はパス２００ａ〜２００
ｄによって与えられる。第４図はパス２００ａのプライ
オリティが最も高く、以下、パス２００ｂ、２００ｃの
順に低くなり、パス２００ｄのプライオリティが最も低
いとした例であり、パス２００ａ〜２００ｄのいずれか
の信号が１１１”になると、それより上位の信号がいず
れもＬＬ　ＯＩＩの場合、該当信号が選択される。従っ
て、パス２００ａ〜２００ｄ上の複数の信号が“１″で
あっても、パス１５４上の出力は、プライオリティの高
い１リソースに対応するパス上の信号のみが“１”とな
る。FIG. 4 is a detailed diagram of the resource selection circuit 109 in FIG. 3, and for convenience of explanation, shows a case where four resources are selected. In FIG. 4, information that any of the four resources is available is transmitted through paths 200a to 200a.
given by d. FIG. 4 shows an example in which path 200a has the highest priority, followed by paths 200b and 200c, and path 200d has the lowest priority. , and all higher-order signals are LL OII, the corresponding signal is selected. Therefore, even if multiple signals on paths 200a to 200d are "1", the output on path 154 will be the priority one. Only the signal on the path corresponding to the high 1 resource becomes "1".

〔Effect of the invention〕

本発明によれば、ベクトル処理装置で不正な結果を得た
場合、その原因がプログラムのどの部分で発生している
かの調査を、Ｄ○ループ内の任意の場所でベクトル処理
を中断させ、Ｄ○ループ内で参照している変数又は配列
の値が主記憶装置上に存在せずベクトル処理装置内のレ
ジスタ上に存在する場合であっても、そのデータを主記
憶装置へ出力して調べることができる。また、Ｄｏルー
プ内で参照している変数又は配列に、それが処理装置内
のレジスタ上にだけ存在する場合があっても、デパック
のための値を代入して、ベクトル処理を再試行できる。According to the present invention, when a vector processing device obtains an incorrect result, an investigation is performed to find out in which part of the program the cause of the error has occurred, by interrupting vector processing at an arbitrary location within the D○ loop. ○ Even if the value of a variable or array referenced in a loop does not exist in the main memory but exists in a register in the vector processing unit, output that data to the main memory and examine it. I can do it. Further, even if a variable or array referenced in the Do loop exists only in a register within the processing device, it is possible to assign a value for depacking and retry vector processing.

さらに、ベクトル処理装置固有のベクトル処理とスカラ
処理の並列処理を抑止して、並列処理に起因する主記憶
参照の順序性に基づく不良とベクトル命令列書替に原因
がある不良を容易に通出できる。Furthermore, by suppressing the parallel processing of vector processing and scalar processing specific to vector processing devices, it is possible to easily report defects caused by the order of main memory references caused by parallel processing and defects caused by vector instruction sequence rewriting. can.

[Brief explanation of drawings]

第１図は本発明のベクトル処理装置の一実施例の概略ブ
ロック図、第２図は第１図における命令読出しと解読に
関係する部分の詳細図、第３図は第１図における命令起
動回路の詳細図、第４図は第３図におけるリソース選択
回路の詳細図である。１・・主記憶装置、　８・・・主記憶制御回路、２ａ〜
２ｅ・・・リクエスタ　（データ転送回路）、３ａ、３
ｂ・・・命令解読回路、　４・・スカラ演算器、　５・
・・命令起動回路、　６・・・ベクトルレジスタ、　７
・・ベクトル演算器。第　　４　　図FIG. 1 is a schematic block diagram of an embodiment of the vector processing device of the present invention, FIG. 2 is a detailed diagram of a portion related to instruction reading and decoding in FIG. 1, and FIG. 3 is an instruction activation circuit in FIG. 1. FIG. 4 is a detailed diagram of the resource selection circuit in FIG. 3. 1... Main memory device, 8... Main memory control circuit, 2a~
2e...Requester (data transfer circuit), 3a, 3
b...Instruction decoding circuit, 4...Scalar arithmetic unit, 5...
...Instruction activation circuit, 6...Vector register, 7
...Vector arithmetic unit. Figure 4

Claims

[Claims]

a main memory device that stores programs and data; a scalar processor that reads scalar instructions from the main memory device and executes scalar processing; and a scalar processor that is activated by the scalar processor and reads vector instructions from the main memory device to perform vector processing. In a vector processing device equipped with a vector processor that executes a vector processor, in order to debug a program, a first instruction that tests the processing status of the vector processor, and a second instruction that instructs to interrupt the decoding process of the vector instruction are used. a third instruction that instructs to cancel the suspension of decoding processing caused by the second instruction, a fourth instruction that instructs to suppress parallel execution of vector processing and scalar processing, and a third instruction that instructs to suppress parallel execution of vector processing and scalar processing; and a fifth instruction that instructs release, and a control register used for transferring control between both processors, which is set by the second instruction and reset by the third instruction. and a second bit that is set by the fourth instruction and reset by the fifth instruction, and interrupts the decoding process of the vector instruction while the first bit in the control register is set. , A vector processing device characterized in that scalar processing and vector processing are executed serially during the setting period of the second bit.