JP2970512B2

JP2970512B2 - Vector processor

Info

Publication number: JP2970512B2
Application number: JP34250695A
Authority: JP
Inventors: 政人西田
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1995-12-28
Filing date: 1995-12-28
Publication date: 1999-11-02
Anticipated expiration: 2015-12-28
Also published as: JPH09185602A

Description

DETAILED DESCRIPTION OF THE INVENTION

【発明の属する技術分野】本発明は、ベクトルプロセッ
サに関し、特にコンピュータグラフィックス等で用いら
れる座標変換処理を実現するベクトルプロセッサに関す
る。[0001] 1. Field of the Invention [0002] The present invention relates to a vector processor, and more particularly to a vector processor that realizes coordinate conversion processing used in computer graphics and the like.

【０００１】[0001]

【従来の技術】従来この種の技術では、三次元座標を投
射、投影するための次式のような座標変換処理をマイク
ロプロセッサを一つまたは複数個接続し、プロセッサ上
のソフトウェア制御により実現していた。2. Description of the Related Art Conventionally, in this type of technology, coordinate conversion processing for projecting and projecting three-dimensional coordinates as shown in the following equation is realized by connecting one or more microprocessors and controlling the software on the processor. I was

【０００２】 [0002]

【０００３】また、汎用ＤＳＰ等をマイクロプロセッサ
に接続して上式のような三次元座標を投射、投影するた
めの座標変換処理を実現していた。さらには、たとえ
ば、特開平１−２７０１７４号公報のように、専用のＡ
ＳＩＣにパイプライン構成演算器を設けて構成すること
が考えられている。この特開平１−２７０１７４号公報
では、座標変換処理を４つのステージに分け、各ステー
ジは最終ステージを除き入力側および出力側にレジスタ
を有し、各ステージの演算は入力側のレジスタが空の場
合に実行させる技術が記載されている。Further, a general-purpose DSP or the like is connected to a microprocessor to realize a coordinate conversion process for projecting and projecting three-dimensional coordinates as shown in the above equation. Further, for example, as disclosed in Japanese Patent Application Laid-Open No. 1-270174, a dedicated A
It is considered that the SIC is provided with a pipeline configuration arithmetic unit. In this Japanese Unexamined Patent Publication No. 1-270174, the coordinate transformation process is divided into four stages, each stage has registers on the input side and the output side except for the final stage, and the operation of each stage is performed when the input side registers are empty. A technique to be executed in the case is described.

【０００４】[0004]

【発明が解決しようとする課題】上述の従来技術では、
汎用のプロセッサ等を用いており、座標変換処理をソフ
トウェアプログラムにより実現するため、高速に座標変
換処理を行えないという問題があった。In the above-mentioned prior art,
Since a general-purpose processor or the like is used and the coordinate conversion processing is realized by a software program, there is a problem that the coordinate conversion processing cannot be performed at high speed.

【０００５】また、従来技術では、座標変換処理を４つ
のステージに分け、各ステージは最終ステージを除き入
力側および出力側にレジスタを有し、各ステージの演算
を入力側のレジスタが空の場合に実行させているため、
高速に座標変換処理を行えないという問題点があった。In the prior art, the coordinate transformation process is divided into four stages. Each stage has a register on the input side and the output side except for the final stage, and the operation of each stage is performed when the register on the input side is empty. To run
There was a problem that the coordinate conversion process could not be performed at high speed.

【０００６】本発明の目的は、画像生成処理を高速に行
えるベクトルプロセッサを提供することにある。An object of the present invention is to provide a vector processor capable of performing image generation processing at high speed.

【０００７】[0007]

【課題を解決するための手段】上記課題を解決するため
に本発明のベクトルプロセッサは、複数のパイプライン
演算機構を含み多重並列パイプラインによるベクトル処
理を行うベクトルプロセッサであって、前記パイプライ
ン演算機構は多重並列パイプラインによるベクトル処理
を行うための通常演算処理を実行するモードと座標変換
処理を実行するモードとを切り換え制御する制御手段を
含み、通常演算処理を実行するモードである場合には前
記複数のパイプライン演算機構の各々に同一のベクトル
データを振り分けて送出し、座標変換処理を実行するモ
ードである場合には前記複数のパイプライン演算機構の
各々に複数の異なるベクトルデータを送出するロードス
トアユニットをさらに含み、前記パイプライン演算機構
は複数の第一の入力ベクトルレジスタと、この第一の入
力ベクトルレジスタへ格納するデータを選択する複数の
選択手段とをさらに含み、前記制御手段は、通常演算処
理を実行するモードである場合には複数のパイプライン
演算機構の各々の同一位置の前記第一の入力ベクトルレ
ジスタに異なるベクトルデータを格納させるように、座
標変換処理を実行するモードである場合には複数のパイ
プライン演算機構の各々の同一位置の第一の入力ベクト
ルレジスタに同一のベクトルデータを格納させるように
前記選択手段に指示する。 In order to solve the above-mentioned problems, a vector processor according to the present invention is a vector processor including a plurality of pipeline operation mechanisms and performing vector processing by a multi-parallel pipeline. mechanism seen <br/> including control means for controlling switching of a mode for executing the mode coordinate conversion processing executes a normal operation process for performing vector processing by multiple parallel pipelines, it executes the normal processing mode Before if
The same vector for each of the multiple pipeline operations
A mode for sorting and sending data and executing coordinate conversion processing
In the case of a plurality of pipeline operation mechanisms,
Rhodes sending multiple different vector data to each
A pipeline operation mechanism, further comprising a tor unit;
Has a plurality of first input vector registers and the first input
Select the data to be stored in the force vector register.
Selection means, wherein the control means includes
Multiple pipelines in the execution mode
The first input vector register at the same position in each of the arithmetic mechanisms;
So that the registers store different vector data
If the mode is to execute the target conversion process,
A first input vector at each of the same positions of the ply operation mechanism
The same vector data in the register
Instruct the selection means.

【０００８】また、本発明の他のベクトルプロセッサ
は、前記パイプライン演算機構は複数の第二の入力ベク
トルレジスタと、複数のスカラレジスタと、前記第二の
ベクトルレジスタからの出力と前記スカラレジスタから
の出力とを選択出力する第二の選択手段と、前記入力ベ
クトルレジスタからの出力と前記選択手段からの出力と
を乗算する複数の乗算器と、前記乗算器からの出力を加
算する加算器と、この加算器の出力を格納する出力ベク
トルレジスタとをさらに含み、前記制御手段は、通常演
算処理を実行するモードである場合には前記第二のベク
トルレジスタからの出力を、座標変換処理を実行するモ
ードである場合には前記スカラレジスタからの出力を選
択するように前記第二の選択手段に指示する。 In another vector processor according to the present invention, the pipeline operation mechanism includes a plurality of second input vectors.
A torque register, a plurality of scalar registers, and the second
From the output from the vector register and from the scalar register
Second selection means for selecting and outputting the output of the
Output from the vector register and the output from the selection means.
And a plurality of multipliers for multiplying
And an output vector for storing the output of the adder.
And a control register, wherein the control means includes
If the mode is to execute arithmetic processing, the second vector
Output from the motor register to the
If it is a mode, select the output from the scalar register.
The second selection means is instructed to make a selection.

【０００９】また、本発明の他のベクトルプロセッサ
は、多重並列パイプラインによるベクトル処理を行うベ
クトルプロセッサであって、ｋ個のパイプライン演算機
構を含み、前記パイプライン演算機構は、複数の第一の
入力ベクトルレジスタと、この第一の入力ベクトルレジ
スタへの入力を選択する第一の選択手段と、複数の第二
の入力ベクトルレジスタと、複数のスカラレジスタと、
前記第二のベクトルレジスタからの出力と前記スカラレ
ジスタからの出力とを選択出力する第二の選択手段と、
前記入力ベクトルレジスタからの出力と前記選択手段か
らの出力とを乗算する複数の乗算器と、前記乗算器から
の出力を加算する加算器と、この加算器の出力を格納す
る出力ベクトルレジスタと、命令をデコードした結果が
座標変換を示す場合には前記第一の入力ベクトルレジス
タに連続したベクトルを次々格納させ、前記スカラレジ
スタからの出力を選択出力させるよう前記選択手段に指
示し、通常演算を示す場合には第一のベクトルレジスタ
に（ｋ−１）個とびにベクトル要素を格納させ、前記第
二のベクトルレジスタからの出力を選択出力させるよう
前記選択手段に指示する制御手段とを含む Another vector processor according to the present invention performs vector processing by a multi-parallel pipeline.
A vector processor, k pipeline arithmetic units
Wherein the pipeline operation mechanism comprises a plurality of first
An input vector register and the first input vector register
First selecting means for selecting an input to the
An input vector register, multiple scalar registers,
The output from the second vector register and the scalar
Second selection means for selectively outputting the output from the register,
The output from the input vector register and the selection means
A plurality of multipliers for multiplying the output of
Adder that adds the output of the adder, and stores the output of the adder.
Output vector register and the result of decoding the instruction
When indicating a coordinate transformation, the first input vector register
Data in the scalar register
The selection means so as to selectively output the output from the
And the first vector register when indicating normal operation
To store (k-1) discrete vector elements,
Select output from second vector register
Control means for instructing the selection means

【００１０】[0010]

【００１１】[0011]

【発明の実施の形態】次に本発明のベクトルプロセッサ
の一実施例について図面を参照して詳細に説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Next, an embodiment of the vector processor according to the present invention will be described in detail with reference to the drawings.

【００１２】本発明の一実施例であるベクトルプロセッ
サでは、次の３次元行列演算を実現する。次の式におい
て、Ｘ、Ｙ、Ｚ、ｘ、ｙおよびｚはそれぞれｍ個の要素
からなるベクトルＸ［ｋ］、Ｙ［ｋ］、Ｚ［ｋ］、ｘ
［ｋ］、ｙ［ｋ］およびｚ［ｋ］（０≦ｋ≦ｍ）を意味
している。A vector processor according to one embodiment of the present invention implements the following three-dimensional matrix operation. In the following equation, X, Y, Z, x, y, and z are vectors X [k], Y [k], Z [k], x each including m elements.
[K], y [k] and z [k] (0 ≦ k ≦ m).

【００１３】 [0013]

【００１４】図２を参照すると、本発明の一実施例であ
るベクトルプロセッサは、パイプライン演算機構１、２
および３がバス６、７、８、９、１０および１１を介し
てロードストアユニット４に接続されて構成される。ロ
ードストアユニット４は主記憶装置５に接続されてい
る。Referring to FIG. 2 , a vector processor according to an embodiment of the present invention includes pipeline operation units 1 and 2.
And 3 are connected to the load / store unit 4 via buses 6, 7, 8, 9, 10, and 11, respectively. The load store unit 4 is connected to the main storage device 5.

【００１５】パイプライン演算機構１、２および３は、
ベクトルのパイプライン演算を行うものであり、これら
の内部構成はすべて同一である。The pipeline operation mechanisms 1, 2, and 3
It performs vector pipeline operation, and all of these internal configurations are the same.

【００１６】バス６、７および８は、パイプライン演算
機構１、２および３のすべての入力側に接続されてお
り、各パイプライン演算機構に入力データを供給する。
入力データは主記憶装置５に格納されており、ロードス
トアユニット４によりバス６、７および８を介してパイ
プライン演算機構１、２および３に供給される。バス
９、１０および１１は、パイプライン演算機構１、２お
よび３の出力側にそれぞれ接続されており、各パイプラ
イン演算機構の出力が送出される。出力データは、ロー
ドストアユニット４を介して、主記憶装置５に格納され
る。The buses 6, 7, and 8 are connected to all inputs of the pipeline arithmetic units 1, 2, and 3, and supply input data to each pipeline arithmetic unit.
Input data is stored in the main memory 5, pies via bus 6, 7 and 8 by load store unit 4
It is supplied to the pipeline operation mechanisms 1, 2, and 3. Bus 9, 10 and 11 are respectively connected to the output side of the pipeline operation mechanism 1, 2 and 3, each pipeline
The output of the in- operation mechanism is sent out. The output data is stored in the main storage device 5 via the load store unit 4.

【００１７】図３を参照すると、本発明の一実施例のベ
クトル演算機構１は、入力ベクトルレジスタ１−１、１
−２および１−３と乗算器１−５、１−６および１−７
と多入力加算器１−８と出力ベクトルレジスタ１−４と
から構成される。バス６、７および８の各々が、セレク
タ１−１２、１−１３および１−１４に接続される。セ
レクタ１−１２は入力ベクトルレジスタ１−１に、セレ
クタ１−１３は入力ベクトルレジスタ１−２に、セレク
タ１−１４は入力ベクトルレジスタ１−３に接続され、
これらのレジスタへの入力を選択する。Referring to FIG. 3 , a vector operation mechanism 1 according to one embodiment of the present invention includes input vector registers 1-1, 1 and 2.
-2 and 1-3 and multipliers 1-5, 1-6 and 1-7
, A multi-input adder 1-8, and an output vector register 1-4. Each of buses 6, 7 and 8 is connected to selectors 1-12, 1-13 and 1-14. The selector 1-12 is connected to the input vector register 1-1, the selector 1-13 is connected to the input vector register 1-2, and the selector 1-14 is connected to the input vector register 1-3.
Select the inputs to these registers.

【００１８】スカラレジスタ１−９はセレクタ１−１５
に接続されている。このセレクタ１−１５は、スカラレ
ジスタ１−９の内容と図示していないベクトルレジスタ
の内容とを選択する。乗算器１−５は、このセレクタ１
−１５からの出力とベクトルレジスタ１−１からの出力
が入力され乗算されるように接続されている。スカラレ
ジスタ１−１０はセレクタ１−１６に接続されている。
このセレクタ１−１６は、スカラレジスタ１−１０の内
容と図示していないベクトルレジスタの内容とを選択す
る。乗算器１−６は、このセレクタ１−１６からの出力
とベクトルレジスタ１−２からの出力が入力され乗算さ
れるように接続されている。スカラレジスタ１−１１は
セレクタ１−１７に接続されている。このセレクタ１−
１７は、スカラレジスタ１−１１の内容と図示していな
いベクトルレジスタの内容とを選択する。乗算器１−７
は、このセレクタ１−１７からの出力とベクトルレジス
タ１−２からの出力が入力され乗算されるように接続さ
れている。乗算器１−５、１−６および１−７の出力は
多入力演算器１−８に接続され加算される。多入力加算
器１−８の出力は、出力ベクトルレジスタ１−４に格納
されるように接続されている。The scalar register 1-9 has a selector 1-15.
It is connected to the. The selector 1-15 selects the contents of the scalar register 1-9 and the contents of a vector register (not shown). The multiplier 1-5 includes the selector 1
The output from -15 and the output from vector register 1-1 are input and multiplied. The scalar register 1-10 is connected to the selector 1-16.
The selector 1-16 selects the contents of the scalar register 1-10 and the contents of a vector register (not shown). The multiplier 1-6 is connected so that the output from the selector 1-16 and the output from the vector register 1-2 are input and multiplied. The scalar register 1-11 is connected to the selector 1-17. This selector 1-
Reference numeral 17 selects the contents of the scalar register 1-11 and the contents of a vector register (not shown). Multiplier 1-7
Are connected so that the output from the selector 1-17 and the output from the vector register 1-2 are input and multiplied. The outputs of the multipliers 1-5, 1-6 and 1-7 are connected to a multi-input arithmetic unit 1-8 and added. The output of the multi-input adder 1-8 is connected so as to be stored in the output vector register 1-4.

【００１９】パイプライン演算機構１は、制御回路１−
２１を有し、この制御回路は従来通りの多重並列パイプ
ラインによるベクトル処理を実行する通常演算モードと
座標変換処理を実行する座標変換モードとを切り換える
制御を行う。この制御回路１−２１からのモード切り換
え指示の信号は、セレクタ１−１２、１−１３、１−１
４、１−１５、１−１６、１−１７、１−１８、１−１
９および１−２０に入力され、それぞれのモードに応じ
てこれらのセレクタが切り換えられる。The pipeline operation mechanism 1 includes a control circuit 1-
The control circuit 21 controls switching between a normal operation mode for executing vector processing by a conventional multiple parallel pipeline and a coordinate conversion mode for executing coordinate conversion processing. The mode switching instruction signal from the control circuit 1-21 is supplied to the selectors 1-12, 1-13, and 1-1.
4, 1-15, 1-16, 1-17, 1-18, 1-1
9 and 1-20, and these selectors are switched according to the respective modes.

【００２０】次に本発明の一実施例のベクトルプロセッ
サの動作について図面を参照して詳細に説明する。Next, the operation of the vector processor according to one embodiment of the present invention will be described in detail with reference to the drawings.

【００２１】図示していない命令制御回路が命令を解読
し、通常演算モードであるのか座標変換モードであるの
かを各パイプライン演算機構１、２および３の制御回路
１−２１、２−２１および３−２１に通知する。各制御
回路１−２１、２−２１および３−２１はこの通知を受
け取り、通常演算モードと座標変換モードとの切り換え
制御を行う。通常演算モードでは、パイプライン演算機
構１、２および３に単一のベクトルデータＶ［ｋ］（０
≦ｋ≦３ｍ）が振り分けられて送出され、各パイプライ
ン演算機構の入力ベクトルレジスタには、異なるベクト
ル要素が格納される。座標変換モードでは、パイプライ
ン演算機構１、２および３に同一のベクトルデータｘ
［ｋ］、ｙ［ｋ］およびｚ［ｋ］（０≦ｋ≦ｍ）のそれ
ぞれが送出され、各パイプライン演算機構の入力ベクト
ルレジスタの各々に格納される。通常演算モードでは同
時に３つのベクトル演算を実行させ、座標変換モードで
は同時に１つの座標変換を実行させる。通常演算モード
であるのか座標変換モードであるのかの通知は、ロード
ストアユニット４にも送られる。ロードストアユニット
４は、通常演算モードであるときはバス６、７および８
のそれぞれにベクトルＶ［ｋ］をバス６にはベクトル要
素Ｖ［１］、Ｖ［４］、・・・のように、バス７にはベ
クトル要素Ｖ［２］、Ｖ［５］、・・・のように、バス
８にはベクトル要素Ｖ［３］、Ｖ［６］、・・・のよう
に振り分けて送出する。座標変換モードであるときは、
バス６にベクトルｘ［ｋ］を、バス７にベクトルｙ
［ｋ］を、バス８にベクトルｚ［ｋ］を送出させる。An instruction control circuit (not shown) decodes the instruction, and determines whether the operation mode is the normal operation mode or the coordinate conversion mode by the control circuits 1-21, 2-21 and 2-21 of the pipeline operation mechanisms 1, 2 and 3. Notify 3-21. Each of the control circuits 1-21, 2-21, and 3-21 receives this notification and controls switching between the normal operation mode and the coordinate conversion mode. In the normal operation mode, the pipeline operation mechanism 1, 2 and 3 a single vector data V [k] (0
≦ k ≦ 3m) is sent by distributed, each pipeline
The input vector register of down operation mechanism, different vector elements are stored. In coordinate transformation mode, the pipeline
The same vector data x
Each of [k], y [k] and z [k] (0 ≦ k ≦ m) is sent out and stored in each of the input vector registers of each pipeline operation unit. In the normal operation mode, three vector operations are executed simultaneously, and in the coordinate conversion mode, one coordinate conversion is executed simultaneously. The notification of whether the mode is the normal operation mode or the coordinate conversion mode is also sent to the load store unit 4. When the load store unit 4 is in the normal operation mode, the buses 6, 7, and 8
, And a bus 7 has vector elements V [1], V [4],..., And a bus 7 has vector elements V [2], V [5],. , And are distributed to the bus 8 as vector elements V [3], V [6],... When in the coordinate transformation mode,
Vector x [k] on bus 6 and vector y on bus 7
[K] causes the bus 8 to transmit the vector z [k].

【００２２】図３を参照すると、通常演算モードにおい
て、パイプライン演算機構１において、制御回路１−２
１はパイプライン演算機構１にバス６のデータが入力さ
れるように、セレクタ１−１２に対しバス６を選択する
ように指示する。入力ベクトルレジスタ１−１には要素
Ｖ［１］、Ｖ［４］、・・・が格納される。Referring to FIG. 3 , in the normal operation mode, control circuit 1-2 in pipeline operation mechanism 1 is provided.
1 instructs the selector 1-12 to select the bus 6 so that the data of the bus 6 is input to the pipeline operation mechanism 1. Elements V [1], V [4],... Are stored in the input vector register 1-1.

【００２３】制御回路１−２１はセレクタ１−１５に対
し、スカラデータまたは図示していない他のベクトルレ
ジスタの内容を選択するように指示する。このセレクタ
１−１５により選択されたスカラデータまたは図示して
いない他のベクトルレジスタの内容が乗算器１−５の一
方の入力として送出される。The control circuit 1-21 instructs the selector 1-15 to select scalar data or the contents of another vector register (not shown). The scalar data selected by the selector 1-15 or the contents of another vector register (not shown) is sent out as one input of the multiplier 1-5.

【００２４】乗算器１−５は、入力ベクトルレジスタ１
−１から出力される内容と、セレクタ１−１５により選
択されたスカラデータまたは図示していない他のベクト
ルレジスタの内容とを乗算する。この乗算結果は、図示
していない他のベクトルレジスタに格納される。また、
制御回路１−２１からの指示を受けたセレクタ１−１９
により多入力加算器１−８の１入力とすることもでき
る。The multiplier 1-5 has an input vector register 1
-1 is multiplied by the scalar data selected by the selector 1-15 or the contents of another vector register (not shown). This multiplication result is stored in another vector register (not shown). Also,
Selector 1-19 receiving instruction from control circuit 1-21
Can be used as one input of the multi-input adder 1-8.

【００２５】パイプライン演算機構２においては、バス
７からベクトル要素Ｖ［２］、Ｖ［５］、・・・が入力
ベクトルレジスタ２−１に格納され、また、パイプライ
ン演算機構３においては、バス８からベクトル要素Ｖ
［３］、Ｖ［６］、・・・が入力ベクトルレジスタ３−
１に格納され、同様にベクトル演算が行われる。In the pipeline operation mechanism 2, the vector elements V [2], V [5],... Are stored in the input vector register 2-1 from the bus 7, and
In the operation unit 3, the vector element V
[3], V [6],...
1 and the vector operation is performed similarly.

【００２６】座標変換モードでは、パイプライン演算機
構１において、Ｘ［ｋ］＝ａ［1,1］＊ｘ［ｋ］＋ａ［2,1］＊ｙ［ｋ］
＋ａ［3,1］＊ｚ［ｋ］パイプライン演算機構２において、Ｙ［ｋ］＝ａ［1,2］＊ｘ［ｋ］＋ａ［2,2］＊ｙ［ｋ］
＋ａ［3,2］＊ｚ［ｋ］パイプライン演算機構３において、Ｚ［ｋ］＝ａ［1,3］＊ｘ［ｋ］＋ａ［2,3］＊ｙ［ｋ］
＋ａ［3,3］＊ｚ［ｋ］の演算がそれぞれ実現される。In the coordinate conversion mode, in the pipeline operation mechanism 1, X [k] = a [1,1] * x [k] + a [2,1] * y [k]
+ A [3,1] * z [k] In the pipeline operation mechanism 2, Y [k] = a [1,2] * x [k] + a [2,2] * y [k]
+ A [3,2] * z [k] In the pipeline operation mechanism 3, Z [k] = a [1,3] * x [k] + a [2,3] * y [k]
The operation of + a [3,3] * z [k] is realized respectively.

【００２７】バス６を介してベクトルｘ［ｋ］（０≦ｋ
≦ｍ）が、バス７を介してベクトルｙ［ｋ］（０≦ｋ≦
ｍ）が、バス８を介してベクトルｚ［ｋ］（０≦ｋ≦
ｍ）がそれぞれパイプライン演算機構１、２および３に
送出されている。The vector x [k] (0 ≦ k) via the bus 6
≦ m) is converted into a vector y [k] (0 ≦ k ≦
m) is converted into a vector z [k] (0 ≦ k ≦
m) are sent to the pipeline operation mechanisms 1, 2 and 3, respectively.

【００２８】パイプライン演算機構１において、制御回
路１−２１は、セレクタ１−１２に対しバス６を選択
し、入力ベクトルレジスタ１−１にベクトル要素ｘ
［ｋ］（０≦ｋ≦ｍ）を格納するように指示する。制御
回路１−２１はセレクタ１−１３に対し、バス７を選択
し、入力ベクトルレジスタ１−２にベクトル要素ｙ
［ｋ］（０≦ｋ≦ｍ）を格納するように指示する。制御
回路１−２１はセレクタ１−１４に対し、バス８を選択
し、入力ベクトルレジスタ１−３にベクトル要素ｚ
［ｋ］（０≦ｋ≦ｍ）を格納するように指示する。レジ
スタ１−９には、行列要素ａ［1,1］の値が格納され
る。スカラレジスタ１−１０には、行列要素ａ［2,1］
の値が格納される。スカラレジスタ１−１１には、行列
要素ａ［3,1］の値が格納される。乗算器１−５では、
入力ベクトルレジスタ１−１から出力される内容と、制
御回路１−２１の指示によりセレクタ１−１５が選択し
たスカラレジスタ１−９からの内容とが乗ぜられる。乗
算器１−６では、入力ベクトルレジスタ１−２から出力
される内容と、制御回路１−２１の指示によりセレクタ
１−１６が選択したスカラレジスタ１−１０からの内容
とが乗ぜられる。乗算器１−７では、入力ベクトルレジ
スタ１−３から出力される内容と、制御回路１−２１の
指示によりセレクタ１−１７が選択したスカラレジスタ
１−１１からの内容とが乗ぜられる。乗算器１−５、１
−６および１−７の演算結果は多入力加算器１−８に入
力され加算される。この結果は出力ベクトルレジスタ１
−４に格納される。出力ベクトルレジスタ１−４の内容
はバス９を介してロードストアユニット４に格納され
る。In the pipeline operation mechanism 1, the control circuit 1-21 selects the bus 6 for the selector 1-12 and stores the vector element x in the input vector register 1-1.
[K] (0 ≦ k ≦ m) is instructed to be stored. The control circuit 1-21 selects the bus 7 for the selector 1-13, and stores the vector element y in the input vector register 1-2.
[K] (0 ≦ k ≦ m) is instructed to be stored. The control circuit 1-21 selects the bus 8 for the selector 1-14 and stores the vector element z in the input vector register 1-3.
[K] (0 ≦ k ≦ m) is instructed to be stored. Register 1-9 stores the value of matrix element a [1,1]. The scalar register 1-10 has a matrix element a [2,1]
Is stored. The value of the matrix element a [3, 1] is stored in the scalar register 1-11. In the multiplier 1-5,
The content output from the input vector register 1-1 is multiplied by the content from the scalar register 1-9 selected by the selector 1-15 according to an instruction from the control circuit 1-21. In the multiplier 1-6, the content output from the input vector register 1-2 is multiplied by the content from the scalar register 1-10 selected by the selector 1-16 according to the instruction of the control circuit 1-21. The multiplier 1-7 multiplies the content output from the input vector register 1-3 by the content from the scalar register 1-11 selected by the selector 1-17 according to an instruction from the control circuit 1-21. Multipliers 1-5, 1
The operation results of -6 and 1-7 are input to the multi-input adder 1-8 and added. This result is output vector register 1
-4. The contents of the output vector registers 1-4 are stored in the load store unit 4 via the bus 9.

【００２９】このパイプライン演算機構１における座標
変換モードの処理は、パイプライン演算機構２および３
においても同様に行われる。The processing in the coordinate conversion mode in the pipeline operation mechanism 1 is performed by the pipeline operation mechanisms 2 and 3
Is performed similarly.

【００３０】図４を参照すると、タイミングチャートの
横軸は時間を表している。ひし形の上辺および下辺の長
さが、ベクトルレジスタの全要素を処理するのに要する
時間を表している。Referring to FIG. 4 , the horizontal axis of the timing chart represents time. The lengths of the upper and lower sides of the diamond represent the time required to process all elements of the vector register.

【００３１】ロード処理は、ロードストアユニットによ
り主記憶からベクトルｘ［ｋ］（０≦ｋ≦ｍ）を入力レ
ジスタ１−１、２−１、３−１に、ベクトルｙ［ｋ］
（０≦ｋ≦ｍ）を入力レジスタ１−２、２−２、３−２
に、ベクトルｚ［ｋ］（０≦ｋ≦ｍ）を入力レジスタ１
−３、２−３、３−３にそれぞれ同時に格納する処理で
ある。このロード処理には、ターンアラウンドタイムＴ
Ｌだけかかる。ロード処理されたベクトル要素が乗算器
１−５、１−６、１−７、２−５、２−６、２−７、３
−５、３−６および３−７において、予め設定された座
標変換マトリックスの各要素とすべての乗算器で同時に
乗算される。この乗算処理には、ターンアラウンドタイ
ムＴＭだけかかる。各乗算器より出力される乗算結果
は、多入力加算器１−８、２−８および３−８に送ら
れ、それぞれの多入力加算器で同時に加算処理が行われ
る。この加算処理には、ターンアラウンドタイムＴＡだ
けかかる。In the load processing, a vector x [k] (0 ≦ k ≦ m) is transferred from the main memory to the input registers 1-1, 2-1 and 3-1 by the load store unit, and a vector y [k] is input.
(0 ≦ k ≦ m) to input registers 1-2, 2-2, 3-2
And the vector z [k] (0 ≦ k ≦ m) in the input register 1
-3, 2-3, and 3-3 respectively. This loading process includes a turnaround time T
It takes only L. The loaded vector elements are multipliers 1-5, 1-6, 1-7, 2-5, 2-6, 2-7, 3
At -5, 3-6 and 3-7, each element of the preset coordinate transformation matrix is multiplied simultaneously by all multipliers. This multiplication process takes only the turnaround time TM. The multiplication results output from the respective multipliers are sent to multi-input adders 1-8, 2-8 and 3-8, and the multi-input adders simultaneously perform addition processing. This addition process requires only the turnaround time TA.

【００３２】以上から、全処理にかかるターンアラウン
ドタイムは、（ＴＬ＋ＴＭ＋ＴＡ）となる。スループッ
トとしては、１クロックサイクル当たりに１座標変換が
可能となる。From the above, the turnaround time for all processes is (TL + TM + TA). As the throughput, one coordinate conversion can be performed per clock cycle.

【００３３】このように、本発明の第一の実施例では、
座標変換の次数分の入力ベクトルレジスタ１−１、１−
２、１−３、２−１、２−２、２−３、３−１、３−
２、３−３を設けた。この入力ベクトルレジスタに連続
したベクトル要素を格納させた。これらと座標変換行列
との演算処理を並列に処理させるようにしたため、１ク
ロックサイクル当たりに１座標変換を実現することがで
きる。Thus, in the first embodiment of the present invention,
Input vector registers 1-1 and 1- for the order of coordinate transformation
2,1-3,2-1,2-2,2-3,3-1,3-
2, 3-3 were provided. Continuous vector elements were stored in this input vector register. Since the arithmetic processing of these and the coordinate conversion matrix is performed in parallel, one coordinate conversion can be realized per clock cycle.

【００３４】[0034]

【００３５】[0035]

【００３６】[0036]

【００３７】[0037]

【００３８】[0038]

【００３９】[0039]

【００４０】[0040]

【００４１】[0041]

【００４２】[0042]

【００４３】[0043]

【００４４】[0044]

【００４５】[0045]

【００４６】[0046]

【００４７】[0047]

【００４８】[0048]

【発明の効果】以上の説明で明らかなように、本発明に
よると、複数のパイプライン演算機構を含む多重並列パ
イプラインによるベクトル処理を行うベクトルプロセッ
サにおいて、座標変換モードと通常演算モードを設け、
これらを切り換えるようにした。通常演算モードでは、
複数のパイプライン演算機構の各々の同一位置の入力ベ
クトルレジスタに異なるデータを格納させ、座標変換モ
ードでは、複数のパイプライン演算機構の各々の同一位
置の入力ベクトルレジスタに同一のデータを格納するよ
うにした。このため、通常演算時には同時に複数のベク
トル演算を実行でき、座標変換時には、同時に１つの座
標変換を実行できる。As is apparent from the above description, according to the present invention, a vector processor for performing vector processing by a multiplex parallel pipeline including a plurality of pipeline arithmetic mechanisms is provided with a coordinate conversion mode and a normal arithmetic mode.
These are switched. In normal operation mode,
To store different data in the input vector register in the same position of each of the plurality of pipeline operation mechanism, in the coordinate conversion mode, to store the same data in the input vector register in the same position of each of the plurality of pipeline operation mechanism I made it. For this reason, a plurality of vector calculations can be performed simultaneously during the normal calculation, and one coordinate conversion can be performed simultaneously during the coordinate conversion.

【００４９】また、本発明によると、座標変換時には、
複数の入力ベクトルレジスタに連続したベクトルを格納
し、この入力ベクトルレジスタの値と座標変換行列の各
要素が格納される複数のスカラレジスタの値とを乗算
し、この結果を多入力加算器により加算するようにし
た。このため、高速に座標変換処理を実行できる。According to the present invention, at the time of coordinate conversion,
A continuous vector is stored in a plurality of input vector registers, the value of the input vector register is multiplied by the values of a plurality of scalar registers storing each element of a coordinate transformation matrix, and the result is added by a multi-input adder. I did it. Therefore, the coordinate conversion processing can be executed at high speed.

[Brief description of the drawings]

【図１】本発明のベクトルプロセッサの一実施例の構成
を示すブロック図である。FIG. 1 is a block diagram showing a configuration of an embodiment of a vector processor according to the present invention.

【図２】本発明の一実施例のパイプライン演算機構の内
部構成を示すブロック図である。FIG. 2 is a block diagram showing an internal configuration of a pipeline operation mechanism according to one embodiment of the present invention.

【図３】本発明の一実施例のベクトルプロセッサの動作
タイミングを示すタイミングチャートである。FIG. 3 is a timing chart showing the operation timing of the vector processor according to one embodiment of the present invention.

【図４】本発明の第二の実施例のベクトルプロセッサの
構成を示すブロック図である。FIG. 4 is a block diagram illustrating a configuration of a vector processor according to a second embodiment of the present invention.

[Explanation of symbols]

１、２、３パイプライン演算機構４ロードストアユニット５主記憶装置６、７、８、９、１０、１１バス１−１、１−２、１−３入力ベクトルレジスタ１−４出力ベクトルレジスタ１−５、１−６、１−７乗算器１−８多入力加算器１−９、１−１０、１−１１スカラレジスタ１−１２、１−１３、１−１４セレクタ１−１５、１−１６、１−１７セレクタ１−１８、１−１９、１−２０セレクタ１−２１制御回路 1, 2, 3 Pipeline operation mechanism 4 Load store unit 5 Main storage device 6, 7, 8, 9, 10, 11 Bus 1-1, 1-2, 1-3 Input vector register 1-4 Output vector register 1 -5, 1-6, 1-7 Multiplier 1-8 Multi-input adder 1-9, 1-10, 1-11 Scalar register 1-12, 1-13, 1-14 Selector 1-15, 1- 16, 1-17 Selector 1-18, 1-19, 1-20 Selector 1-21 Control circuit

Claims

(57) [Claims]

1. A vector processor for performing vector processing by multiple parallel pipelines including a plurality of pipeline operation mechanism, each of the plurality of pipeline operation mechanism, normal calculation for performing vector processing by multiple parallel pipelines selecting a control means for controlling switching of a mode for executing the mode coordinate conversion processing to execute processing, the next fraction first input vector register of the coordinate transformation, the data to be stored into the first input vector register With multiple selection means
Only including, the control unit, when a mode for executing the normal operations process is different among the plurality of pipeline operation mechanism
So as to store the first input vector <br/> Ru vector elements different for the registers in things, when a mode for executing a coordinate transformation process is first definitive to be different among the plurality of pipeline operation mechanism features and be behenate-vector processor that instructs the selecting means so as to store the same vector elements for one of the input vector register.

2. The pipeline operation mechanism includes a plurality of second input vector registers, a plurality of scalar registers, and a second output for selectively outputting an output from the second vector register and an output from the scalar register. A plurality of multipliers for multiplying an output from the first input vector register by an output from the selection unit; an adder for adding an output from the multiplier; and an output of the adder. Further comprising an output vector register for storing the output from the second vector register when the mode is a mode for executing a normal operation process, and a mode for executing a coordinate conversion process when the mode is a mode for executing a coordinate conversion process. 2. The vector processor according to claim 1, wherein said instruction instructs said second selecting means to select an output from said scalar register.

3. A vector processor for performing vector processing by multiple parallel pipelines includes k number of pipeline operation mechanism, each of the pipeline operation mechanism includes a plurality of first input vector register, this First selecting means for selecting an input to the first input vector register; a plurality of second input vector registers; a plurality of scalar registers; an output from the second vector register and a signal from the scalar register. Second selection means for selecting and outputting an output; a plurality of multipliers for multiplying the output from the first input vector register by the output from the selection means; and an addition for adding the outputs from the multipliers An output vector register for storing the output of the adder; and the first input buffer if the result of decoding the instruction indicates a coordinate transformation. Consecutive vector elements Torr register is sequentially stored, the instructing said selection means so as to select and output the output from the scalar register and to indicate normal operation for the first vector register (k-1) pieces 2. The vector processor according to claim 1, further comprising control means for instructing the selection means to store vector elements at intervals and to selectively output the output from the second vector register.