JPS6182272A

JPS6182272A - Vector processor

Info

Publication number: JPS6182272A
Application number: JP59205040A
Authority: JP
Inventors: Tomoo Aoyama; 青山　智夫; Hiroshi Murayama; 浩村山
Original assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Current assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Priority date: 1984-09-29
Filing date: 1984-09-29
Publication date: 1986-04-25

Abstract

PURPOSE:To use a large memory space by providing a storage managing circuit using an expanded storage as a base in a vector processor and using the storage management circuit so as to assign the large scale memory space to the user. CONSTITUTION:An instruction stack has distinguishly instructions for each processor so that three kinds of processors: a paging processor 107, a vector processor 102 and a control processor 101 are processed in parallel at the same time. Further, a decode processing managing device executes plural decode processings for instructions of the said instruction stack in time division. Since a logical device is provided and a program existing on an expanded storage device 108 used in the partial set of the program is executed by plural main storage devices 104, 105 and 109, the area on the expanded storage device 108 used in the partial set of the program is loaded and stored to a designated main storage device.

Description

[Detailed description of the invention]

〔発明の利用分野〕本発明はベタ１〜ル処理装置に係り、特に大規模利学技
術剖算に好適な拡張記憶装置を具備したマル千−ｆｒ＋
セッサ構成のべ々１ヘル処理装置に関する。「発明の背景１今日、科学技術計算の分野では、演算の高速要求に伴な
い大きなメモリプ９間（数百ＭＢ）が要求さねている。二のような大規模なメモリ空間を複数１−ぜに割当て、
かつ、そ才１を主記憶装置のみで実現することけ費用の
点から不可能である。特じべ′７１．ル処理装置は汎用
機よりも高速性を要求されるため、高速のメモリ素子を
使用することに７１１、＝＋ストを下げろことが困難で
ある。ユーザの大規模メモリ空間内空間る要求とコスト面の要
求を満足させるため、主記憶装置の外に大容量の拡張記
憶装置を実装したスーパコンピュータが公表さおでいる
（小高他、［日経ニレノア　１゜［１ニクスＪ　Ｎ、ｑ
　３１４、Ｐ、＋５９〜１８４、日経マグロウヒル　Ｉ
！１８３）。この拡張記憶装置は１−記憶装置どけ異１
１、ワード慴位のアクセスではなく、ページ（４ＫＢ）
中位の連続アクセスによって、ｆｌＲＮＯＯＯＭバーｒ
ト）７秒７）’−タノデータ転送速度を達成している。このデータ転送速度は汎用計算機の主記憶装置上のデー
タ転送処理速度よ１１も大きいが、成る大きなデータブ
ロックの連続的なアクセスでのみ実現されるデータ転送
速度なので、ベクトルプロセッサに拡張記憶装置を直結
し、ユーザのベクトル処理で数百ＭＢに及ぶ大メモリ空
間を実現することはできない。〔発明の目的〕本発明の目的は、拡張記憶をベースとし・た記憶管理回
路をベクトル処理装置に具備し、該記憶管理回路によ−
）でユーザに大規模なメモリ空間を割当てるτとを可能
にし・、大メモリ空間を使用するフ、−規摸利学技術Ｈ
４算を実行できるベグ１ヘル処理装置をｔ７Ｆｆ　ｒｌ
Ｅすることにある。「発明の概要］本発明はベグ１ヘル処理装置を、複数の主記憶装置　鉱
Ｑｆｉ記憶装置、ページングプロセッサ、ベクトル処理１７、各プ「１セツサのデータ処理の並列実行を実現[Field of Application of the Invention] The present invention relates to a flat file processing device, and in particular to a flat file processing device equipped with an extended storage device suitable for large-scale computer science calculations.
The present invention relates to a processing device having a processor configuration. Background of the Invention 1 Today, in the field of scientific and technical computing, large memory spaces (several 100 MB) are required due to high-speed calculations. Assigned to
In addition, it is impossible to realize Sosai 1 using only the main memory device due to cost. Tokujibe'71. Since a file processing device is required to be faster than a general-purpose machine, it is difficult to reduce the cost of using a high-speed memory element. In order to satisfy users' demands for large-scale memory space and cost, supercomputers equipped with a large-capacity expansion storage device outside of the main memory have been announced (Kodaka et al., [Nikkei Nirenor). 1° [1 Nix J N, q
314, P, +59-184, Nikkei McGraw-Hill I
! 183). This extended storage device is 1-storage device is different 1
1. Page (4KB), not word-level access
By medium continuous access, flRNOOOM barr
g) 7 seconds 7)'-Tano data transfer speed has been achieved. This data transfer speed is 11 times higher than the data transfer processing speed on the main memory of a general-purpose computer, but since it is a data transfer speed that can only be achieved by continuous access of large data blocks, the expansion storage is directly connected to the vector processor. However, it is not possible to realize a large memory space of several hundred MB through user vector processing. [Object of the Invention] An object of the present invention is to equip a vector processing device with a memory management circuit based on expanded memory, and to
) allows the user to allocate a large memory space with τ, and uses a large memory space.
t7Ff rl a Beg1Hel processing device that can perform 4 arithmetic
It is about doing E. ``Summary of the Invention'' The present invention realizes parallel execution of data processing for one set of data in a plurality of main storage devices, a paging processor, vector processing 17, and each processor using a VEG1H processing device.

【
７で、拡張記憶装置にに存在する大規模なプログラムの
実行を拡張記憶装置よりも小さな容量の複数の主記憶装
置上で実施するようにしたことである。〔発明の実施例〕以下、本発明の一実施例を図面を用いて詳細に説明する
。第１図は本発明に係るベクトル処理装置の基本１ｉ１’
ｆ　ＩＲで、−１ントロールプロセツサ１０１、ベクト
ルプロセッサ１０２、ページングプロセッサ１０７、を
記憶装置１０／Ｉ、１０５，１０９．拡張記憶装置１０
８、スイッチング回路１０３，１０６に上って構成され
る。ベクトル処理を規定するプログラムを次のような構成の
オブジェタトコードで記述する。プログラムを構成して
いるベクトルおよびスカシ命令群の成る集合が、他の集
合とは異なるメモリ空間内のデータに対しての処理動作
を規定している場合、二の命令集合をセグメントと云い
、該命令集合のポイントするデータ空間を局所メモリ空
間という。このように定義されたセグメントと局所メモリ空間は科
学技術計算プログラム中に数多く見出すことができる。たとえば次のようなりｏループでは、Ｄ○　１００　１
＝１．Ｎ＋００　　Ａ（Ｔ）＝Ｂ（Ｔ）十Ｃ（Ｔ）ｒ′）Ｏルー
プを構成するスカシ、ベクトル命令列がセグメントであ
り、配列Ａ、Ｂ、Ｃとｒ）０制御変数Ｔ、Ｎが占有して
いるメモリ空間が局所メモリ空間である。従って、プロ
グラムはセグメントと局所メモリ空間によって定義する
ことができる。この定義によりげ、プログラム実行に必要十分な処理と
け、（Ｉ″ｌｌ現在セグ、メントを実行し、でいるのかを示
す指標。Ｉｇｌ　　局所メモリ空間を生成格納する指示、である
、このことを第２図のプログラムＰＬ、についていうと
、ａ　−Ｃけセグメントを示し、Ｔ−ａはセグメントが
ボーｒントする局所メモリ空間を主記憶装置上に生成指
示する命令、ＳＴａけ該メモリ空間を拡張記憶装置上に
格納する命令を示す。ａ。ｂのセグメントについても同様である。プログラムはデータ処理を規定するセグメントと局所、
メモリ空間と該空間を管理する命令とによ一〕で構成す
ることができるが、プログラム自体もまたセグメントと
局所メモリ空間によって定義される。このことを第２図
のＰＩｌについて説明する。成る命令ここでけＳＴｂにおいて、データ処理がＳＴｈ
以前と以後に分割できるとする。この仮定はプログラム
が各構成セグメントを順序だって実行するものという仮
定の下に論理的に構成されていわば妥当である。ＳＴｂ
点におけるプログラム分割を行−〕だ場合、■、ａ命令
からＳＴｂ命令に到ろ命令群を主記憶装置上にロープイ
ンクする命令と該命令群を拡張記憶′ＪＡ置にストアす
る命令とプロゲラ１１の実行を指示する命令を定義する
ことによ１）、プロクラ１１をプログラム自体のローデ
ィング、実行、ス１〜７の３種類の動作定義に縮約可詣
である。この縮約を第２図のプログラムＰＩに示す。二
のように縮約されたプログラムは、比較的小さなメモ＋
１空間上にプロクラムの実行が終るまで保持することか
できる。科学技術計算の場合は。プログラムが自分自身を処理することけないが、今後よ
り複雑なデータ処理を行う場合、プログラムが自分自分
を変形することが起り得る。このような場合、プロクラ
ムを主記憶装置にローディングする命令の次にプログラ
ム自身を変形処理する命令を置くことにより、目的の処
理を行うことができる。第２図のＰ２は、このプログラ
ム自身を変形するプログラムの構造を示したものである
。Ｐ２において、α部がプロクラ１２α部（これはセグメ
ントａ、ｂから構成されている）を変形するセグメント
である。ＪＴｉ、　ｔｂ　、第２図に示り、たように、プロクラ
ムを、プ「１ごノ→ム白Ｉ・を主記憶装置上ロート、編
集、スト・アする部分を［プロ／７ラムカーネル」、プ
ロゲラ１１カーネルによって引用対象になる部分を「プ
ログラムボテ、イ」と呼ぶ。第１図のベクトル処理装置において、第２図Ｆ）　に上
って表示さ才ｌたプログラムは次のように処理される。プ「１ダラムは初め拡張記憶装置１０８上に置か狛でい
る。ベクトル処理装置が起動されると、コン１−「１−
ルプロセノサ１０１はページングプロセリ什１０７を起
ｆｆ１ｌ＋　Ｌ７、第２図のＰ７のプログラムカーネル
部を主記憶装置１０９にロードする。ぺ一＝ジ〉・グプ
ロセッ升１０７のロープインク完了報告をＴ１ンＩ・ロ
ールプロセラ廿ｌｏｔが受取った後、該プ「１セツサ１
０１けプログラムカーネル部の先頭の命令より実行を開
始する。プログラムカーネル部の先頭命令はプロゲラ１
１ボデ７ｆαを主記憶装置へ「１−トする命令であるの
で、二の命令の指示により、プ［１ηう１１が拡張記憶
装置１０８から主記憶装置１０９にロードされる。次に
コントロールプロセッサ１０１けセク、メントαの指示
により、主記憶装置１０９にロードしたプログラムボデ
イヲ修飾し７１次のセグメントＥ　Ｘ、　Ｅ　Ｃでコン
トロールプロセッサ１０１の命令処理位置を示している
ポ・インク（以後ｃｒ’ｓｗという）を書替え、制御を
プロゲラ１３ボディ部に移す。この動作は汎用計算機の
無条件ブランチと同様である。プログラムボティ部の先頭命令はａセグメントの局所メ
モＩＪ空間を主記憶装置１０４又け１０５ヘロートする
命令Ｌ　ａであるので、コントロールプロセッサ１０１
けページプロセッサ１０７．スイッチング回路１０６に
指示をケ、え、拡張記憶装置＋０８上のａセグメントの
局所メモリ空間を主記憶装置１０４又は１０５ヘローデ
イングする。こ二では仮に局所メモリ空間が主記憶装置１０４ヘロー
トされたとする。コントロールプロセッサ１０１は続い
て、ｃｐｓｗの制御に従ってセグメントａ内の命令を実
行する。仮にセグメントａ内にベグ１ヘル命令が出現し
たとする。この時、コン１へ「１−ルプロセソ什１０１
けベクトルプロセッサＩ　ｎ　２　、スイッチング回路
１０３に起動および指示を行い、ベクトル処理を主記憶
装置１０４に対１−４て行う二とが出来るようにパスの
接続を指示する。ベグ１ヘル命令はコンｌ−「＋−ルプ
ロセッサ１０１によ、）で主記憶装置１０９から読出さ
れ、ベクトルプロセッサ１０２に送られ、すてにローデ
ィングさ才１ている主記憶装置１０　’Ｉ　」；の局所
メモリ空間を対象にベクトル処理が行われる。ベクトル
プ「１セツ什１０２でベクトルクル環が行われている間
、−１ントロールプロセッ廿１０１はベグ１ヘル命令の
直後にベタ１ヘル処理の終了を待つ命令がなけ才棗ば、
次の命令を実行する。次命令が仮にセグメン１〜ａの次
のセグメンｌ−ｈの局所メモリ空間の主記憶装置へのロ
ープ、インク指示とする。このとき〜にノ１〜ロールプ
「１セツサ１０１けスイッチング回７ｆｊ　Ｉ　ｎ　６
　、ページングプロセッサ１０７に指示をりえ　拡張記
憶装置１０８１−のセグメントｂの局所メモリ空間を主
記憶装置１０５ヘロートする。ごのｒ＋　−＝−１：動作はベクトル処理狸と同時に行
うことができる。この局所゛メモリ空間の主記憶装置へのロー１−動作の
べ／７トル処理との同時実行機能を有効に使用するため
、オブジェクトコードのプログラムボデーイ部を最適化
する必要がある。拡張記憶装置上のプ「１ノノラ１１ボ
ディ部は、最初、第３図ａのよう内・構）告をとってい
る。第３図ａのプログラムにおいて、各セグメントはそ
のセグメンｉ・を実行するのに必要な局所メモリ空間を
主記憶装置ヘロートする命令と、セグメントの規定する
処理の結果を拡張記憶装着へストアする命令を前後に有
する。この形式のオブジェク１ヘコートは、本発明のベ
クｉ・生処理装置では局所メモリ空間の主記憶装置への
ロード、データ処理、処理結果の拡張記憶装置へのスト
アの順序だった処理を指示するので、性能向上が期待で
きない。コントロールプロセッサ１０１どページングプ
ロセッサ１０２間の並列動作を可能とするため、命令列
を次の４種類に分割する。 ■　コントロールプロセッサにおいて、データ処理を行
う命令（以下Ｃ命令という）（？）ベージンクプ「１セツサでページング動作を規定
する命令（以下Ｐ命令という）（ｊ）コントロールプロセッサでページングプロセッサ
の状態を調べる命令（以下ＣＷ命令という）の１　ペー
ジングプロセッサでコン１〜［１−ルプロセノサの状態
を調べる命令（以下ＰＷ命令という）プ「ｌグラ！いけ
第１図の主記憶装置１０９にロードされ、コン１〜ロー
ルプロセツサ１０１によって解読されるが、Ｐ命令が出
現し、た時は、ページングプ［Ｉセッサの状態によって
該命令の指示がコントロールプ「１セツサ内に保持され
るかページングプロセッサ用に送ら汎るかが決定される
。Ｐ命令がコントロールプロセッサ側に保持される場合
、たとえば、ページングプロセッサがビジー状態でＰ命
令を受付られない場合、このときコントロールプロセッ
サかページプロセッサの終了を待つように処理を行うと
、コントロールプロセッサの命令デコート処理にシリア
ルな状態を発生させることになる。このような場合、目
的とし、ているＰ命令がページングプロセッサがビジー
であって実行できなくとも、Ｐ命令の次の命令が論理的
に実行可能ならば、この次命令を実行し、かつ、ページ
ングプロセッサのビジー状態が解除された場合、直ちに
実行留保状態のＰ命令を実行する必要がある。このよう
な一種の命令追越し、動作を行うため、命令デコード処
理を、Ｃ９Ｐ命令の両系統に分割する。し・かり、コン
トロールプロセッサの命令実行は、第３図ａのコートの
ように概念的に実行されわずなら、ないので、Ｃ，Ｐ両
系統のテコート処理をタイムスライスし１、時分割で交
互に命令を実行するように制御する。この制御により、
Ｃ命令を実行し２なから、Ｐ命令をある一定のタイミン
グで起動し、ページングプロセッサのビジー状態が解除
された場合、直ちにページングのための情報をページン
グプロセッサに送るような処理が可能になる。しかＬ・
、この時分割デコード処理は両系統の命令の処理に何ら
の同期処理も含まない。このため、第３図ａのようなペ
ージングに対して、シリアルな処理を規定しているプロ
グラムの動作と一１２＝一’ＰＩ　１−ｆＪ−い場合が発生する。たとえば第３
図ａにおいて、Ｐ命令の■、ａが行われたとき、時分割
ダニ１−１−処理を行うと、Ｌ　ａ命令の処理が行われ
ない時でもセ・グメン［−ａのデータ処理が開始されて
し、まう。このような不都合を解消するため、ＣＷ。ＰＷ系列の命令が必要になる。この系列の命令を追加【
７、新たにオブジェクトコードを生成したもの右第３図
１）に示す３．第３図すのオブジェクトコードは次の処
理を仮定している。（１）　　ページングプロセッサは局所メモリ字間をロ
ードすべき主記憶装置を区別する手段をもち。か−）　局所空間のローディングが完了した際、どの主
記憶装置へのデータ転送が完了したかを示す情報をコン
トロールプロセッサへ報告する。す）　局所メモリ空間が「ｌ−］〜さオｌる主記憶装置
を１．２と命令し、二の順序に従って使用する。を蓬１　　第３図りの７１列はページングプロセッサ用
の命令列で５　ν列は一１ントロールプロセッサ用の命
令である。ただし、ページングプロセッサ用の命令も、
テコ１−ド処理はコントロールプロセッサで行われる。第３図すの構造のオブジェクトコードは次のように実行
される。まずμ列のＬ　ａ命令がデコートされ、ページングプロ
セッサにセグメントａに必要な局所メモリ空間の転送の
開始指示が行われる。次のタイミングではν列のＴＪｎ
Ｌｊｌｌ命令がデコートされるが、この命令は主記憶装
置（１）へのデータ転送の完了を待つ命令であるので、
ν列の次のセグメントａは実行されない。あるタイミン
グ後、μ列のｒ−ｂ命令がデコードされる。この時ペー
ジングプロセッサがビジー状態であると、このＬ　ｂ命
令は実行されず、命令スタック上に保持される。次のタ
イミングでは再び制御はν列の命令に移行する。この時
、再びＴＪｎｔｉｌｌ命令が実行され、コントロールプ
ロセッサはページングプロセッサの終了を待つことにな
る。ページングプロセッサがセグメントａのための局所
メモリ空間の転送を完了すると、コントロールプロセッ
サの制御がμ列の命令上にあるときは、命令スタック上
の命令即ちＬｌｌが直ちに実ｔ−ｔされる。該−１ン１
〜ロールプロセツサの制御が１・列の命令卜にある時は
ＴＪｌｔｊｌｌが実行され、二（ｈ小部は直に完了Ｌ７
　　次のタイミングからセクノントａが実行さｈろ、こ
の状態で、ある時間ではμ列の命令が、その池の時間で
はν列の命令が実行さ１１ろ。７１列の命令がＷ命令（第３図ｉ）参照）に至ると、μ
列の命令のデコート処理がν列の命令列でＲｅｌ命令を
検出するまで停止１．する。工のＷ命令によって、１゛
ノド「１−ルプ［１セラ升かセフメンｌ−ａのデータ処
理を完了１−２た後に、対象となった局所メモリ空間を
拡張記憶ヤ装置へ転送する順序性が保証できる。　　−
ｊｒ、　ν列の命令は、Ｒｅｌ命令後はＴＪｎｔｉ、］
２命令であるかＩ）、　主記憶装置（２）への局所メモ
リ空間の転送が完了し、ていれば直ちに次のセグメント
ｂの実行に移行する。第：３図すにおいて、ν列にＩＴ
ｎ白１系の命令の連続かＥ出されるが、こればｐ列でＳ
Ｔａ、ｆ−、ｃ命令のように同一の主記憶装置に−）い
ての局所メモ１１空間の転送指示が行わわろため、ペー
ジングプロセッサからの完了報告＝１５− が２同乗ることに対応し・ている。第３図すの構造のオブジェクトコ−１くを本発明のペク
ト・生処理装置で実行し、た場合のコン１へロールプロ
セッサとページングプロセッサのビジー状況を第４図に
示す。第４図において１点線の部分がプ「１セラ升・ウ
ェイｉ・時間である。第４図がｌ）明らかなように、因
果関係を満す範囲内で両プロセッ廿は自動的に最大のデ
ータ処理が行ゎ狛るように制御される。以上示Ｌ７たコン［・ロールプロセッサとページングプ
ロセラ升間の制御法が、コントロールプロセッサから起
動されるベクトルプロセッサとコントロールプロセッサ
の間にも成立する。ただし、ベクトルプロセッサはデー
タ処理の性質−１＝、プロセッサウェイトが困難である
。このベクトルプ［１セノ廿の制御をとり入れた場合の
オブジェクトコードの構造に１いて以下にのべる。ベクトルプロセッサの起動は、１セグメン１〜内でのみ
行わわ、複数のセグメントにまたがって行われない。最
も甲純に考えると、第５図ａのような構造になる。第５
図ａに、おいて、ν１列の命令けベタ１ヘル命令を示し
、ν１列の命令はベクトル命令ＩＪＪ、　、＋ｌの命令
を示す。Ｘりトル命令以外の命令のてどを以下スカラ命
令という。コン１−ロールプ「１セノ什から起ｉｌｌさ
れるベタ１〜ルプロセツサを制御する命令とし、て最小
限１次の命令系列が必要である。中　ベタ１−ルプ［１セツサを正規に動作させろための
準備を行うセットアツプ命令、ｔ２＞　　７りトルブ「１セツサを起動するための命令
（Ｊ゛〕、下ＥＸ入７ｐという）。（■　べ！１１ヘル処理の完了をチェックするため命令
（〕−゛Ｊ、下ＴＶＰという）。二の３種類の命令系によ−〕で、　＝＋ン１−ロールプ
「１セノ什の命令デ：１−ドタイミング論理を変更する
−とかくベクトルプロセッサの制御が可能になる。し、
かり、第５図ａに示さｈている命令によって記１ｉｔｓ
さｆ［ろプロセッサの動作を考慮すｔｔ、　ｌず明らが
なように、：１ン１−［１−ルプ［１セツサの制御がＴ
■Ｐ命令に至ると、その時点で＝１ン１〜［１−ルプロ
セッサのデータ処理けりエイ１〜し、、ＴＶＰＪ′、Ｊ
、降のスカラ処理ａ−命令列を実行することができない
。二わけベク［・ルプ「１センサをコン１−ロールプロセ
ッサに付加させたために生じた無効時間である。二の無効時間をｔ側車化し、上り高速のベクトル処理装
置を実現するため、セグメントａをさらにそのデータ処
理の型に従って次のように細分する。中　べ））［・ル９！！！埋以前に開始し、なければな
らない処理（以下ａ１処理という）ｑ・　ベクトル処理と同時に実行できる処理（以下ａ７
処理という）（ＡＩ　　ベクトル処理以後に行わなければならない処
理（以下ａ、、処理という）ｒ７ｊ）　　ベタ１〜ル処理の完了のチェックを開始す
る命令（以下５ＴＶＰという）弔）　ａ、処理どａ４処理の境界を示す命令（以下Ｂｎ
ｒＴＮｒ１という）上記の分類に上−〕で示された命令によってセグメント
−ａを再構成すると第５図すをｔｌ）る−第５図りの構
ｊ告のプロゲラ！いについてコン１〜ローＪレプロセノ
什とベクトルプロセッサの制御は次のように八・、乙、
：ｆｌｌ系列の命令け＝１ンｉ・［］−ルプロセッサτ
実行さ１１７１．ＥＸＶＰ命令は＝１ント「１−ルプロ
セノ升τ子−１−ドさ４１．ベクトル処理−動オろ、ベ
グトＩ＋、／命餐列は起動さ１またベタ１〜ルプ［１セ
ノ什て一１ンｌ−「１−　ルブロセノサを介して主記憶
°（重囲１０９（第１図参照）から読出され、ベク１、
ルプｒ１セッ什てデー１−１．さｉ＋実行さ］する。ａ
。系列の命令はコントロールプロセッサセッせプｒ１ヤソ
甘の動作どけ非同期にテコ１−ドされ実行さｔ＋ろ−Ｓ
Ｔ■Ｐ命令によって、：Ｉントロールプ［１セノ升の子
：」−ドサｒクルけ２−］のフェイズに分割前１１ろ　
第１のフエ、ｒスではベクトルプロセリ什でのデータ処
理が終了し、たか否かがチェック主ＩＩ、第２σｉ　”
７　工、（ズでけａ２系列の命令が実行さ１１イ、　該
２フエ（ブけあｒ）、かし、め定義されたりｒミン′ノ
間陪で相互に定着さおだ時間を占める。ハ゛シ１．ルプ「１セツサのチー　タ処理が完了し、た
場合第１のフェイスけｔｌ’ｌ　ｉｈｔする。ａ７命令
の処理において、ＢｎｕＮｎ命令を検出し７た場合は第
２のフエイズは消滅する　−の２様のデ：１−トフエイ
スの導入により、ベクトル処理の完了を調へながら、同
時にスカラ命令の一部の処理を実行することがτきろ。二の機能により、コン１へ［１−ループ［１セツサにベ
クトルプロセッサを付加したために生じた「無効時間］
を極小化できる。１！Ｌ　Ｉ：、ベタ１ヘル処理の終結についての無効時
間の極小化について述べたが、ベクトルプロセッサ起動
についても全く同様の議論が可能である。次にプログラムのロケーションについて考察する。いま
まで述べて東だようにプロゲラ１１とデータは拡張記憶
装置ｌ−の実アドレスによって、そのロケーション位置
が定義されている。当然プログラムのデータ処理手続も
当該ロケーションによって記述されている。一方、局所
メモＩｌ空間を主記憶装置ヘロードし、このロードされ
た新たな空間に対し２データ処理を行う場合、以前のロ
ケーションによって行う二とが不可能になるにの場合、
局所メモリ空間を主記憶装置のどの番地に１−＋　−Ｉ
し、たのかを示す情報をコントロールプロセッサが、−
２０＝局所・メモ１１空間をｒｌ　−１ニオる命令をデニ１−
ドする際抽出（、ての情報を基に″１ント「Ｊ−ルプ「
１セツ什（ハフ力う処理、ベクトルプロセッサのベクト
ル処理１−Ｆ−１て発生するメモリアクセスのアト１ノ
スリロケーシヨンを行う。二のりロケーションされたア
ト１ノスは主記憶装置の実アドレスを示している、従っ
て＝１ントロールプロセッサ、ベクトルプ「１セツサに
７ドＩノスリ「１ケ一シヨン機構を具備するニどに上り
局所メモリ空間におけるプログラム実行か可能になる４次にページングプ「１セソ叶の機能について考慮する。前述の議論はページングプｒ：１セッサの局所メモ１１
空間管理についてのみ行わ打ており、局所メモリ空間そ
のものの構造についてけ行わわていない−ベク［・ル処
理けある特定のデータ群に対し一様のデータ処理を行う
ことであるので、スカラ処理のように［処理の順序性」
について、上記のデ〜　タ群内の要素について必ずしも
保証する必要は八・い−次のような１１０ループを考え
ると、ｎＯ１００１＝１．］ＯＯＯＡ　（Ｉ）二Ｂ　（■）　十Ｃ（■）１００　　Ｃ０ＮＴ　Ｔ　へＩ　ＴＪ　ＥΔへ９１〜ル
要素けＩ＋　２＋　　３＋・・・のようか順序でＲ］算
する必要けろ・く、１　＋’　３＋　２＋　４　＋　５
＋　　・・の順序でもよい。ｍ＝でに記のＤＯ小ループ
同じセグメントに次のよう７ｊ、ｙ　ｒ）　Ｑループが
存在し、たどする。ｎｎ　　２ｎＯＴ＝１，９９９，２ｎ　（Ｔ’）　＝ｌ”３　（Ｔ）＊Ｃ（Ｔ’）２　ｏ　
ｎ　　ＣｏＮＴＴ　ＮＴＪＥ後者のＤＯ小ループ前者のループとけ異なり、Ｂ。Ｃベタ１ヘルへのアクセスは連続でなく非連続である。ベグ１ヘル処理はテーク処理を間断なくパイプライン演
算器に行わせ、処理速度を向上させる処理であるため、
被演算テークは非連続よりも連続番地に置く方が効率が
良くなる。Ｂベタ１〜ル要素がＩ　１２１３１　　のよ
うにロケーションされている場合、二の［Ｉケーション
は前者のＤ○のベタ１〜ル処理にけ適しているが、後者
のＤ○の処理には前者程適し、ではいない。し、かし、
Ｂベタ１−ルのロケーパ、・コンを、Ｉ、３，２．４，
５．７．６＋　　８＋ａ）よ−）にすると、後者のＩ］
○クル理（二於いても連続７′ノセスとな：１、テーク
処理効率か向１−する。すた＾１Ｔ昔σ）ｎｏ処理において、ベク］・ル処理で
けテークの順序性は保証し、なくて白いので、アクセフ
、　ｔｙ＋沖１青性は変Ｉ′″ｌオ、データ処理効率は
低下し、ない　　射にｎｌどびのア′ノセスヒｎとびの
アクセスか連続７　））セスどｌＩＨ存する場合、配列
をｍ＊ｎのｆＩ１余に１〕てｍ＊ｎ個の部分集合に分離
することにｔｌｌ、ｍどびの７グセスがポイントする部
分沖合！−１１とびの７！ノセスかポインダする部分集
合とじ分けろ二とができるｍ−のよう内・操作を行って
も’；ｊｌｔ　Ｉ＃ｌニア’、）セスにけ影響をりえ＾
・い。二の分離操作しＬ；１部分に５合内でけそ旧ぞわ
のアクセスは連ｉ、プ１：１７．　ｆＩ：乙−従一、て
、拡張記憶装置−に１′Ｚある局所メモリ空間を主記憶
（−１ｉ　ＩＦｉへ「１−ドする際、−１−記の配列の
部分集合再配置化をイイら二とに上って局所メモリ空間
のｔｉＶｔ造の最適化を行い、ベク［・ル処理の高原ｆ
ヒ右図る二ｋかできろ。てのようｆ′８′、処理分散を
ページソタプ「１セソ甘どベクトルプ「１セソ甘間で行
い、ペクト・ル処理装置の高速性をより向上させうる。１−記の局所メモｊｌ空間の部分集合再配置による最適
化どけ、セグメントのアクセスが行わ第１ない部分集合
を主記憶装置へローディングＬ、ないことではない。ア
クセスの行わわない部分集合を主記憶装置ヘロードし、
なければ主記憶８袖は小さくなるが、局所メモリ空間を
拡張記憶装置へ書き込む際、アクセスの行われなかった
主記憶装置−ににローディングさねている・い部分沖合
を書込力に先立って拡張記憶装置から読み出し１部分集
合再配圃のｊψ操作を行い、その結果を拡張記憶装置に
書込まなければならない。拡張記憶装置へのアクセスは主記憶装置のアクセスとは
異なり、一種類のアクセスが連続し、た時にのみ高速と
なるので、上記の、ように拡張記憶装置への続出書込が
交互に発生し、た場合、著し、いテーク転送速度の低下
を来す。このため、セグメントがアクセスし、ない局所
メモリ空間の部分集合をも主記憶装置へロードし、でお
き１局所メモリ空間ａ：　ｌｉｊ’、　ｌ’ｊｆ３記憶
装置に書込む際には、主記憶装置への舌１．４！ｌ１ｆ
ｉ：アクセスを行い、部分１ニ合再配置のｊゲ！操作を
行−１て１１）：りに記憶ヤ装置ヘス１へ７するように
する。Ｊ″ｌ、１　プ「１グラ１１のセグメント処理ヒを行い
、主記憶装置ｒｔの情＋ｌＩを局所メモ１１窄間に限定
し、か−）局所メモ＋１空間の部分集合の再配置による
メモリ空間のセグメント処理に対する最適化を行うこと
にトリ　拡弓１４記憶装置１−に定義されだ大規模内・
プログラムどぞのプログラム実行に必要な大メモリ空間
右十記憶装置トし存在させた場合と同様のベタ１〜ル処
ｐｌ＋を、上り小さい主記憶装置上で行うことが丁きろ
１第６Ｕ２１は本発明のベダトル処理装置にｆｊけるコン
ト［１−ルー／’ｒ＋セッ什の命令デコーダの前処理部
分の概略ブ［１ツ！７図である。第６図にｔｊいて、命令は主記憶装置１０９（第１図参
照）よ１１パス６５０を介し７てフリップフロ・・ノブ
（以下ＦＦど呻す）６００に入ろ−ＦＦ６００１−の命
令は命令管理回路６０１によって、ペーシングプ「Ｊセ
ッサ、Ｔ１ント「１−ルプロセソ什、又けへ９１−ルプ
「ｌセラ什用の命令に分類さ１シ、タイミン′ノ合せの
ためのＦＦ６０２を介し７てス、ｒソチング回路６０３
に送［″Ｊ第１る。ス、ｒツチンク回路６０３により命
令の型に従−］で、ペーシンノノ゛プロセソ什川の命令
は命令スタック６０４に、　ｍｌントロールプロセノサ
用の命令はスタック６０５に、ベタ１ヘルプ［１セツサ
用の命令はスタツノノ６０６に入る。６０８は命令）、
タライノ６０／ｌ内の命令移行を管理する回路であ暴）
、該回路はフエイスジエネ１ノータロ０７によって定義
される特定フェイズたけ動作する。以下二のフェイズタ
・ｒミンクをＡフェイスどよぶ、スタック「３０４に入
った命令は１．ヘフエ・ｒス毎にス９ッグを移行し１．
命令ｌノシスタ６０９に入る。該命令１ノジスタ６０９
上の命令はテコーダ６１０によって子コードされ１局所
−メモリ空間をロート又はストアする命令とウェイト命
令とその池の命令とに分類される。ロート・ス１−ア系
命令の場合、セ１ツクタロ１２けパス５１を、ウェイト
命令時はパス６５２を、その他の命令の時はパス６５３
を選択する。 −Ｊｊ　　ページングプロセッサ１０７　（第１図参照
）ノ状態（１パスｎ　５　・Ｉ　ｌ　’＋ｆｆ４　Ｌ　
テＦ　Ｆ　６１　］　１１：　（呆ＥＩＦ主１１τいイ
）６′″二で　ページングプ「１セツサか１′シーのＩ
Ｉ　’ｐをＦ　Ｆ　ｔ；　１１力状態が″ビ′とし１．
そうτへ゛い場合も−”　ｎ　”とする。またスタック
管理回路（’ｉｎ８にけパス６　！”ｉ　５右）介［、
τ向合移行押出情報か１″！、＋’、　４１ろものどオ
ろ−「ｌ　−ｌ：・ス１〜７系命令の場合、セレクタ６
１２（」パフ、［）５１ど〔３５５の接続右行うので、
ページソリ゛プ「１セ・・）什１０７がビジーのｔｌｊ
ｌ　１１１　、　ｏ−ド又はストゴ′命令は命令１ノシ
スタ６０９Ｆに留る。ペーシソゲプ「Ｉセソ什１０７か
ビジー状態でろ・くな−Ｉ九用今、パブ６５１むよび６
５５を介して命令移行許可がスタック管理回路６０８に
対し、て行わ１１、局所メモリ空間σ）ｒｌ−ド又はス
トア命令がパス６５　（ｉ　％通し、てページングプロ
セッサ＋０７にｉ’ｊ；　ｒ’Ｉｉ１：ｇ。一′でページン））ブ「１セソ廿ウエイ１へ命令動作は
一１ント「１−ルプ「１セツサ１０１のＲＥｌ、（リリ
ース）命令に関連するので　コントロールプロセツ什系
列の命令の処理について説明する。コントロールプ「１セラ什系列の命令はスタック６０５
に入った後、スタック管理回路６１４によ−２て命令移
行が管理さ狛る。スタック管理回路６１４けフエイズシ
ェネ１ノータロ０７がＡフエ、（ズ以外の７エイズを生
成【−１たどき（Ｊ！Ｊ、下このフェイズをＢフェイス
という）に作動する。また該管理回路６１４け２種類の
命令移行押出指示をパス６５７．６５８を通して受けと
る。こわらの抑１１−指示かない時、命令けＢフェイズ
が生成される毎にスタック６０５を移行し、命令レジス
タ６１３に入る。命令レジスタ６１３上の命令はデコー
ダ６１５によって解読され、Ｒ，Ｅ　Ｌ命令以外の場合
。ＦＦ６１６に′１″がセット・される。このＦＦ６１６
け一１ントロールプロセッサ１０１が成るプログラムの
セグメントを実行中であることを示す。テ：】−タ゛６
１５がＲＥＴ、命令を解読すると、ＦＦ６１５けＩＩ　
ｎ　Ｄにリセットされる。第１図の主記憶装置ｆ’ｆｆ
１ｌｌ’）４，１０５の状態は次の状態を考える。（１）主記憶装置に局所メモリ空間が［１−ドご１１゜
一１ソｌ−ｒｌ−ルプ［１セツ什およびベクトルプロセ
ノ什かｌ’、　％、川用ヒｔ（Ｉニー：）た状態（］゛
ノ、下ｌノティ状態１・い５）ｔＷ’　：’Ｅ記憶（＝！２１Ｆ’ｆにページング−１
ロ七ソ什かアクセス［τいろ状態Ｃ以下ジェネレ−ト状
態という）＋２Ｊ、、ｌ−の上記憶装圃の状態はページ
ングプロセッサ１０７が１ｉ′！−理し、その報告がコ
ントロールプロセノ仕１０１にろ・される。第６図では
パス６６０゜６（５１を介しτそｌ＋ぞ］ｌ主記憶！Ｘ
Ｊｉ　１面１０／ｌ、１０５の状態の報告がかさＪ’ｔ
　、　　Ｆ　Ｆ　６２０　、６２１に１い［寺さ才する
一ＦＦ６．ｌ！ｎ、６２］の１直はページン′ノ゛ずロ
ヤッ廿１０７がジェネレート状態ならば゛ビ′、１ノテ
イ状態な１″１げＩｆ　ｎ　１４とする。テニ１−夕６１５でＴＪ　ｎ　ｔ、　ｉ　＋命令が解読
されると、そのフ、１象とする主記憶ヤ装置１０４ある
いけ１０５の状態を調へろために、セレクタ６２２に指
示が行わり、ＦＦＩ’ｉ２Ｑ又け６２１の状態をパス６
６２ｖけ６６３を介し７てセ１ツクタロ２３へ送る。セ
１ノリタロ２　、ｉ　′ｒ−けＴＴｎｔｊｌ系命令の時
はパス〔；６４１１１１１を、１丁ｒ＋Ｉｉｌ系１゛Ｊ
、外の命令の時にはパス６６５側を１ｉｔＲする。仮に
ＬＴｎｔｉｌｌなる命令がデコーダ６１５で解読されて
、”］’″が主記憶装置１０４側たったとすると、パス
６６２．６６／ｌ、６５７の１妾続が行われ、ＦＦ６２
０の状態がスタック管理回路６］４に送られる。このと
きＦＦ６２０の内容が゛１″であると、スタック管理回
路６１４の命令移行処理は押出さ１（、命令レジスタ６
１３」−にＴＪｎｔ、ｊｌｌの命令は保持される。この
命令滞留処理はページングプロセッサ１０７がレデーイ
状態となるまで行われる。ページングプロセッサ１０７
がレデーイ状態になると、命令移行が行わト、パス６６
７を介し７てコントロールプロセッサ１０１の次段命令
解読部に命令が送られる。Ｕｎｔｉｌ系以外の曲以外は
、セレクタ６２３はパス６６５側を選ｌＲするので、ス
タック管理回路６１４に命令移行押出指示は行われず、
ページングプロセッサ１０７の状態とは関係なく命令は
パス６６７　、ｔ：に送り出される。再びページングプロセッサ１０７の命令テコード処理の
説明にもどる。命令トジスダに０９に１１工ｒ１・命令か東だ場合、７
−１−タ〔；１０の指示によりセ１）））タロ１２けパ
スｒｉ　！”ｉ　：！Δ選ＩＲ第２）　従って一１ン１
へ［１−ルブロセ・・）甘１０１のための命令てＲＥ■
、を発行Ｌ　＠、い限す　バージングプ「１セツ廿１０
７け命令待ちの状態てつ１ｒ１・する〜該ＲＥ■、命令
はセグメント処哩の終了時点で発行され１局所ｌモリ空
間内の処理がｊ疼−２！−ｆす、該空間が拡張記憶装置
１０８に１ト込オＩＩ　ｔ＋：け、１１けなｒ：、、　
ｔ））いという因東関係か保証さ１１ろ、べ戸１トル命令はス（ソチン１）回路６０３を経て。スタッグ６０Ｇに入る　二のスタックを管理する回Ｖｆ
Ｉ　Ｍ　６３　ｎ　′７：＄＋　Ｚ＋　、該スタック管
理回路３’ｆｌ　６３　Ｑけパス６７０を介ｌ−てべ）
７　ｌ・ルプロセｙ廿１０２（第１１″ス１参照）から
のべ））１−ル命令読出要求信号Ｉ−、、ｌニー、、て
動作り、、　フ、９　ニア　−’）　６　ｎ　６ト（７
）へ４７　’（Ｊｌ／命告ヘーパフ９７１１−に送信オ
ろ５各プクソ′）管理回路ｒ；Ｑ８，６１４．６’３０てけ
命令かスタ・・ツクに入＋７．ろ゛くなろど、そわぞオ
ーパスに８ｎ、６８１，６８２を介し、て、命令管理回
ｖ各６０１に命令送出抑１１−を要求する。命令管理回
路６０１では、ＦＦ　Ｂ　Ｄ　Ｏ内の命令と該送出押出
指示との間の関係を調へ、命令続出処理を要求し・てい
るスタックとＦＦ６０　Ｑ内の命令タイプが一致し、て
いる場合は、主起憶装＠１０９の命令続出処理を中断す
るようにディセーブル信号をパス６８３を介し７て、］
］ン１−ロールプロセッサ１０の命令読出回路に送る。第７図はコン１へロールプロセッサの命令デフ１−ダの
後処理部分の概酩ブロック図である。コントロールプロセッサ１０１のための命令はパス６６
７を介してスタック７０１にスタックされる。：のスタ
ック７０１はスタック管理回路７０２によってスタック
内の命令移行が管理されている。該命令移行動作は第６
図のフェイスジェネレータ６０７のＢフェイズ出力信号
によって行わわる。該フェイズ信号はパス７５０を介し
、第７図の回路ブロックに送ら才ｌるものとする。セｌ
ツクタフ０３は初期状態ではパス７５１を選択し、でい
るどする。従って、スタック７０　］　：Ｊ−，の命令
けＢフ工ｒブ１１″ｆ号に同期してスタック１−を移行
し、命令１ノジフタ７０４に格納さ４１る。命１＝　１．′シフ９７　ＣＩ　、４十の命令はテコ−
タフ０５によ−、て解読さＪＩ、ベクトル−１「１セツ
サ起動命令（ＦＸＶＰｉ　ベクトルプロセッサ状態チェ
ック開始命令（ＳＴＶＰ）、オン、トびヘノノトル命令
と並ｔ１１τ実行τきる命令ど出来ない命令のＩｆｆ界
を示す命令（１１ｎＴ■Ｎｎ）どに分類さ才］る。Ｆ　Ｘ　Ｎ、７　ｒ’又はＳ　Ｔ　Ｖ　ｒが解読さ１１
だ場合、デコーダ７（１５はスイソチンク回路７０６に
よって命令１ノジスタ７０４十の命令か第２命令Ｉノジ
スタ７０７ｔ＼移行するようにテータパスの接続を行う
よう制御する９第２命令１ノシスタ７０７１−の命令は
第２す１−タ゛７０８によ−〕てデコー１；され、べ′
ノドルーツ°「１セツサ１０２の起動又はチェック情報
が牛１＋Ｑさ狛　パス７５”４（Ｔｌマ〕ノド信号どオ
ーダ又１」子−９信６を識別するため２重線で記述さ才
［ている）４介ｌ、てベタトルーｆｒ＋セッサ１０２に
必要＊、、！Ｗ示がろ゛さオ；ろ。二の１￥、テコ１−
夕７０５によ一、　−ｒ　Ｆｒｒ　７　ｎ　ｑが”　ｌ
　”に七ノドされ、二の↑前傾かパス７５５を介し、て
セレクタ７０３に送られる。一方、パス７５０を介し、て送られて来たＢ７エ、ｒズ
信号は第２フエ・ｒスジエネレータ７１０に入力さお、
Ｂフェイズをさらに分割したＣ、Ｄフエ、ｒスを生成す
る。Ｃ，Ｄ両フェイスは同時には生成さおない。ここで
１すＣフェイズをスタック７０１」二の命令デコードに
使用し、Ｄフェイスを第２命令レジスタ７０７上のＥ　
Ｘ　Ｖ　Ｐ又は５ＴＶＰ命令のデコート処理に用いるも
のとする。今仮に第２命令レジスタ７０７にＥ　Ｘ　Ｖ
　Ｐ命令が（８納されているとする。このときすでにＦ
Ｆ７０９け１′ビ′にセットさ汎て才ｊす、該ＦＦ７０
９の出力はパス７５５を介し、てセレクタ７０３を動作
せしめ、パス７５２上のＣフェイズ信号を選択し１てい
る。一方、ベタ１〜ルプロセノサ＋０２に送られた起動又は
チェック情報は該ベクトルプロセッサ１０２に作用し５
．そのレスポンスがパス７５６を介してＦＦ７１］に送
ら第１る。ここではＦＦ７］］のｆ直がＬＬ　Ｄ　ＩＩ
の場合、ベクトルプロセッサ指示が成功した二どを示し
，、１″″の場合失敗したてどを示すビする。ＦＦ７　
１　１の値は比較回路７１；）にトー１τ′″０″どＪ
上申ンさ才する。ーで′７′ー次の仮定を画く。セレ９タフ１４け初期：
Ｉｋｆｌ？τ目パス７５３を；巽ｔＲするどする，パス
７５゛３にけＩ）７エ、、（ス信号か送Ｉ）、土ｌて来
るものとする。！ヒｄｉ’／回ｖＩＩ７　１　２　力出力はパス７５７
を介し、で。ｎ　”１　ｘ　４ズ信号とＡ　Ｎ　Ｄ回部７１３で論理
積がどらｔ＋．ＦＦ７１１の内容が０′″となー〕だ場
合でか−）ｎ７７、（　スのとき、即ちＥ　Ｘ　Ｖ　Ｐ
　又けＳ　Ｔ　Ｖ　Ｐ命令が実１：？さ４劃たタイミニ
ノブでＦＦ７０９を″０′″にｊｌ＋・ノトオる一Ｆ　
Ｆ　７　０　９が１１　０　１１にリセッ１へさ、ｌま
た場合、セ１ツクタフ０３けパス７５１を再び選↑Ｈし
、、初期状態のようにＢ７エ、ｒズで命令のデコー１り
が行わオ（る−　ニオ１に反し、ＡＮＤ回路７１３τ論
理積をどった結果，ＦＦ７０９か１１セツトさｉ’Ｌ　
ｌ’ｉ”、い４０合　　セ１〕））り７０３けパス７５
２を選択［−、セ１ノテノタ７１４（土パス７５コをＩ
Ｉ択する。即ち，Ｃ′７エイスでは命令スタック７　０
　］　−１−の命令移行か行わす」、１１１反のＤフェ
イズでは第２命令レジスク７　０７．１−（７］ＥＸＶ
Ｐ又けＳ　Ｔ　Ｖ　Ｐ命令処理か行わ才する。Ｃフェイズ時、命令スタック７０１の命令けＥＸＶＰ又
けＳ　Ｔ　Ｖ　Ｐ命令を含まない。こわけベー′ノＦ・
ルプ「１セツサ１０２の起動を行った後に再び起動を行
ったり、直後にベクトルプロセッサ１０２の状態をチェ
ックする二とけ論理的に意味がないからである．もしＥ
　Ｘ　Ｖ　Ｐによってベクトルプロセνす１０２の起動
を行い，数タイミング後に該ベクトルプロセッサ１０２
の状態を調べるような場合けＳ　Ｔ　Ｖ　Ｐ命令の直前
にＴ３　０　Ｕ　Ｎ　ｒ１命令を必要とする。この条件
の下に以下の説明が成立する二とになるーＣフェイズ時、デコーダ７０５け命令Ｉノジスタ７　０
　４　１１の命令を解読し、スイッチング回路７０６に
よって、パス７５８と７５９を接続する。従−〕で、Ｃ
フェイズでは命令は命令１ノジスタ７０４から第３命令
１ノジスタ７１５に移される。第３命令レジスタ７１５
ｊ−の命令は上記の条件よりＥＸＶＰとＳＴ　Ｖ　Ｐ命
令は含まれない。第３１ノジスタ７　１　５）１の命令
はデコーダ７１６によって解析さ−　：１Ｇ− ｉｌ．ｐ，ｎＴＴＮｎ命令の時に限りＦＦ７　１　７が
′１″′じヤ・ソトさ：ｌする。ＦＦ７　１　７が′１
″′にセラ１へされる長二，パス７６０を介してセレク
タ７１４１オパス７５１”ｌど７５７を接続する。同時
にスタック管理回路７０２に命令の移行押上を指示する
。この処Ｔ！Ｈ　ニより．Ｔ’ｌｎＴＩＴｒｌ命令が出
現し，た後，Ｃ，　Ｄ７、ｘ（スけ／１１１滅し，、Ｂ
７エイズで第２命令レジスタ７０７ヒのＥＸＶＰ．ＳＴ
ＶＰ命令の実行が行わトろ。またＦＦ７　１　７かパ１
”となった時点又はスタック管理回路７０２で命令スタ
ック７０１に新たへ・命令をスタックできなくなった時
点の画集（！Ｉの論理和がＯＲ回路７１８でとＩ″Ｊわ
５命令送出抑＋ｌ−１’Ｎ示が一１ント・ロールプＬ１
セッ廿１０１の前段テ゛ー１ード回ＶｆＩ（第６図）に
パス６５８を介して送１゛＋　ｉｔろ，ＦＦ７１７のリ
セット−はベクトルプロセッサ１０２の１ノスポンプが
ＩＩ　ｎ　１１となるまで行われ内゛い、、該７＋トソ
プフ「１ツブ７１７のリセットはＡＮ　Ｄ回路７］３の
出力にＹリパス７６１を通して行わ４Ｉる。一Ｉｆ、第；３命令１ノシスタ７１５Ｊ−の命令がＢ０
ＴＪ　Ｎ　Ｄ命令以外の場合、ＦＦ７１７は゛′ビ′に
セットされず、当該Ｉノシスタ７１５１−の命令けＣフ
ェイズタイミングでパス７６２上に送り出される。パス７６２は汎用計算機と同様の命令デコード処理を行
うコントロールプロセッサ回路に接続されている。次に第８図乃至第１０図を用いて，ページングプロセッ
サによる拡張記憶→主記憶データ転送動作についてη（
ａ明する。ページングプロセッサ１０７においては、局
所メモリ空間の部分集合の生成と再配置をローカルメモ
リとマーク１ノジスタを用いて行う。ローカルメモリと
は拡張記憶装置上のデータをロードする際、データを一
時的に保持するメモリのことであし）、二のローカルメ
モリー１−でセグメントのアクセスが局所的に連続にな
るようｔ二要素のならべ替えを行う作業領域の：とをい
う。マークレジスタとはレジスタのビットが拡張記憶装置か
らロードしたデータの論理的な区切（たとえば語境界）
に対応し・、該区分がセグメントによってアクセス対象
となっている場合にマークレジスタの対応ビットが１″
′であり、アクセス対象でろ・い場合に”　ｏ　”とな
るデータ（以下このデータをマークという）を保持する
レジスタをいう。第８図はページングプロセッサ１０７の拡張記憶装置へ
の１１ク工スト生成部と読出されたデータをバッファリ
ングし，、マーク１ノジスタによる局所空間の再配置を
行う論理回路の概略ブロック図である。第８図の論理回路が起動されると，パス８５０を介し７
てＦＦ８００に拡張記憶装置からロードすべきデータの
個数がセットさ１する。同時にパス８５１を介してカウ
ンタ８０１．８０２がリセットされ、カウンタ８０２に
は拡張記憶装置からロードする局所メモ＋１空間の先頭
番地がセットされる。カウンタ８０２け該先頭番地とＦＦ８０３上の増分値を
毎クロツク加算し、、１ノジスタ８２３，パス８５３を
介し，て拡張記憶装置８０４ヘアドレスとして送る。第
８図には図面の簡約化のため、拡張記憶′ＪＡＷ８０４
へ送る起動及び動作指示信号パスとそれに関する論理は
省かれている。 −３９＝カウンタ８０１はカウンタ８０２と同期し，て動作１，
、増分値けＦＦ８２１によりＩＩ　Ｉ　ＩＩにとら第１
ている。該カウンタ８０１の出力はレジスタ８２２を介
し７比較回路８０５によって毎夕，ｒユング後比較され
、ＦＦ８００のロードすべきデータ数との一致がとられ
ると、カウンタ８０１．８０２のカウントアツプ抑止が
レジスタ８２４、パス８５２を介し、で行われる，この
抑止はパス８５１からリセット指示が来るまで解除され
ない。拡張記憶装置８０４にパス８５３を介してリクエストと
アドレスが送られた後，拡張記憶装置８０４から数十タ
イミング後アドバンス信号がパス８５４を介し・て送ら
れ、続いて読出されたデータがパス８５５を介り，で送
られる。ここでは説明を簡約ｆヒし，ページングプロセ
ッサの動作の基本的部分を明確にするため次の仮定を置
くことにする。１℃　　拡張記憶装置のタイミングはページングプロセ
ッサの４倍の周期である。（２）　　拡張記憶装置へのＩＪグエストに対するレス
ポンスデータ１１１けページングプロセッサの処理テ一
タ巾の４倍である。（′１１　　第８図は拡張記憶装着のタイミングで動作
し。でいるが、点線で囲まれた部分はページングプロセッサ
のタイミングで動作する。論理回路８１０はページングプロセッサのタイミングで
作動するサイクルカウンタで，値を０。１、２，３，Ｏ・・・・・・のように周期的にとり，こ
の値をパス８　５　６」二に送る。セレクタ８１１けパ
スＲ５６　１−の信号ｌ二よって制御され、拡張記憶装
置８０４からＤ−ドされて＋？ＦＢ　＋　２〜８１５上
に保持されているデータを逐次ｉ！！！ｔＲ　Ｌ−、結
果をパス８５７」二に送る。一方マークテータはパス８６０を介して送られ、ＦＦ８
　］　６にいったんラッチされた後、スタック８１７に
送られる。スタック８１７では拡張記憶”Ｊｉ　置８　
０　４からのデータアドバンスが来るまでマークデータ
が保持される。第８図では図面の簡慴化のため，スタッ
ク８１７け２段しか示されていない。パス８５１にアド
バンスか来た後、スタック８１７中のクークデータけＦ
Ｆ８　１　１’ｌに移されるａＦＦ８１８上のマークデ
ータも又ＦＦｌ’ｌ　Ｉ　２へ・８１５１のデータと同
様セレクタ８１９に十−］τ升４　’７　ＩＩ　ソり　
ｌ：ｉ！ｌｔｌ’ｉ！が行わわ、パス８６１１ｍ送られ
る。このマークデータは論理回路１’１２０の入力とな
り、マークか１″′ならば′パス８５７−ｔ：の信号を
加工せずにパス８６２−）：に送る。もしマークかｔｒ
　（Ｉ　ＩＩならけ入力テータに関係なくパス８６２上
のデータを”　０”とするここまでの処理によって、局所メモｉ１空間の部分集合
の再配置に関する準備が行われた二どになる。第９図は第８図のパス８６２上のデータを、主記憶装置
に書込むため、ｍとびのアクセスを部分的に連続とする
ように一部のデータの順序を入替る処理を行い、かつ主
記憶装置の書込先のアト１ノスを生成する論理回路の概
略ブロック図である。パス８６２上のデータは一度ＦＦ９００にラッチされる
。次のタイミングでＦＦ９００のデータがパ０″″であ
るか否かのチェックが比較回路９０１でなされる。ＦＦ
９００上のデータが１７０　ＩＩでない時　パフ、！１
　”、ｓ　０．１−にバ１１ノド信号か送り出され。子−９が”　ｎ　”ろ・らげパス！１８０−ににバリッ
ドがｊス出七１１ｚ、　　論理回路９０２け、ＲＡＭ書
込ポインタ！Ｉ　ｎ　’ｌ、ＲＡＩＶ（読出ボ・ｒシタ
９０４、主記憶書ｉ３ポｒンタ９０５の作動と停車を制
御する管理回路τある８仮に管理回路１〕０２のパス９
５２　ｈの信号を・１ト込ポインタ個１作可どする。（
パス９５１（１パス！１５２，９５３，９５４の東線で
ある。）二の時、パス９５０トのバリフ１；信壮どパス
９５′：！１・の許可信号けＡ　Ｎ　Ｄ回路９０６で論
理積がとｔ′Ｉ才ｊ　井込ポインタ９０３のカラン１ヘ
アンプバリリドと九゛ろ　該カウントアツプバリッドか
生成さ１１ｔ−とき、書込ポ（ンタｎ０３は、ＦＦ９１
７の更新値によるカウントアツプを行い、ＲＡＭ９１０
゜９１１の書込ア１〜ｌノスを牛Ｔ友する。ＲＡＭ９１
０　。ｑｌｌ（士［１−カルメモリであり、次のように用い１
゛、狛ろ。・１−１　　け[
7, a large-scale program existing in the expanded storage device is executed on a plurality of main storage devices each having a smaller capacity than the expanded storage device. [Embodiment of the Invention] Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 shows the basics 1i1' of the vector processing device according to the present invention.
f IR, -1 control processor 101, vector processor 102, paging processor 107, and storage devices 10/I, 105, 109 . Expanded storage device 10
8. The switching circuits 103 and 106 are configured. A program that specifies vector processing is written using object code with the following structure. When a set of vector and scan instructions that make up a program specifies processing operations for data in a memory space that is different from other sets, the second set of instructions is called a segment. The data space pointed to by a set of instructions is called the local memory space. Many segments and local memory spaces defined in this manner can be found in scientific computing programs. For example, in the o-loop as follows, D○ 100 1
=1. N +00 A (T) = B (T) + C (T) r') O The sequence of vector instructions that constitute the loop is a segment, and the arrays A, B, C and r) 0 control variables T, N are The memory space it occupies is local memory space. Thus, programs can be defined by segments and local memory spaces. According to this definition, sufficient processing is necessary for program execution. Regarding the program PL in Figure 2, a-C segments are shown, T-a is an instruction to create a local memory space in the main memory where the segment will be borrowed, and STa is an instruction to create the memory space in expanded storage. Indicates instructions to be stored on the device.The same applies to segments a and b.A program consists of segments that specify data processing, local areas,
Although a program can consist of a memory space and instructions that manage that space, the program itself is also defined by segments and local memory space. This will be explained with respect to PIl in FIG. Here, in STb, the data processing is STh
Assume that it can be divided into before and after. This assumption is logically constructed based on the assumption that the program executes each component segment in order, so it is valid. STb
If the program division at the point is line -], then ① an instruction to rope ink the instruction group from the a instruction to the STb instruction onto the main memory, an instruction to store the instruction group in the extended memory 'JA location, and the progera 11 By defining instructions for instructing the execution of 1), the programmer 11 can be reduced to three types of operation definitions: loading, execution, and steps 1 to 7 of the program itself. This reduction is shown in program PI of FIG. The condensed program shown in 2 is a relatively small memo +
It can be held in one space until the execution of the program is completed. For scientific and technical calculations. A program cannot process itself, but if it performs more complex data processing in the future, it is possible that the program will transform itself. In such a case, the desired processing can be performed by placing an instruction for transforming the program itself next to an instruction for loading the program into the main memory. P2 in FIG. 2 shows the structure of a program that transforms this program itself. In P2, the α portion is a segment that deforms the procla 12α portion (this is composed of segments a and b). JTi, tb, as shown in Figure 2, the part that loads, edits, and stores the program into the main memory [Pro/7 RAM Kernel] , the portion that is subject to quotation by the Progera 11 kernel is called ``Program Bote, I''. In the vector processing device of FIG. 1, the program shown in FIG. 2F) is processed as follows. The controller 1 is initially placed on the expanded storage device 108. When the vector processing unit is started, the controller 1 is
The program processor 101 starts the paging processor 107 and loads the program kernel portion of P7 in FIG. 2 into the main storage device 109. After the T1 roll processor lot receives the rope ink completion report for the processor set 107,
Execution starts from the first instruction of the 01-digit program kernel section. The first instruction of the program kernel part is Progera 1
Since this is a command to load 1 body 7fα to the main memory, the second command loads the program [1η to 11] from the expanded storage 108 to the main memory 109.Next, the control processor 101 The program body loaded into the main memory 109 is modified according to the instruction from the section α and the next segment EX and E C indicate the instruction processing position of the control processor 101. ) is rewritten and control is transferred to the body of the progera 13. This operation is similar to an unconditional branch in a general-purpose computer. Since it is an instruction L a to perform a herotation, the control processor 101
Page processor 107. An instruction is given to the switching circuit 106 to load the local memory space of segment a on the extended storage device +08 to the main storage device 104 or 105. In this second example, it is assumed that the local memory space is allocated to the main storage device 104. Control processor 101 then executes the instructions in segment a under the control of cpsw. Assume that a VEG1HELL instruction appears in segment a. At this time, to Con 1, "1-Le Processor 101
The vector processor I n 2 and the switching circuit 103 are activated and instructed, and instructions are given to connect paths so that vector processing can be performed in the main storage device 104 on pairs 1-4. The vector command is read from the main memory 109 by the processor 101 and sent to the vector processor 102, where it has already been loaded into the main memory 10'I'. Vector processing is performed on the local memory space of ; While the vector circle is being executed in the vector loop 102, the -1 control processor 101 has no command to wait for the completion of the beta 1 hell processing immediately after the vector 1 hell instruction.
Executes the next instruction. Assume that the next command is a rope and ink instruction for the local memory space of segments l-h following segments 1-a to the main memory. At this time, 1 setter 101 switching times 7fj I n 6
, instructs the paging processor 107 to allocate the local memory space of segment b of the extended storage device 1081- to the main storage device 105. Gono r+ -=-1: The operation can be performed at the same time as the vector processing raccoon. In order to effectively use this function of simultaneous execution of row 1 operations to the main memory in the local memory space and the parallel processing, it is necessary to optimize the program body of the object code. The program 11 body part on the extended storage device initially has internal/structural information as shown in Figure 3a.In the program of Figure 3a, each segment executes its segment i. It has an instruction to allocate the local memory space necessary for the main memory to the main memory, and an instruction to store the result of the processing specified by the segment to the expanded storage. Since the raw processing unit instructs the sequential processing of loading the local memory space to the main memory, processing the data, and storing the processing results to the extended storage, performance cannot be expected to improve.The control processor 101 and the paging processor 102 In order to enable parallel operations between the two, the instruction string is divided into the following four types: ■ Instructions for data processing in the control processor (hereinafter referred to as C instructions) instruction (hereinafter referred to as P instruction) (j) 1 of the instructions (hereinafter referred to as CW instruction) to check the status of the paging processor in the control processor. It is loaded into the main memory 109 in FIG. It is determined whether the instruction is held in the control processor or sent to the paging processor.If the P instruction is held in the control processor, for example, if the paging processor is busy and the P instruction is If it is not accepted, if you wait for the control processor or page processor to finish, it will cause a serial state in the control processor's instruction decoding process. Even if the P instruction cannot be executed because the paging processor is busy, if the next instruction after the P instruction is logically executable, the next instruction is executed and the busy state of the paging processor is released. It is necessary to immediately execute the P instruction in the execution pending state.In order to perform this type of instruction overtaking, the instruction decoding process is divided into two systems of C9P instructions.However, the instruction execution of the control processor Since this cannot be executed conceptually like the code in Figure 3a, the code processing of both the C and P systems is time-sliced 1, and the commands are controlled to be executed alternately in a time-sharing manner. . With this control,
After executing the C instruction, the P instruction is started at a certain timing, and when the paging processor is released from the busy state, it is possible to immediately send paging information to the paging processor. Only L・
, this time-division decoding processing does not include any synchronization processing in the processing of instructions of both systems. For this reason, for paging as shown in FIG. 3A, a case may occur where the operation of a program that specifies serial processing is different from the one12=1'PI1-fJ-. For example, the third
In Figure a, when the P instruction ■, a is executed, if the time-sharing tick 1-1- processing is performed, the data processing of Se Gumen [-a starts even when the La instruction is not processed. It's been done and it's gone. In order to eliminate such inconvenience, CW. PW series instructions are required. Add this series of instructions [
7. New object code generated as shown in Figure 3 (1) on the right. The object code in Figure 3 assumes the following processing. (1) The paging processor has means for distinguishing between the local memory characters and the main memory to which they are to be loaded. -) When loading of the local space is completed, information indicating to which main storage device the data transfer has been completed is reported to the control processor. ) The main memory whose local memory space is ``l-'' is commanded as 1.2 and used according to the order of 2. 5 The ν column is an instruction for the 11 control processor. However, the instruction for the paging processor is also
Leverage processing is performed by the control processor. The object code having the structure shown in Figure 3 is executed as follows. First, the L a instruction in the μ column is decoded, and the paging processor is instructed to start transferring the local memory space required for segment a. At the next timing, TJn of column ν
The Ljll instruction is decoded, but since this instruction waits for the completion of data transfer to the main memory (1),
The next segment a in column ν is not executed. After a certain timing, the r-b instruction in the μ column is decoded. If the paging processor is busy at this time, this L b instruction will not be executed and will be held on the instruction stack. At the next timing, control shifts again to the ν-column command. At this time, the TJntill instruction is executed again, and the control processor waits for the paging processor to finish. When the paging processor completes the transfer of local memory space for segment a, the instruction on the instruction stack, ie Lll, is immediately executed when the control processor's control is over the instruction in the μ column. -1 n1
~ When the control of the roll processor is in the command column 1, TJltjll is executed, and the 2(h part is immediately completed L7
From the next timing, secnont a will be executed. In this state, at a certain time, the instruction in the μ column will be executed, and at the same time, the instruction in the ν column will be executed. When the instruction in column 71 reaches the W instruction (see Figure 3 i), μ
The decoding process of the instructions in the column is stopped until the Rel instruction is detected in the instruction column of the ν column.1. do. After the data processing of 1 cell or 1 cell is completed 1-2, the order in which the target local memory space is transferred to the extended storage device is determined by the W command of the worker. can be guaranteed. −
jr, ν column instructions are TJnti, ] after the Rel instruction.
2 instructions (I), the transfer of the local memory space to the main memory (2) has been completed, and if so, execution immediately moves to the next segment b. In Figure 3, IT is in column ν.
A series of n white 1 series commands or E is issued, but this causes S in the p column.
Since the instruction to transfer the local memory 11 space in the same main memory like the Ta, f-, and c instructions is not performed, the completion report from the paging processor = 15- corresponds to the fact that two instructions are sent at the same time. ing. FIG. 4 shows the busy status of the roll processor and paging processor in the object code 1 having the structure shown in FIG. 3 when it is executed by the object/raw processing device of the present invention. In Fig. 4, the dotted line represents 1 cell, way i, and time. As is clear from Fig. The control method between the control processor and the paging processor shown in L7 above also holds true between the vector processor activated by the control processor and the control processor. However, the vector processor has data processing characteristics that make it difficult to control the processor weight.The structure of the object code when this vector processor is controlled is described below. It is performed only within one segment 1~, and is not performed across multiple segments.If you think about it in the simplest way, it will have a structure like that shown in Figure 5a.
In FIG. 1A, the instructions in the ν1 column represent the vector instructions IJJ, , +l. Instructions other than the X little instruction are hereinafter referred to as scalar instructions. Control 1-role "This is an instruction to control the processor that is generated from the 1st processor, and a minimum 1st-order instruction sequence is required. Setup instruction to prepare for t2 > 7 reset command (called J゛〕, lower EX input 7p). -J, lower TVP). According to the three types of instruction system in 2-], =+n1-role ``One second instruction de:1-de changes the timing logic-and controls the vector processor. becomes possible.
Then, by the instruction shown in FIG.
Considering the operation of the processor, as is clear, the control of the processor is
■When the P instruction is reached, at that point the data processing of the =1-[1-le processor ends], TVPJ',
, the following scalar processing a-instruction sequence cannot be executed. This is the invalid time caused by adding one sensor to the control processor. is further subdivided into the following according to the type of data processing. Processing (hereinafter referred to as a7)
Processing) (Processing that must be performed after AI vector processing (hereinafter referred to as a, processing) r7j) Command to start checking the completion of solid 1 to 3 processing (hereinafter referred to as 5TVP) Condolences) a, processing, a4 processing Command indicating the boundary of (hereinafter referred to as Bn
rTNr1) When segment-a is reconfigured by the instructions shown in the above classification, Figure 5 is produced. The control of controllers 1 to 2 and the vector processor is as follows:
:fll series of instructions = 1 n i [] - le processor τ
Executed 1171. The EXVP command is = 1 nt ``1-proceno square τ child-1-de 41. Vector processing - move, Vegt I +, / life sequence is activated 1 The vector 1 is read out from the main memory 109 (see Figure 1) via the
Day 1-1. s i+execute]. a
. The series of instructions are asynchronously leveraged and executed by the control processor set r1.
By the T■P command, the phase of :I Trolp [1 senomasu child:"-Dosarkurke 2-] is 11 times before division.
In the first step, the data processing in the vector processor is completed, and the check is made as to whether or not the second σi ”
7. When the two series of commands are executed, the two are defined, and occupy the time between each other. If the cheater processing for 1 set is completed, the first phase is turned off. In the processing of the a7 instruction, if the BnuNn instruction is detected, the second phase disappears. By introducing the two-way de:1-toface, it is possible to check the completion of vector processing and at the same time execute part of the processing of the scalar instruction. -Loop [“Invalid time” caused by adding a vector processor to one setter]
can be minimized. 1! L I:, Although we have discussed minimizing the ineffective time for terminating solid 1-hell processing, exactly the same argument can be made for vector processor activation. Next, consider the location of the program. As mentioned above, the locations of the progera 11 and data are defined by the real addresses of the extended storage device l-. Naturally, the data processing procedure of the program is also described by the location. On the other hand, when loading the local memory Il space to the main memory and performing two data processes on this loaded new space, if the previous location makes it impossible to perform two data operations,
Set the local memory space to any address in main memory 1-+ -I
The control processor sends information indicating whether
20 = local/memo 11 space rl −1 command to deni 1−
``1 nt ``J-rupu''
1 set (Huff processing, vector processing of vector processor 1-F-1) Performs the memory access at 1 nos relocation that occurs. The 2nd located at 1 nos indicates the real address of the main memory. Therefore, 1 -torped processor, Vector Palp "7 de Inosuri""One -one shyon mechanism" It is possible to execute a program in the local memory space 4: The above discussion is based on paging program r: 1 session's local memo 11
It is only concerned with space management, and not with the structure of the local memory space itself.Vector processing involves performing uniform data processing on a specific data group, so scalar processing As in [order of processing]
Regarding the above data group, it is not necessarily necessary to guarantee the elements in the data group. Considering the following 110 loops, nO1001=1. ]OOO A (I) 2B (■) 10C (■) 100 C0NT To I TJ EΔ To 91~R elements in the order I+ 2+ 3+...R] Need to be calculated, 1 +' 3+ 2+ 4 + 5
+... may be the order. m = DO small loop described above In the same segment, there is a Q loop as follows, which is traced. nn 2nOT=1,999,2 n (T') = l"3 (T)*C(T')2 o
n CoNTT NTJE The latter DO small loop is different from the former loop. Access to C Beta 1 Hell is not continuous but discontinuous. Veg1Hell processing is a process that allows the pipeline arithmetic unit to perform take processing without interruption to improve processing speed.
It is more efficient to place operand takes at consecutive addresses than at non-consecutive addresses. If the B solid 1~le element is located as I 12131, the second [I cation is suitable only for the former solid 1~le processing of D○, but the former is suitable for the latter D○ processing. It's not quite suitable. but,
I,3,2.4,
5.7.6+ 8+a) yo-), the latter I]
○Kuru principle (Even in 2, it is continuous 7' nose: 1, take processing efficiency is 1 -. In no processing, the order of takes is guaranteed in vector] le processing. However, since it is white without access, the data processing efficiency will be lowered and the data processing efficiency will be lowered. If lIH exists, divide the array into m*n subsets by using m*n's fI1 plus 1]. Even if you perform an operation like ';jlt I#lnear',) you can separate the subsets that are pointers and separate them.
·stomach. The second separation operation L; The access to the old zowa within 5 minutes in 1 part is 1:17. fI: When loading the local memory space 1'Z in the extended storage device to the main memory (-1i IFi), it is recommended to rearrange the subset of the array in In the second step, we optimize the tiVt structure of the local memory space, and the plateau f of vector processing.
You should be able to do 2k. As f'8', the processing distribution is performed in 1 seso vectorp and 1 seso sweet, and the high speed of the vector processing device can be further improved. Optimizing by set relocation, loading the first subset that is not accessed into the main memory L, does not mean that the segment is not accessed.Loading the subset that will not be accessed into the main memory,
If not, the main memory 8 size will be smaller, but when writing the local memory space to the expanded storage device, the portion of the main memory that has not been accessed will be loaded prior to the write operation. It is necessary to read from the expanded storage device, perform the jψ operation for redistributing one subset, and write the result to the expanded storage device. Access to the expanded storage device is different from access to the main storage device, and is fast only when one type of access is continuous, so successive writes to the expanded storage device occur alternately as shown above. , the take transfer speed will drop significantly. Therefore, when a segment accesses and loads a subset of the missing local memory space into the main memory and writes it to the local memory space a: lij', l'jf3, the main memory Tongue 1.4! l1f
i: Access and rearrange part 1 j game! Perform the operation 11) to transfer the data to the storage device 1. J″l, 1 Perform segment processing of 1 graph 11, limit the information + lI of the main storage device rt to the local memo 11 space, or -) create memory space by rearranging a subset of the local memo + 1 space. To optimize the segment processing of
The large memory space required for each program's execution is 6 U21. The outline of the preprocessing part of the instruction decoder of fj control [1-rou/'r+set] in the Vedator processing device of the invention is [1]! This is Figure 7. In FIG. 6, the command is sent from the main memory 109 (see FIG. 1) to the flip-flop knob (hereinafter referred to as FF) 600 via path 650.The command of FF6001 is sent to the command management circuit. 601, the pacing processor ``J processor'', the T1 nt ``1-le processor,'' and the 91- loop ``l processor 1'' are classified as commands for the pacing processor 7 through the FF 602 for timing adjustment. r soting circuit 603
The instructions for the pacing processor are sent to the instruction stack 604, the instructions for the ML control processor are sent to the stack 605, and Betta 1 help [The command for 1 set is entered in the status 606. 608 is the command),
This is a circuit that manages instruction transfer in Taraino 60/l)
, the circuit operates during a specific phase defined by the phase generator 1 notaro 07. The following two phase r minks are called A face, and the commands entered in the stack 304 are 1.
The instruction enters register 609. The instruction 1 register 609
The above instructions are child-coded by the decoder 610 and classified into one local memory space rotor or store instruction, a wait instruction, and its pool instruction. In the case of a rotary/S1-A type instruction, use the selector 12-digit path 51, in the case of a wait instruction, use the path 652, and in the case of other instructions, use the path 653.
Select. -Jj Paging processor 107 (see Figure 1) state (1 pass n 5 ・I l '+ff4 L
TE F F 61] 11: (Stupid EIF main 11τ ai) 6'''2 paging group ``1 set or 1' sea I
I 'p is F F t; 11 The force state is "B" and 1.
Even in the case where τ is smaller than τ, it is assumed to be −”n”. In addition, the stack management circuit ('in8 ni ke pass 6!' i 5 right) [,
τ Opposite transition extrusion information 1"!, +', 41romono-oro-"l -l:・S For 1 to 7 series commands, selector 6
12 (''puff, [) 51 [355 connection right, so
Page solip "1st...) 107 is busy tlj
The l111, o-do or stgo' instruction remains in the instruction 1 register 609F. Pacisogep ``I'm busy at 107. I'm busy now, pub 651 and 6.''
The instruction transfer permission is given to the stack management circuit 608 via the path 65 (11), and the local memory space σ) read or store instruction is transferred to the paging processor +07 via the path 65 (i'j;r'Ii1). :g. Pagen at 1')) The command operation for the 1st setter 101 is related to the RE1 (release) command of the 1st setter 101, so it is The processing will be explained below.
After entering, the stack management circuit 614 manages instruction migration. The stack management circuit 614 ke phase shene 1 notaro 07 operates when A fue (generates 7 AIDS other than Z) [-1 (J!J, this phase below is called B face). Receive the type of instruction transfer extrusion instruction through paths 657 and 658. When there is no instruction, the stack 605 is transferred every time the instruction B phase is generated, and it enters the instruction register 613. On the instruction register 613 The instruction is decoded by the decoder 615, and in cases other than the R, E, and L instructions, '1' is set in the FF616.
This indicates that the control processor 101 is currently executing a segment of the program. TE:】-T6
15 is RET, when the command is decoded, FF615 ke II
n Reset to D. Main storage device f'ff in Figure 1
1ll') 4,105 consider the following state. (1) The local memory space in the main memory is [1-dogo11゜11soll-rl-loop [1 set and vector prostheno or l', %, river hit (I knee:) tW': 'E memory (=!21F' paging to f-1)
The paging processor 107 is in the state of the storage device 1i'! - The report is sent to the control processor 101. In Fig. 6, the path 660°6 (through 51 τ sol +) l main memory!X
Ji 1st page 10/l, 105 condition report is covered J't
, FF 620, 621 to 1 [Terasa Saisuruichi FF6. l! If the pagen'nozu Royale 107 is in the generate state, the 1st shift of pagen'nozu Royale 107 is in the generate state, 1"1 is in the 1 note state.If n 14. When the i + instruction is decoded, an instruction is given to the selector 622 to adjust the state of the main memory device 104 or 105, which is the main memory device 104 or 105, and the state of the FFI'i2Q cross 621 is changed to pass. 6
The data is sent to the selector 23 via 62v and 663. Se1 Noritaro 2, i'r-keTTntjl type command, pass [;6411111, 1dr+Iil type 1゛J
, 1itR is applied to the path 665 side when the instruction is outside. If the instruction LTntill is decoded by the decoder 615 and "]'" is located on the main storage device 104 side, one concatenation of paths 662.66/l and 657 is performed, and the FF62
A state of 0 is sent to the stack management circuit 6]4. At this time, if the content of the FF 620 is "1", the instruction transfer process of the stack management circuit 614 is pushed out 1 (, instruction register 6
13''-, the TJnt and jll instructions are retained. This instruction retention process is performed until the paging processor 107 becomes ready. Paging processor 107
When the is in the ready state, instruction transfer is performed, and pass 66
The instruction is sent to the next-stage instruction decoding section of the control processor 101 via 7. For songs other than those in the Until series, the selector 623 selects the path 665 side, so no command transfer instruction is issued to the stack management circuit 614.
Instructions are sent out on path 667, t: regardless of the state of paging processor 107. Returning again to the explanation of the instruction code processing by the paging processor 107. If the command is 09 to 11 engineering r1, the command is east, 7
-1-ta [;According to the instructions of 10, set 1))) Taro 12-digit pass ri! ”i:! Δ selection IR 2nd) Therefore, 11-1
To [1-Le Brosset...) Order for Sweet 101 RE■
, issued L@, limited to 1 set of 10
In the state of waiting for 7 commands, the command is issued at the end of segment processing, and the processing within one local memory space is 1r1-2! -f, 1 space is included in the extended storage device 108.
t)) It is guaranteed that the Into relationship is 11. The Beto 1 Tor command passes through the Su (Sochin 1) circuit 603. Enter stag 60G, manage the second stack Vf
I M 63 n'7: $+ Z+, via the stack management circuit 3' fl 63 Q path 670)
7 1-le instruction read request signal from 102 (see 11th section 1)) 1-le instruction read request signal (7)
) to 47' (Jl/Order sent to Hepuff 9711-5 each pukso') Management circuit r; Q8, 614.6' 30 Enter command or start +7. Of course, it requests the instruction management circuits 601 to suppress instruction sending 11- through the OPUS 8n, 681, and 682. The instruction management circuit 601 checks the relationship between the instruction in the FF BDO and the sending extrusion instruction, and determines that the stack requesting continuous instruction processing and the instruction type in the FF 60 Q match. In this case, a disable signal is sent via the path 683 to interrupt the instruction successive processing of the main memory @109.]
] to the instruction reading circuit of the 1-roll processor 10. FIG. 7 is a general block diagram of the post-processing portion of the instruction differential 1-der of the controller 1 roll processor. Instructions for control processor 101 pass 66
7 to the stack 701. The stack 701 of : is managed by a stack management circuit 702 to transfer instructions within the stack. The instruction transfer operation is the sixth
This is performed by the B phase output signal of the face generator 607 shown in the figure. It is assumed that the phase signal is sent to the circuit block of FIG. 7 via path 750. Cell
In the initial state, the Tsukutafu 03 selects the path 751 and inputs it. Therefore, the stack 1- is shifted in synchronization with the instruction number B of the stack 70]:J-, and is stored in the shifter 704 of the instruction 1. Instruction 1 = 1.' shift 97 CI, 40 commands are a lever
Tough 05 decoded JI, Vector-1 "1 setter activation instruction (FXVPi) Vector processor status check start instruction (STVP), ON, t11τ execution τ instruction as well as t11τ execution τ instruction IF. Commands that indicate the field (11nT■Nn) are classified as: F X N, 7 r' or S T V r is decoded 11
In the case of It is decoded by the second stage 708 and
The start or check information of the 1 setter 102 is written with double lines to identify the 1 + Q path 75 4 (Tl MA) throat signal order or 1 child - 9 signal 6. ) 4-year-old, required for Betatoru fr + Sessa 102*,,! W can't be shown. 2 no 1 yen, lever 1-
At 705 pm, -r Frr 7 n q” l
” and sent to the selector 703 via the second forward tilt path 755. On the other hand, the B7 and r signals sent via the path 750 are sent to the second forward tilt generator 710. Enter the
The B phase is further divided into C, D phase, and r phase. Both faces C and D are not generated at the same time. Here, the first C phase is used to decode the second instruction on the stack 701, and the D phase is used to decode the second instruction on the second instruction register 707.
It is assumed that this is used for decoding processing of XVP or 5TVP instructions. Now, if the second instruction register 707 is
Suppose that the P command has been stored (8).At this time, F
The FF70 is generally set to F709.
The output of signal 9 passes through path 755 and operates selector 703, which selects the C phase signal on path 752. On the other hand, the activation or check information sent to Beta 1 to Luprosenosa+02 acts on the vector processor 102.
．． The response is sent to the FF 71 via a path 756. Here, the f direction of FF7]] is LL D II
A value of 1'' indicates that the vector processor instruction was successful, and a value of 1'' indicates that the vector processor instruction was unsuccessful. FF7
1 The value of 1 is sent to the comparator circuit 71;
I'm sorry for the inconvenience. - and '7' - draw the following hypothesis. Sele 9 tough 14 ke initial stage:
Ikfl? Assume that the τ-th path 753 is sent by Tatsumi tR, and the path 75'3 is I) 7E, , (S signal is sent I), and SatI is sent. ! Hdi'/timesvII7 1 2 Force output is pass 757
Through and with. If the logical product of the 1 x 4 signal and the AND circuit 713 is t+.The content of FF 711 is 0''', then -) n77, (When S, that is, E
The S T V P command is actually 1:? Set FF709 to ``0'' with the tied mini knob that you turned.
If F 7 0 9 goes to reset 1 to 11 0 11, select the setter 03 path 751 again ↑H, and decode the instruction 1 at B7 and rs as in the initial state. Contrary to 1, as a result of calculating the AND circuit 713τ, FF 709 or 11 is set i'L
l'i", 40 go, 1))
Select 2 [-, Se1notenota 714 (Sat pass 75 pieces I
I choose. That is, in C'7 Ace, the instruction stack is 7 0
] In the D phase of 111, execute the command transition of -1-. 07.1-(7) EXV
P-straight S T V Executes P command processing. During the C phase, the instructions in the instruction stack 701 do not include the EXVP or STVP instructions. Kowakebe'no F.
This is because there is no logical meaning in starting up the vector processor 102 after starting up the vector processor 102 or checking the state of the vector processor 102 immediately after starting up the setter 102.
The vector processor ν102 is activated by XVP, and after several timings, the vector processor 102 is activated.
If you want to check the status of , you need the T3 0 U N r1 instruction immediately before the S T V P instruction. Under this condition, the following explanation holds: During the C phase, the decoder 705 and the instruction I register 70
4 decodes the 11 command and connects paths 758 and 759 by switching circuit 706. Follow-], C
In the phase, the instruction is transferred from the instruction 1 nozzle 704 to the third instruction 1 nozzle 715. Third instruction register 715
J- instructions do not include EXVP and STVP instructions due to the above conditions. The instruction of the 31st register 715)1 is analyzed by the decoder 716. Only when p, nTTNn instruction, FF7 1 7 is '1'''.FF7 1 7 is '1''
The selector 7141, the opus 751'', and the like 757 are connected via the path 760, which connects the selector 7141 and the selector 757 to the cell 1 through the path 760. At the same time, the stack management circuit 702 is instructed to move the instruction. This place T! From H d. After the T'lnTITrl instruction appears, C, D7,
EXVP.7 in the second instruction register 707. ST
The VP command is executed. Also FF7 1 7 or Pa 1
” or when the stack management circuit 702 can no longer stack a new instruction on the instruction stack 701, the OR circuit 718 outputs the logical sum of ! 'N is 11th rolep L1
In the first stage VfI (FIG. 6) of the first stage of the set 101, the reset of the FF 717 is carried out through the path 658 until the 1 node pump of the vector processor 102 reaches II n 11. Yes, the reset of the 7+tosopf 1 block 717 is performed through the Y repass 761 to the output of the AND circuit 7]3.
In cases other than the TJ ND instruction, the FF 717 is not set to 'BI', and the instruction of the relevant I/O sister 7151- is sent onto the path 762 at the C phase timing. A path 762 is connected to a control processor circuit that performs instruction decoding processing similar to that of a general-purpose computer. Next, using FIGS. 8 to 10, we will explain the data transfer operation from expanded storage to main memory by the paging processor η(
a. In the paging processor 107, a subset of the local memory space is generated and rearranged using the local memory and the mark 1 register. Local memory is a memory that temporarily holds data when loading data on an extended storage device.The local memory 1- is a memory that temporarily holds data when loading data on an extended storage device. The : of the work area where the sorting is to be performed. A mark register is a register bit that marks a logical delimitation (for example, a word boundary) of data loaded from extended storage.
Correspondingly, the corresponding bit in the mark register is 1'' if the segment is being accessed by a segment.
', and is a register that holds data that becomes "o" when it is not an access target (hereinafter, this data is referred to as a mark). FIG. 8 is a schematic block diagram of the paging processor 107's logic circuit for buffering read data and relocating the local space using the mark 1 register. When the logic circuit of FIG.
The number of data to be loaded from the extended storage device is set to 1 in the FF 800. At the same time, counters 801 and 802 are reset via path 851, and the starting address of the local memory+1 space to be loaded from the extended storage device is set in counter 802. The starting address of the counter 802 and the increment value on the FF 803 are added every clock, and the result is sent to the extended storage device 804 as an address via the 1 register 823 and the path 853. In order to simplify the drawing, FIG.
The activation and operation instruction signal paths to and associated logic have been omitted. −39= Counter 801 is synchronized with counter 802 and operates 1,
, the first value is taken from II II II by the incremental value FF821.
ing. The output of the counter 801 is compared every evening by the 7-comparison circuit 805 via the register 822, and when it matches the number of data to be loaded into the FF 800, the count up of the counters 801 and 802 is inhibited by the register 824, This inhibition, which is performed via path 852, is not released until a reset instruction is received from path 851. After the request and address are sent to the expanded storage device 804 via the path 853, an advance signal is sent from the expanded storage device 804 to the path 854 after several tens of timings, and then the read data is sent to the expanded storage device 804 via the path 855. Sent via intermediary. Here, the explanation will be simplified and the following assumptions will be made to clarify the basic part of the operation of the paging processor. 1°C Extended storage timing is four times the period of the paging processor. (2) The response data for the IJ request to the extended storage device is 111 times the processing data width of the paging processor. ('11 In Figure 8, the circuit operates at the timing of expansion memory installation. However, the part surrounded by dotted lines operates at the timing of the paging processor. The logic circuit 810 is a cycle counter that operates at the timing of the paging processor. The value is periodically taken as 0, 1, 2, 3, O, etc., and this value is sent to the path 856''2.It is controlled by the signal l2 of the selector 811 and the path R561-. , the data stored in +?FB+2 to 815 that has been D-coded from the extended storage device 804 is sequentially sent to the path 857''2.Meanwhile, the mark data is sent to the path 860. Sent via FF8
] 6 and then sent to the stack 817. In the stack 817, the extended memory "Ji"
The mark data is held until the data advance from 04 is received. In FIG. 8, only two stacks of stacks 817 are shown to simplify the drawing. After advance comes to path 851, cook data table F in stack 817
The mark data on aFF 818 transferred to F8 1 1'l is also transferred to FFl'l I 2. Similarly to the data of 8151, the mark data on aFF 818 is transferred to selector 819.
l:i! ltl'i! is performed, and path 8611m is sent. This mark data becomes an input to the logic circuit 1'120, and if the mark is 1'', the signal on the path 857-t: is sent to the path 862-): without processing.
(I II By setting the data on the path 862 to "0" regardless of the input data so far, preparations for relocation of the subset of the local memory i1 space have been completed. 9th In order to write the data on path 862 in FIG. 8 to the main memory, the figure performs processing to rearrange the order of some data so that m-separate accesses are partially continuous, and This is a schematic block diagram of a logic circuit that generates At1NOS as the writing destination of the device. The data on path 862 is once latched into FF900. At the next timing, it is determined whether the data in FF900 is PA0''''. A check is made in the comparison circuit 901.FF
When the data on 900 is not 170 II, puff! 1
``, s 0.1- is sent a bar 11 node signal. Child-9 is `` n '' low pass! !I n 'l, RAIV (readout port 904, main memory i3 porter 905 management circuit τ that controls the operation and stop 8 temporary management circuit 1) 02 path 9
The signal of 52 h can be used for one pointer including one pointer. (
Pass 951 (1 pass! It is the east line of 152,953,954.) At the second time, pass 950 to Balifu 1; Shinsodo pass 95':! The enable signal of 1 is generated in the A N D circuit 906, and the logical product is t'I. is FF91
Count up based on the updated value of 7 and store it in RAM910.
゜911's writing a1~lnos is a cow T friend. RAM91
0. qll (shi[1-cal memory, used as follows: 1
゛、Komaro.・1-1

【゛めＲ／１．１９］ｎか書込対象どな
る。（２）　　当該ＲＡＭに書込めなくなると、管理回路９
０２けＲＡへ４９１１を書込に使用するようにスイツチ
ング回路９１２に作用し、データパスの接続を行う。ｔｌ）Ｊ−記■によって書込先のスイッチングが行われ
た時、すでにデータの書込まれたＲＡＭ９］０け読出対
象のＲＡ、　Ｍとなり５続出が可能になるようなテ゛−
タパスの接続がスイッチング回路９１２で行われる。 ■　書込対象のＲ，ＡＭ９１１が書込めなくなると、項
番Ｑ）と同様のＲＡＭの切替が起る。１）〜■のくりかえし・に於いて、第８図のＦＦ８１８
のマークデータに゛０″が存在すると、書込ポインタ９
０３の値の更新は、読出ポインタ９０４の値の更新より
も遅くなるが、書込読出ＲＡＭのバンクが異っているの
で両ポインタ間の比較は不要で、ある。ＲＡＭ書込時、
書込バリッドが生成されるまで、書込対象データはＦＦ
９１３に保持される。　　　・ＲＡＭ９１０，９１１の読出は次のように行われる。管
理回路９０２によってパス９５２を介して読出ポインタ
９０４が起動されると、ページン＝４４− グー／’ｒｌセソ什のタイミング毎に該ポインタはＦＦ
９１８の値で更新される。このポインタ９０４の出力値
は直接ＲＡＭ’９１０．９１１を読出すためには用いｒ
゛＋　ｉｔず、ＲＡＭ９　］　Ｏ，！１１１を読出すア
ト１ノスを生成するための表が格納されているＲＡ　Ｍ
９１４を引用するためのアト１ノスとし、て用いられる
。二二で１才ＲＡＭ９１４をアクセフ、するための各信
号の１ノベル変換回路は全て省略されている、二のＲＡ
Ｍ９１４の弓Ｉ用に、上るＲ、ＡＭ９１０ヌ目９１］目
間１］イントによって、］、２，３゜４．５・・・　順
のように第８図の拡張記憶装置８０４力臼）、読出され
たデータを、１．３’、２．／ｌ、５・・・・のように
順序を一部変更し、て請出すことができる。この操作に
よａ）拡張記憶装置−１−の実メモリ空間の構造と局所
メモリ空間の構造を一致させないで、局所メモリ空間を
サグメン１〜対応に最適なＦＴ市に変換する二とが可能
になる。ＲＡＭ９１４１−の変換テーブルの値は第矩図
の論理回路が起動さオ鵞ろ前にパス゛９５５を通し、で
イニシャライズしてオンく必要がある。このようにして
読出されたデ−タは２書込先の主記憶装置を管理し、て
いる論理回路９１５によって、スイッチング回路９１６
に指示が句、えられ、適切なパス９６０又け９６１が選
択される。ここでパス９６０，９６１は主記憶装置１０
４，１０５　（第１図参照）へのデータバスである。Ｒ
ＡＭ続出ポインタ９０４で値がＲＡＭ９１４のメモリ範
囲を越えた場合、その報告がパス９６８を介して管理回
路９０２に行われる。主記憶装置のアドレスは次のようにして生成される。主
記憶書込ポインタ９０５は、第９図の論理回路が起動さ
れた時、主記憶書込開始アドレスデータ９６Ｇを介して
、イニシャルセットされる。ＲＡＭ続出ポインタ９０４が管理回路９０２によって起
動されるタイミンクと同期して、主記憶書込ポインタ９
０５はＦＦ９　＋　９の値に従ってカウントアツプされ
る。生成された書込アドレスは。レジスタ９２２．パス９６７を介し、てスイッチング回
路９１６に送られ、それぞれ対応する主記憶装置１０４
あるいは１０５ヘパス９６２．９６３を介して送られる
。ポインタ９０４と９０５け管理回路！″ｌＯ２を介り
、で同期し、て作動し、ている。第１０図はセグメントがアクセスし、ｆ８−いメモリ空
間を主記憶装置１０・１．１０５に格納するための論理
に関オろ概呻ブロック図である。第８図のパス８５７１−のデータは第１０図のＦＦＩｎ
ｎ１１にラッチされる。第１０図にオンいて、論理回路
１００２は’Ｒ，Ａ　Ｎ、ｙ書込ポインタ＋００３゜Ｒ
ＡＭ読読出ポインツタ１００４主記憶書込ポインタ１０
０５の起動停止を制御する管理回路である。初期状態では各ポインタは全てカウントアツプ可能ろ・
状態どする。ＦＦ１００Ｏにデータがセラ１〜さ才１ろ
時、書込ポイ〕ツタカウントアツプのためのバリッド信
号かパス９８０４こに送出されている。該ｚ＜リッド信号とパスＩ　Ｏ５２１＝のカウントアツ
プ可信吐けＡＮＤ回路１００８で論理積がとらｈ　、書
込カウンタ１００３に入力される。同時に管理回路１０
０２けＲＡＭ　Ｉ　ＯＩ　Ｏ又は１０１１に対１５てデ
ータ書込可能なようにスイッチング回路１００Ｇに作用
し、データパス１０５５と書込対象のＲＡ　Ｍどの接続
を行う。ＦＦｌ０ｆ’）１は」ニー・１７− 記の処理のためのデータの保持に置かれている。２つのＲＡＭｌ０ＩＯ，Ｉｆ）１１の使用法は前述のセ
グメントアクセスの行われる局所メモリ空間の場合と同
様である。ＲＡＭ読出ポインタ１００４の起動は管理回路１００２
からパス】０５１を介して行われ、すでに書込まれてい
るＲＡＭの続出アドレスを生成する。該続出アドレスは
レジスタ１０５６を経てスイッチング回路１００６を介
し、てＲＡＭ１．０ＩＯあるいは＋０１１をアへセスす
る。ＲＡＭｌ０ＩＯあるいは１０１１から読出されたデ
ータはスイッチ：ノブ回路１００７によって、データを
書込むべき主記憶装置＋０４あるいは１０５ヘパス１０
６０又は１０６１を介して送られる。データを書込む主
記憶アドレスは書込ポインタ１００５によって生成され
る。該ポインタは管理回路１００２によって起動される
が、カウントアツプを開始する以前にパス１０５７を介
し・て、データを格納すべき主記憶装置上の領域の先頭
番地をセットしておく必要がある。主記憶装置書込ポイ
ンタ１００５けＲＡ　Ｍ　ｉｆ？、出ポ、インタ１００
４ど同期し、て作動[゛MeR/1.19] What happens to n or the writing target? (2) When it becomes impossible to write to the RAM, the management circuit 9
It acts on the switching circuit 912 to use 4911 for writing to 02 RA, and connects the data path. tl) When the writing destination is switched by writing J-■, RAM 9 to which data has already been written is set to 0, RA becomes M, and 5 successive outputs become possible.
Tapas connections are made in switching circuit 912. (2) When it becomes impossible to write to R and AM911 to be written to, the same RAM switching as in item number Q) occurs. 1) In repeating ~■, FF818 in Figure 8
If "0" exists in the mark data, the write pointer 9
Updating the value of 03 is slower than updating the value of read pointer 904, but since the banks of the write/read RAM are different, there is no need to compare the two pointers. When writing to RAM,
The data to be written is FF until write valid is generated.
913. - Reading from RAMs 910 and 911 is performed as follows. When the read pointer 904 is activated by the management circuit 902 via the path 952, the pointer is set to FF at every timing of paging=44-g/'rl seso.
It is updated with a value of 918. The output value of this pointer 904 is used to directly read RAM'910.911.
゛+ itzu, RAM9] O,! RAM in which the table for generating At1nos for reading 111 is stored
It is used as an atnos to cite 914. The 1 novel conversion circuit for each signal to access the 1 year old RAM 914 in 22 is all omitted, the 2 RA
For the bow I of M914, go up R, AM910 nu 91] 1] int, ], 2, 3° 4.5... As in the order of expansion storage device 804 in Fig. 8), The read data is 1.3', 2. /l, 5, etc. You can partially change the order and request it. This operation makes it possible to a) convert the local memory space into the optimal FT city for sagmen 1 to 2 without making the structure of the real memory space of extended storage device-1 match the structure of the local memory space; Become. The values in the conversion table in RAM 9141- must be initialized and turned on through path 955 before the logic circuit shown in the rectangular diagram is activated. The data read in this way is processed by a switching circuit 916 by a logic circuit 915 that manages the main storage device to which the data is written.
An instruction is then given to select the appropriate path 960 and 961. Here, paths 960 and 961 are the main storage device 10.
4,105 (see Figure 1). R
If the value in the AM successive pointer 904 exceeds the memory range of the RAM 914, a report thereof is sent to the management circuit 902 via a path 968. The main memory address is generated as follows. Main memory write pointer 905 is initialized via main memory write start address data 96G when the logic circuit of FIG. 9 is activated. In synchronization with the timing when the RAM successive pointer 904 is activated by the management circuit 902, the main memory write pointer 9
05 is counted up according to the value of FF9+9. The generated write address is. Register 922. are sent to the switching circuit 916 via the path 967, and are sent to the corresponding main memory device 104.
Or sent via 962.963 to 105. Pointer 904 and 905 management circuit! Figure 10 shows the logic for the segments to access and store the memory space in main memory 10.1.105. This is a general block diagram. The data of path 8571- in FIG.
It is latched to n11. 10, the logic circuit 1002 is 'R, A N, y write pointer +003°R
AM read read pointer 1004 main memory write pointer 10
This is a management circuit that controls starting and stopping of 05. In the initial state, all pointers can be counted up.
What's the condition? When data is stored in the FF 1000 from cell number 1 to cell number 1, a valid signal is sent to the path 9804 for counting up the writing point. The AND circuit 1008 performs a logical product of the z<lid signal and the path IO521=, and inputs the result to the write counter 1003. At the same time, the management circuit 10
It acts on the switching circuit 100G so that data can be written to the 02 RAM I OI O or 1011, and connects the data path 1055 to the RAM to be written. FF10f')1 is placed to hold data for the processing described in Section 17-1. The usage of the two RAMs 10IO, If) 11 is the same as in the case of the local memory space where segment access is performed as described above. The RAM read pointer 1004 is activated by the management circuit 1002.
051 to generate a subsequent address for the RAM that has already been written. The subsequent address passes through the register 1056 and the switching circuit 1006 to access RAM 1.0IO or +011. The data read from RAM10IO or 1011 is transferred to the main memory +04 or 105 by a switch/knob circuit 1007.
60 or 1061. The main memory address at which data is written is generated by the write pointer 1005. The pointer is started by the management circuit 1002, but before starting counting up, it is necessary to set the starting address of the area on the main memory where data is to be stored via the path 1057. Main memory write pointer 1005 RAM if? , Depo, Inter 100
4. Sync and operate

【−・ている、生成された主記憶書
込アト１ノスはレジスタＩ　ｎ　＋　７．パス１０５８
を通ってス・（ツチング回路１００７に送ｌ二、オｌ、
十記憶装置１０／ｌ、１０５の書込読１１−を管理し、
でいる管理回路９１５（第９図）に上って、書込先の主
記憶装置が選択され、パス１０６２又け１０６３上にア
ドレスデータどり、で送り出さ才する。次にページジングプロセッサの主起憶装り上の局所ス干
ＩＩ空間の拡張記憶装置へのストア動作に−）いて、第
１１図、第１２図を用いてｉ＋ｈ明する。第１１図は局所メモリ空間内のセグメン１へで参照さ才
ｊろ部分集合内のデータのならべ替えを行う部分の［略
ブ［ｌツク図である。第１１図において、主記憶装置１
０４あるいけ＋０５から読出されたデータは、該データ
の有効性を示すバリッド信号とともにそｉ＋ぞわパス１
１５２．１１５０又は１１５３．１１５１を経由し、て
スイッチング回路１１０２又け１１０１の入力となる。 −ｊｆ、ページングブロセノサが現在アクセスを行って
いる主記憶装置の番号を管理する論理回路１１００があ
る。該管理回路１１００の内容はパス１１５４を介してスイ
ッチング回路１１０１．１１０２に作用し５、ページン
グプロセッサのアクセスし７ている側の主記憶装置との
データバスの結合を行う。この結合によって５バリッド
信号はパス］］５５．Ｊ−に、データはパス１１５９ト
に送出される。論理回ｖｆＳ］１０４．１１０５はそｔ
ぞｔｌ、　ＲＡ　Ｍ書込ポインタ、読出ポインタである
。こわらのポインタは管理回路１１０６によってパス１
１５６．１１５７を介し・で制御される。初期状態では
ポインタ１１０４はカウントアツプ可能状態即ち、パス
＋１５７−１の信号値が″ビ′とする。：のときパスｌ
　ｌ　５５４二のバリッド信号け、へＮｎ回路１１０３
で論理積がと＃）れ、ＲＡＭ書込ポインタ１１０４をＦ
ＦＩＩＩＧの値に従ってカウント・アップする、この動
作によ１１、主記憶装置からのデータの到着と同期し７
てＲ’Ａ　Ｍ書込アトＩノスが生成される。第１１図において、１１１０と１１１１はＲＡＭを示す
。この２つの’ＲＡＭの使用法は第９図におけるＲＡＭ
ｎ　１０と９１１と同様であって、管理回ＦＩＲ＋　１
０６によるスイッチング回路１１１２゜１１１３へのパ
ス結合指示によって、Ｒ，ＡＭ＋１１０．１１１１のど
ちらか一方が書込に用いられ。他方か読出に用いられる。該当ＲＡＭ　への書込が終了
すると、管理回路１１０６け読出ポインタ１１０５を起
動する。該ポインタ＋１０５はＦＦ１１１７の値に従っ
て毎クロックカウントアツプされるか、このポインタの
出力はレジスタ１１２０を介り、、ＲＡＭ］］１４を引
用するためのアドレスと６・ろ。ＲＡＭ　＋、　］　］
　／ｌを引用し、た結果は、読出対象どなっているＲ　
Ａ　Ｆ１４即ち１１１０又は１１１１のア１くレスとな
る。二の間接的なアドレス生成により、局所空間内での
データのならべ替えをＲＡＭの記憶容量の範囲で行うこ
とができる。このようにし、てデータのならべ替えを行
った結果はパス１１６（ｍｌ−に送出さ才ｌろ。一方読
出ポインタ１１０５の内容のカウントアツプ毎にバリッ
ド信号がパス＋１６１１−に送出される。該読出バリッ
ドけＦＦ］Ｉ５で必要な遅延処理が行われた後パス１１
６２１上に送出される。以上の処理は、局所、メモリ空間内のセグメント・がア
クセスを行う部分集合について成立する。セグメントの
アクセスが行われない部分集合については、読出ポイン
タ１１０５の値が直接ＲＡＭ１１１０．１１１１の読出
アドレスとなるような論理回路によって主記憶装置の読
出が行われる。この部分はＲＡＭｌ１１／１部分を除い
て第１１図と同じ、なので省略されている。このようにして、主記憶装置から請出したデータとバリ
ッドは、次の論理回路においてセグメントアクセスを行
う部分集合と行わない集合との併合処理が行われる。こ
の時、２つの部分集合を主記憶装置から読出す際、同し
ピッチで読出すと、セグメントアクセスがどちらか一方
の部分集合に片よっている場合、遅いピッチの読出デー
タが次段の論理回路内でぶつかりを生ずる。このため主
記憶装置からの局所メモリ空間の続出を次段の論理回路
の正常動作のために制御する必要がある。このために設けられているのがパス１１７０である。二
のパスＩ＋７０．．Ｉ−に有意な指示即ち主記憶装置か
らの続出を一時中断するような要求が来た場合、二の指
示は主記憶装置の読出論理回路の動作を中断させるとと
もに、ポインタ管理回路１１０６に作用し１、パス１１
５８．］　１５７．１１５６を介１７て書込、続出ポイ
ンタの動作を中断させる。第１２図は第１１図によって生成された読出バ１１ツド
とデータ信号を貯え、ページングプロセッサと拡張記憶
装置とのタイミングの相異の一致をとるとどもに、マー
ク１ノジスタを使用してセグメ：ノｌ−７クセスの有無
によって局所メモリ空間の部分集合の再配置を行った操
作の逆操作を行う論理回路の慨酩ブロック図である。第１２図によ１いて、パス］　１６２上のバリッド信呼
はスタック管理回路１２００に入力される。該スタック管理回路１２００は、データスタック１２０
２内のデータ移行をバリッド信号の到着によって管理し
、でいる。ここではスタック管理回路１２００とスタッ
ク１２０２を局所メモリ空間の部分集合のうち、セグメ
ン１へアクセスか行われる部分の読出に関するものとす
る。同様にスタック管理回路１２０１とスタック１２０
３をセグメントアクセスの行われない部分の読出に関す
るものとする。第１２図において、マークデータはパス１２５０を介し
てスタック１２０４にいったんラッチされ、後にマーク
レジスタ１２０５に格納される。サイクリックカウンタ１２０６はこのマークレジスタ１
２０５を順次引用し、１２０２又は１２０３のどちらの
スタックからのデータを引用すべきかを決定する。この
引用情報はパス１２５１を経由して、スタック管理回路
１２００，１２０１、スイッチング回路１２０７に伝え
られる。スタック１２０２．１２０３に入るデータのピ
ッチと。その引用ピッチは同期していないので、スタックが一杯
になることが起りうる。この時、管理回路＋２００．１
．２０１はパス１１７０又は１２５２を介してデータを
送出している論理回路に送出の中断を指示する。スイッチング回路１２０７によって選択されたデータは
スタック１２０９に格納される。該スタック１２０９内
のデータの移行を管理する回路が第２スタック管理回路
１２０８である。この管理回路１２０８は、スイッチン
タ゛回路１２０７から送出されるパス＋　２５３、−．
１−のバリッド信号によってスタツイノ＋　２０９−ヒ
のデータの移行を管理するとＩｔ、に、バリッドが４個
分受信し、で、スタック１２０９が・杯にろ・ろと、ノ
、タック内の全データを同時に１ノジスタ１２１０へ移
行させる。このバリッドを４す−）管理するのけ、ベー
シングプロセッサのタイミングが拡張記憶装置のそれの
１／４同期であることから来ている。 −・方スイッチン９回路１２０７からパス１２５３１−
に出力さ才ｌているバリン１へ信号は、サイクル力１ノ
ンタ１２１１によって１／４のバリッドに減しＩＥ、れ
パス＋　２５４、．１１に送出されろ。このパスＩ２５
　／１．１−のバリン１−けレジスタ１２１０にデータ
がセソ１〜されろ夕・ｒミンクに同期し２ている。それ
故、拡張記憶装置に１ノジスタ１２１０の出力をそのま
まデータとして送出することができる。次に前述のページングプロセッサの説明で用いられてい
る局所メモリ空間における部分集合再配置とセグメント
アクセスの有無の関係、論理回路にデータとして与えら
れるマークデータ、データの順序入替のためのＲＡＭ　
（たとえば第９図のＲＡＭ９１４）の初期情報ｌ二つい
て説明する。これらの情報は本発明のベクトル処理装置
上で動作させられるソフトウェアに関するものであるが
、第８図乃至第１２図の論理回路動作に密接に関連して
いるのでここで説明する。プログラムはセグメント化された直後、局所メモリ空間
は第１３図ａのような構成になっている。第１３図ａにおいて、各部分集合Ａ、Ｂ、Ｃはそれぞれ
連続アクセスを行うステーメントのデータ、２とびの非
連続アクセスを行うステートメントのデータ、連続と２
とびの両方のアクセスを行うステーメントの部分集合と
する。この局所メモリ空間を最適に構成するために次の
ような操作を行う。 ■　部分集合は拡張記憶装置からそのまま主記憶装置へ
子′−タ転送を行う。ＱＩ　　量分焦合Ｂは、セグメントでデータ処理を行う
ために必要な部分とそうでない部分とが交互に存在［て
いる。従って、１，０，１．Ｏ・・・・又けＯ，Ｉ、ｎ
、］　　・・・・のよう内・パターンのマークデータを
用意すオ＋ｌｆ、Ｒｅ台をセグメントアクセス、の行わ
れるＢ１集合と、７クセスの行わｔ’Ｌ　Ｐａ：いＢ７
集合とに分離する二とができる。項番中どの相異は、このマークデータのみで他のデータ
たとえば拡張記憶装置又は主記憶装置上の読出又は書込
の先頭番地の管理については同様の処理でよい。この時
、Ｂ、集合はＢ、沖合の直後に置く。この部分集合の再
配置によって、２どびの非連続アクセスが連続アクセス
に変る。てｔｌけブ［１グラム自身の修正を意味する。このような場合も、先に説明したように、プロツノラム
自身を修正し７てベクトル処理を行う：ことが１本発明
のベクトル処理装置では可能である。ｔｇ＋　　部分集合Ｃはベタ１〜ル処理の特徴である、
演算順序を入替えても結果が変わらないことを利用して
、２の剰余かＯであるデータと１であろ子−夕を対にＩ
−てｆＪ′らへろような主記憶構成どする。二の上うな
集合をＣ′で表示する。例を、＋７】げろど　Ｉ、３，
２，４．５，７，６，８゜のような番地づけの主記憶装
置１−の部分集合かぞＪｌに１１する。この、ような番
地−５げにより２どびの非連続７′ノセスは常に非連続
なのではなく部分的に連続とね・る。ての構成の主記憶
装置に対し、ては非連続７リセスのプログラム部を項番
ｌ四重と同Ｋ（ｎ正する必要がある。ただし連続ア′ノ
セスめプ「１ゲ→ム部分については修正の必要か７’、
、い。１・記のクロき番地っけを行った局所メモリ空間を生成
するには５たどえげ第９図のＲＡ　Ｍ　９１１にｌ、３
，２，４．・・・のようなデータ（アドレス）をセット
−すればよい。以トのようにし、て再構成された局所メモリ空間の概要
を第１３図すに示す。マークデー勺お上びデータ順序入
替データは、それらをセノ１へするようろ・命令をコン
１へロールプロセッサに設置−１乙、Ｌつな１段で実現
する二とかできろ。次に−ｌント「１−ルブロセソサ士ｊよびベタ１−ル−
ｊ１１セツサの主記憶装置に対するり［１ケーシヨ゛）
に−）いｒ説明オろ９局所゛メモリ空間の再構成ｔＮ行
ｈ　ｆ＋’ｌけ４１；↑　リ［１ケーシヨンは各セグメ
ンＩ・（ｒｎ、　１１１甲（、”ｆに　・笛的に定−冷
さ才１、各プロセッサか’：＋　（ｈｌｊ　ｐ）ニスド
ア１・１ノフ、に定数を加算すれは、目的どする１ｆ記
憶装百１−のアト１ノスを求めるこ１・かできる−局所
′メモリ空間の再構成を行った！４合て１１．第１３図
１）の上うに部分集合の境界４変え／Ｉ・い範囲てあｌ
′Ｌ、　ｔｌ、局所メモリ空間の再１清１ノ（をｔｒわ
ろ・い場合ど同様ち・リロケーション処理々を斤うてと
かできる。ｒ発明の効ｑシ〕本発明にｋｔｔば、拡張記憶装置の容量ど同１ご寸−タ
のメモ１１空間を必要とする大規模なプロ′ノ→ノ、の
べ′ノ１〜ル処理が可能となる。拡張記憶゛（に開目）
τ記憶ヤｉ［ご上りも遅いサイクルタ（ムのメモ１ｊ素
了てｔｌＷ　１＋９可能であ−って、大容鼠の上記憶装
間を・＼コ、７トルクル理装置に実装するよりも経済的
に有利になる。拡張記憶装置のサイクルタイムの遅いメ
モリ素子採用によるデータ転送速度の低下は、ページン
グプロセッサの処理に上り拡張記憶装置へのアクセスを
連続アクセスとする二と１士ンよびデータ転送１１を大
きくとりインタリーブ数を多くとることにより、該デー
タ転送速度の低下を補うことができる。複数の主記憶装置を具備することによる経済的不利は、
現在の科学技術計算の実務から予測する限り無視できる
。すなわち、２次元又は３次元測微分方程式を解く場合
、有限要素法などで離散化する際、１０４オーダの節点
をとる二とが多い。このときデータｌＪを４〜８Ｂとす
ると、作業配列を使用し７たとし、ても、１６０Ｋｒ３
オーダのメモリで１つのＤｏ処理を行うことができる。１セグメントを数個のＤＯ処理とし。各配列間で共通に使用する配列がない場合を仮定しても
１つの主記憶装置は２ＭＢ以下で十分である。こトに対
し１、拡張記憶装置は数百ＭＢのオーダであるから、容
量的には主記憶装置け拡張記憶装置の１０−２のオーダ
である。従って複数の主記憶装置をベクトル処理装置に
実装する経済的損失は無視し、うる。本発明のベタ１ヘル処理装置では、コン１−ロールプ［
１セソ什、ベタ１−ルプロセッ什、ページン９′プロセ
ッサを並行し、て動作させるための種々の命令か定着で
き、かつ、その命令の制御をハードウェアで行うので、
３種部のプロセッサをＭｉ列作動させるためのロスタイ
ムは罹小化される。[−・The generated main memory write at 1 nos is the register I n + 7. pass 1058
through the switching circuit 1007.
(10) Manages writing/reading 11- of storage device 10/l, 105;
The main storage device to be written to is selected by the management circuit 915 (FIG. 9), and the address data is sent out on the path 1062 and 1063. Next, the operation of storing the local memory space on the main memory of the paging processor into the expanded storage device will be explained using FIGS. 11 and 12. FIG. 11 is an approximate block diagram of a part that performs sorting of data in a subset of the local memory space when referenced to segment 1. In FIG. 11, main storage device 1
The data read from 04 or Ike+05 is sent to Soi+Zowa path 1 along with a valid signal indicating the validity of the data.
It passes through 152.1150 or 1153.1151 and becomes an input to the switching circuit 1102 and 1101. -jf, there is a logic circuit 1100 that manages the number of the main memory device currently being accessed by the paging brothenosa. The contents of the management circuit 1100 act on switching circuits 1101 and 1102 via a path 1154, thereby establishing a data bus connection with the main memory device being accessed by the paging processor. Through this combination, 5 valid signals pass]]55. J-, the data is sent out on path 1159. Logical time vfS] 104.1105 is
These are the RAM write pointer and read pointer. The Kowara pointer is set to pass 1 by the management circuit 1106.
156.1157. In the initial state, the pointer 1104 is in a count-up enabled state, that is, when the signal value of the path +157-1 is ``bit''.
l 554 2nd valid signal to Nn circuit 1103
The AND is done with #), and the RAM write pointer 1104 is set to F.
This operation counts up according to the value of FIIIG11 and synchronizes with the arrival of data from main memory7.
The R'AM write at Inos is generated. In FIG. 11, 1110 and 1111 indicate RAMs. The usage of these two 'RAMs is shown in Figure 9.
Same as n 10 and 911, management times FIR + 1
According to the path coupling instruction to the switching circuits 1112 and 1113 by 06, either R or AM+110 or 1111 is used for writing. The other is used for reading. When writing to the corresponding RAM is completed, the management circuit 1106 activates the read pointer 1105. The pointer +105 is counted up every clock according to the value of the FF 1117, or the output of this pointer is passed through the register 1120, and is the address for quoting the RAM]]14. RAM +, ] ]
/l and the result is the read target R
A F14, ie address 1110 or 1111. By the second indirect address generation, data can be rearranged within the local space within the storage capacity of the RAM. In this way, the result of rearranging the data is sent to the path 116 (ml-). On the other hand, every time the contents of the read pointer 1105 count up, a valid signal is sent to the path +1611-. Valid FF] Pass 11 after necessary delay processing is performed in I5
621. The above processing holds true for a subset that is accessed locally by a segment in the memory space. For a subset whose segments are not accessed, reading from the main memory is performed by a logic circuit such that the value of the read pointer 1105 directly becomes the read address of the RAMs 1110 and 1111. This part is the same as in FIG. 11 except for the RAM11/1 part, so it is omitted. In this way, the data and valid retrieved from the main memory are merged into a subset to which segment access is performed and a set to which segment access is not performed in the next logic circuit. At this time, when reading two subsets from the main memory, if they are read at the same pitch, if segment access is biased to one of the subsets, the read data with the slower pitch will be read out to the next stage logic circuit. This causes conflict within the body. For this reason, it is necessary to control successive access of local memory space from the main memory for normal operation of the next stage logic circuit. A path 1170 is provided for this purpose. Second pass I+70. ．． When a significant instruction, that is, a request to temporarily suspend successive reading from the main memory, comes to I-, the second instruction interrupts the operation of the read logic circuit of the main memory and acts on the pointer management circuit 1106. 1. Pass 11
58. ] 157. Write via 1156 to interrupt the operation of the successive pointer. FIG. 12 stores the readout signal and data signal generated by FIG. 11, matches the timing differences between the paging processor and the expansion storage device, and uses the Mark 1 register to segment the data: FIG. 7 is a schematic block diagram of a logic circuit that performs the inverse operation of rearranging a subset of a local memory space depending on the presence or absence of a No. 1-7 access. According to FIG. 12, valid calls on path 162 are input to stack management circuit 1200. The stack management circuit 1200 controls the data stack 120
Data migration within 2 is managed by the arrival of a valid signal. Here, it is assumed that the stack management circuit 1200 and the stack 1202 are related to reading a portion of a subset of the local memory space in which segment 1 is accessed. Similarly, the stack management circuit 1201 and the stack 120
3 relates to reading of a portion where segment access is not performed. In FIG. 12, mark data is once latched in stack 1204 via path 1250 and later stored in mark register 1205. The cyclic counter 1206 uses this mark register 1.
205 are sequentially quoted, and it is determined whether data from the stack 1202 or 1203 should be quoted. This quotation information is transmitted to the stack management circuits 1200 and 1201 and the switching circuit 1207 via a path 1251. and the pitch of the data going into the stacks 1202, 1203. Since the quote pitches are not synchronized, it is possible for the stack to become full. At this time, the management circuit +200.1
．． 201 instructs the logic circuit that is sending data via path 1170 or 1252 to interrupt the sending. Data selected by switching circuit 1207 is stored in stack 1209. A circuit that manages data migration within the stack 1209 is a second stack management circuit 1208. This management circuit 1208 receives the paths +253, -.
When the data migration of Statuino + 209-hi is managed by the valid signal of 1-, four valid data are received, and the stack 1209 is filled with all the data in the tack. is transferred to the 1-no register 1210 at the same time. The reason for managing this validity comes from the fact that the timing of the basing processor is 1/4 synchronous with that of the expanded storage device. - Path from switch 9 circuit 1207 to path 12531-
The signal outputted to Barin 1 is reduced to 1/4 valid by the cycle force 1 nonta 1211, and is passed to IE, +254, . Be sent to 11. This path I25
/1.1- data is stored in the register 1210 of the register 1210 in synchronization with the signal 1210. Therefore, the output of the 1-no register 1210 can be sent as data to the extended storage device. Next, we will discuss the relationship between subset relocation and segment access in the local memory space used in the explanation of the paging processor described above, mark data given as data to logic circuits, and RAM for rearranging the order of data.
(For example, the RAM 914 in FIG. 9) will be described using two pieces of initial information. These pieces of information relate to the software operated on the vector processing device of the present invention, and will be explained here because they are closely related to the logic circuit operations shown in FIGS. 8 to 12. Immediately after the program is segmented, the local memory space has a configuration as shown in FIG. 13a. In FIG. 13a, each subset A, B, and C are the data of a statement that is accessed consecutively, the data of a statement that is accessed two times non-successively, and the data of a statement that is accessed consecutively and two
A subset of statements that performs both jump and jump accesses. The following operations are performed to optimally configure this local memory space. (2) The subset is transferred directly from the expanded storage to the main storage. In the QI quantity focus B, parts necessary for data processing in the segment and parts not necessary exist alternately. Therefore, 1,0,1. O...over O, I, n
,] ...Prepare the mark data of the inner pattern as shown in B1 set where segment access is performed on O + LF and Re units, and 7 accesses are performed t'L Pa:I B7
The set and the two that are separated into the two are created. Regarding the differences among the item numbers, the same process may be used to manage other data, such as the starting address for reading or writing on the extended storage device or the main storage device, with only this mark data. At this time, B, set, is placed immediately after B, offshore. This rearrangement of the subset turns two non-consecutive accesses into continuous accesses. tetl kebu [means modification of one gram itself. Even in such a case, the vector processing device of the present invention allows vector processing to be performed by modifying the protonorum itself, as described above. tg+ Subset C is a feature of Beta 1~R processing,
Taking advantage of the fact that the result does not change even if the order of operations is swapped, we can combine the data that is the remainder of 2 or O with the data that is 1 and set it as I.
-The main memory structure is as follows: fJ'. The second set is denoted by C'. For example, +7] Gerodo I, 3,
2, 4.5, 7, 6, 8 degrees, etc., to a subset of the main memory 1-Jl. Due to this address -5, two discontinuous 7' noses are not always discontinuous, but are partially continuous. For a main memory device with a configuration of Is it necessary to modify 7'?
,stomach. 1. To generate a local memory space with the black address shown in the table, add l, 3 to the RAM 911 in Figure 9.
, 2, 4. All you have to do is set the data (address) like... An overview of the local memory space reconstructed as described above is shown in FIG. For the mark data and data order change data, send them to Seno 1.Insert the instructions to the controller 1 and install it in the roll processor. Next, ``1-Rubrosessor and Beta 1-Ru-''
Access to main memory of j11 setter [1 case]
ni-) r explanation 9 local ゛ Reconstruction of memory space tN row h f+'l ket 41; -Coldness 1, each processor': + (hlj p) By adding a constant to 1.1 nof, we can find the at1 nos of the 1f storage device 101-1. The local 'memory space has been reorganized! 11. Figure 13 1) The boundary of the subset 4 has been changed/I.
'L, tl, if you want to re-create the local memory space, you can use the same relocation processing. It is possible to process large-scale professional notebooks and notebooks that require 11 memo spaces of the same size as the device capacity.Extended memory (opens the eyes)
It is possible to use 1 + 9 memory devices, which are also slow to use, and is more economical than implementing a 7-torque processing device. The reduction in data transfer speed due to the use of memory elements with slow cycle times in the expanded storage device increases the processing speed of the paging processor, which increases the processing speed of the expanded storage device and data transfer. By increasing 11 and increasing the number of interleaves, this decrease in data transfer speed can be compensated for.The economic disadvantage of having multiple main storage devices is as follows:
As expected from current scientific computing practices, this can be ignored. That is, when solving a two-dimensional or three-dimensional differential equation, when discretizing the equation using the finite element method, nodes of the order of 104 are often taken. At this time, if the data lJ is 4 to 8B, and the working array is 7, it will be 160Kr3.
One Do process can be performed in the order memory. One segment is treated as several DO processes. Even assuming that there is no array commonly used between arrays, one main storage device of 2 MB or less is sufficient. On the other hand, since an expanded storage device is on the order of several hundred MB, its capacity is on the order of 10-2 times that of the main storage device and the expanded storage device. Therefore, the economic loss of implementing multiple main memories in a vector processing device can be ignored. In the Beta 1 Health processing device of the present invention, the Control 1 Roll [
It is possible to fix various instructions for operating 1 processor, 1 processor, and 9' processor in parallel, and the instructions are controlled by hardware.
The loss time for operating three processors in Mi rows is minimized.

[Brief explanation of the drawing]

第１図は本発明に係るベクトル処理装置の概略ブ「１ツ
ク図、第２図はブロクラムの概略構成図。第３図はペッツトル処理プログラムの概略構成図、第４
図はベーシング、コントロール両プロセッサの小娘動作
図、第５図はベクトル処理プログラムの最適１１″、概
念図、第６図はコン１〜〇−ルプ「１セツ廿の第１段目
のテコ１−１回路のブロック図、第７図はコントロール
プロセッサの第２段目のテコ＝１・回路のブロック図、
第８図はページンクプロセノせのｒｌ−１・（１（能〕
ｒ１ツク図、第９図、第１０図はページ゛ツターｊ′ロ
セッ什の局所メモリ空間に対する編集機能ブ「１ノ））
図、第１１図はペーシングブ「１セノ什のストア動作の
第１段論理のブロック図、第）２図１１同フ、１〜７重
）】作の第２段論理のブ「１ノ９図、第１３図は局所メ
モリ空間概念図′ｒ−ある、Ｉ　ｎ　Ｉ−ＸＩント［１
−ルプ「Ｉセラ什、　　１０２ベグｊ・ルプ「１セツサ
、　　＋０３．１０６　　スｒノチング回路、　　　＋
０４，１０５　　主記憶装置。１０７　ベージソ′Ｊプロセッサ、　　１０８・１広張
記憶装置　　１０９　主記憶装置。゛・′１図２１）　　　　　　　Ｉ嘱４図一ノ第　　５久第１２図第　１３ｊ”１久　　　　　　　　　　ｔFIG. 1 is a schematic block diagram of the vector processing device according to the present invention, FIG. 2 is a schematic block diagram of the block diagram, FIG.
The figure is a diagram of the operation of both the basing and control processors, Figure 5 is a conceptual diagram of the optimal 11'' vector processing program, and Figure 6 is the lever 1- 1 circuit block diagram, Figure 7 is a block diagram of the second stage lever = 1 circuit of the control processor,
Figure 8 shows the rl-1 (1 (noh)
The r1 block diagram, Figures 9 and 10 show the editing function block "1)" for the local memory space of the page loader.
Figure 11 is a block diagram of the first stage logic of the store operation of the pacing block ``1-9 Figure 1. , FIG. 13 is a conceptual diagram of the local memory space.
-Rup "I Sera, 102 Beg J.Rup" 1 Setsa, +03.106 Slnoting Circuit, +
04,105 Main storage. 107 Begeso'J processor, 108.1 Widespread storage device 109 Main storage device.゛・'1Figure 21)

Claims

[Claims]

(1) A plurality of main storage devices, an expanded storage device of the main storage device, and three types of processors: a paging processor, a vector processor, and a control processor, and the control processor is provided with data processing of the three types of processors. In order to execute the above in parallel, an instruction stack for separately holding instructions for the three types of processors and a decoding process management mechanism for time-sharing decoding of multiple instructions on the instruction stack are provided, and the paging process is performed in parallel. Loading an area on the extended storage device used by a subset of the program into a specified main storage device in order to cause the processor to execute the program existing on the expanded storage device on the plurality of main storage devices. - A vector processing device, characterized in that it is provided with a logic mechanism for storing and a logic mechanism for editing the arrangement of data in a designated area, and the vector processor processes vector instructions on a main storage device under the control of the control processor. .

(2) The vector according to claim 1, wherein the expanded storage device has a larger capacity and lower speed than the plurality of main storage devices, and is used to store a large-scale program. Processing equipment.